I have a list of keywords (sometimes with non-alphanumeric characters) that I’d like to find in a list of files. I can do that with the code below, but I want to avoid matching keywords if they are found inside another word, e.g.:

Lo.rem <-- Match if not prefixed by nor suffixed with a letter
is <-- Same
simply) <-- Match if not prefixed by a letter
printing. <-- Same
(text <-- Match if not suffixed with a letter
-and <-- Same


What I have tried:

Here's my code so far if useful:

$keywords = (((Import-Csv "C:\Keywords.csv" | Where Keywords).Keywords)-replace '[[+*?()\\.]','\$&') #Import list of keywords to search for
$paths = ((Import-Csv "C:\Files.csv" | Where Files).Files) #Import list of files to look for matching keywords
$count = 0

ForEach ($path in $paths) {
$file = [System.IO.FileInfo]$path
Add-Content -Path "C:\Matches\$($count)__$($file.BaseName)_Matches.txt" -Value $file.FullName #Create a file in C:\Matches and insert the path of the file being searched

$hash = @{}
Get-Content $file |
  Select-String -Pattern $keywords -AllMatches |
  Foreach {$_.Matches.Value} | 
%{if($hash.$_ -eq $null) { $_ }; $hash.$_ = 1} | #I don't remember what this does, probably fixes error messages I was getting
Out-File -FilePath "C:\Matches\$($count)__$($file.BaseName)_Matches.txt" -Append -Encoding UTF8 #Appends keywords that were found to the file created
$count = $count +1

I’ve tried playing with regex negative lookahead/lookbehind but did not get anywhere, especially since I’m a beginner in PowerShell, e.g.:

Select-String -Pattern "(?<![A-Za-z])$($keywords)(?![A-Za-z])" -AllMatches

Any suggestions? Much appreciated
Updated 12-Jul-22 4:40am
Peter_in_2780 11-Jul-22 18:47pm    
I don't know about PowerShell's regex engine, but all the others I know have atoms that match the edges of words. For example, "man grep" includes the following:
The Backslash Character and Special Expressions
The symbols \< and \> respectively match the empty string at the beginning and end of a word. The symbol \b matches the empty string at
the edge of a word, and \B matches the empty string provided it's not at the edge of a word. The symbol \w is a synonym for [_[:alnum:]]
and \W is a synonym for [^_[:alnum:]].
Soskipic 12-Jul-22 10:46am    
@Peter_in_2780 Thanks a lot for your suggestion. I've received an answer on Stack Overflow which works fine.
Hope someone else in a similar situation may make use of your suggestion.
Soskipic 12-Jul-22 15:20pm    
I've actually ended up combining both your suggestion and Stack Overflow's to come up with the exact desired outcome.

Select-String -Pattern "\b($($keywords -join '|'))\b"-AllMatches

1 solution

Select-String -Pattern "\b($($keywords -join '|'))\b"-AllMatches
