Assuming:
1. there may be more than one match present in the file
1.a. it is necessary to find all matches
2. it is better practice to read the file line-by-line ... you might wish to halt reading the file when you've found a match (?), or, the file may be quite large (?)
3. a goal is to minimize "high cost" operations like calling 'Split on a string.
4. it is more probable the word preceding the match occurs within the line rather than at the end of the line.
using System.IO;
using System.Collections.Generic;
using System.Linq;
private char[] splitChar = new char[] {' '};
private int matchLength;
private int matchIndex;
private void ScanFileForMatches(string filePath, string matchMarker)
{
string[] splitLine = null;
string currentLine = null;
string previousLine = null;
string matchString = null;
matchLength = matchMarker.Length;
using (StreamReader sr = new StreamReader(filePath))
{
while (! sr.EndOfStream)
{
currentLine = sr.ReadLine();
if (currentLine.Contains(matchMarker))
{
if (matchMarker != currentLine.Substring(0, matchLength))
{
splitLine = currentLine.Split(splitChar);
matchIndex = Array.FindIndex(splitLine, str => str == matchMarker) - 1;
}
else if (previousLine != null)
{
splitLine = previousLine.Split(splitChar);
matchIndex = splitLine.Count() - 1;
}
matchString = splitLine[matchIndex];
Console.WriteLine("word before match: {0} position of word before match in line: {1}",matchString, matchIndex);
}
previousLine = currentLine;
}
}
}
Disclaimer:
1. I've tested this with the following text file, and a match-marker string, "called:"
Quote:
once upon a time
the man named Patrick
called the pest-control operator;
and he kept talking
until Maria called
and, so forth and so on,
but, then, David called to say
that
If he ever called Maria again,
that she would say she never called
him back.
And, it correctly handled all cases. However, be advised this is extracted from some code written years ago, and adapted quickly, for educational purposes, and you should test this thoroughly before using it in production code. I take no responsibility for performance of this code. If you find a bug in it, of course I'd like to know !
Possible to-do's:
1. an obvious omission here is testing for when the match-marker string might have a punctuation mark added to it: "called." "called," ... etc.
2. how about making the search work for any variation of upper- or lower- case match-marker string ?
3. what if the marker-string is the first word in the file ? Throw an error ?