The way I would do it is different: I'd create a "mapping" file which contained indexes into the big file. Probably two sets of indexes.
The first set would be in indexes and lengths of each sentence (plus a "Sentence ID" value to individually identify the sentence).
The second set would be each word. Or rather, each different word, with the Sentence ID it appears in, and the offset within the sentence.
So entries might look like this:
the 1,0;1,93;3,0;3,116;4,0;5,37;5,72,5,90
way 1,4
i 1,8
...
in the sentences that start this reply.
When you want to find a word, you just look in the "Words" mapping table, and it tells you immediately if it is in the text, and tells you which sentence, and offset within the sentence it's located.
Since the number of words in any language is fairly small (the average active vocabulary of an English native speaker is 20,000 words) this should be a lot quicker and easier to work with than parsing the table each time, and it only needs to be created (or modified) when your input text file changes.