Start by extracting the code to "chunk" the file into a separate
iterator method[
^]. You can also get rid of the
StringBuilder
, since you never read its contents.
NB: If you want to process the lines in parallel, you'll need a new list for each chunk.
Private Shared Iterator Function ChunkFile(ByVal reader As StreamReader) As IEnumerable(Of List(Of String))
Dim chunkLines As New List(Of String)()
While Not reader.EndOfStream
Dim line As String = reader.ReadLine()
If line.Contains("INDI") Then
If chunkLines.Count <> 0 Then
Yield chunkLines
chunkLines = New List(Of String)()
End If
End If
chunkLines.Add(line)
End While
If chunkLines.Count <> 0 Then
Yield chunkLines
End If
End Function
Then use a
Parallel.ForEach
loop to process the chunks:
Using reader As New StreamReader(fileName)
Dim chunks As IEnumerable(Of List(Of String)) = ChunkFile(reader)
Parallel.ForEach(chunks, Sub(chunkLines) ProcessChunk(chunkLines))
End Using
That should significantly reduce the number of allocations compared to your second code block.
You'll then need to profile your code to see where the bottleneck is. If the
ProcessChunk
method is fairly quick, then the overhead of making the code multi-threaded could outweigh any potential benefits.