Click here to Skip to main content
14,387,757 members
Rate this:
Please Sign up or sign in to vote.
See more:
When the xml file size is above 250 MB it fails with out of memory exception.
we are using the below code.

Dim Final as string
Final = File.ReadAllText(sFTPLogXml)
newFinal = Replace(Final, "&", "&")


Please let me know is there any other way to load large size xml files.

What I have tried:

I have tried with XmlTextReader and Stream Reader.
Posted
Comments
Kevin Marois 10-Feb-16 15:02pm
   
What are you doing with the file after it loads? Can you load only part of the file?
Dave Kreskowiak 10-Feb-16 16:05pm
   
You're going to have to go down a different route because you're not going to get a XML file that large loaded all at once.

What were you trying to do with the file once it was loaded?
Sergey Alexandrovich Kryukov 10-Feb-16 16:46pm
   
No matter how you try it, there are always sizes which cannot fit in memory. In this cases, you simply should not load a file in memory. What to do instead depends on the purpose of this file and what you want to achieve. That saind, the question does not really make much sense.
—SA
Sinisa Hajnal 11-Feb-16 3:10am
   
Consider what you need from the file. Load only parts you really need.
Member 12319280 11-Feb-16 14:52pm
   
All the FTP logs are stored in the XML file,once it reads the xml it writes to the log ile. below is the code.
'newFinal = Replace(Final, "&", "&") 'Added by SSIG on 10/29/2013 to read xml file

'Added By CTS 10072014 start to repalce Hexadecimal character
newFinal = Replace(newFinal, "", "") '


Dim sbOutput As System.Text.StringBuilder = New System.Text.StringBuilder()
Dim ch As Char
For chCount As Integer = 0 To newFinal.Length - 1
ch = newFinal(chCount)

If (Asc(ch) >= System.Convert.ToUInt32("0x0020", 16) AndAlso Asc(ch) <= System.Convert.ToUInt32("0xD7FF", 16)) Or (Asc(ch) >= System.Convert.ToUInt32("0xE000", 16) AndAlso Asc(ch) <= System.Convert.ToUInt32("0xFFFD", 16)) Or Asc(ch) = System.Convert.ToUInt32("0x0009", 16) Or Asc(ch) = System.Convert.ToUInt32("0x000A", 16) Or Asc(ch) = System.Convert.ToUInt32("0x000D", 16) Then
sbOutput.Append(ch)
End If
Next

newFinal = sbOutput.ToString()
'Added By CTS 10072014 end


'Write the above content into new file
'Code commented by CTS 09302014 Start
'TS = FSO.OpenTextFile(sLogCopy, IWshRuntimeLibrary.IOMode.ForWriting, True) '-sLogCopy => Copying File
'TS.Write(newFinal) 'Added by SSIG on 10/29/2013 to write copy of xml file
'TS.Close()
'Code commented by CTS 09302014 End

'Code added by CTS 09302014
File.WriteAllText(sLogCopy, newFinal)

TS = Nothing
FSO = Nothing

''End

lFilesize = FileLen(sTmpPath & sTmpXmlFile & sTmpFileExt)
colFTPLine = New colItems 'Initialize

'Code commented by CTS 10072014 start
'Do While (i < lFilesize)
' i = FileLen(sLogCopy)
'Loop
'oXML.load(sLogCopy)

'Code commented by CTS 10072014 end


'Code Added By CTS 09212014 Start
'This True block reads XML file and Constructs colFTPLine collection

Dim ds As DataSet

'Try
ds = New System.Data.DataSet()
ds.ReadXml(sLogCopy)

this code is setup to consume the entire file, and then process it. we should consider making a few adjustments to the logic, and only consume smaller chucks, so that much of the source can remain unchanged, this code is in production and developed before 2010.

1 solution

Rate this:
Please Sign up or sign in to vote.

Solution 1

If all you're trying to do is replace & with &amp; then you don't need to read the whole file at once.

You only need to read in manageable blocks.

The following code I haven't tested but should give you the gist of what you need to do to limit the impact of working with large files.

Once working have a play with the block size and bench mark performance so you can achieve your result in the best time.

string inputFile = "myFile.xml";
string tempFile = "tmp.xml";

//open input and output file streams
System.IO.FileStream input = System.IO.File.OpenRead(inputFile);
System.IO.FileStream output = System.IO.File.Open(tempFile, System.IO.FileMode.Create, System.IO.FileAccess.Write);

//set start point, end point and block size
long position = 0;
long blockSize = 200;
long fileSize = input.Length;

long outputPosition = 0;

byte[] buffer = new byte[blockSize];

bool atEnd = false;

//while not at the
while(!atEnd)
{

  //manage block size for end of file read
  long currentBlock = blockSize;

  if (position + blockSize > fileSize)
    currentBlock = fileSize - position;

  //read input block
  input.Read(buffer, (int)position, (int)currentBlock);

  //do text replace and generate output block
  byte[] outputBuffer = System.Text.Encoding.UTF8.GetBytes(System.Text.Encoding.UTF8.GetString(buffer).Replace("&", "&amp;"));

  //write output block
  output.Write(outputBuffer, (int)outputPosition, outputBuffer.Length);

  //if end of file
  if (position >= fileSize) {
    atEnd = true;
  } else {
      //if not increment pointers
      outputPosition += outputBuffer.Length;
      position += currentBlock;
  }

}

//close files
input.Close();
output.Close();

//delete original
System.IO.File.Delete(inputFile);
//rename temp file
System.IO.File.Move(tempFile, inputFile);
   
v2

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)




CodeProject, 503-250 Ferrand Drive Toronto Ontario, M3C 3G8 Canada +1 416-849-8900 x 100