Because you are using a
Parallel.ForEach()
the XML file "currently" being processed is sort of meaningless.
Depending on the parallelism the
Parallel.ForEach
accomplishes several XML files could be in different states of processing at the same time.
For starters, you load each file and then never refer to that filename again.
Should the file be "moved to another location" after the file load is completed? Or not until ReadJobsFromFeed() has completed?
If the former, then put the file moving in the
Parallel.ForEach()
right after the Load.
If the latter, then put it after
ReadJobsFromFeed
returns.
It might be simpler if
ReadJobsFromFeed
took the filename as the argument and loaded the XmlDocument itself. Then it could move the XML file whenever it was appropriate:
Parallel.ForEach(filenames, ReadJobsFromFeed);
and
ReadJobsFromFeed(string filename)
{
XmlDocument xmlDocument = new XmlDocument();
using (XmlReader reader = XmlReader.Create(filename))
{
xmlDocument.Load(reader);
}
}
Other issues/questions with your code.
1. The XmlReader is not disposed correctly (see above, or):
var filenames = Directory.GetFiles(@"D:\\xmljobs", "*.xml");
Parallel.ForEach(filenames, filename =>
{
XmlDocument Document = new XmlDocument();
using (XmlReader reader = XmlReader.Create(filename))
{
Document.Load(reader);
}
ReadJobsFromFeed(Document);
});
2.
channel
is being set to the LAST node with Name of WEBHARVY_DATA. Is that really what you want?
3. Several of the steps in
ReadJobsFromFeed
would be much simpler implemented using Linq.
E.g., instead of:
int num = 0;
for (int i = 0; i < channel.ChildNodes.Count; i++)
{
if (channel.ChildNodes[i].Name == "item")
{
num++;
}
}
use:
int num = channel.ChildNodes.Cast<XmlNode>().Count(cn => cn.Name == "item");
and others.