Validate text written to a text file

Question

0.00/5 (No votes)

See more:

I need some advice on how to handle a unique situation with one of my Vb.net apps. This app creates a new text file which is basically a subset of a different text file. The app works as follows:
• Copy source text file from a network drive to the local computers drive
• Open copied source text file for reading
• Open a new text file for writing
• Read source file line #1 and store results in a variable then write what is in the variable to the new text file
• Repeat the above step X times
• Read source text file line by line and search for a line that contains “Y”
• Copy that line to the new text file
• Continue reading the source text file line by line and copying data to the new text file until it finds a line that contains “Z”
• Close both text files
• Move new text file to a different folder

The above code has been running for at least 8 years on 30+ different computers and we have never had an issue until last week. Then last week we noticed out of the blue that one of the new text files was different on 1 line vs. the source file. A few days later, we noticed a 2nd file that was different as well. Both files came from the same computer. In both cases, the source file originally had a digit of “1” that got changed to “0” in the new text file.

In order to test this further, I created a test app that did the same steps multiple times. I repeated the above process twice for multiple text file. I then ran the 2 (supposedly identical) text files through a hashing algorithm to validate the files were identical. Out of approximately 200 attempts, it found 8 files that were different.

I was shocked because I have never seen an issue where the data I intended to write to the hard drive wasn’t exactly the same as what was actually written to the drive. We checked the hard drive on the computer and it appeared to be OK so we replaced the computer with a different unit and the problem has appeared to be resolved. We still haven’t figured out exactly what the issue is with the computer.

What I have tried:

The text files this app creates are G-code files used to drive a CNC machine that cuts metal parts. The bad text files it created caused us to scrap 2 expensive parts. Because of this, I am trying to figure out something I can do in my app that will prevent this. My first thought was to add the same logic I did in my test app where I create the new tape twice and then compare the 2 files for equality. Besides doubling the process time, this option is not 100% error proof because it would be possible to duplicate the same error in both text files and still be different then the source text file. Does anyone have any better ideas? Would it be possible to compare each line immediately after it gets written? I don’t think you can read and write at the same time so this would require opening and closing every time when switching from read to write. Any help would be appreciated.

Posted 31-Oct-16 6:56am

theskiguy

Updated 31-Oct-16 7:59am

Add a Solution

1 solution

Add a Solution

Add your solution here

Treat my content as plain text, not as HTML

Preview 0

…

Existing Members

Sign in to your account

...or Join us

Download, Vote, Comment, Publish.

Your Email
Password
Forgot your password?

Your Email
This email is in use. Do you need your password?
Optional Password

I have read and agree to the Terms of Service and Privacy Policy
Please subscribe me to the CodeProject newsletters

When answering a question please:

Read the question carefully.
Understand that English isn't everyone's first language so be lenient of bad spelling and grammar.
If a question is poorly phrased then either ask for clarification, ignore it, or edit the question and fix the problem. Insults are not welcome.
Don't tell someone to read the manual. Chances are they have and don't get it. Provide an answer or move on to the next question.

Let's work to help developers, not make them feel stupid.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Jochen Arndt · Accepted Answer · 2016-10-31T07:59:00

Because the problem disappeared after replacing the system it was probably a hardware problem. It may be not only the HD that fails, but also other parts like memory (RAM), CPU, controllers, and cables. It may be also sourced by electromagnetic radiation (a CNC environment has usually more radiation than an office).

To avoid wrong data sourced by disk failures and when transferring data, you can do what you have done already:
Use checksums or hashes which may be external (additional files) or part of the file itself (which seems to be not possible for your files).

Another option is creating the complete new file content in memory, write it to the file, read the file in again, and compare with the data held in memory.

But both methods avoid only data inconsistency due to disk failures.

If you really need better checks then there is only one option: Redundancy.
That means that you have to use two (or more) systems that their own versions of the file and compare them.