I don't know if this have ever happened to you, but if you are a dialup user like me, downloading huge files can be very time consuming. Several times it happened while downloading a big file that my PC crashed or get a nice BSOD. Some download managers can handle this as they have a "rollback" function to eliminate the corrupted last part of the partly downloaded file, sometimes the "rollback" setting isn't big enough and you find yourself with a 300MB+ white elephant. In the past, I used to use a file splitting program to "split" the file up to the corrupted section, then continue downloading from that point onwards. This wasn't always very handy as sometimes the corrupted section is quite close to the start of the file and you have to practically re-download the entire file. I landed up the other night in exactly that position and wasn't planning to download another 300MB.
The first requirement I needed was a function to "cut" a piece from the file from a specific start point to a specific end point so that I could recover the part of the file after the error. This would prove equally handy to get the 1st part of the file assuming only one corrupted section. The Cut function was probably the hardest part, never having worked on files at byte level. Now it looks pretty elementary, but I could not find any helpful examples and it took some time to make it work exactly like it should.
Now to explain the recovering process. For example, you have a 300MB file with 30KB corrupted at say the 150MB point.
- Recover 1st part, i.o.w. "cut" from 0 to 150MB less a couple bytes to make sure.
- Recover last part, i.o.w. "cut" from 150MB + 30KB plus a couple bytes to make sure to the end of file.
- Resume download on first part till past the start point of the last part.
- Re-"cut" 1st part from 0 to start point of last part.
- Finally, join the 2 parts.
Secondly, I needed a function to find all the corrupted parts of the file.
[Update: This only really works on CDs, floppy discs and with files on bad sectors]
This is done with the
Scan function. The return value from this function is an multi-dimensional integer array in the form
int[x,0] is start point and
int[x,1] the end point of a corrupted part. This was probably the second hardest part, having to catch
IOExceptions in a double
Then I created a library for these functions so I could just interface with it via a command line interface (included in zip) or a windows application.
Lastly I added a the following other helpful functions:
Split : to to split files into equal parts, implementation was easy as it just loops through the Cut function.
Join: to combine several files together.
- Some overloads and a counter pointer incase you wanted to track progress in a Windows app or CL app.
The only problem I have it that I do not have a file that has multiple corrupted areas, but in theory it should work. Any suggestions welcome.
Hope this proves helpful, it sure did for me. At the end, I only had to re-download a couple of MB. Other uses include extracting files from a damaged CD.
Quite fun writing my first article.