Click here to Skip to main content
15,880,972 members
Please Sign up or sign in to vote.
1.00/5 (1 vote)
See more:
Hello people from codeproject.

I need to create a tool in Visual C or Visual Studio (vb.net) that upload a .doc or .docx file and also obtain the hexadecimal values of this same file.

I already have the other part of the software, the translator from Hexadecimal to Text.

Any idea? Thanks in advanced.
Posted
Comments
Andreas Gieriet 11-Oct-15 18:13pm    
What is "upload" and what is "hexadecimal" values?
I mean, both terms tell me something, but not in the context you are asking.
Please explain in more detail.
E.g. do you mean "downloading" from a server some file and store locally?
When you download a file, it's a pile of bytes you get, so what do you mean by hexadecimal value? Do you mean the binary data(?) or is this a hash code you are talking about?
Regards
Andi
Member 11267069 11-Oct-15 18:51pm    
Hello Andi, thanks for your reply. This is excactly what I want to perfom.

1) Have a .doc file (i.e. word) and obtain the text on it, imagine that the .doc file is damaged and the only way to see the internal data is using an hexadecimal editor.

2) once the hexadecimal code is displayed, transfor it into a text values and in that way, retrieve the pure text inside the damaged .doc file.

This is a very small project of data recovery, I am doing this for getting my degree.

Could you please help me with it?? Regards.
Andreas Gieriet 11-Oct-15 19:05pm    
How do you identify the doc file? Through an url (e.g. http://...)?
If so, open a socket and get the data through that and store locally. Then you can do whatever you want with the file.
Google for C socket library and/or C socket tutorial.

If you do not need to download via an url but rather mean by "upload" to open a local file and slurp it into memory, then you might consult http://www.cprogramming.com/tutorial/cfileio.html. Be aware that file IO on Windows platforms is "broken" in the sense that it does awkward magic assuming text files. Therefore, you have to follow the instructions as detailed in what's the differences between r and rb in fopen.

Regards
Andi
PS: I agree with Philippe Mori: Doc is a binary file while docx is a zip file of a directory tree of XML files. You cannot handle both the same way.
Philippe Mori 11-Oct-15 18:59pm    
Given that DOCX files are compressed, an HEX editor will probably not be of much help...
Member 11267069 11-Oct-15 19:06pm    
True, but let`s avoid this, is there any way to design this? I mean, load the .doc file and in a richtextbox display the hexadecimal contect, then use a converter (the wich I almos have done) and finally see the data...

Regards Philippe

1 solution

There is no solution to your question.
-First, I don't understand why you have a problem to convert from text to hex since you know how to do from hex to text.
-Second, I don't understand why you want to use hex encoding since it is of no help.

As reported in comments, since .docx are compressed, nothing will help you but a decompression as first step.
When I read .doc manually, I do direct reading.
 
Share this answer
 
Comments
Member 11267069 11-Oct-15 20:07pm    
Hello Sir, thanks for your interest.

Well, let´s stablish the issue before anything:

1) I want to retreive the text from a damaged .doc file (please forget the .docx files).

If I get the way (software) that retreives the hexadecimal code from de damaged file and then, this hex code, will be added in the second software ( the one who transforms hexadecimal to text and viceversa )I will be able to see the pure text that was originally in the damaged word file.

This is what I am trying to do sir, a very basic software for data recovery.

I will be waiting for your reply, regards.
Patrice T 11-Oct-15 20:28pm    
"If I get the way (software)"
Which software ?
Member 11267069 11-Oct-15 21:27pm    
Good nite Sir.

I want to perfom a software (this is the software) able to perfom the work that I am looking for: obtain the hexadecimal code from a .doc file. that it, only that.
Andreas Gieriet 13-Oct-15 2:47am    
What is hexadecimal code of a file? Do you mean, reading a file into memory? A file is a sequence of bytes. There is no such thing like "hexadecimal code from a file".
Regards
Andi

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900