Click here to Skip to main content
15,922,894 members
Please Sign up or sign in to vote.
2.00/5 (1 vote)
See more:
Guys,
I am new to programming,i am interseted to study the file formats.
i wanted to know the storing mechanism of the some file formats like .txt,.doc etc..
in byte by byte with all the information regarding the format.
Posted

1 solution

Sorry, but this isn't a question we can answer - and there are several reasons.
Firstly .TXT files don't have a structure: they are just text and you can store any form of text in them, but the text they contain may have structure if the application (or user) who created them applied some to the actual data. For example, you could store a whole book in a TXT file, and orginse it into chapters, pages and paragraphs. Or you could store data separated by commas and using lines to delimit rows.
Even then, the .TXT extension doesn't mean that you can read it - extensions aren't "fixed" and any application can create a file with any extension regardless of the content.

And worse, .DOC is a common extension that could refer to any of hundreds of applicatiosn data, but even produced by Microsoft Word, there have been a number of different (and incompatible) versions of the actual content - so there is no ".DOC" file format that is guaranteed to work with all files.

If you need to know about a specific file format, then start here: List of file formats - Wikipedia, the free encyclopedia[^] - but don't expect it to be a trivial matter to process them yourself!
 
Share this answer
 
Comments
AK96 1-Jan-16 10:23am    
Thanks for your help. but still there is a doubt.
That is the other applications refer the .Doc so, how they read the data in it?
OriginalGriff 1-Jan-16 10:35am    
Because they have read the format specification(s) and implemented a compatible reader, found a DOC reader component, or they have used an installed instance of Word to read it via Interop.
That doesn't mean that they can read anything other than .DOC files (a .DOCX may confuse them for example)
AK96 1-Jan-16 10:43am    
i am thinking how that compitable reader works?
OriginalGriff 1-Jan-16 10:50am    
By reading data from the file as laid down in the file format specification (which is available for some, but nowhere near all) file formats.
Which means the equivelant of "Read 4 bytes and convert that to a integer: that's the number of file offset values that will follow it, so then use that number to read each of the offsets, which are stored as MSB first 8 byte values. Then each of those will be a paragraph, which has this format"...
Only a heck of a lot more complicated, generally! :laugh:
AK96 1-Jan-16 11:02am    
Thanks.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900