Click here to Skip to main content
15,891,136 members
Please Sign up or sign in to vote.
1.00/5 (1 vote)
See more:
Hi,

I'm doing a cross platform project that requires a conversion of document to plain text. And i have to use that plain text for further processing. Are there any c++ libraries that can convert docx and odt file to text file? Are there any platform independent c++ libraries for conversion?

Thanks in advance.
Posted
Updated 12-Oct-12 3:28am
v3
Comments
[no name] 12-Oct-12 9:04am    
Use word interop would be the obvious solution.
karthikraj8791 12-Oct-12 9:07am    
What is word interop? Could you explain about that?
[no name] 12-Oct-12 9:11am    
http://support.microsoft.com/kb/196776
karthikraj8791 12-Oct-12 9:16am    
I don't want to use Microsoft application instead i need a library that does the conversion in programmatic way. Also since i'm building a cross platform application i cannot depend on the Microsoft service for the conversion.
[no name] 12-Oct-12 9:21am    
Maybe you should improve your "question" to include all of the information necessary for someone to help you.

1 solution

I understand that you don't want to use any proprietary software like Microsoft Office and why. The only open-source code I know is OpenOffice itself (where .odt came from) and its fork LibreOffice. Please see:
http://en.wikipedia.org/wiki/OpenOffice.org[^],
http://www.openoffice.org/[^],
http://en.wikipedia.org/wiki/LibreOffice[^],
http://www.libreoffice.org/[^].

You can download the source and find the code working with nearly all versions of Office documents. And, of course, .ODT and all other OpenOffice/LibreOffice documents.

—SA
 
Share this answer
 
Comments
Maciej Los 12-Oct-12 13:38pm    
Interesting collection of links, my 5!
Sergey Alexandrovich Kryukov 12-Oct-12 13:42pm    
Thank you, Maciej.
Not really a collection; this is about just one product, essentially; the only code I know which readily works with all those formats directly, without any proprietary software.
--SA
Maciej Los 12-Oct-12 13:50pm    
Formally... 1 link is not a collection, but 2, 3 and more links is a collection... although one product ;)
Sergey Alexandrovich Kryukov 12-Oct-12 14:47pm    
OK, this is a collection of two: OpenOffice and LibreOffice, the second being the split of first. Well, collection of two. Formally. :-)
Thank you.
--SA
Maciej Los 12-Oct-12 14:52pm    
He-he ;) We are the collection of two... formalist ;)

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900