Click here to Skip to main content
Sign Up to vote bad
good
See more: Java
Hi friends,
I am trying to extract the contents of ODT files for indexing.
Let me elaborate.
 
The following are the steps i follow to extract the contents of the odt file:
 
Steps
1 - convert the odt file into a temporary zip file.
2 - loop thru the files inside and retrieve the 'content.xml' file.
3 - the actual content of the ODT file resides in an xml element called <text:p>
4 - index the contents retrieved from <text:p>
 

I am having trouble in step 3.
I do not have the content.xml's schema. Only with the schema, i can generate the respective java classes of the elements.
 
Pls guide me
Posted 8 Mar '10 - 20:57
Edited 9 Mar '10 - 20:10


5 solutions

And which part of your program are you having trouble with?
  Permalink  
I am using JAXB to extract from the 'content.xml' file in the odt. I am unable to get the XML Schema of the content.xml file. I tried generating it from the xml using hitsw site. But it doesn't work.
  Permalink  
koolshiva wrote:
But it doesn't work.

 
Sorry, but that really does not help anyone to guess what might be wrong. Take a look at this article[^] for guidance on reading XML data.
  Permalink  
Sorry for not being specific. Let me elaborate.
 
The following are the steps i follow to extract the contents of the odt file:
 
Steps
1 - convert the odt file into a temporary zip file.
2 - loop thru the files inside and retrieve the 'content.xml' file.
3 - the actual content of the ODT file resides in an xml element called <text:p>
4 - index the contents retrieved from <text:p>>
 
I am having trouble in step 3.
I do not have the content.xml's schema. Only with the schema, i can generate the respective java classes of the elements.
 
Pls guide me
  Permalink  
Hey friends,
 
I have found an alternative. I am using SAX instead of JAXB now. I already had this option, but i personally preferred JAXB owing to performance.
  Permalink  

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

  Print Answers RSS
Your Filters
Interested
Ignored
     
0 OriginalGriff 218
1 Ron Beyer 215
2 Aarti Meswania 190
3 Rohan Leuva 178
4 Mahesh Bailwal 160
0 Sergey Alexandrovich Kryukov 8,553
1 OriginalGriff 6,899
2 CPallini 3,648
3 Rohan Leuva 2,963
4 Maciej Los 2,308


Advertise | Privacy | Mobile
Web01 | 2.6.130516.1 | Last Updated 11 Mar 2010
Copyright © CodeProject, 1999-2013
All Rights Reserved. Terms of Use
Layout: fixed | fluid