Click here to Skip to main content
15,878,809 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
Hello evryone,

I have create a program that converts .doc to .pdf/.html, found that code here http://angelozerr.wordpress.com/2012/12/06/how-to-convert-docxodt-to-pdfhtml-with-java/[^]. I use the XDocReport library and samples works fine. The 1st library I can't run it, the 2nd is there are some configurations.

When running the samples, which converts the doc file that is in the zip file(downloaded). It can convert to pdf or html. But when I try converting doc file created in my computer I got this error
Java
Exception in thread "AWT-EventQueue-0" org.apache.poi.POIXMLException: org.apache.poi.openxml4j.exceptions.InvalidFormatException: Package should contain a content type part [M1.13]
        at org.apache.poi.util.PackageHelper.open(PackageHelper.java:41)
        at org.apache.poi.xwpf.usermodel.XWPFDocument.<init>(XWPFDocument.java:120)
        at docconverter.Convert.ConvertToPDF(Convert.java:32)


Convert Code:
Java
public static void ConvertToPDF(String docPath, String pdfPath) {
    try {
        InputStream doc = new FileInputStream(new File(docPath));
        XWPFDocument document = new XWPFDocument(doc);
        PdfOptions options = PdfOptions.create();
        OutputStream out = new FileOutputStream(new File(pdfPath));
        PdfConverter.getInstance().convert(document, out, options);
    } catch (FileNotFoundException ex) {
        Logger.getLogger(Convert.class.getName()).log(Level.SEVERE, null, ex);
    } catch (IOException ex) {
        Logger.getLogger(Convert.class.getName()).log(Level.SEVERE, null, ex);
    }
}

 public static void ConvertToHTML(String docPath, String htmlPath) {
    try {
        InputStream doc = new FileInputStream(new File(docPath));
        XWPFDocument document = new XWPFDocument(doc);
        XHTMLOptions options = XHTMLOptions.create();
        OutputStream out = new FileOutputStream(new File(htmlPath));
        XHTMLConverter.getInstance().convert(document, out, options);
    } catch (FileNotFoundException ex) {
        Logger.getLogger(Convert.class.getName()).log(Level.SEVERE, null, ex);
    } catch (IOException ex) {
        Logger.getLogger(Convert.class.getName()).log(Level.SEVERE, null, ex);
    }
 }


the error points on this
Java
XWPFDocument document = new XWPFDocument(doc);


I dont know if this is the cause of the error
What I'm trying to convert is .doc file. If its true, can someone give me an code, idea or url anything that can convert .doc/.docx to .pdf/.html
Posted
Updated 11-Mar-14 22:12pm
v3
Comments
Shubhashish_Mandal 12-Mar-14 8:44am    
This link may help you..
http://apache-poi.1045710.n5.nabble.com/org-apache-poi-POIXMLException-org-apache-poi-openxml4j-exceptions-InvalidFormatException-Package-sh-td5711375.html

The XDocReport docx->pdf converter works with docx and not with doc file.

Note that doc file is binary format although docx is a zip which is composed with XML entries.

So the error "Package should contain a content type part [M1.13]" means that your input is not a docx file.
 
Share this answer
 
You may add like this in your code

Java
package tcg.doc.web.managedBeans;

import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;

import org.apache.poi.xwpf.converter.core.FileImageExtractor;
import org.apache.poi.xwpf.converter.core.FileURIResolver;
import org.apache.poi.xwpf.converter.xhtml.XHTMLConverter;
import org.apache.poi.xwpf.converter.xhtml.XHTMLOptions;
import org.apache.poi.xwpf.usermodel.XWPFDocument;
import org.springframework.beans.factory.annotation.Qualifier;
import org.springframework.context.annotation.Scope;
import org.springframework.stereotype.Component;

@Component
@Scope("session")
@Qualifier("ConvertWord")


public class ConvertWord {
    private static final String docName = "TestDocx.docx";
    private static final String outputlFolderPath = "d:/";


    String htmlNamePath = "docHtml.html";
    String zipName="_tmp.zip";
    File docFile = new File(outputlFolderPath+docName);
    File zipFile = new File(zipName);




      public void ConvertWordToHtml() {

          try {

                // 1) Load DOCX into XWPFDocument
                InputStream doc = new FileInputStream(new File(outputlFolderPath+docName));
                System.out.println("InputStream"+doc);
                XWPFDocument document = new XWPFDocument(doc);

                // 2) Prepare XHTML options (here we set the IURIResolver to load images from a "word/media" folder)
                XHTMLOptions options = XHTMLOptions.create(); //.URIResolver(new FileURIResolver(new File("word/media")));;

                // Extract image
                String root = "target";
                File imageFolder = new File( root + "/images/" + doc );
                options.setExtractor( new FileImageExtractor( imageFolder ) );
                // URI resolver
                options.URIResolver( new FileURIResolver( imageFolder ) );


                OutputStream out = new FileOutputStream(new File(htmlPath()));
                XHTMLConverter.getInstance().convert(document, out, options);


                System.out.println("OutputStream "+out.toString());
            } catch (FileNotFoundException ex) {

            } catch (IOException ex) {

            }
         }

      public static void main(String[] args) {
         ConvertWord cwoWord=new ConvertWord();
         cwoWord.ConvertWordToHtml();
         System.out.println();
    }



      public String htmlPath(){
        // d:/docHtml.html
          return outputlFolderPath+htmlNamePath;
      }

      public String zipPath(){
          // d:/_tmp.zip
          return outputlFolderPath+zipName;
      }

}


For maven Dependency on pom.xml

XML
<dependency>
   <groupid>fr.opensagres.xdocreport</groupid>
   <artifactid>org.apache.poi.xwpf.converter.xhtml</artifactid>
   <version>1.0.4</version>
 </dependency>




or download it from here http://code.google.com/p/xdocreport/wiki/XWPFConverterXHTML[^]
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900