Click here to Skip to main content
11,412,834 members (71,234 online)
Rate this: bad
good
Please Sign up or sign in to vote.
See more: C#4.0 .NET4
I wish to Read a word file without using Interop.word dll...Do not want to install word in IIS..Actualy I made a keyword search by converting word file into txt file and reading from it..I tried using Open xml SDK but it doesn't read old doc files correctly..Also found spire.doc which is payment type..Provide a complete code with solution at the earliest...
Code as follows:
  private void SearchWord(string[] str1)
        {
            string filename1 = "";
            string randomName = "";
            string fname = "";
            Session["cids"] = "";
            object missingType = Type.Missing;
            object readOnly = true;
            object isVisible = false;
            object documentFormat = 8;
            string s12 = "select id,docfilename from Uploadeddocsmaster";
            dt = cn.viewdatatable(s12);
            int dtcount = dt.Rows.Count * 2;
            string[] ids = new string[dtcount];
 
            for (int k = 0; k < dt.Rows.Count; k++)
            {
                string id = dt.Rows[k]["id"].ToString();
                filename1 = dt.Rows[k]["docfilename"].ToString();
                string fileName = Server.MapPath("~/UploadedFiles/") + filename1;
                string ext = Path.GetExtension(fileName);
                if (ext == ".doc" || ext == ".docx")
                {
                    RichEditDocumentServer server = new RichEditDocumentServer();
                    server.LoadDocument("document.doc", DocumentFormat.Doc);
                    server.ExportToPdf(memoryStream);
 
                    Application applicationclass = new Application();
                    string[] crefids = filename1.Split('.');
                    for (int mj = 0; mj < crefids.Length; mj++)
                    {
                        randomName = crefids[0].ToString();
                    }
 
                    object Source = fileName;
                    object Target = Server.MapPath("~/Temp/" + randomName + ".txt");
                    fname = Target.ToString();
                    // object Target = @"D:\Alex\ResumeManager Dec 6,2012\ResumeManager\Uploaddocs\test1.txt";

                    //Upload the word document and save to Temp folder
                    // FileUpload1.SaveAs(Server.MapPath("~/Temp/") + Path.GetFileName(FileUpload1.PostedFile.FileName));

 
                    applicationclass.Documents.Open(ref Source,
                                                    ref readOnly,
                                                    ref missingType, ref missingType, ref missingType,
                                                    ref missingType, ref missingType, ref missingType,
                                                    ref missingType, ref missingType, ref isVisible,
                                                    ref missingType, ref missingType, ref missingType,
                                                    ref missingType, ref missingType);
                    applicationclass.Visible = false;
                    Document document = applicationclass.ActiveDocument;
                    object format = Microsoft.Office.Interop.Word.WdSaveFormat.wdFormatUnicodeText;
 
                    //Save the word document as HTML file
                    document.SaveAs(ref Target, ref format, ref missingType,
                                    ref missingType, ref missingType, ref missingType,
                                    ref missingType, ref missingType, ref missingType,
                                    ref missingType, ref missingType, ref missingType,
                                    ref missingType, ref missingType, ref missingType,
                                    ref missingType);
 
                    //Close the word document
                    document.Close(ref missingType, ref missingType, ref missingType);
 

                    foreach (string str in str1)
                    {
 
                        using (StreamReader sr = new StreamReader(fname))
                        {
 
                            if (string.IsNullOrEmpty(str) == false)
                            {
                                string szReadAll = sr.ReadToEnd().ToLower();
                                if (Regex.IsMatch(szReadAll, str.ToLower()))
                                {
                                    if (!ids.Contains(id))
                                    {
                                        ids[mn] = id;
                                    }
                                    Session["ids"] = ids;
                                }
                            }
                        }
 
                    }
                }
 
                else if (ext == ".pdf")
                {
                    string randomName1 = DateTime.Now.Ticks.ToString();
                    string fname1 = "";
 

 
                    object Target1 = Server.MapPath("~/Temp/" + randomName1 + ".txt");
                    fname1 = Target1.ToString();
 
                    PDDocument doc = PDDocument.load(fileName);
                    PDFTextStripper stripper = new PDFTextStripper();
                    string s = stripper.getText(doc).ToLower();
                    System.IO.StreamWriter LogFile = new System.IO.StreamWriter(fname1, true);
                    LogFile.WriteLine(s);
                    LogFile.Close();
                    foreach (string str in str1)
                    {
                        using (StreamReader sr = new StreamReader(fname1))
                        {
 
                            if (string.IsNullOrEmpty(str) == false)
                            {
                                string szReadAll = sr.ReadToEnd().ToLower();
                                if (Regex.IsMatch(szReadAll, str.ToLower()))
                                {
                                    if (!ids.Contains(id))
                                    {
                                        ids[mn] = id;
                                    }
                                    Session["ids"] = ids;
                                }
                            }
                        }
 
                    }
                }
                mn++;
 
            }
 

 

            //Upload the word document and save to Temp folder
            // FileUpload1.SaveAs(Server.MapPath("~/Temp/") + Path.GetFileName(FileUpload1.PostedFile.FileName));

        }
[Edit]Code block added[/Edit]
Posted 9-Jan-13 7:53am
Edited 9-Jan-13 7:55am
ProgramFOX150.6K
v2
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 1

I perfectly understand if you don't want to mess with Microsoft Office installation and Office interop, but first of all, think why messing with Microsoft Office documents at all — proprietary product is proprietary. These days, there is a number of other option.

Nevertheless, the last versions of Office documents are not so proprietary. You can always learn them, as they are no standardized. Please see:
http://en.wikipedia.org/wiki/Office_Open_XML[^],
http://en.wikipedia.org/wiki/Microsoft_Office_XML_formats[^],
http://en.wikipedia.org/wiki/Office_Open_XML_file_formats[^].

(Don't mix them up with OpenDocument, http://en.wikipedia.org/wiki/OpenDocument[^].)

Now, there is another approach to it. There are third-party products working with Microsoft Office document. If they can do it, you can, too. You just need to download source code of some open-source products and find out how it works. The only open-source code I know is OpenOffice itself (where .odt came from) and its fork LibreOffice. Please see:
http://en.wikipedia.org/wiki/OpenOffice.org[^],
http://www.openoffice.org/[^],
http://en.wikipedia.org/wiki/LibreOffice[^],
http://www.libreoffice.org/[^].

You can download the source and find the code working with nearly all versions of Office documents. And, of course, .ODT and all other OpenOffice/LibreOffice documents.

Please also see my past answers:
Convert Office-Documents to PDF without interop[^],
Hi how can i display word file in windows application using c#.net[^].

—SA
  Permalink  
Comments
Maciej Los at 9-Jan-13 16:51pm
   
Agree, +5!
Sergey Alexandrovich Kryukov at 9-Jan-13 17:03pm
   
Thank you, Maciej.
—SA
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 4

See my comment to Sergey's answer, and read this: http://a.nnotate.com/server-installation-windows.html[^] - section: Adding support for uploading DOC, PPT, XLS etc using OpenOffice.
  Permalink  

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

  Print Answers RSS
0 OriginalGriff 404
1 Sascha Lefévre 200
2 Maciej Los 150
3 ProgramFOX 130
4 Sergey Alexandrovich Kryukov 110
0 Sergey Alexandrovich Kryukov 9,025
1 OriginalGriff 7,317
2 Maciej Los 3,570
3 Abhinav S 3,298
4 Peter Leow 3,084


Advertise | Privacy | Mobile
Web04 | 2.8.150427.1 | Last Updated 10 Jan 2013
Copyright © CodeProject, 1999-2015
All Rights Reserved. Terms of Service
Layout: fixed | fluid

CodeProject, 503-250 Ferrand Drive Toronto Ontario, M3C 3G8 Canada +1 416-849-8900 x 100