Click here to Skip to main content
12,295,791 members (45,200 online)
Click here to Skip to main content
Add your own
alternative version

Tagged as

Stats

16.2K views
1.2K downloads
6 bookmarked
Posted

Converting a Microsoft Word document to a text file in C#

, 5 Jan 2014 CPOL
Rate this:
Please Sign up or sign in to vote.
This Tip explains how to convert a Microsoft Word document to a text file in C#, using the Microsoft Word Object Library

Introduction

In this tip, I'll explain how to convert a Microsoft Word document to a text file in C#. To do this, Word must be installed.

Adding a reference to the Microsoft Word Object Library

The first step is to add a reference to the Microsoft Word Object Library. In Visual Studio, choose "Add Reference...", go to "COM", and select "Microsoft Word [version number here] Object Library".

As you can see on the image, I use the Microsoft Word 15.0 Object Library, that's the library of Word 2013. You can have another number than 15.0.

The code

At the top of the code file, we will add the following using [namespace] statements:

using System.IO;
using Word = Microsoft.Office.Interop.Word;

Now, we can just write Word.Document instead of Microsoft.Office.Interop.Word.Document for example. Now, we will ask the user which file (s)he wants to convert, using the following code:

Console.WriteLine("Please enter the full file path of your Word document (without quotes):");
object path = Console.ReadLine();
Console.WriteLine("Please enter the file path of the text document in which you want to store the text of your word document (without quotes):");
string txtPath = Console.ReadLine();

As you can read in the code, for the path of the Word document, the full path is required. If you just write test.docx, then you'll actually try to convert C:\Windows\system32\test.docx instead of the test.docx file in the folder of the converter. For the file path of the text file, it is OK to write test.txt, because then it will create the test.txt file in the folder of the converter. It is also necessary that the path to the Word file is an object, not a string, because when we're going to open the Word file, the parameters should be objects. Now, we'll open the Word file and retrieve the text using the following code:

Word.Application app = new Word.Application();
Word.Document doc;
object missing = Type.Missing;
object readOnly = true;
try
{
    doc = app.Documents.Open(ref path, ref missing, ref readOnly, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing);
    string text = doc.Content.Text;
    File.WriteAllText(txtPath, text);
    Console.WriteLine("Converted!");
}

Here, we create a Word Application that opens the document. The first argument of the Open method is the file path, the third argument is whether we want to open the file as read-only (yes in this case). The text is stored in Content.Text, and then we use the File.WriteAllText method to write the text to a file. Now, we'll create the catch and finally blocks:

catch
{
    Console.WriteLine("An error occured. Please check the file path to your word document, and whether the word document is valid.");
}
finally
{
    object saveChanges = Word.WdSaveOptions.wdDoNotSaveChanges;
    app.Quit(ref saveChanges, ref missing, ref missing);
}

Because we don't want to save the changes (we didn't even make changes), we use WdSaveOptions.wdDoNotSaveChanges. The Application.Quit method closes all open documents, and quits the Word Application. If we merge all code snippets, we get this:

Console.WriteLine("Please enter the full file path of your Word document (without quotes):");
object path = Console.ReadLine();
Console.WriteLine("Please enter the file path of the text document in which you want to store the text of your word document (without quotes):");
string txtPath = Console.ReadLine();
Word.Application app = new Word.Application();
Word.Document doc;
object missing = Type.Missing;
object readOnly = true;
try
{
    doc = app.Documents.Open(ref path, ref missing, ref readOnly, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing);
    string text = doc.Content.Text;
    File.WriteAllText(txtPath, text);
    Console.WriteLine("Converted!");
}
catch
{
    Console.WriteLine("An error occured. Please check the file path to your word document, and whether the word document is valid.");
}
finally
{
    object saveChanges = Word.WdSaveOptions.wdDoNotSaveChanges;
    app.Quit(ref saveChanges, ref missing, ref missing);
}

History

  • 5 Jan 2014: First version

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

ProgramFOX
Belgium Belgium
I'm a hobbyist programmer. My favourite languages are C#, JavaScript, and Python. I also like chess; you can find me on Lichess.

You may also be interested in...

Comments and Discussions

 
GeneralAlternative without MS Word Pin
Mario Z15-Jul-15 22:08
professionalMario Z15-Jul-15 22:08 
GeneralRe: Alternative without MS Word Pin
ProgramFOX15-Jul-15 22:14
mentorProgramFOX15-Jul-15 22:14 
GeneralRe: Alternative without MS Word Pin
Mario Z15-Jul-15 22:35
professionalMario Z15-Jul-15 22:35 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Terms of Use | Mobile
Web02 | 2.8.160525.2 | Last Updated 5 Jan 2014
Article Copyright 2014 by ProgramFOX
Everything else Copyright © CodeProject, 1999-2016
Layout: fixed | fluid