Click here to Skip to main content
13,296,377 members (54,392 online)
Click here to Skip to main content
Add your own
alternative version


122 bookmarked
Posted 18 Dec 2003

Automating MS Word Using Visual Studio .NET

, 18 Dec 2003
Rate this:
Please Sign up or sign in to vote.
This code will demonstrate how to automate and get content from a MS Word document


One day you may be asked to write a parser that will have to parse a bunch of documents and break them down into a structured model and store them in a relational-database. And those documents will most likely be written in MS Word. And the sad part would be that they would not have any structure, they will not follow any standard and they will include OLE/embedded objects. I was assigned with such a task, and it was a very interesting experience for me.

Since this was my first such project, meaning automating MS Office application, I had to go and do a lot of reading about automation. The good news is that there is a bunch of stuff out there; the bad news is that all of them are written for VB or VBA developers. I couldn’t find anything for C++ developers. Long story short, I am writing this article to make things a little easier for someone else that might be assigned with a similar task.

The original code I worked on was written using Borland C++ Builder 5 Professional. This article is however, the C# version. By the way if you are interested in seeing the C++ version ask for it and I will post it. I encourage people to take a look at the C++ version sometime to start appreciate the simplicity of C#.


No special background is necessary. Just have some hands on experience with C#.

Using the code

I am going to include some code that will allow you to understand how to get what you need from a Word document. It really doesn’t matter if you are making a console application or a Windows application. The steps and the code is the same. So you can go ahead and create new C# project. You may choose to create a Windows Application that way you can click some button.

Okay so once you create a new project, go ahead and right click on References in the Solution Explorer, and select Add Reference… When the Add Reference window comes up select the COM tab. This will list all Component Names which are available on your machine, since we are going to use MS Word, we will scroll down until we find: Microsoft Word 9.0 Object Library.

Note: Yours might be a different version depending on the version of Office installed on your machine. This is for MS Word 2000.

using System;
using System.Drawing;
using System.Collections;
using System.ComponentModel;
using System.Windows.Forms;
using System.Data;

namespace sparser
  /// Summary description for Form1.
  public class frmParserMainUI : System.Windows.Forms.Form
    /// User Interface Objects
    /// I have removed the user interface object, since 
    /// they have nothing to do with
    /// the actual code, and they take a lot of space.

    /// Required designer variable.
    private System.ComponentModel.Container components = null;
    private System.Windows.Forms.OpenFileDialog openFileDialog;

The following block create MS Word COM Object. This is the object which will be used to access WORD application functions. To see what functions are available you can do it either within Visual Studio .NET IDE or MS Word.

/// MS Word COM Object
/// This is where we create our WORD object
private Word.ApplicationClass vk_word_app = new Word.ApplicationClass();


To view the functions from MS Word, launch Word, hold down the Alt key and press F11 [Alt+F11], this will give you the VBA window, once there press F1 to get the help window, and do a search for document object. This is the best source of documentation for available functions. However, the documents have been written for VBA, but at least you know what the function does and what kind of parameters it takes.

Alright the code continued...


/// The main entry point for the application.
static void Main()
  Application.Run(new frmParserMainUI());

/// Get source document. Open a FileDialog window for
/// user to select single/multiple files for
/// parsing.
private void butSourceDocument_Click(object sender, System.EventArgs e)
  if( openFileDialog.ShowDialog() == DialogResult.OK )
    object  fileName = openFileDialog.FileName;
    object  saveFile = fileName + "_Vk.doc";

    object  vk_read_only  = false;
    object  vk_visible  = true;
    object  vk_false    = false;
    object  vk_true    = true;
    object  vk_dynamic  = 2;

    object  vk_missing  = System.Reflection.Missing.Value;

    // Let make the word application visible
    vk_word_app.Visible = true;

    // Let's open the document
    Word.Document vk_my_doc = vk_word_app.Documents.Open(
      ref fileName, ref vk_missing, ref vk_read_only,
          ref vk_missing, ref vk_missing,
      ref vk_missing, ref vk_missing, ref vk_missing, ref vk_missing,
      ref vk_missing, ref vk_missing, ref vk_visible );


Alright, so here the user is given a file open dialog where they can select a Word document. Notice that we save the filename as an object. This is because the functions that we use need reference to object.

So now we can use our Word object to start Word. This is achieved by the vk_word_app.Visible = true; and vk_word_app.Activate(); The first statement makes sure that the instance of Word is visible, and the second one activate it. If you don't want the Word instance to be visible just set the Visible property to false.

Next we create a Word Document object, and that is done using Word.Document vk_my_doc = vk_word_app.Documents.Open( ... ); Notice all the parameters which are required by the function. Most of them are NULL values, so we use vk_missing which is a System.Reflection.Missing.Value.

So now we have created a Word instance and opened a Word document. Now let's move on with the code...


    // Let's create a new document
    Word.Document vk_new_doc = vk_word_app.Documents.Add(
    ref vk_missing, ref vk_missing, ref vk_missing, ref vk_visible );

    // Select and Copy from the original document

    // Paste into new document as unformatted text
    vk_word_app.Selection.PasteSpecial( ref vk_missing, ref vk_false,
      ref vk_missing, ref vk_false, ref vk_dynamic,
      ref vk_missing, ref vk_missing );

    // close the original document
    vk_my_doc.Close( ref vk_false, ref vk_missing, ref vk_missing );

Next we would like to create a new document. This is just like clicking the New Blank Document button on the toolbar. So we create another Word Document object, and this time notice that we use vk_word_app.Documents.Add( ... ); This will add a new blank document which is also visible.

Next what we do is select all content from the document which we opened, and we copy it.

Next we paste our content into the new document with a special format. The format used in the code is for plain text. This is because we want to get rid of the crap Word puts into the formatting of the text. Then we close our original document without making any changes.


    FindAndReplace( "^t^t^t^t^t^t^t", "^t", vk_num );

    // Save the new document
    vk_new_doc.SaveAs( ref saveFile, ref vk_missing,
      ref vk_missing, ref vk_missing, ref vk_missing,
      ref vk_missing, ref vk_missing, ref vk_missing,
      ref vk_missing, ref vk_missing, ref vk_missing );

    // close the new document
    vk_new_doc.Close( ref vk_false, ref vk_missing, ref vk_missing );

    // close word application
    vk_word_app.Quit( ref vk_false, ref vk_missing, ref vk_missing );


Now let's say if we are interested in doing a find and replace operation, we would basically select the whole document and use the Find and Replace function.

Note: The FindAndReplace function does not belong to Word object or Document object, it is user defined, the actual Find and Replace function is define in the following block.

Next we want to save our changes to the new document and close it.

And finally quit word.


    private void FindAndReplace( object vk_find, object vk_replace, 
       object vk_num )
      object  vk_read_only  = false;
      object  vk_visible    = true;
      object  vk_false      = false;
      object  vk_true      = true;
      object  vk_dynamic    = 2;

      vk_word_app.Selection.Find.Execute( ref vk_find, 
        ref vk_false, ref vk_false,
        ref vk_false, ref vk_false, ref vk_false, ref vk_true, 
        ref vk_num, ref vk_false,
        ref vk_replace, ref vk_dynamic, ref vk_false, 
        ref vk_false, ref vk_false, ref vk_false );

The above block shows the actual FindAndReplace function. Notice that it belongs to the vk_word_app object. And the function operates on the active selection. The Find and Replace function is a part of the Find function with special parameters. These parameters can be identified in the documentation as I mentioned at the top of the article.

The code demonstrated above, illustrates how to open a word document, how to create a new word document, selecting, copying, pasting, and doing a Find and Replace function.

As you can see, once you get an understanding of how Word operates and go through the documentation you will be able to automate all the functions that Word has to provide.

// Let's get the content from the document
Word.Paragraphs vk_my_doc_p = vk_new_doc.Paragraphs;
// Count number of paragraphs in the file
long p_count = vk_my_doc_p.Count;
// step through the paragraphs
for( int i=1; i<=p_count; i++ )
  Word.Paragraph vk_p = vk_my_doc_p.Item( i );
  Word.Range vk_r = vk_p.Range;
  string text = vk_r.Text;

  MessageBox.Show( text );

If you take a look at the code above, you can see that we have declared an object that represents a paragraph in a Word document. This is how you can extract text from a Word document, you get the paragraph object and you can access all the paragraphs that are contained in the list. The code loops thru every paragraph of a given document and displays them in a MessageBox.

Points of Interest

The new version of Office, Office 2003 is going to make things a little easier for Office developers. So if you are an Office developer, you should start looking into the features that Office 2003 has to offer. One of the nice features that I like is the capability of exporting documents into XML format.


This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here


About the Author

Vahe Karamian
Software Developer Noorcon Inc.
United States United States
Published Books:

Introduction to Game Programing: Using C# and Unity 3D designed and developed to help individuals that are interested in the field of computer science and game programming. It is intended to illustrate the concepts and fundamentals of computer programming. It uses the design and development of simple games to illustrate and apply the concepts.

Book Preview:
Preview all chapters

Available from:
Barnes and Noble Book Store
Amazon Kindle (eBook)
iTunes - iBook (eBook)

Vahé Karamian

You may also be interested in...


Comments and Discussions

GeneralRe: Adding page to MS Word.. Pin
friman4416-Oct-09 11:53
memberfriman4416-Oct-09 11:53 
AnswerRe: Adding page to MS Word.. Pin
Johnny Glenn11-Apr-12 23:00
memberJohnny Glenn11-Apr-12 23:00 
QuestionRead & write microsoft word/Excel file & search a particular field Pin
madhu_su10-Apr-07 0:09
membermadhu_su10-Apr-07 0:09 
GeneralDisplay ms word as a part of aspx page Pin
avias27-Mar-07 20:09
memberavias27-Mar-07 20:09 
GeneralAlternative automation method for easier change management & Word version compatibility Pin
daluu30-Nov-06 15:23
memberdaluu30-Nov-06 15:23 
GeneralC++ example Pin
fons gufler20-Nov-06 23:46
memberfons gufler20-Nov-06 23:46 
GeneralPrint to a word doc Pin
jerry_pendergraft27-Oct-06 5:45
memberjerry_pendergraft27-Oct-06 5:45 
GeneralASP.NET Pin
Zeeshan_Fast16-Oct-06 0:45
memberZeeshan_Fast16-Oct-06 0:45 
GeneralNo overload for method 'Open' takes '12' arguments Pin
ABlokha7710-Oct-06 8:21
memberABlokha7710-Oct-06 8:21 
GeneralRe: No overload for method 'Open' takes '12' arguments Pin
ABlokha7710-Oct-06 8:25
memberABlokha7710-Oct-06 8:25 
GeneralRe: No overload for method 'Open' takes '12' arguments Pin
BodzioM30-Jan-08 12:51
memberBodzioM30-Jan-08 12:51 
AnswerRe: No overload for method 'Open' takes '12' arguments Pin
Vahe Karamian10-Oct-06 9:08
memberVahe Karamian10-Oct-06 9:08 
QuestionFind & Mark Paragraph containing text. Pin
verma-rahul7-Sep-06 18:21
memberverma-rahul7-Sep-06 18:21 
Generalnumber of words and frequency of a word in .doc file Pin
sharma sanjeev16-Aug-06 21:05
membersharma sanjeev16-Aug-06 21:05 
QuestionCompatibility across pc's Pin
tmoney1019-Aug-06 19:46
membertmoney1019-Aug-06 19:46 
AnswerRe: Compatibility across pc's Pin
daluu30-Nov-06 15:12
memberdaluu30-Nov-06 15:12 
QuestionRead Only Pin
NikeshM23-Jul-06 3:41
memberNikeshM23-Jul-06 3:41 
GeneralPerformance Issue. Pin
101514899714-Jul-06 5:28
member101514899714-Jul-06 5:28 
GeneralCorrupted Document / Macro Enable Pin
Code Addictive27-Jun-06 14:44
memberCode Addictive27-Jun-06 14:44 
GeneralTo uses MS word in Microsoft Visual C++ V7.0 Pin
Jesper Rahbek27-Jun-06 5:12
memberJesper Rahbek27-Jun-06 5:12 
AnswerRe: To uses MS word in Microsoft Visual C++ V7.0 Pin
Vahe Karamian27-Jun-06 8:23
memberVahe Karamian27-Jun-06 8:23 
GeneralRe: To uses MS word in Microsoft Visual C++ V7.0 Pin
BonnieM25-Mar-08 8:48
memberBonnieM25-Mar-08 8:48 
GeneralRe: To uses MS word in Microsoft Visual C++ V7.0 Pin
Vahe Karamian25-Mar-08 16:33
memberVahe Karamian25-Mar-08 16:33 
Generalretrieve the present group of documents opened under the word application Pin
gowri_g14-Jun-06 21:46
membergowri_g14-Jun-06 21:46 
GeneralDetecting Page and Actual Line Number Pin
Cyrus_SE10-May-06 5:34
memberCyrus_SE10-May-06 5:34 
QuestionAutoText Entries and C# Pin
drew@farmcredit25-Apr-06 9:46
memberdrew@farmcredit25-Apr-06 9:46 
General'System.Runtime.InteropServices.COMException Pin
dekod14-Mar-06 0:21
memberdekod14-Mar-06 0:21 
GeneralRe: 'System.Runtime.InteropServices.COMException Pin
dekod14-Mar-06 0:30
memberdekod14-Mar-06 0:30 
GeneralRe: 'System.Runtime.InteropServices.COMException Pin
skunker4-May-06 0:52
memberskunker4-May-06 0:52 
GeneralRe: 'System.Runtime.InteropServices.COMException Pin
dekod4-May-06 1:32
memberdekod4-May-06 1:32 
GeneralRe: 'System.Runtime.InteropServices.COMException Pin
skunker4-May-06 5:02
memberskunker4-May-06 5:02 
GeneralSave As HTML format Pin
Andy Ho10-Mar-06 18:00
memberAndy Ho10-Mar-06 18:00 
GeneralHiding Word instance successfully Pin
daluu17-Feb-06 10:12
memberdaluu17-Feb-06 10:12 
GeneralRe: Hiding Word instance successfully Pin
dekod14-Mar-06 2:34
memberdekod14-Mar-06 2:34 
GeneralRe: Hiding Word instance successfully Pin
IA Dar21-Apr-06 2:46
memberIA Dar21-Apr-06 2:46 
GeneralRe: Hiding Word instance successfully Pin
dekod4-May-06 1:35
memberdekod4-May-06 1:35 
GeneralRe: Hiding Word instance successfully Pin
IA Dar4-May-06 1:37
memberIA Dar4-May-06 1:37 
QuestionAny advice for reading document properties? Pin
dcbrower30-Dec-05 16:15
memberdcbrower30-Dec-05 16:15 
AnswerRe: Any advice for reading document properties? Pin
Vahe Karamian3-Jan-06 0:16
memberVahe Karamian3-Jan-06 0:16 
AnswerRe: Any advice for reading document properties? Pin
Vahe Karamian27-Jan-06 11:43
memberVahe Karamian27-Jan-06 11:43 
Questioncan I use word object like RichEditBox? Pin
felek_mf13-Dec-05 4:59
memberfelek_mf13-Dec-05 4:59 
QuestionHow can i insert text into a document without losing the format? Pin
OldDeda29-Sep-05 11:32
memberOldDeda29-Sep-05 11:32 
AnswerRe: How can i insert text into a document without losing the format? Pin
dekod14-Mar-06 3:00
memberdekod14-Mar-06 3:00 
Generalbilingal content separation on a ms word, please help Pin
Cs learner14-Sep-05 22:47
memberCs learner14-Sep-05 22:47 
GeneralApplication Deployment Pin
Paranoicus Maximus9-Sep-05 3:13
memberParanoicus Maximus9-Sep-05 3:13 
GeneralSystem.UnauthorizedAccessException: Access is denied. Pin
ricerca7-Sep-05 10:56
memberricerca7-Sep-05 10:56 
GeneralRe: System.UnauthorizedAccessException: Access is denied. Pin
ricerca7-Sep-05 12:12
memberricerca7-Sep-05 12:12 
GeneralPasting different doc ffiles into a unique one Pin
kastelnik31-Aug-05 12:10
memberkastelnik31-Aug-05 12:10 
GeneralRe: Pasting different doc ffiles into a unique one Pin
tartancli1-Sep-05 12:02
membertartancli1-Sep-05 12:02 
GeneralRe: Pasting different doc ffiles into a unique one Pin
kastelnik2-Sep-05 6:20
memberkastelnik2-Sep-05 6:20 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

Permalink | Advertise | Privacy | Terms of Use | Mobile
Web01 | 2.8.171207.1 | Last Updated 19 Dec 2003
Article Copyright 2003 by Vahe Karamian
Everything else Copyright © CodeProject, 1999-2017
Layout: fixed | fluid