Click here to Skip to main content
13,042,421 members (138,095 online)
Click here to Skip to main content
Add your own
alternative version


13 bookmarked
Posted 26 Sep 2007

Help System Automation

, 26 Sep 2007
Rate this:
Please Sign up or sign in to vote.
Word Document Automation


This simple program shows how to generate a help system using an existing Word document. The program generates HTML files and XML file to be added in the Web project.

In this article, I just give the main ideas. For more details, check the source code and the sample Word document.

For Whom is this Article

The article is for developers who would like to start working in Microsoft Word automation programs.

Microsoft Word 2000

This program is based on using style and formatting in your Word document to be later converted into XML files.

I have used Word DLL of Word application, reference it in your project.

(Microsoft Office 11.0 Object Library)

Project in More Detail

Word 2000 has format and style like (TOC, TOCEntry, Heading, ...) I have used these styles to be automated with my program.

So if a user wants to use my program, she/he must use styles.

This program generates two XML files:

  1. Table of content
  2. Document

Convert Document XML file to HTMLs files, and Table of content XML file to be used as DataSource in tree or any navigation control.

Using Library Word Part 1

Add reference to Word document Microsoft Word 11.0 Object Library to be used.
Look at WordApp.cs.

Add the reference:

Word = Microsoft.Office.Interop.Word; 

I have used...

Word.ApplicationClass wordApplication; 

... to gain access to Word document properties and text, etc.

String WordFilePath ;//this is the path of your document

To Open Word Document

//------ var
private Word.Document doc;
private Word.Paragraphs DocParagraphs;
public String WordFilePath;
private Word.InlineShapes Inshapes;

This opens the Word document and uses the doc object.

wordApplication = new Word.ApplicationClass();
object o_nullobject = System.Reflection.Missing.Value;
object o_filePath = WordFilePath;
object tru = false;
object tr = true;
wordApplication.Visible = false;// make Microsoft Word work in background
doc = wordApplication.Documents.Open(ref o_filePath,
ref o_nullobject, ref tr, ref o_nullobject, ref o_nullobject, ref o_nullobject,
ref o_nullobject, ref o_nullobject, ref o_nullobject, ref o_nullobject, ref o_nullobject,
ref tru, ref o_nullobject, ref o_nullobject, ref o_nullobject, ref o_nullobject);

Get Inline Shapes

public Word.InlineShapes getInlineDocumentShape()
foreach (Word.Shape W in doc.Shapes)

Word.InlineShapes ishape;
ishape = doc.InlineShapes;
Inshapes = ishape;
return Inshapes;

Get Word Paragraphs

public Word.Paragraphs getDocumentParagraphs()
return DocParagraphs =doc.Paragraphs ;

Now Converting to XML

See DocumentParser.cs for more details.

TableOfContent.xml :for table of content
Document.xml : for word document paragraphs and Images
public void ParsToXml() {...}
XmlTextWriter tocWriter;//table of content writer
XmlTextWriter parWriter;//paragraph writer


Paragraphs pars = getDocumentParagraphs(); //to get word paragraphs Word.
InlineShapes inShapes = getInlineDocumentShape(); //to get word images

Now start the loop for each paragraph to get style and text.

Paragraph Styles

  • Heading: Every Topic in document starts with Heading(N)
    • N =1 Main topic
    • N>1 Sub topic
  • TOC: Every topic in table of content starts with TOC(N)
    • N =1 Main topic
    • N>1 sub topic
  • ImageStyle: Every Image in document has this style.

    for(index = 1; index < pars.Count; index++)
    style = ((Word.Style)pars[index].get_Style()).NameLocal; 

Format and Style of Table of Content

if(style.StartsWith("TOC ")) //this style of table of content

Every topic starts with style [TOC ].


  • [TOC1] 1.Introduction
  • [TOC2] 1.1 Author
  • [TOC2] 1.2 About
  • [TOC3] 1.2.1 About book

{ ..take difference for current level and next level}

A sample XML file of Table of content is as follows:

<Topic level="4" name="INTERFACE REQUIREMENTS" page="6">
<Topic level="4.1" name="User Interfaces" page="6">
<Topic level="4.1.1" name="Accessibility" page="6" />
<Topic level="4.1.2" name="System messages" page="6" />
<Topic level="4.1.3" name="Paging" page="7" />
<Topic level="4.1.4" name="Data lists and Data grids" page="7" />
<Topic level="4.2" name="Hardware Interfaces" page="8" />
<Topic level="4.3" name="Software Interfaces" page="8">
<Topic level="4.3.1" name="Operating Platform" page="8" />
<Topic level="4.3.2" name="Storage engine" page="8" />
<Topic level="4.3.3" name="External data sources" page="8" />


If the Style is ImageStyle

InlineShapes inShapes = getInlineDocumentShape(); to get inline shape from document
//mindex :index of inline shape in document
inShapes[mindex].Select(); //make the select to copy in clipboard

To Get Words of Paragraphs

If the style is Heading:

Word.Words words;
words = pars[index].Range.Words;//take words of paragraph

Check if the word has a list type:

for (windex = 1; windex <= words.Count; windex++)
if (words[windex].FormattedText.ListFormat.ListType.ToString() == "wdListNoNumbering")
{....check format for each word and write it to xml Text node
using FormatingFunction(,)
public String FormatingFunction(Word.Words obj, int index)
if (index > obj.Count)
return "";
String fr = "";
if (obj[index].Bold.ToString() == "-1")
{fr = "Bold";}
if (obj[index].Italic.ToString() == "-1")
{if (fr != ""){
fr += "," + "Italic";}
{fr = "Italic";}
if (obj[index].Underline.ToString() == "wdUnderlineSingle")
{if (fr != ""){
fr += "," + "UnderLine";}
else{fr = "UnderLine";
} }
return fr;}
{...write it in list node..}
} //---------
<Text Format="">is </Text>
<Text Format="Italic">Performance Management System </Text>
<Text Format="">that helps you collect different measures and make faster and smarter
decisions through a set of user friendly customizable dashboards and scorecards
targeted for each and every member of your organization.</Text>
<Text Format="Italic" />
<Topic Name="Product Features" Level="2.2">
<Text Format="" />
<Image src="2.21">j7B/wBND+VFFAB/Z4/56fpR9g/6aH8qKKAD7AP+eh/KroGABRRQB//Z</Image>
<Text Format="">The figure above provides a high level vision of </Text>
<Text Format="Bold">Cub </Text>
<Text Format="">solution. The vision includes the idea of hiding the complexity of
creating ETL (Extract, Transform &amp; Load) processes, a data warehouse and an
OLAP database for analysis from the end user.
<Text Format="" />
<Text Format="">It will provide end users with a sub-set of the features offered by
the underlying systems, taking into account the ability to extend this set
in future releases. As well as linking with existing DW and OLAP database
provided as part of an implementation service.

Converting XML to HTML

I built an HTML convertor to convert XML nodes to HTML. Check HtmlConvertor.cs.

Future Plans

I will give more explanation for this article.

Wait for future articles:

  • Dynamic Online Flexible GridView
  • SpyWare


  • 26th September, 2007: Initial post


This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


About the Author

Amr M. K.
Web Developer
Jordan Jordan
- Need is the mother of the invitation .
- To be acknowledge is matter of time..
be skillfull..needs...

You may also be interested in...


Comments and Discussions

GeneralHello Amr Pin
Islam923-Nov-07 15:38
memberIslam923-Nov-07 15:38 
GeneralSug Pin
mero-no23-Nov-07 15:28
membermero-no23-Nov-07 15:28 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

Permalink | Advertise | Privacy | Terms of Use | Mobile
Web02 | 2.8.170713.1 | Last Updated 26 Sep 2007
Article Copyright 2007 by Amr M. K.
Everything else Copyright © CodeProject, 1999-2017
Layout: fixed | fluid