Click here to Skip to main content
11,489,693 members (68,849 online)
Click here to Skip to main content

Create simple Word 2007 documents without needing Word 2007 installed

, 31 Jul 2006 CPOL 154.3K 9.1K 102
Rate this:
Please Sign up or sign in to vote.
A simple class library that enables the creation of Word 2007 documents, packaged in the Word 2007 specifications, without requiring a copy of Word 2007 to be installed

Resulting document displayed in Word 2007

Introduction

Word 2007 is becoming quite widespread even before becoming a production release. There have been over 3 million downloads of Beta 2 already as at July 2007. This article demonstartes a system for creating simple Word 2007 documents based solely on the packaging and XML formats. This can run on the desktop or a server and requires no installation of Word.

Background

It is quite common to see articles talking about using Word on a server to create or manipulate Microsoft Word documents. Often these applications are only tested on a developers machine, as they will fail in a production environment, as this usually requires running multiple threads. Only one instance of Word can run at a time, so the app must maintain a queue, and process as Word becomes free. This is especially problematic in a web server environment.

Add to this the fact that many System Administrators will either not install Word on a server, or require an in depth business case to be provided, it is often not as simple as it could be.

Using the code

A Word 2007 document usually has the extension .docx There are exceptions for macro enabled documents, templates etc. For this article we will only look at .docx files. This is actually a group of files and folders zipped up. By opening the file file with a zipping application you will see a structure like this:

Folder structure of a word document

This is a very simple word document, and the structure can get a lot more complex. The text content of the word document is found in the word\document.xml file.

The code will create the package, but takes a very simple approach, creating the word\document.xml file, and adding a standard set of files for all the auxilliary files.

DocumentMaker supports a very limited subset of the Word object model, The only styles supported in this limited version are:

  • Heading 1
  • Heading 2
  • Heading 3
  • Paragraphs
  • Bold, italic and underlined text

We begin by creating a Document object. This is quite similar to the Document Object model that you might use in a standard Word application.

// The document object represents the word document, without the Word 2007 <BR>// packaging
Document doc = new Document();

Now we can add objects that derive from Paragraph to the Paragraphs collection.

doc.Paragraphs.Add(new Heading1("Document maker"));

A paragraph contains only plain text. If formatted text is required, then a Run object is required, or alternatively, an overloaded version of the Paragraph constructor allows for a formatting style to be specified. A Run represents a contiguous piece of text with one style applied to it. If the text requires mixed styling, i.e. some text with a bold word in the middle of it, then separate Runs can be added to the Paragraph object.

// Add a paragraph that is italic, using an overloaded constructor
doc.Paragraphs.Add(new Paragraph("By Mark Focas, July 2006", 
        TextFormats.Format.Italic));

// Or Create a Paragraph object and add Runs ro it.
Paragraph p=new Paragraph();
p.Runs.Add(new Run("Text can have multiple format styles, they can be "));
p.Runs.Add(new Run("bold and italic", 
        TextFormats.Format.Bold | TextFormats.Format.Italic));
doc.Paragraphs.Add(p);

Note how the Run objects can be or ed together to create combinations

p.Runs.Add(new Run("bold and italic", 
        TextFormats.Format.Bold | TextFormats.Format.Italic));

Once the document has been created, it needs to be packaged to be of any use. For maximum flexibility, DocumentMaker returns a Stream object

// A document is of little use unless pacakged in the Word 2007 <BR>// Packaging format
DocumentPackager dp = new DocumentPackager();
Stream s= dp.Package(doc);

There are not many objects in the solution, most of them have been described above.

Class diagram for DocumentMaker

Points of Interest

The Microsoft XML model for Word appears complex, but in some ways is quite simple. Every object is a paragraph, all headings, list items etc. This makes it extremely simple to generate word documents, but extremely complex to process using XSLT. The approach I took with the Run object is based on the approach the Word XML format takes.

I remain completely neutral about, and do not wish to enter in any debate on whether Word XML is better that Open Document Language or vice versa. This article should not be viewed as claiming Microsoft Word is any better or worse than Open Office. It could easily be adapted to output Open Office format if you require that.

Having said that, I am rather disappointed that Microsoft are intending to charge for downloads of Office 2007 beta 2 from August.

History

30 July 2006 - Version 1.0.0

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

Mark Focas
Web Developer
Australia Australia
Working in the educational arena, automating publishing processes, developing a single XML source, multiple output format publishing solution for a distributed environment
http://blog.focas.net.au

Comments and Discussions

 
QuestionDocati Pin
Robert te Kaat12-Jun-14 4:54
memberRobert te Kaat12-Jun-14 4:54 
QuestionWord 2010 Pin
eduvolp17-Oct-12 7:33
membereduvolp17-Oct-12 7:33 
QuestionMessage Automatically Removed Pin
26-Jul-12 10:27
memberProgramminfree26-Jul-12 10:27 
GeneralRe: Good one! ByteScout Document SDK is another alternative Pin
Oramo Blue8-Jan-14 0:34
memberOramo Blue8-Jan-14 0:34 
GeneralGreat! Pin
rlejason24-Aug-10 20:26
memberrlejason24-Aug-10 20:26 
QuestionGood. How to add image? Pin
Demaker31-Mar-10 4:24
memberDemaker31-Mar-10 4:24 
GeneralError message It appears that no class was specified as the ResourceManager Pin
fhunth26-Mar-10 11:23
memberfhunth26-Mar-10 11:23 
GeneralRe: Error message It appears that no class was specified as the ResourceManager Pin
Mark Focas26-Mar-10 14:15
memberMark Focas26-Mar-10 14:15 
QuestionLicense? Pin
Member 59723252-Mar-09 3:40
memberMember 59723252-Mar-09 3:40 
AnswerRe: License? Pin
Mark Focas2-Mar-09 12:39
memberMark Focas2-Mar-09 12:39 
GeneralRe: License? Pin
august.rydberg7-May-09 8:58
memberaugust.rydberg7-May-09 8:58 
QuestionGreat work on this! One question about Versions of Word Pin
ola halvorsen24-Feb-09 0:06
memberola halvorsen24-Feb-09 0:06 
AnswerRe: Great work on this! One question about Versions of Word Pin
Mark Focas2-Mar-09 12:33
memberMark Focas2-Mar-09 12:33 
Questionreally! what about table support ? Pin
pita20003-Feb-09 10:07
memberpita20003-Feb-09 10:07 
AnswerRe: really! what about table support ? Pin
Mark Focas2-Mar-09 12:34
memberMark Focas2-Mar-09 12:34 
Questiontable support? Pin
Unruled Boy30-Dec-08 20:39
memberUnruled Boy30-Dec-08 20:39 
AnswerRe: table support? Pin
Robert Hutch15-Feb-12 2:05
memberRobert Hutch15-Feb-12 2:05 
QuestionWhat to do if I want Times New Roman? Pin
d00_ape2-Jan-08 0:10
memberd00_ape2-Jan-08 0:10 
QuestionAbout vb Pin
akilandam28-Nov-07 23:07
memberakilandam28-Nov-07 23:07 
GeneralAdded your article reference Pin
PuneDotNet28-Oct-07 21:02
memberPuneDotNet28-Oct-07 21:02 
GeneralGood Work Pin
Hiran Das22-Aug-07 23:55
memberHiran Das22-Aug-07 23:55 
QuestionHow to insert image? Pin
tushar_vaja7-Aug-07 20:04
membertushar_vaja7-Aug-07 20:04 
AnswerRe: How to insert image? Pin
Mark Focas8-Aug-07 3:14
memberMark Focas8-Aug-07 3:14 
QuestionGOOD!! how can I insert a header?? Pin
hemaral30-Jul-07 10:01
memberhemaral30-Jul-07 10:01 
AnswerRe: GOOD!! how can I insert a header?? Pin
Mark Focas8-Aug-07 3:16
memberMark Focas8-Aug-07 3:16 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Terms of Use | Mobile
Web03 | 2.8.150520.1 | Last Updated 31 Jul 2006
Article Copyright 2006 by Mark Focas
Everything else Copyright © CodeProject, 1999-2015
Layout: fixed | fluid