
Introduction
Word 2007 is becoming quite widespread even before becoming a production release. There have been over 3 million downloads of Beta 2 already as at July 2007. This article demonstartes a system for creating simple Word 2007 documents based solely on the packaging and XML formats. This can run on the desktop or a server and requires no installation of Word.
Background
It is quite common to see articles talking about using Word on a server to create or manipulate Microsoft Word documents. Often these applications are only tested on a developers machine, as they will fail in a production environment, as this usually requires running multiple threads. Only one instance of Word can run at a time, so the app must maintain a queue, and process as Word becomes free. This is especially problematic in a web server environment.
Add to this the fact that many System Administrators will either not install Word on a server, or require an in depth business case to be provided, it is often not as simple as it could be.
Using the code
A Word 2007 document usually has the extension .docx There are exceptions for macro enabled documents, templates etc. For this article we will only look at .docx files. This is actually a group of files and folders zipped up. By opening the file file with a zipping application you will see a structure like this:

This is a very simple word document, and the structure can get a lot more complex. The text content of the word document is found in the word\document.xml file.
The code will create the package, but takes a very simple approach, creating the word\document.xml file, and adding a standard set of files for all the auxilliary files.
DocumentMaker supports a very limited subset of the Word object model, The only styles supported in this limited version are:
- Heading 1
- Heading 2
- Heading 3
- Paragraphs
- Bold, italic and underlined text
We begin by creating a Document object. This is quite similar to the Document Object model that you might use in a standard Word application.
Document doc = new Document();
Now we can add objects that derive from Paragraph to the Paragraphs collection.
doc.Paragraphs.Add(new Heading1("Document maker"));
A paragraph contains only plain text. If formatted text is required, then a Run object is required, or alternatively, an overloaded version of the Paragraph constructor allows for a formatting style to be specified. A Run represents a contiguous piece of text with one style applied to it. If the text requires mixed styling, i.e. some text with a bold word in the middle of it, then separate Runs can be added to the Paragraph object.
doc.Paragraphs.Add(new Paragraph("By Mark Focas, July 2006",
TextFormats.Format.Italic));
Paragraph p=new Paragraph();
p.Runs.Add(new Run("Text can have multiple format styles, they can be "));
p.Runs.Add(new Run("bold and italic",
TextFormats.Format.Bold | TextFormats.Format.Italic));
doc.Paragraphs.Add(p);
Note how the Run objects can be or ed together to create combinations
p.Runs.Add(new Run("bold and italic",
TextFormats.Format.Bold | TextFormats.Format.Italic));
Once the document has been created, it needs to be packaged to be of any use. For maximum flexibility, DocumentMaker returns a Stream object
DocumentPackager dp = new DocumentPackager();
Stream s= dp.Package(doc);
There are not many objects in the solution, most of them have been described above.

Points of Interest
The Microsoft XML model for Word appears complex, but in some ways is quite simple. Every object is a paragraph, all headings, list items etc. This makes it extremely simple to generate word documents, but extremely complex to process using XSLT. The approach I took with the Run object is based on the approach the Word XML format takes.
I remain completely neutral about, and do not wish to enter in any debate on whether Word XML is better that Open Document Language or vice versa. This article should not be viewed as claiming Microsoft Word is any better or worse than Open Office. It could easily be adapted to output Open Office format if you require that.
Having said that, I am rather disappointed that Microsoft are intending to charge for downloads of Office 2007 beta 2 from August.
History
30 July 2006 - Version 1.0.0
| You must Sign In to use this message board. |
|
|
 |
|
 |
What license is associated with this code? Can the source code be modified and used in commercial applications?
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
 |
Hello 5972325,
You can use this code for any purpose you want commercial or non commercial. I only ask that you acknowledge my work and would appreciate a message here if you do use it just so I can feel good about it!
Having said that, I believe that the nature of Code Project is a learning, teaching and sharing site, so if you are here looking for code to use in a commercial product then you should at least fill in your profile rather than having just a number and no details.
Good luck for your project, and please feel free to leave feedback about what you thought of the code.
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
 |
Hello,
I'm representing the company StatusInfo AB, located in Gothenburgh, Sweden. It looks like we soon will use a small component based on this code in our commercial web application.
The code is really perfect for a first simple word exporter. We only had to add simple support for font, font size and color.
Great work!
Best regards, August Rydberg Developer, StatusInfo AB
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
 |
I get a warning when opening a file in Word 2007 that the file was created using a pre-version of word or something similar. Do you know how to fix this?
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
 |
|
|
 |
|
 |
As mentioned in other replies, this was written before the Office Open XML SDK was available, that is a better approach to take than to extend this library. Thanks for your feedback though!
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
 |
|
 |
How can I change the font from Eras Medium ITC to Times New Roman?
_____________________________
...and justice for all
APe
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
 |
hai, i am akila from Tamil Nadu . I want to know about the vb project (mini project) that is my project is (can i create a own MS-Word in Visual Basic)Microsoft Word -(Tomorrow technology ) Please help me ..... Can i create or devlope own MS-Word in Visual basic software??????????
R.Akilanda Nageswari
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
 |
|
|
 |
|
 |
Hi, It is really noce article......i used it and its worklign gr8.
I am trying to insert image in word document and image should be inserted at the specified location by user.....how would i do this? any idea?
we need this urgently.
Regards, Tushar
|
| Sign In·View Thread·PermaLink | 2.00/5 |
|
|
|
 |
|
 |
hi Tushar, Thanks for your comment. Images are a lot harder, as they need to be encoded, and relationships need to be created for them. Now there is actually a packaging namespace in the .NET framework which makes it easier to do such things. At the time of writing this article, that namespace was not available. I will look at the code over the weekend and see if I can offer you some pointers or perhaps make an update. Regards, Mark Focas
Being in a minority of one, doesn't make you insane George Orwell However, in my case it does
|
| Sign In·View Thread·PermaLink | 2.00/5 |
|
|
|
 |
|
 |
Very nice article, I'm working with a docx document and I need to insert a header from my applicattion, but after several attemps I'm not be able to do it. Can you tell me some tips to the right way??
Thanks!!!
Hector
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
 |
Hi Hector, Thanks for your comments on my article. I will look into headers over the weekend and try and post a message about it after that. A header would not be difficult to implement, but it does depend upon what features you want to implement in a header, such as page number, images etc.
Being in a minority of one, doesn't make you insane George Orwell However, in my case it does
|
| Sign In·View Thread·PermaLink | 1.00/5 |
|
|
|
 |
|
 |
Hi, Sorry I haven't had a chance to get back before now. To create a header, you need to create a header xml file, and add a relationship to it in the relationships file. You can find a lot of very good information on Brian Jones blog. He mentions headers in this entry: http://blogs.msdn.com/brian_jones/archive/2006/02/02/523469.aspx[^]. You really should read the specs tho. They are massive but ther are some clear examples as well. A lazy approach could be to create a control document, then save a copy and add a header, then extract the two documents to separate folders and run a comparison utility (WinMerge is Open Source and very good). This can help, but it is much better to do some research into the actual specifications. The specifications can be found here: http://openxmldeveloper.org/default.aspx[^] Like I mentioned earlier, there are already base classes to do this, so I see no reason to extend this early example. Try looking at this example (for Excel): http://www.codeguru.com/csharp/.net/net_asp/tutorials/article.php/c13123/[^]
Being in a minority of one, doesn't make you insane George Orwell However, in my case it does
|
| Sign In·View Thread·PermaLink | 1.00/5 |
|
|
|
 |
|
|
 |
|
 |
Hi Vb, Thanks for the comment. I will post some code soon regarding changing the font colour, it isn't too difficult, I didn't really need it so didn't implement it. At the moment I am attending Microsoft Tech-ed in sydney, so don't have time to update the code, but hope to do something in about a fortnight.
Images are a little harder, but once again, give me a little time and I will see what I can find, I have looked at the schema, mostly it is a matter of adding the image to the package, but also Bin64 encoding it and adding it to the document as well.
Being in a minority of one, doesn't make you insane George Orwell However, in my case it does
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
 |
I would love to do the work i just need an idea. I tried appending a Color Value= 00025200 to the paragraph and attempted to change the value of the color tag in the style xml but it didnt work.
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
 |
OK. What you need to do is to add a property tag to the run tag. In the property you set the colour. It is like this:
<w:r><w:rPr><w:color w:val="FF0000" /></w:rPr><w:t>The text that will be red</w:t></w:r>
So the 'r' tag is a run element. The 'rPr' tag is a run property. The colour value is expressed in RRGGBB format where RR = 00 to FF, etc.
If you implement the Colour as a property of the Run class then you can add it in to the rendering. Supplying some overloads of the constructor to allow for creating a run with colour will make it easier.
I have some working code, but haven't been able to find the time to re-write the article just yet.
The images are a little harder. I will try and find some reference material for you.
BTW Sorry, being from Down under makes it annoying talking about colours, as we spell it with a 'u', but the framework being from the states is lacking the full richness of that extra 'u'!
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
 |
|
 |
So i did it and it works like a Charm.
Run.CS
//First Added a private string
private string _color = string.Empty;
//Then a Color Property
public string Color { get { return _color; } set { _color = value; } }
//Modified the XMLStyle protected string XmlStyle { get { if (_format == 0) { return string.Empty; } else { string boldStyle=string.Empty; string italicStyle=string.Empty; string underlineStyle=string.Empty; string style=string.Empty;
if (_color == string.Empty) _color = "000000";
if (_format > 0) { string propStartTag=""; string propEndTag=""; if ((_format & TextFormats.Format.Bold) == TextFormats.Format.Bold) { boldStyle=""; } if ((_format & TextFormats.Format.Italic) == TextFormats.Format.Italic) { italicStyle=""; } if ((_format & TextFormats.Format.Underline) == TextFormats.Format.Underline) { underlineStyle=""; } style=propStartTag + boldStyle + italicStyle + underlineStyle + propEndTag; } return style; } } } //Added another Run Overload public Run(string text, TextFormats.Format format,string strRRGGBB) { _text=text; _format=format; _color = strRRGGBB; }
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
 |
|
 |
Thank you Nader
Being in a minority of one, doesn't make you insane George Orwell However, in my case it does
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|