Click here to Skip to main content
15,867,771 members
Articles / DevOps / Automation
Tip/Trick

Edit Word Documents using OpenXML and C# Without Automation/Interop

Rate me:
Please Sign up or sign in to vote.
4.73/5 (19 votes)
28 May 2015CPOL5 min read 173.1K   6.5K   53   37
This tip explains how to use Word Automation using OpenXML and C# without Word Interop.

Introduction

Are you looking for a way to use Word automation? Want to deal with Word documents programmatically? Go through this tip, it will help you to deal with Word Automation without Interop using C# and OpenXML.

After going through this tip, you can tell:

  1. What is Open XML
  2. Why to use Open XML
  3. How to use OpenXML to create Word documents using C# and OpenXML API
  4. Create Word table using OpenXML and C#

Background

I have seen many developers struggling to deal with Word documents programmatically, There are couple of ways to deal with Word documents:

  1. Using COM interop object (Winword instance) (For interop sample code, check this article)
  2. Using OpenXML API (Do not have to install Word on machine)

Using the Code

Things We Need

Before starting with the OpenXML cooking, we need the following things to be ready with us:

  1. C# Visual Studio (2005+ version)
  2. OpenXML API (can be downloaded from here Open XML SDK 2.5 for Microsoft Office)

That's it. (Wow!!! No word installation needed)

Getting Started with OpenXML

Now a days, DOCX files are getting popular day by day, due to them being very light and faster in processing, DOCX is the magical result of ZIP and XML combination. So it is clear that if we able to manage XMLs, we will be able to manage DOCX too. For managing WordXML, we need some API and that API is known as Open XML SDK for Microsoft Office, MSDN Says "API simplifies the task of manipulating Open XML packages and the underlying Open XML schema elements within a package. The Open XML SDK encapsulates many common tasks that developers perform on Open XML packages, so that you can perform complex operations with just a few lines of code."

Open XML Advantages over Interop

  1. Open XML is an open standard for Word-processing documents, presentations, and spreadsheets that can be freely implemented by multiple applications on different platforms
  2. The purpose of the Open XML standard is to de-couple documents created by Microsoft Office applications so that they can be manipulated by other applications independent of proprietary formats and without the loss of data.
  3. As it is light weight, the processing is faster than interop objects
  4. It has good Interoprability, Backwards Compatibility and Programmability
  5. As it is Smaller File Size, it is to manage all variety of document stores, including Exchange servers, SharePoint, and of course network file storage.
  6. It’s a IS29500 standard, free for all to use, and extremely well documented
You can unzip DOCX file

Do you know you can unzip DOCX file? DOCX is the combination of several well structured .XML file, An Open XML file is stored in a ZIP archive for packaging and compression. You can view the structure of any Open XML file using a ZIP viewer, Open XML document is built of multiple document parts. The relationships between the parts are themselves stored in document parts, each typical DOCX file has the following different parts.

See the below image to know the different XML parts:

Body is the main part of the document and it has many different parts as shown in the above figure.

Working with Paragraphs (First Assignment)

Paragraphs is the most basic unit of block-level content within a WordprocessingML document, paragraphs are stored using the <p> element, Paragraph different sub elements like ParagraphProperties (Optional), Run and Text.

Paragraph Properties

Paragraph properties are used for the formatting of the text, some of the examples of paragraph properties are alignment, border, hyphenation override, indentation, line spacing, shading, text direction. The OXML SDK Paragraph properties class represents the <pPr> element.

Run

The run element is provided to demarcate a region of text. The OXML SDK Run class represents the <r> element.

Text

This element contains actual Text of a document, With the <r> element, the text (<t>) element is the container for the text that makes up the document content.

Start with the Code (Create new word document and write in it)

Open Visual Studio and start with the first OpenXML assignment.

Create new Project/Application and add DLL reference (DLL should exist in Installed OpenXML API folder, e.g., C:\Program Files\Open XML SDK\V2.0\lib).

1. DocumentFormat.OpenXml

See the below snippet where we are creating new Word document with the help of OpenXML.

C#
using (WordprocessingDocument doc = WordprocessingDocument.Create
("D:\\test11.docx", DocumentFormat.OpenXml.WordprocessingDocumentType.Document))
       {
           // Add a main document part.
           MainDocumentPart mainPart = doc.AddMainDocumentPart();

           // Create the document structure and add some text.
           mainPart.Document = new Document();
           Body body = mainPart.Document.AppendChild(new Body());
           Paragraph para = body.AppendChild(new Paragraph());
           Run run = para.AppendChild(new Run());

           // String msg contains the text, "Hello, Word!"
           run.AppendChild(new Text("New text in document"));
       }

In the above simple snippet:

  • We have use 'WordProcessingDocument' class for creating new document
  • Add MainDocumentPart in document
  • Then append Body to main document part
  • Then add Paragraph to Body element
  • Then add Run to Paragraph element
  • Then add Text to Run element

That's it. No need to save document anymore.

Now if you go and check for 'test11.docx', then you can see it contains text 'New text in document'.

Now try to unzip that Docx file, you will get below folder structure, you will get folders _rels, docsProps, word and [Content_Types].xml file.

Open Word folder and check document.xml. You will see the below snap:

In the above image, you can see <w:body>represents MainBody of the document, <w:p> is the paragraph element, <w:r>is the run element, <w:t>is the text element.

This is how OpenXML works.

Points of Interest

OpenXML is really an amazing thing, it fluently works with spreadsheets, charts, presentations, and Word processing documents. The Open XML file formats are useful for developers because they use an open standard and are based on well-known technologies: ZIP and XML.

References and Book of Facts

Following are the referral links for OpenXML:

Thanks

OpenXML is not a single cup of tea, I am continuing with a different assignment on OpenXML in the next version of this article. Till then, enjoy this stuff. Suggestions and queries are always welcome.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Technical Lead
India India
Hi there, I am Prasad. Author, Blogger, contributor and passionate about Microsoft .NET technologies. I like to write an articles/blogs on different .NET aspects and like to help Developers, to resolve their issues and boost them on Microsoft Technologies.


Certifications: Microsoft Certified professional (MCP), Microsoft Certified technology specialist (MCTS), Agile-Scrum Master.


Awards: Microsoft Re-connect MVP (GSC Member), Most valuable member at dotnetspider, Most popular curator, Most active curator, featured curator at Microsoft Curah, Editor at dotnetspider.


Microsoft MVP 2014 [ASP.NET/IIS]
Click here for more .NET Tips
-After all Knowledge is an endless entity

Comments and Discussions

 
SuggestionC# Without Word Document Pin
Antony Bruno1-Feb-19 21:17
Antony Bruno1-Feb-19 21:17 
QuestionMailMerge html Pin
Member 1354568228-Nov-17 1:50
Member 1354568228-Nov-17 1:50 
AnswerRe: MailMerge html Pin
koolprasad200329-Nov-17 16:55
professionalkoolprasad200329-Nov-17 16:55 
GeneralRe: MailMerge html Pin
Member 1354568229-Nov-17 20:37
Member 1354568229-Nov-17 20:37 
Questiondocx with form fields Pin
LegecyWolf5-Sep-16 0:39
LegecyWolf5-Sep-16 0:39 
AnswerRe: docx with form fields Pin
koolprasad20035-Sep-16 19:34
professionalkoolprasad20035-Sep-16 19:34 
Questiondocx Previewer Pin
atulonweb@gmail.com8-Feb-16 21:15
atulonweb@gmail.com8-Feb-16 21:15 
AnswerRe: docx Previewer Pin
koolprasad20038-Feb-16 22:33
professionalkoolprasad20038-Feb-16 22:33 
QuestionOpenXML or DocX for word document editing Pin
atulonweb@gmail.com23-Oct-15 0:19
atulonweb@gmail.com23-Oct-15 0:19 
AnswerRe: OpenXML or DocX for word document editing Pin
koolprasad200323-Oct-15 19:34
professionalkoolprasad200323-Oct-15 19:34 
GeneralRe: OpenXML or DocX for word document editing Pin
atulonweb@gmail.com27-Oct-15 19:46
atulonweb@gmail.com27-Oct-15 19:46 
GeneralRe: OpenXML or DocX for word document editing Pin
koolprasad20034-Nov-15 0:10
professionalkoolprasad20034-Nov-15 0:10 
GeneralRe: OpenXML or DocX for word document editing Pin
atulonweb@gmail.com5-Nov-15 18:38
atulonweb@gmail.com5-Nov-15 18:38 
GeneralMy vote of 1 Pin
Member 117072372-Aug-15 19:19
Member 117072372-Aug-15 19:19 
GeneralRe: My vote of 1 Pin
koolprasad200313-Sep-15 19:47
professionalkoolprasad200313-Sep-15 19:47 
QuestionOpen XML Productivity tool Pin
Simon_Whale1-Jun-15 1:50
Simon_Whale1-Jun-15 1:50 
AnswerRe: Open XML Productivity tool Pin
koolprasad200315-Jun-15 19:10
professionalkoolprasad200315-Jun-15 19:10 
QuestionWord Interop related access problems Pin
hoysalaoncp31-May-15 21:17
hoysalaoncp31-May-15 21:17 
AnswerRe: Word Interop related access problems Pin
koolprasad20031-Jun-15 3:02
professionalkoolprasad20031-Jun-15 3:02 
GeneralRe: Word Interop related access problems Pin
hoysalaoncp1-Jun-15 16:46
hoysalaoncp1-Jun-15 16:46 
GeneralRe: Word Interop related access problems Pin
koolprasad20031-Jun-15 23:13
professionalkoolprasad20031-Jun-15 23:13 
QuestionBalance Pin
Member 1172605029-May-15 9:24
Member 1172605029-May-15 9:24 
AnswerRe: Balance Pin
koolprasad200329-May-15 20:44
professionalkoolprasad200329-May-15 20:44 
AnswerRe: Balance Pin
Michael Breeden1-Jun-15 2:29
Michael Breeden1-Jun-15 2:29 
GeneralRe: Balance Pin
Member 117260501-Jun-15 2:52
Member 117260501-Jun-15 2:52 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.