Aspose.Words for .NET is a class library that enables your applications to perform a great range of document processing tasks. Aspose.Words supports DOC, DOCX, RTF, HTML, OpenDocument, PDF, XPS, EPUB and other formats. With Aspose.Words you can generate, modify, convert, render and print documents without utilizing Microsoft Word. (1) It is a very powerful tool which isn't dependent on word being installed.
We chose to not use the DOM approach with Aspose words for many reasons. We had a need to take html which was outputted from TextControl (TxText) and insert it in a Table cell or in the body of the document. I had posted on Aspose's forum and was told it wasn't possible. I came up with this solution, which is the undocumented way to import html using Aspose.Words OO Approach.
Original Forum Post which spawned this article.
Using the code
The great thing about aspose is the ability to move nodes from one document to another. You basically import the node into the a new document and then append it to a existing node. The trick is to append the import node to the correct parent node of the existing document.
Map of what node can be a child/parent of another node (2):
Method which inserts html inside a aspose body node. The Aspose document doesn't allow you to directly import html inside a document after the document was created. The only way to open the document is to use load options in the constructor. We needed to insert the html in different locations throughout the document after it was created. To do this you have to created a brand new document and load it using the import options. Once its been created you can exatract all child nodes from the document and import them into the existing document where you choose. We ran several performance tests and found that performance wasn't affected by creating a additional document just for import html.
private void addHTMLIntoNode(string html, Aspose.Words.Body pNode)
List<Node> list = new List<Node>();
using (System.IO.MemoryStream mStream = new System.IO.MemoryStream(System.Text.ASCIIEncoding.ASCII.GetBytes(html)))
LoadOptions loadOptions = new LoadOptions();
loadOptions.LoadFormat = LoadFormat.Html;
Document doc = new Document(mStream, loadOptions);
NodeCollection paragraphs = doc.GetChildNodes(NodeType.Paragraph, true);
NodeCollection tables = doc.GetChildNodes(NodeType.Table, true);
var parents = from t in paragraphs.ToArray() where t.ParentNode is Aspose.Words.Tables.Cell select t;
var paraclean = paragraphs.ToArray().Except(parents);
foreach (Node n in list)
dynamic newNode = _doc.ImportNode(n, isImportChildren: true, importFormatMode: ImportFormatMode.KeepSourceFormatting);
Helpful Extension Methods:
Using Apose.Words object model approch was very difficult and felt like it was all over the place. I came up with extension methods to essentially wrap their code and allow for cleaner more readable programming. See full solution for extension methods.
Adding table, row and cells with text without the extension methods is not very readable or maintainable.
var table = new Aspose.Words.Tables.Table(_doc);
var row = new Aspose.Words.Tables.Row(_doc);
var cellOne = new Aspose.Words.Tables.Cell(_doc);
var paraOne = new Paragraph(_doc);
Run runOne = new Run(_doc, "Cell One");
runOne.Font.Name = "Microsoft Sans Serif";
runOne.Font.Size = 9;
runOne.Font.Bold = true;
var cellTwo = new Aspose.Words.Tables.Cell(_doc);
var paraTwo = new Paragraph(_doc);
Run runTwo = new Run(_doc, "Cell Two");
runTwo.Font.Name = "Microsoft Sans Serif";
runTwo.Font.Size = 9;
runTwo.Font.Bold = true;
Adding table, row and cells with text WITH the extension methods allows for centeralized, readable and highly maintainable code.
var row = fSection.NewTable().NewRow();
(2)Image taken from Aspose.Words documentation.