This is a review of Aspose .NET. We have changed our process in order to eliminate Microsoft Word software dependance forced by the native Interop libraries. With the Aspose library we were able to manage all the process in memory and manage the document more efficiently. The objective is double :
- Avoid the installation of Microsoft Word on a server for automation purposes;
- Better performance with a library directly in a code;
Also, Aspose has a complete Microsoft Office library and is very useful to add some flexibility and functionality to our web application. Your mileage will vary depending on the methods used to process documents in regards to integrating their library, but they are functionality complete and the documentation is an integral part of the provided product.
Using the code
1 - Setting up the license
Aspose components requires you to load the license into the library for it to unlock it's true functionnality.
You can request temporary licenses for you integration phase on their website by creating a quote for the component(s) you are interested in and in the final stage of the quote, the option will be made available.
Dim License As New Aspose.Words.License()
Adding the license as a static resource to your library or program works as expected, otherwize it should reside in the same folder as the application.
2 – Working with your document
Now this is where your design might diverge, we used hidden bookmarks to control dynamic (in and out) sections in the document. If you used other methods, you’ll need to delve deeper into the functionalities of the library.
Two main classes will be used in most process :
The Aspose.Word.WordDocument and Aspose.Word.WordDocumentBuilder
Declared and instanciated as follows :
Dim oWordFile As System.IO.MemoryStream
Dim oWordDocument As Aspose.Words.Document
Dim oWordDocumentBuilder As Aspose.Words.DocumentBuilder
oWordFile = New System.IO.MemoryStream(oDocument.Content, True)
oWordDocument = New Aspose.Words.Document(oWordFile)
oWordDocumentBuilder = New Aspose.Words.DocumentBuilder(oWordDocument)
oWordFile = Nothing
Now with this set up you’re ready to roll out your own code to exploit the document.
Aspose Word.Net object layer manages documents much like an XML document, whether it be a legacy (RTF, DOC(1997/2003), etc..) or the newer format (DOCX, DOCM, etc…).
You’ll need to traverse a tree of object nodes in order to properly modify or exploit existing data.
3 - A couple of examples and notes on bookmarks management
Dim sValue As String = oWordDocument.Range.Bookmarks(sBookmarkName).Text
sValue = "Test"
oWordDocument.Range.Bookmarks(sBookmarkName).Text = sValue
That covers manipulating existing bookmarks.
Creating the bookmark itself is fairly simple you’ll need to add two child nodes to the document’s structure :
Dim oBookmarkStart As New Aspose.Words.BookmarkStart(oWordDocument, sNewName)
Dim oBookmarkEnd As New Aspose.Words.BookmarkEnd(oWordDocument, sNewName)
The method used to identify bookmark boundaries will vary depending on your method of managing the document, if you are building the document you'll simply add it to the document's content, if you're using pre-existing template, a more complex solution is required to manage bookmarks.
The only trap you have to look out for is when cloning a node (any component in the document is a node and can be cloned as well as it’s own tree) be aware that cloned bookmarks still retain their original names, you MUST change their name or you will run into exceptions.
A simple loop through a cloned structure will enable to do just that and adjust the bookmark names accordingly by trapping nodes of the two types mentioned above.
4 – Saving your modifications
Though these examples are basic they cover I/O using bookmarks, all you’ll need now is to save the document using the appropriate XSaveOptions class.
Be advised, using other save methods offered (detection by filename extension) is error prone
in our experience with the tool.
I’ll illustrate here the PDF save feature.
oPDFOptions = New Aspose.Words.Saving.PdfSaveOptions()
oPDFOptions.EmbedFullFonts = False
oPDFOptions.ExportCustomPropertiesAsMetadata = False
oPDFOptions.FontEmbeddingMode = Aspose.Words.Saving.PdfFontEmbeddingMode.EmbedNone
oWordDocument.FieldOptions.IsBidiTextSupportedOnUpdate = False
Points of interest
From a performance point of view, not having to instanciate the document from a hard drive bound file is already a major improvement, not having to rely on Microsoft Office adds even more, but aside all that the processing itself shows a 6:1 gain against the native .Net Interop libraries.
Another note of interest concerning the performance relative to the component, the clean up is costly and should therefore be isolated to non-critical paths of your process.
The object abstraction layer also greatly simplifies the code required to manipulate the document itself, not introducing the same quirks present in the document's final structure, that complexity is managed by the library itself.
2013-08-06 : First version.