Click here to Skip to main content
13,247,336 members (89,884 online)
Click here to Skip to main content
Add your own
alternative version

Stats

9.3K views
169 downloads
10 bookmarked
Posted 23 Jun 2017

Convert HTML to Word Document using CKEditor and MariGold.OpenXHTML

, 29 Sep 2017
Rate this:
Please Sign up or sign in to vote.
Implement an online HTML to Word converter using CKEditor and MariGold.OpenXHTML

Introduction

MariGold.OpenXHTML is a GitHub open source library to convert HTML documents into Word documents. It internally uses Open XML SDK to create Word documents. The CKEditor is a popular free tool for formatting the HTML in web sites. By integrating these together, we can develop an online HTML to Word converter. We will create an ASP.NET MVC project to demonstrate this.

Using the Code

This tutorial uses Visual Studio 2015 community edition. The first part of this tutorial will explain how to integrate CKEditor in MVC project and the second part will discuss about the conversion of HTML to a Word document from the output of CKEditor.

Setup the CKEditor

Download your preferred package from the CKEditor web site. This tutorial will use the full package which contains all the plugins to experiment with. Open Visual Studio and create a new MVC project with default templates. We can re-use the Home controller and Index cshtml for our demo purposes.

Extract the downloaded CKEditor package and copy the entire ckeditor folder into the Scripts folder.

Remove all the HTML contents from Index.cshtml and add the following code:

@using (Html.BeginForm("Index", "Home", FormMethod.Post))
{
    @Html.TextArea("content", new { @id = "editor1" })
    <input type="submit" value="Submit" />
}

Of course, we need to include the reference of ckeditor.js and a script element at the bottom of the same page to initialize the CKEditor.

<script type="text/javascript" src="~/Scripts/ckeditor/ckeditor.js"></script>
<script>
    CKEDITOR.replace('editor1');
</script>

CKEditor is now fully configured and if you run the application, it will load on the home page. The next step is to install the MariGold.OpenXHTML and implement an Index post action method on Home controller to submit the HTML content.

Setup the MariGold.OpenXHTML

This library is available as a NugGet package. To install, enter the following command on package manager console.

Install-Package MariGold.OpenXHTML

This will also install the following dependencies:

  • DocumentFormat.OpenXml - OpenXml SDK library to create Open XML word documents
  • MariGold.HtmlParser - To parse and extract the HTML elements from the input text

The final step is to integrate all these to create the Word documents on the fly. Add a new Index method as below on Home controller to post the HTML from CKEditor. Don’t forget to include the necessary namespaces.

using System.Web.Mvc;
using System.IO;
using MariGold.OpenXHTML;
[HttpPost]
[ValidateInput(false)]
public FileResult Index(string content)
{
    using (MemoryStream mem = new MemoryStream())
    {
        WordDocument doc = new WordDocument(mem);
        doc.Process(new HtmlParser(content));
        doc.Save();

        return File(mem.ToArray(), "application/msword", "sample.docx");
    }
}

Most of the work is done in the WordDocument class. This class contains few properties and methods to manipulate the process of converting HTML into Open XML word document. Refer to the GitHub project home page for more details.

Here, we will be using a MemoryStream to create the Word document in-memory. The Process method is responsible for parsing the HTML and convert it into Word document. This method requires an IParser type implementation for parsing the HTML text. This will help to completely replace default HTML parsing implementation with any other custom implementation. Refer to the GitHub project home page on how to implement this.

The Save method is required to flush all the modifications into the MemoryStream. The last line of code will write the content of MemoryStream as a binary array into the FileContentResult. This will force the browser to download the output file.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

Kannan Ar
Software Developer (Senior) self employed
India India
No Biography provided

You may also be interested in...

Pro
Pro

Comments and Discussions

 
QuestionNice article on CKEditor and MariGold.OpenXHTML Pin
Mou_kol10-Jul-17 23:27
memberMou_kol10-Jul-17 23:27 
QuestionSample screen shots Pin
Member 1268918110-Jul-17 0:51
memberMember 1268918110-Jul-17 0:51 
AnswerRe: Sample screen shots Pin
Kannan Ar10-Jul-17 1:54
professionalKannan Ar10-Jul-17 1:54 
GeneralRe: Sample screen shots Pin
Member 1268918117-Aug-17 16:54
memberMember 1268918117-Aug-17 16:54 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

Permalink | Advertise | Privacy | Terms of Use | Mobile
Web01 | 2.8.171114.1 | Last Updated 30 Sep 2017
Article Copyright 2017 by Kannan Ar
Everything else Copyright © CodeProject, 1999-2017
Layout: fixed | fluid