Click here to Skip to main content
Click here to Skip to main content
Technical Blog

Google Docs to clean html, good for WordPress posts, emails

, 25 May 2014 CPOL
Rate this:
Please Sign up or sign in to vote.
Google docs is a great platform to write documents, especially when you compare it with the WordPress editor.  It would be good to have a clean way to export a Google doc to a wordpress post or generate nice looking emails. If you copy and paste a Google doc into a WordPress post, it loses many ...M

Google docs is a great platform to write documents, especially when you compare it with the WordPress editor.  It would be good to have a clean way to export a Google doc to a wordpress post or generate nice looking emails. If you copy and paste a Google doc into a WordPress post, it loses many formatting and produces a bloated html with lots of inline style, CSS classes that do not go well with WordPress. So, here’s a solution that will generate a clean HTML from a Google Doc and email it to you so that you can copy and paste it a WordPress post or send to others via email.

For example, here’s a Google Doc:

GoogleDocScreenshot

Once you run the script, it will produce a nice clean email for you:

GoogleDocsEmail

Here’s how to do it:

  1. Open your Google Doc and go to Tools menu, select Script Editor. You should see a new window open with a nice code editor.
  2. Copy and paste the code from here: GoogleDocs2Html
  3. Then from the “Select Editor” menu, choose ConvertGoogleDocToCleanHtml
  4. Click the play button to run the script.
  5. You will get an email containing the HTML output of the Google Doc with inline images.
  6. You can easily forward that email to anyone or copy and paste in a WordPress post.

Here’s how the code works:

First it will loop through the elements (paragraph, images, lists) in the body:

function ConvertGoogleDocToCleanHtml() {
  var body = DocumentApp.getActiveDocument().getBody();
  var numChildren = body.getNumChildren();
  var output = [];
  var images = [];
  var listCounters = {};

  // Walk through all the child elements of the body.
  for (var i = 0; i < numChildren; i++) {
    var child = body.getChild(i);
    output.push(processItem(child, listCounters, images));
  }

  var html = output.join('\r');
  emailHtml(html, images);
  //createDocumentForHtml(html, images);
}

The processItem function takes care of generating proper html output from a Doc Element. The code for this function is long as it handles Paragraph, Text block, Image, Lists. Best to read through the code to see how it works. When the proper html is generated and the images are discovered, the emailHtml function generates a nice html email, with inline images and sends to your Gmail account:

function emailHtml(html, images) {
  var inlineImages = {};
  for (var j=0; j<images.length; j++) {
    inlineImages[[images[j].name]] = images[j].blob;
  }

  var name = DocumentApp.getActiveDocument().getName()+".html";

  MailApp.sendEmail({
     to: Session.getActiveUser().getEmail(),
     subject: name,
     htmlBody: html,
     inlineImages: inlineImages
   });
}

Enjoy!

Remember images in the email are inline images. If you copy and paste into WordPress, the images won’t get automatically uploaded to WordPress. You will have to manually download and upload each image to WordPress. It’s a pain. But that’s the problem with WordPress editor.

Special thanks to this GitHub project, that gave me many ideas: https://github.com/mangini/gdocs2md

 

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

Omar Al Zabir
Architect BT, UK (ex British Telecom)
United Kingdom United Kingdom

Comments and Discussions

 
QuestionNeeds to protect HTML links PinmemberMember 1094622314-Jul-14 14:26 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Terms of Use | Mobile
Web02 | 2.8.1411019.1 | Last Updated 25 May 2014
Article Copyright 2014 by Omar Al Zabir
Everything else Copyright © CodeProject, 1999-2014
Layout: fixed | fluid