Click here to Skip to main content
13,794,579 members
Click here to Skip to main content
Add your own
alternative version

Stats

2.7K views
92 downloads
5 bookmarked
Posted 10 Nov 2018
Licenced CPOL

Simple Word Document Viewer

, 10 Nov 2018
Rate this:
Please Sign up or sign in to vote.
Simple Word Document File Viewer

Introduction

This article describes how to build a simple Microsoft Word document viewer (.docx) format.

It is useful for viewing the Word document in your project for any purpose.

The viewer is very simple at the current state and needs a lot more development. This article will describe only the concept.

The viewer depends on two major open source libraries:

The viewer language is Visual Basic .NET.

Background

I was working on a project whose main data exists in a Word document and I found that the only way for data entry is to view the document on a form and choose and select parts of it and copy it for saving in the database.

I searched for a Word document viewer on the internet and did not find any. All that I found is a library for reading the (.docx) format and returning the data in .NET object, I chose DocX for this purpose.

Then I thought if I could read the file and view it myself, I search for RTF library and chose the String builder for RTF.

By compiling these two libraries, I could build this viewer.

Using the Code

The viewer solution consists of two projects:

  • WordDocViewer, a Windows form application
  • WordFile, a class library project

The Windows Form project is the host and responsible for viewing the RTF result on an MDI child form using RichTextBox control.

The RTF result is built by the RTFlib after reading the document by DocX library in the class library project.

The class library is very simple - it has two classes:

  • Document represents the Word document and can load the Word document file and parses the pages.
  • Page because the DocX library has no page class. I create one to keep each page paragraphs together.

I could parse the pages by searching for the line feed character in each paragraph and when I find it, I split the paragraph into two parts of text and consider the new part is a new paragraph.

This is the Load function:

Public Function Load(File As String) As Boolean
    Try
        Me.Doc = DocX.Load(File)
        Dim Page As Page = New Page With {._Index = Me.Pages.Count + 1}
        Dim Pos As Short = 0
        Dim Text As String = String.Empty

        Me.Pages.Add(Page)
        For Each Paragraph As Novacode.Paragraph In Me.Doc.Paragraphs
            If Paragraph.Text.Contains(vbLf) Then
                Text = Paragraph.Text
                Pos = Text.IndexOf(vbLf)

                Paragraph.ReplaceText(Text.Substring(Pos + 1), String.Empty)
                Page.Paragraphs.Add(Paragraph)

                Page = New Page With {._Index = Pages.Count + 1}
                Page.Paragraphs.Add(Paragraph.InsertParagraphAfterSelf(Text.Substring(Pos + 1)))
                Me.Pages.Add(Page)
            Else
                Page.Paragraphs.Add(Paragraph)
            End If
        Next
        Return True
    Catch ex As Exception
        Throw
    End Try

    Return False
End Function

To view images in the viewer, the RTFlib needs to pass a Drawing.Image type parameter to its InsertImage function and for that, I create the GetImage function in the Document class.

Public Function GetImage(Picture As Novacode.Picture) As Drawing.Image
    Dim DocImage As Novacode.Image = Nothing
    Dim Image As Drawing.Image = Nothing
    Dim stream = Nothing

    DocImage = Me.Doc.Images.Find(Function(T) T.Id = Picture.Id)
    If DocImage IsNot Nothing Then
        stream = DocImage.GetStream(IO.FileMode.Open, IO.FileAccess.Read)
        Dim Buffer(stream.Length) As Byte
        stream.Read(Buffer, 0, Buffer.Length)
        Image = Drawing.Image.FromStream(stream)
        stream.Close()

    End If

    Return Image
End Function

Here is a captured image of the viewer:

Word Viewer

Points of Interest

The viewer is very simple and easy to understand and is also very easy to convert to C# language.

History

  • 8th November, 2018: First release

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

hussam.it
Software Developer (Senior)
Syrian Arab Republic Syrian Arab Republic
Developer for 20 years now looking forward for any new technology.
Individual working for my products.
Main development language is VB.net and can develop in any other one.

You may also be interested in...

Comments and Discussions

 
-- There are no messages in this forum --
Permalink | Advertise | Privacy | Cookies | Terms of Use | Mobile
Web06 | 2.8.181207.3 | Last Updated 10 Nov 2018
Article Copyright 2018 by hussam.it
Everything else Copyright © CodeProject, 1999-2018
Layout: fixed | fluid