![]() |
Enterprise Systems »
Office Development »
Office Automation
Beginner
License: The Code Project Open License (CPOL)
Word automation (Part 1)By padmanabhan NWord automation (Part 1) |
C#, Windows, ASP.NET, Dev
|
|
Advanced Search Add to IE Search |
|
|
|
||||||||||||||||
In this article, we are going to see some automation concepts like converting Table of Contents to TreeView and converting tables in Word to Excel.
The main concept of this article is that everyone can read the data in Word, but when I had some requirements like getting the Table of Contents and converting the tables to Excel, I really struggled a lot and found the solution. So I thought of sharing those things.
The automation has two tabs.
While automating any document which has more than 400 pages, we don't have any options to select only a particular part which we want to automate. So, I thought that the Table of Contents would be helpful. The Table of Contents will have the key words or the headings of the details given in the document. This article will find the Table of Contents and store it in TreeView, both parent and child.
This is the second issue that I faced. While automating Word which has a lot of tables, the alignment was bad and for that, I had to delete those tables. But later, I had a requirement for those tables also to be exported. So, I started Automation of Tables to Excel.
The uploaded Document is taken and the Table of Content is identified. There, it is been separated as parent and child.
if (doc.TablesOfContents.Count != 0)
{
doc.TablesOfContents[1].IncludePageNumbers = false;
Table conTable = doc.TablesOfContents[1].Range.ConvertToTable
(ref nullobj, ref nullobj, ref nullobj, ref nullobj,
ref nullobj, ref nullobj, ref nullobj, ref nullobj,
ref nullobj, ref nullobj, ref nullobj, ref nullobj,
ref nullobj, ref nullobj, ref nullobj, ref nullobj);
string insertRange1 = string.Empty;
int tblrowa = conTable.Rows.Count;
int tblcola = conTable.Columns.Count;
for (int tblrow2 = 1; tblrow2 <= tblrowa; tblrow2++)
{
for (int tblcol2 = 1; tblcol2 <= tblcola; tblcol2++)
{
insertRange1 = conTable.Cell(tblrow2, tblcol2).Range.Text;
}
}
string[] Content = insertRange1.Split(new char[] { '\r' });
List<string> cellValdata = new List<string>();
TreeNode Parenttree = new TreeNode();
TreeNode treechild = new TreeNode();
//assigning the parent and child nodes
for (int cou = 0; cou <= Content.Length - 1; cou++)
{
CopyWithProgress(Content.Length);
if (Content[cou] != "" && Content[cou] != "\a")
{
string[] FurtherSplit = Content[cou].Split(new char[] { });
if (FurtherSplit[0].Length <= 2)
{
if (Content[cou] != "" && Content[cou] != "\a")
{
Parenttree = new TreeNode(Content[cou]);
tree.Nodes.Add(Parenttree);
}
}
else
{
if (Content[cou].Contains("."))
{
if (Content[cou] != "" && Content[cou] != "\a")
{
treechild = new TreeNode(Content[cou]);
Parenttree.Nodes.Add(treechild);
}
}
else
{
Parenttree = new TreeNode(Content[cou]);
tree.Nodes.Add(Parenttree);
}
}
}
ProgressBar.PerformStep();
}
}
Remember it is been assumed that there can be only one Table of Contents for each document.
When a parent is checked, all the child nodes will be checked and when any one of the child nodes is unchecked, then the parent node will be unchecked.
Boolean bChild = true;
Boolean bParent = true;
private void tree_AfterCheck(object sender, TreeViewEventArgs e)
{
if (bChild)
{
CheckAllChildren(e.Node, e.Node.Checked);
}
if (bParent)
{
CheckMyParent(e.Node, e.Node.Checked);
}
}
void CheckAllChildren(TreeNode tn, Boolean bCheck)
{
bParent = false;
foreach (TreeNode ctn in tn.Nodes)
{
bChild = false;
ctn.Checked = bCheck;
bChild = true;
CheckAllChildren(ctn, bCheck);
}
bParent = true;
}
void CheckMyParent(TreeNode tn, Boolean bCheck)
{
if (tn == null) return;
if (tn.Parent == null) return;
bChild = false;
bParent = false;
tn.Parent.Checked = bCheck;
CheckMyParent(tn.Parent, bCheck);
bParent = true;
bChild = true;
}
Expand all is used to expand the TreeView:
private void btnExpandAll_Click(object sender, EventArgs e)
{
this.tree.ExpandAll();
}
Collapse all is used to collapse the TreeView:
private void btnCollapseAll_Click(object sender, EventArgs e)
{
this.tree.CollapseAll();
}
Check all is used to check all the TreeView parent nodes and Child nodes:
private void btnCheckAll_Click(object sender, EventArgs e)
{
for (int node = 0; node < tree.Nodes.Count; node++)
{
tree.Nodes[node].Checked = true;
if (bChild)
{
CheckAllChildren(tree.Nodes[node], tree.Nodes[node].Checked);
}
if (bParent)
{
CheckMyParent(tree.Nodes[node], tree.Nodes[node].Checked);
}
}
}
UnCheck all is used to uncheck all the TreeView parent nodes and Child nodes:
private void btnUncheckAll_Click(object sender, EventArgs e)
{
for (int node = 0; node < tree.Nodes.Count; node++)
{
tree.Nodes[node].Checked = false;
if (bChild)
{
CheckAllChildren(tree.Nodes[node], tree.Nodes[node].Checked);
}
if (bParent)
{
CheckMyParent(tree.Nodes[node], tree.Nodes[node].Checked);
}
}
}
This is the process of reading each and every row and converting it to Excel.
if (doc.Tables.Count != 0)
{
//Identifying the table and getting the values.
int rowtbl = 0;
System.Text.Encoding ascii = System.Text.Encoding.ASCII;
for (int tables = 1; tables <= doc.Tables.Count; tables++)
{
rowtbl = rowtbl + 1;
Table tbl = doc.Tables[tables];
CopyWithProgress(100);
foreach (Microsoft.Office.Interop.Word.Row row in tbl.Rows)
{
CopyWithProgress(doc.Tables.Count);
List<string> cellValues = new List<string>();
int val = 0;
foreach (Microsoft.Office.Interop.Word.Cell cell in row.Cells)
{
string cellContents = cell.Range.Text;
if(!cellContents.Contains("="))
cellValues.Add(cellContents.Remove(cellContents.Length - 2));
}
int ran = 65;
for (int celval = 0; celval <= cellValues.Count - 1; celval++)
{
m_objRange = m_objSheet.get_Range(ascii.GetString(new byte[]
{ (byte)ran }) + rowtbl.ToString(), m_objOpt);
m_objRange.Value2 = cellValues[val].Trim().TrimEnd().TrimStart().ToString();
ran++;
val++;
}
rowtbl = rowtbl + 1;
ProgressBar.PerformStep();
}}
//Saving the output Excel file
m_objBook.SaveAs(@CurrentPath + "\\Temp.xlsx", m_objOpt, m_objOpt,
m_objOpt, m_objOpt, m_objOpt,
Microsoft.Office.Interop.Excel.XlSaveAsAccessMode.xlNoChange,
m_objOpt, m_objOpt, m_objOpt, m_objOpt, m_objOpt);
System.Runtime.InteropServices.Marshal.ReleaseComObject(m_objBooks);
System.Runtime.InteropServices.Marshal.ReleaseComObject(m_objExcel);
//m_objBook.Close(false, TMPpath, false);
doc.Close(ref m_objOpt, ref m_objOpt, ref m_objOpt);
a.Quit(ref m_objOpt, ref m_objOpt, ref m_objOpt);
File.Delete(file.ToString());
Process.Start(@CurrentPath + "\\Temp.xlsx");
}
The uploaded document should have Table of Contents and Tables or otherwise an error message will be given. When the columns are merged in a table, an error will appear. So if a table has merged columns, this automation won't provide results. If a document has multiple number of Table of Contents, the first Table of Contents will be considered for automation.
This is my third article on CodeProject. This project is tested with more documents and the corrections have been made up to an extent. If further errors occur, please let me know so that it may be corrected in future.
| You must Sign In to use this message board. | |||||||||||||||
|
|||||||||||||||
|
|||||||||||||||
|
|||||||||||||||
|
|||||||||||||||
General
News
Question
Answer
Joke
Rant
Admin
|
PermaLink |
Privacy |
Terms of Use
Last Updated: 24 Jun 2009 Editor: Deeksha Shenoy |
Copyright 2009 by padmanabhan N Everything else Copyright © CodeProject, 1999-2009 Web21 | Advertise on the Code Project |