65.9K
CodeProject is changing. Read more.
Home

How to Parse the Bookmarks File of Mozilla into a Datatable

starIconstarIconstarIconstarIcon
emptyStarIcon
starIcon

4.55/5 (8 votes)

Sep 14, 2007

2 min read

viewsIcon

44165

downloadIcon

807

An article describing how to parse a Mozilla bookmarks file into a datatable

Screenshot - ImportMozilla1.jpg

Introduction

In this article, I create a procedure to import the file HTML containing the favourites of Mozilla Firefox. It is useful for parsing the file, putting it into a DataTable and then putting it into a TreeView. You can import Mozilla Firefox favourites to your application with this class.

Using the Code

Essentially, the file is an HTML file with particular tags to identify the URLs and folders. When you meet the tag <DT> followed by <H3>, it means that a new folder starts. I used a static method from the class ClsMozilla called ReadBookmark that returns a DataTable, accepting as a parameter the physical name of the file containing the bookmarks. I used StreamReader StreamBookMarks to make a cycle in order to analyze all the rows contained in the file. When I met the relevant tag, I placed the Folder and Links in the DataTable.

int Last_parent_id = 0;
//this is the relation with a folder. 0 means that it is alone

int Before_parent_id = 0;
//when close the folder, this store the prev id

This is done in order to associate each link or folder with a parent folder. This is fundamental when you want to build the tree of the links because you need to know the parent. This is interesting use of the DataTable because, in the first round, I tried to use a multidimensional array, but the management of this last is really more difficult than a DataTable. In the DataTable, I created various columns.

DataTable dt = new DataTable();
dt.Columns.Add( new DataColumn("id", typeof(Int32)));
dt.Columns.Add( new DataColumn("parent_id", typeof(Int32)));
dt.Columns.Add( new DataColumn("Hierarchy", typeof(Int32)));
dt.Columns.Add(new DataColumn("Title", typeof(string)));
dt.Columns.Add(new DataColumn("Link", typeof(string)));
dt.Columns.Add(new DataColumn("Des", typeof(string)));
dt.Columns.Add(new DataColumn("Type", typeof(string)));
DataRow dr ;

In the variable Type, I trace the nature of the link, i.e. if it is a URL or a Folder. I used a simple Regex to remove the HTML code.

public static string StripHTMLTags(string StrText)
{
//On retire le code HTML

return Regex.Replace(StrText,@"<(.|\n)*?>",string.Empty);
 }

Points of Interest

The class is very simple, as is the Firefox file's structure. The favourites of Internet Explorer are simpler because every favourite is a simple file. The Opera bookmark file is similar to Firefox and, in the future, I will release a new article for the other browser.

History

  • 14 September, 2007 -- Original version posted