|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
|
Announcements
Chapters
Services
Feature Zones
|
Note: This is an unedited contribution. If this article is inappropriate,
needs attention or copies someone else's work without reference then please
Report This Article
IntroductionThis article introduces a program that can extract message headers from Newsgroups, and can export to text file or Microsoft Excel file.
Background
For a long time, I was looking for a tool that can extract information from newsgroup server, which can help analyzing what topic people focus mostly, who are concerning what content, where are they from, who is the most active people in the newsgroup etc. But unfortunately, I can’t get one from internet, or some can be used but they need payment. So, I decided to write a program myself.
NNTP wrapper classThere are already some articles in Code Project describing NNTP commands. The one I referred mostly is the article written by TY Lee. I extended a little bit of the classes to make them more suitable for user’s interface exposing. Basically the classes follow the NNTP definition, and use network stream to communicate with newsgroup server. Class NewsgroupClient wraps the methods that can send commands to server and retrieve data correspondingly. For example, ListGroup() method lists the newsgroup created on the server. SelectGroup () method selects a group as the active one and retrieve article ranges. DownloadHeaders() method gets article headers from the current active group. The connections are constructed in the same thread of UI, so the Application.DoEvents() method is inserted in some places of the code to make sure user can operate the application while data is transferring. When each article header is retrieved, event is fired so that the information can be displayed immediately. User interfaceThe user interface of this program is divided into two portions: The Newsgroups tree on the left side of the Form, and the article list on the right. Here I use SpringSys OrchidGrid to construct the main parts of the UI, because it can work in tree mode and support exporting data to Excel.
The left tree view has two levels, the top level is for the server nodes, the second level displays the newsgroups on each server. After adding a Newsgroup server, a NewsgroupClient object will be created and stored in the server node, meanwhile, all the newsgroups created on the server will be listed by calling the ListGroup() method. Obviously, one NewsgroupClient corresponds to a Newsgroup server and is responsible for all the later network communications. Below code retrieve the Newsgroups from a server and add them to the server node: private bool _updating = false; private void UpdateGroups(GridTreeNode node) { // gets NewsgroupClient object from the tree node NewsgroupClient ngClient = node.Tag as NewsgroupClient; if (ngClient == null) return; _updating = true; this.StartProgress(); // connects the server if necessary if (!ngClient.Connected) { string server = node.Data as string; this.lblMsg.Text = string.Format("Connecting to server {0} ......", server); if (!ngClient.Connect(server)) { _updating = false; this.StopProgress(); MessageBox.Show("Error connecting to server " + server); return; } } // resets the childre of the server node node.ClearChildren(); int index = node.Row.Index + 1; // retrives the newsgroups from the server this.lblMsg.Text = "Retrieving groups from server......"; string[] newsgroups = ngClient.ListGroup(); // adds the newsgroups as the child nodes ogServer.Redraw = false; for (int i = 0; i < newsgroups.Length; i++) { Row row = this.ogServer.Rows.Insert(index + i, newsgroups[i]); row.TreeImage = this.imageList1.Images[1]; } ogServer.Redraw = true; this.lblMsg.Text = "Completed!"; this.StopProgress(); // adjust the column's width this.ogServer.AutoSizeColWidth(); _updating = false; } The data of the tree will be persisted into a text file when the application is closed. The next run the program, data will be restored from the text file. This is done by the PersistGroups() and LoadGroups() methods. By checking the nodes, we can specify which groups are going to be explored before downloading the article headers. Here we use the check box node feature of the grid. Below code would make sure checking a server node would check all the group nodes belong to that server. this.ogServer.Tree.CheckAction = TreeCheckAction.Children; Below code browses each row of the tree grid and download only for the nodes that are checked. private bool _downloading = false; private void btnDownload_Click(object sender, EventArgs e) { if (_downloading || _updating) return; try { _downloading = true; this.StartProgress(); foreach (Row row in ogServer.Rows) { if (row.IsNode) continue; // skips none node row if (row.UserData != null) continue; // skips the node that has already been visited if (row.TreeChecked == CheckState.Checked) DownLoadHeaders(row); // downloads headers for the specific newsgroup } } catch { } finally { this.StopProgress(); _downloading = false; if (_currentClient != null) _currentClient._forceStop = false; } } NewsgroupClient _currentClient; private void DownLoadHeaders(Row row) { GridTreeNode node = row.Node; // gets NewsgroupClient from the node NewsgroupClient ngClient = node.Tag as NewsgroupClient; if (ngClient == null) return; // connects if necessary _currentClient = ngClient; _currentClient._forceStop = false; if (!ngClient.Connected) { string server = node.Data as string; this.lblMsg.Text = string.Format("Connecting to server {0} ......", server); if (!ngClient.Connect(server)) { _updating = false; this.StopProgress(); MessageBox.Show("Error connecting to server " + server); return; } } // gets the group name string group = row[0] as string; if (group == null) return; this.lblMsg.Text = string.Format("Downloading from group {0} ......", group); try { // selects the group ngClient.SelectGroup(group); // downloads article headers // note: headers will be filled back to UI throw OnDownloadHeader event ArrayList headers = ngClient.DownloadHeaders(ngClient.CurrentGroup.LowID, ngClient.CurrentGroup.HighID); if (headers == null) { this.lblMsg.Text = "Download message header failed"; } else { this.lblMsg.Text = "Success!"; row.Style = _visitedStyle; row.UserData = "Visited"; } } catch { this.lblMsg.Text = string.Format("Erorr happen while downloading from group {0}", group); } finally { } } Our target is not only to display the article headers in a friendly user interface, but also the most important one is to export the data into text file or Excel file for later processing. Fortunately, OrchidGrid has the built-in support for data exporting, they are methods ExportToDelimitedFile() and ExportToExcel(). We don’t need to write extra code for this functionality. Please see the below code: You can also write your own exporting code if you like. In this application, I commented some code that would export only the email address of the article author to a text file. During the network operations, progress bar and message titles are all active to indicate the progress. You can stop or cancel an operation at any moment as well. If you need other header data in addition to “Subject,” “From”, “Date”, you can modify the ArticleHeader class and adjust the code for your project. Try a sample serverLet’s try a newsgroup server for example – “msnews.microsoft.com”, it has a bunch of Newgroups , and some contain thousands of articles. Input the server address “msnews.microsoft.com” and press Enter key, the server and the newsgroups on that server are listed on the left tree. Check some newsgroups as you like and click the button “Download Message Headers”, you will get all the headers in the selected newsgroups. Then, you can export the headers to text or Excel file. Please see the screen shot of this application at the top of this page. Hope you like this tool and think it is useful.
|
||||||||||||||||||||||