65.9K
CodeProject is changing. Read more.
Home

E-mail Address Scanner in C#

starIconstarIcon
emptyStarIcon
starIcon
emptyStarIconemptyStarIcon

2.83/5 (8 votes)

May 26, 2006

3 min read

viewsIcon

87237

downloadIcon

1

Scan a website for email addresses and add them to an XML-based file (MSN contact list file)

E-mail Address Scanner

Introduction

This simple program scans a webpage (within its own basic browser) for email addresses, which you can then view or remove (and even add to) in a list. This list can then be exporte into an MSN Messenger Contacts List file (*.ctt) to add to your MSN contacts.

This article provides source code for you, and explains key points within the code. It does not go over the whole project as a tutorial would.

Background

This was orginally developed as a plugin for the HTMLEditor program, but was the converted to a standalone application for this site.

Key Points in Code

Step 1) Typing in the URL

The firs thing the program must do is allow the user to specify a website to get the addresses from. To do this there is a textbox (textBox1) which accepts a URL. On pressing enter, the previewKeyDown event loads the URL into the web browser, and loads the <BODY> HTML of the page into the rich text box.
This event is as follows:

private void textBox1_PreviewKeyDown ( object sender, PreviewKeyDownEventArgs e )
{
if (e.KeyCode == Keys.Return)
{
Status.Text = "Loading file...";
webBrowser1.Navigate ( textBox1.Text );
richTextBox1.Text = webBrowser1.Document.Body.InnerHtml;
Status.Text = "File loaded!";
}
}

Next is the real part, the OnClick event of the button labelled "Get email addresses from site / file" is the backbone of this application:

private void btnGet_Click ( object sender, EventArgs e )
{
int Total = richTextBox1.Text.Length;
int Percent = Total / 100;
Console.WriteLine ( richTextBox1.Text.Length );
Console.WriteLine ( "1% = " + Percent );

To keep the user updated, we get the value for 1% of the total file, so our progress bar can report acurately.
int At;
int Start = 0;
int End = 0;
ProgressBar.Visible = true;
for (int c = 0; c < richTextBox1.Text.Length; c++)
{
richTextBox1.Select ( c, 1 );
if (richTextBox1.SelectedText == "@")
{
At = richTextBox1.SelectionStart;
for (int b = At; b >= 0; b--)
{
richTextBox1.Select ( b, 1 );
if (richTextBox1.SelectedText == " " || richTextBox1.SelectedText == "<" || richTextBox1.SelectedText == "," || richTextBox1.SelectedText == ">")
{
Start = richTextBox1.SelectionStart + 1;
break;
}
}

So, We look for the "@" symbol - obviously because it is a character which MUST be in something to be an email address, then we look for the "<>" brackets to either side of it (b decrements until it finds one to the left, wheras a increments until it finds one to the left)

for (int a = At; a <= richTextBox1.Text.Length; a++)
{
richTextBox1.Select ( a, 1 );
if (richTextBox1.SelectedText == " " || richTextBox1.SelectedText == "<" || richTextBox1.SelectedText == "," | richTextBox1.SelectedText == ">")
{
End = richTextBox1.SelectionStart;
break;
}
}
richTextBox1.Select ( Start, End - Start );
lb_Emails.Items.Add ( richTextBox1.SelectedText );
}

here, we select in between the e-mail addresses < > tags in HTML code and add it to our listBox control.

ProgressBar.Value = (c/ Percent);
Status.Text = "Parsing file..." + c / Percent + "%";
Console.WriteLine (c / Percent );
}
Status.Text = "Success! " + lb_Emails.Items.Count + " items parsed!";
}

SO, although with the nested loops / ifs it may look complicated, if you break it back down into English the code actually speaks for itself - look for @, find the <> around it and take it out, adding it to the list.

The next stage for the user is to manually prune / add to their list, but the event handlers for these events are so self-explanatory they do not deserve a mention here (no offense).

The final stage is to click "Done" and have your new MSN Contact List file made.

*.CTT files are XML-based, simple documents following this format:

<?xml version="1.0"?>
<messenger>
<service name=".NET Messenger Service">
<contactlist>
<contact>email@address.com</contact>
<contact>email@address2.com</contact>
</contactlist>
</service>
</messenger>

So, first we need to add the XML version, <messenger>, <service name> and <contactlist> lines, then create a loop adding each <contact> + email + </contact>, and then close those tags, and save it to a CTT file.

This is all done in the button's event handler:

private void btnDone_Click ( object sender, EventArgs e )
{
SaveFileDialog save = new SaveFileDialog ( );
save.Filter = "Messenger Contacts (*.ctt) | *.ctt";
save.InitialDirectory = "C:\\";
save.FileName = "ContactList";

//Simply initialise a sfd with the option of creating MSN Contact files 


if (save.ShowDialog() == DialogResult.OK)
{

FileStream fs = new FileStream ( save.FileName, FileMode.Create );
StreamWriter sw = new StreamWriter ( fs );
lb_Emails.Items.Remove ( "..." );
//We use the "..." character to spcify if the user wants to add an entry, so we discount it from out list 

sw.WriteLine("<?xml version=\"1.0\"?>");
sw.WriteLine("<messenger>");
sw.WriteLine( " <service name=\".NET Messenger Service\">" );
sw.WriteLine ( " <contactlist>" );
foreach (object ob in lb_Emails.Items)
{
sw.WriteLine ( " <contact>" + ob.ToString ( ) + "</contact>" );
}
sw.WriteLine ( " </contactlist>" );
sw.WriteLine ( " </service>" );
sw.WriteLine ( "</messenger>" );
sw.Close ( );
fs.Close ( );
//Basic loop to re-create XML structure of *.CTT files, as discussed before. 

Status.Text = "File saved successfully!";
}
}


Feel free to experiment with this code / add new features to it :) Have fun

History



26/05/06: Submitted to CodeProject

Feedback

I am always willing to accept feedback, positive or otherwise. Feel free to contact me via the following:

MSN: jamespraveen@aol.com
E-Mail: james@magclan.cwhnetworks.com
Forum: http://www.just-code-it.net

Or of course post comments on this site.