E-mail Address Scanner in C#






2.83/5 (8 votes)
May 26, 2006
3 min read

87237

1
Scan a website for email addresses and add them to an XML-based file (MSN contact list file)
E-mail Address Scanner
Introduction
This simple program scans a webpage (within its own basic browser) for email addresses, which you can then view or remove (and even add to) in a list. This list can then be exporte into an MSN Messenger Contacts List file (*.ctt) to add to your MSN contacts.
This article provides source code for you, and explains key points within the code. It does not go over the whole project as a tutorial would.
Background
This was orginally developed as a plugin for the HTMLEditor program, but was the converted to a standalone application for this site.
Key Points in Code
Step 1) Typing in the URL
The firs thing the program must do is allow the user to specify a website to get the addresses from. To do this there is a textbox (textBox1) which accepts a URL. On pressing enter, the previewKeyDown event loads the URL into the web browser, and loads the <BODY> HTML of the page into the rich text box.
This event is as follows:
private void textBox1_PreviewKeyDown ( object sender, PreviewKeyDownEventArgs e )
{
if (e.KeyCode == Keys.Return)
{
Status.Text = "Loading file...";
webBrowser1.Navigate ( textBox1.Text );
richTextBox1.Text = webBrowser1.Document.Body.InnerHtml;
Status.Text = "File loaded!";
}
}
Next is the real part, the OnClick event of the button labelled "Get email addresses from site / file" is the backbone of this application:
private void btnGet_Click ( object sender, EventArgs e )
{
int Total = richTextBox1.Text.Length;
int Percent = Total / 100;
Console.WriteLine ( richTextBox1.Text.Length );
Console.WriteLine ( "1% = " + Percent );
To keep the user updated, we get the value for 1% of the total file, so our progress bar can report acurately.
int At;
int Start = 0;
int End = 0;
ProgressBar.Visible = true;
for (int c = 0; c < richTextBox1.Text.Length; c++)
{
richTextBox1.Select ( c, 1 );
if (richTextBox1.SelectedText == "@")
{
At = richTextBox1.SelectionStart;
for (int b = At; b >= 0; b--)
{
richTextBox1.Select ( b, 1 );
if (richTextBox1.SelectedText == " " || richTextBox1.SelectedText == "<" || richTextBox1.SelectedText == "," || richTextBox1.SelectedText == ">")
{
Start = richTextBox1.SelectionStart + 1;
break;
}
}
So, We look for the "@" symbol - obviously because it is a character which MUST be in something to be an email address, then we look for the "<>" brackets to either side of it (b decrements until it finds one to the left, wheras a increments until it finds one to the left)
for (int a = At; a <= richTextBox1.Text.Length; a++)
{
richTextBox1.Select ( a, 1 );
if (richTextBox1.SelectedText == " " || richTextBox1.SelectedText == "<" || richTextBox1.SelectedText == "," | richTextBox1.SelectedText == ">")
{
End = richTextBox1.SelectionStart;
break;
}
}
richTextBox1.Select ( Start, End - Start );
lb_Emails.Items.Add ( richTextBox1.SelectedText );
}
here, we select in between the e-mail addresses < > tags in HTML code and add it to our listBox control.
ProgressBar.Value = (c/ Percent);
Status.Text = "Parsing file..." + c / Percent + "%";
Console.WriteLine (c / Percent );
}
Status.Text = "Success! " + lb_Emails.Items.Count + " items parsed!";
}
SO, although with the nested loops / ifs it may look complicated, if you break it back down into English the code actually speaks for itself - look for @, find the <> around it and take it out, adding it to the list.
The next stage for the user is to manually prune / add to their list, but the event handlers for these events are so self-explanatory they do not deserve a mention here (no offense).
The final stage is to click "Done" and have your new MSN Contact List file made.
*.CTT files are XML-based, simple documents following this format:
<?xml version="1.0"?>
<messenger>
<service name=".NET Messenger Service">
<contactlist>
<contact>email@address.com</contact>
<contact>email@address2.com</contact>
</contactlist>
</service>
</messenger>
So, first we need to add the XML version, <messenger>, <service name> and <contactlist> lines, then create a loop adding each <contact> + email + </contact>, and then close those tags, and save it to a CTT file.
This is all done in the button's event handler:
private void btnDone_Click ( object sender, EventArgs e )
{
SaveFileDialog save = new SaveFileDialog ( );
save.Filter = "Messenger Contacts (*.ctt) | *.ctt";
save.InitialDirectory = "C:\\";
save.FileName = "ContactList";
//Simply initialise a sfd with the option of creating MSN Contact files
if (save.ShowDialog() == DialogResult.OK)
{
FileStream fs = new FileStream ( save.FileName, FileMode.Create );
StreamWriter sw = new StreamWriter ( fs );
lb_Emails.Items.Remove ( "..." );
//We use the "..." character to spcify if the user wants to add an entry, so we discount it from out list
sw.WriteLine("<?xml version=\"1.0\"?>");
sw.WriteLine("<messenger>");
sw.WriteLine( " <service name=\".NET Messenger Service\">" );
sw.WriteLine ( " <contactlist>" );
foreach (object ob in lb_Emails.Items)
{
sw.WriteLine ( " <contact>" + ob.ToString ( ) + "</contact>" );
}
sw.WriteLine ( " </contactlist>" );
sw.WriteLine ( " </service>" );
sw.WriteLine ( "</messenger>" );
sw.Close ( );
fs.Close ( );
//Basic loop to re-create XML structure of *.CTT files, as discussed before.
Status.Text = "File saved successfully!";
}
}
Feel free to experiment with this code / add new features to it :) Have fun
History
26/05/06: Submitted to CodeProject
Feedback
I am always willing to accept feedback, positive or otherwise. Feel free to contact me via the following:
MSN: jamespraveen@aol.com
E-Mail: james@magclan.cwhnetworks.com
Forum: http://www.just-code-it.net
Or of course post comments on this site.