Click here to Skip to main content
Click here to Skip to main content

NoSpamEmailHyperlink: 6. Customization

, 12 Oct 2003
Rate this:
Please Sign up or sign in to vote.
Customizing the NoSpamEmailHyperlink to cause maximum damage to the spam harvesters.

A control derived from NoSpamemailHyperlink renders differently but produces the same result

Introduction

This is the last in a series of six articles, following the design, development and practical use of a fully functional ASP.NET custom control.

The full list of articles is as follows:

These articles are not intended to be a comprehensive look at custom control development (there are 700+ page books that barely cover it), but they do cover a significant number of fundamentals, some of which are poorly documented elsewhere.

The intent is to do so in the context of a single fully reusable and customizable control (as opposed to many contrived examples) with some awareness that few people will want many parts of the overall article but many people will want few parts of it.

This article looks at techniques for customizing the NoSpamEmailHyperlink to make it unique for any given page without downloading and editing the code directly, making it difficult to incorporate any future improvements to the base control.

It assumes at least a basic knowledge of C# and class inheritance.

Customizing the NoSpamEmailHyperlink

With a few simple overrides, it is possible to completely change the nature of the NoSpamEmailHyperlink so that each site implementing the control uses it in a slightly different way, making it nearly impossible for email harvesters to detect and decode the email addresses. It is even possible to use numerous derived controls on the same page (as seen in the screenshot above), but this is not recommended.

The aim of this control is to force harvesting software to handle JavaScript as effectively as any browser and thus push up the price of such software, in turn pulling down the profit margins of the email spammers. If all they have to do is detect the link array, or the code key and either use it or move on, this would not cause them too many problems.

If, on the other hand, the control acts slightly different on some sites and very different on others, the email harvesters are going to have to work a lot harder.

The NoSpamEmailHyperlink offers six properties and three methods which can be overridden to create a completely different control with the same principles.

Custom Coding Key

Replacing the encoding string is easy as long as you follow a couple of simple rules. Create a new control, derived from NoSpamEmailHyperlink and override the protected .CodeKey property.

public class NoSpamEmailHyperlinkEx : NoSpamEmailHyperlink
{
    protected override string CodeKey
    {
        get
        {
            return "QpWlEmR5ToY4UkInO0PiAjS19DbFuGhHvJ2KyLgZc3XtCxVf6BNrdMeszawq78";
        }
    }
}

It is essential that you never include the same character twice. This will confuse the decoding algorithm. If it finds the duplicated character, it cannot possibly know which character it was translated from.

It is also essential that you only include alphanumeric characters, unless you are also overriding the Encode/Decode functionality to handle it. Any other characters may translate a valid email address to an invalid one (for example, if the first character becomes a hyphen).

It is not essential to include every alphanumeric character in the key. Missing out one or two characters can actually make it more difficult to decode. For example, if you miss the "a" and "A" characters out of the key string, all other characters will be substituted except for the As. Once you realize that a string is encoded using substitution, the last thing you expect is for some characters not to be substituted at all. And yet, the decoding algorithm will handle the missing characters correctly.

Custom Variable Names

Should the NoSpamEmailHyperlink become excessively popular, there are a number of ways in which a harvester could identify the encoded hyperlinks and discount them, or even decode them without using JavaScript.

Because the NoSpamEmailHyperlink uses GetType().Name to build the array names, function names and global-level variable names, any control derived from it will automatically use different names to avoid clashes.

However, a harvester could easily look for arrays with names ending _LinkArray and discount any links with IDs found in those arrays. Without too much more effort, it could find the _SeedArray and the ky variable and attempt to decode them.

But if we change the names of those variables on just a few pages, the process of detecting them becomes a lot more difficult.

public class NoSpamEmailHyperlinkEx : NoSpamEmailHyperlink
{
    protected override string CallScriptName {
        get {
            return GetType().Name + "Elephant";
        }
    }

    protected override string FuncScriptName {
        get {
            return GetType().Name + "SilverFish";
        }
    }

    protected override string SeedArrayName {
        get {
            return GetType().Name + "TexasHoldem";
        }
    }

    protected override string LinkArrayName {
        get {
            return GetType().Name + "Leichtenstein";
        }
    }

    protected override string CodeKeyName {
        get {
            return "ck";
        }
    }
}

As you can see, it is not entirely necessary for these strings to be related in any way to their function. For example, the above code changes the name of the seed array definition so that it resembles the following:

var NoSpamEmailHyperlinkExTexasHoldem =  new Array("23");

You may know what this means, and the calling script will adjust itself to find the new array name, but the harvester will no longer find the array simply by hunting for _SeedArray.

Note that all of the above properties, except for the CodeKeyName are used in the JavaScript at a global level. It is always advisable to use GetType().Name somewhere in the definition to allow for further controls deriving from yours and failing to override these properties.

Custom Coding Algorithm

For the more adventurous tinkerer, it is also possible to override the .Encode() and .GetFuncScript() methods to provide a completely new algorithm for encoding and decoding the email address.

The new algorithm may be as simple or as complex as you like. Just because your favorite algorithm is simple, do not assume that this is a bad thing. As long as it is different, it is more confusing for the harvesters.

Maybe you want to make a simple change, such as accelerating the rate of change in the base number (initially the seed). Simply copy the code as described in NoSpamEmailHyperlink: 3. Email Encoding and Decoding into your derived control and amend it however you please.

For example:

public class NoSpamEmailHyperlinkEx : NoSpamEmailHyperlink
{
    protected override string GetFuncScript()
    {
#if DEBUG
        // Formatted script text in debug version
        JavaScriptBuilder jsb = new JavaScriptBuilder(true);
#else
        // Compress script text in release version
        JavaScriptBuilder jsb = new JavaScriptBuilder();
#endif

        jsb.AddLine("function ", FuncScriptName, "(link, seed)");
        jsb.OpenBlock(); // function()
        jsb.AddCommentLine("This is the decoding key for all ",
            LinkArrayName, " objects");
        jsb.AddLine("var ", CodeKeyName, " = \"", CodeKey, "\";");
        jsb.AddLine();
        jsb.AddCommentLine("Store the innerText so that it doesn't get");
        jsb.AddCommentLine("distorted when updating the href later");
        jsb.AddLine("var storeText = link.innerText;");
        jsb.AddLine();
        jsb.AddCommentLine("Initialize variables");
        jsb.AddLine("var baseNum = parseInt(seed);");
        jsb.AddLine("var atSym = link.href.indexOf(\"@\");");
        jsb.AddLine("if (atSym == -1) atSym = 0;");
        jsb.AddLine("var dotidx = link.href.indexOf(\".\", atSym);");
        jsb.AddLine("if (dotidx == -1) dotidx = link.href.length;");
        jsb.AddLine("var scramble = link.href.substring(7, dotidx);");
        jsb.AddLine("var unscramble = \"\";");
        jsb.AddLine("var su = true;");
        jsb.AddLine();
        jsb.AddCommentLine("Go through the scrambled section of the address");
        jsb.AddLine("for (i=0; i < scramble.length; i++)");
        jsb.OpenBlock(); // for (i = 0; i < scramble.length; i++)
        jsb.AddCommentLine("Find each character in the scramble key string");
        jsb.AddLine("var ch = scramble.substring(i,i + 1);");
        jsb.AddLine("var idx = ", CodeKeyName, ".indexOf(ch);");
        jsb.AddLine();
        jsb.AddCommentLine("If it isn't there then add the character");
        jsb.AddCommentLine("directly to the unscrambled email address");
        jsb.AddLine("if (idx < 0)");
        jsb.OpenBlock(); // if (idx < 0)
        jsb.AddLine("unscramble = unscramble + ch;");
        jsb.AddLine("continue;");
        jsb.CloseBlock(); // if (idx < 0)
        jsb.AddLine();
        jsb.AddCommentLine("Decode the character");
        jsb.AddLine("idx -= (su ? -baseNum : baseNum);");
        jsb.AddLine("baseNum -= (su ? -Math.pow(i, 2) : Math.pow(i, 2));");
        jsb.AddLine("while (idx < 0) idx += ", CodeKeyName, ".length;");
        jsb.AddLine("idx %= ", CodeKeyName, ".length;");
        jsb.AddLine();
        jsb.AddCommentLine("... and add it to the unscrambled email address");
        jsb.AddLine("unscramble = unscramble + ",
            CodeKeyName, ".substring(idx,idx + 1);");
        jsb.AddLine("su = !su;");
        jsb.CloseBlock(); // for (i = 0; i < scramble.length; i++)
        jsb.AddLine();
        jsb.AddCommentLine("Adjust the href property of the link");
        jsb.AddLine("var emAdd = unscramble + ",
            "link.href.substring(dotidx, link.href.length + 1);");
        jsb.AddLine("link.href = \"mailto:\" + emAdd;");
        jsb.AddLine();
        jsb.AddCommentLine("If the scrambled email address is also in the text");
        jsb.AddCommentLine("of the hyperlink, replace it");
        jsb.AddLine("var findEm = storeText.indexOf(scramble);");
        jsb.AddLine("while (findEm > -1)");
        jsb.OpenBlock(); // while (findEm > -1)
        jsb.AddLine("storeText = storeText.substring(0, findEm) + emAdd ",
            "+ storeText.substring(findEm + emAdd.length, storeText.length);");
        jsb.AddLine("findEm = storeText.indexOf(scramble);");
        jsb.CloseBlock(); // while (findEm > -1)
        jsb.AddLine();
        jsb.AddLine("link.innerText = storeText;");
        jsb.CloseBlock(); // function()

        return jsb.ToString();
    }

    protected override string Encode (string Unencoded)
    {
        // Convert string to char[]
        char[] scramble = Email.ToCharArray();

        // Initialize variables
        int baseNum = ScrambleSeed;
        bool subtract = true;

        // Find the @ symbol and the following .
        // if either don't exist then we don't have a
        // valid email address and should return it unencoded
        int atSymbol = Array.IndexOf(scramble, '@');
        if (atSymbol == -1) atSymbol = 0;
        int stopAt = Array.IndexOf(scramble, '.', atSymbol);
        if (stopAt == -1) stopAt = scramble.Length;

        // Go through the section of the address to be scrambled
        for (int i=0; i < stopAt; i++)
        {
            // Find each character in the scramble key string
            char ch = scramble[i];
            int idx = CodeKey.IndexOf(ch);

            // If it isn't there then ignore the character
            if (idx < 0) continue;

            // Encode the character
            idx += (subtract ? -baseNum : baseNum);
            baseNum -= (subtract
                ? -(int)Math.Pow(i, 2) : (int)Math.Pow(i, 2));
            while (idx < 0) idx += CodeKey.Length;
            idx %= CodeKey.Length;
            scramble[i] = CodeKey[idx];
            subtract = !subtract;
        }

        // Return the encoded string
        return new string(scramble);
    }
}

Only the highlighted lines have been changed, but this is a massive change to the coding algorithm and an extra JavaScript command for the harvesters to understand.

Conclusion

The variations on this theme are limited only by your imagination. You could use multiple keys, perhaps one upper-case and one lower-case key. Perhaps you want to substitute underscores and hyphens, prefixing with a random letter to keep the address valid.

You could simulate the World War II "one time pad" system, by "adding" the first letter of the email address to the first letter of the key, the second letter of the email address to the second letter of the key, and so on.

You do not have to limit yourself to substitution algorithms. You could reverse the characters in both the user and domain segments of the email address (e.g. pdriley@santt.com becomes yelirdp@ttnas.com) or use a more complex transposition algorithm.

It really makes no difference what approach you take, the more people that add their own personal touch to the NoSpamEmailHyperlink the more painful it becomes for the email harvesters.

Let your imagination run wild.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

Share

About the Author

Paul Riley
Web Developer
United Kingdom United Kingdom
Paul lives in the heart of En a backwater village in the middle of England. Since writing his first Hello World on an Oric 1 in 1980, Paul has become a programming addict, got married and lost most of his hair (these events may or may not be related in any number of ways).
 
Since writing the above, Paul got divorced and moved to London. His hair never grew back.
 
Paul's ambition in life is to be the scary old guy whose house kids dare not approach except at halloween.

Comments and Discussions

 
GeneralLimited Lifespan PinmemberDave Bacher8-Apr-05 11:09 
GeneralRe: Limited Lifespan PinmemberPaul Riley8-Apr-05 12:53 
GeneralDynamic controls PinmemberAlbin Log14-Mar-05 2:28 
GeneralRe: Dynamic controls PinmemberPaul Riley14-Mar-05 9:45 
GeneralGreat article &amp; great control! PinmemberRadoslav Bielik12-Feb-04 10:55 
GeneralRe: Great article &amp; great control! PinmemberPaul Riley12-Feb-04 11:16 
GeneralVery very usefull Pinmemberarf112-Feb-04 9:24 
GeneralRe: Very very usefull PinmemberPaul Riley12-Feb-04 11:15 
GeneralNice Pinmemberjoelycat21-Oct-03 4:42 
GeneralRe: Nice PinmemberPaul Riley21-Oct-03 6:34 
GeneralRe: Nice PinmemberRadoslav Bielik12-Feb-04 11:01 
GeneralRe: Nice PinsussAnonymous12-Feb-04 11:04 
No, I haven't actually created a control but I have used this technique "hand-rolled".

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Terms of Use | Mobile
Web04 | 2.8.141223.1 | Last Updated 13 Oct 2003
Article Copyright 2003 by Paul Riley
Everything else Copyright © CodeProject, 1999-2014
Layout: fixed | fluid