Click here to Skip to main content
13,248,562 members (74,935 online)
Click here to Skip to main content
Add your own
alternative version


7 bookmarked
Posted 1 Jan 2013

A Plan for Spam

Rate this:
Please Sign up or sign in to vote.
How to abate the CodeProject spam crisis.

We are presently experiencing a hard pressure from a narrow group of "TV and Media" spammers who cynically challenge out ability to resist this kind of crime. Members of CodeProject are doing remarkable effort for extermination of unwanted parasites, but the measures taken seem to be not quite satisfactory. My reason for this short article is related to discussion of what we can do with between Chris Maunder and myself:[^],[^].

Several hours later, a fresh idea came to my mind, a variant of the ideas we already discussed. I would ask interested members to think about it and discuss it, criticize and support. Generally, we need some brain storm to help Chris and others to arm the site with suitable improved protection against spam, the way not threatening legitimate members and not boosting the overhead of using and maintaining the site too much.

I'm coming back to the idea of Bayesian filtering. I've successfully used it on my e-mails a while ago, but, after all, replaced it all by my own approach (this is not a place to discuss it because it cannot be applied to the site). I think, Bayesian filtering approach did not find its dominating place in e-mail services by some natural reasons, such as human operator/user overhead and unavoidable false negatives/positives of the method. However, I'm starting to think that if we use this idea, with a special twist (which can be further discussed), we can apply it for the protection of CodeProject.

This short article is named after the article "A Plan for Spam" by Paul Graham:[^].

See also another article:[^].

I think, after reading of the articles the idea will be clear enough.

As to the implementation, please looks at this open-source product:[^].

And this is a CodeProject article: A Naive Bayesian Spam Filter for C#[^].

That was just to demonstrate that the implementation won't be a big problem.

Still, the problem is: how to decide on the cancellation of the spammer's account? Don't we face the same problems: false negative/positive and excessive amount of the intervention of the administrator. Remember now, that I pointed out the main problem with the workload put on a human administrator: the requires chores are not automated, or not optimized to meet the goals.

Now, here is the main idea:

Let's invert the situation socially. Instead of making the decision on cancellation of a offender's account, let's make the potential offender applying for the "legalization" of a potentially spamming post. Hold on! Don't deny this idea from the very beginning, before I explain how it practically may look. I'm going to demonstrate that this can be done gently enough.

First of all, let's remember the starting point. At starting point, the filter is empty (or all available filters are empty), so, without intervention of the member caring about extermination of spam, nothing is filtered out, ever. The filters are started to populate as some member spots the spam and report it as such. It should be a special reporting action for spam, which feeds the spammed context into a filter. A filter starts populating and gradually acquires the ability to detect spamming content automatically. Yes, which some false positives/negatives. For the detail of this process, please come back to the articles by Paul Graham.

As a first step, the post content is not placed on the CodeProject content page (Questions & Answers, or something else). Instead, a potential offender gets the message on a page. Something like that:

CodeProject informs:
Sorry, we cannot place you post immediately. It contains some content detected by our filters as potential spam. The detection was bases on previous spam reports of CodeProject members. If you believe this is not spam, you will need to post your explanation here [URL]

The content goes to the database. On the request by the potential spammer, the page with legalization form is generated; and the report goes to the database, where the status of prospective post is stored. Again, it should not happen often; and legitimate members posting their messages will almost never get this message. I know this from my experience with Bayesian filtering for e-mail.

Now, by the request of the administrator, all the filtered members' messages will be generated on a single page. Usually, one glance on the messages will be enough to judge if this is spam or not. Importantly, this is quite unlikely that a real spammers will pledge for legalization of their contents. So, I think that the action most typically be will be "Yes to all" (pretty like in the movie "Bruce Almighty", 2003; no, this is not spam, I have no interest in promotion of this commercial product and cited it only for illustration of the protection method; I pledge for legalization of this post Smile | :) ). Of course, this "yes to all" is applied to the posts awaiting for approval/legalization. And it will be equally easy to have a single button "Remove all offending posts and member accounts" for all checked items.)

If you clearly imaging it, you will see that this procedure will be much easier than what we have now.

The access to this approval/legalization and member extermination procedure is a matter of some discussion. This aspect is not as important. I would suggest that the right for the final extermination of an offenders' accounts will be left to the administration, while the right for legalization and the right for extermination of offender's post (from this page; it is already there from the page of the question in Question & Answers forum) could be granted to members with some level of reputation.

Please discuss this idea and share your ideas. Maybe we can come up with some variant of my approach or something completely different.

Thank you for attention for this rather unpleasant matter and the effort already paid in order to sustain the site.



This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


About the Author

Sergey Alexandrovich Kryukov
United States United States
No Biography provided

You may also be interested in...

Comments and Discussions

QuestionNice article Pin
Rakshith Kumar30-Oct-13 0:59
memberRakshith Kumar30-Oct-13 0:59 
AnswerRe: Nice article Pin
Sergey Alexandrovich Kryukov30-Oct-13 4:10
mvpSergey Alexandrovich Kryukov30-Oct-13 4:10 
GeneralMy vote of 5 Pin
Michael Haephrati8-Mar-13 3:27
mvpMichael Haephrati8-Mar-13 3:27 
GeneralRe: My vote of 5 Pin
Sergey Alexandrovich Kryukov8-Mar-13 6:39
mvpSergey Alexandrovich Kryukov8-Mar-13 6:39 
QuestionIt's not a bad idea, but it can be taken further. Pin
Pete O'Hanlon8-Jan-13 7:03
protectorPete O'Hanlon8-Jan-13 7:03 
I've used Naive Bayes with some moderate success in the past, and it definitely has a place, but as you point out the big issue is how to decide on the automatic deletion of an account (poor Michael Martin had has account closed three times in the same day with the filter that Chris put in place last year).

Yes to a moderation queue, but this still leaves the issue that spammers create new accounts and post new messages. As an idea, to keep the noise down:

When a message is identified as spam, it goes into the moderation queue - but it is still visible, in place, for the account that created it. Other accounts don't see it in that place, they only see it in the moderation queue.
Sufficient votes - remove the message from the moderation queue. If the votes are for it not being spam, then all users get to see it in place. If the votes are for it being spam, then only the OP will be able to see it, don't remove it.

This isn't a new idea - FogBugz already implements a feature just like this - but it is surprisingly effective.

*pre-emptive celebratory nipple tassle jiggle* - Sean Ewington

"Mind bleach! Send me mind bleach!" - Nagy Vilmos

CodeStash - Online Snippet Management | My blog | MoXAML PowerToys | Mole 2010 - debugging made easier

AnswerRe: It's not a bad idea, but it can be taken further. Pin
Sergey Alexandrovich Kryukov8-Mar-13 6:41
mvpSergey Alexandrovich Kryukov8-Mar-13 6:41 
QuestionAlready tried - but maybe not well enough Pin
Chris Maunder7-Jan-13 16:07
adminChris Maunder7-Jan-13 16:07 
GeneralRe: Already tried - but maybe not well enough Pin
SoMad9-Jan-13 12:47
memberSoMad9-Jan-13 12:47 
GeneralRe: Already tried - but maybe not well enough Pin
Sergey Alexandrovich Kryukov9-Jan-13 14:04
mvpSergey Alexandrovich Kryukov9-Jan-13 14:04 
GeneralRe: Already tried - but maybe not well enough Pin
SoMad9-Jan-13 14:14
memberSoMad9-Jan-13 14:14 
GeneralRe: Already tried - but maybe not well enough Pin
Sergey Alexandrovich Kryukov9-Jan-13 14:15
mvpSergey Alexandrovich Kryukov9-Jan-13 14:15 
GeneralRe: Already tried - but maybe not well enough Pin
Chris Maunder9-Jan-13 15:32
adminChris Maunder9-Jan-13 15:32 
GeneralRe: Already tried - but maybe not well enough Pin
SoMad9-Jan-13 15:34
memberSoMad9-Jan-13 15:34 
GeneralRe: Already tried - but maybe not well enough Pin
Sergey Alexandrovich Kryukov9-Jan-13 16:02
mvpSergey Alexandrovich Kryukov9-Jan-13 16:02 
AnswerRe: Already tried - but maybe not well enough Pin
Dan Neely14-Feb-14 5:30
memberDan Neely14-Feb-14 5:30 
GeneralRe: Already tried - but maybe not well enough Pin
Sergey Alexandrovich Kryukov14-Feb-14 5:44
mvpSergey Alexandrovich Kryukov14-Feb-14 5:44 
GeneralA good start. Pin
SoMad5-Jan-13 18:08
memberSoMad5-Jan-13 18:08 
GeneralRe: A good start. Pin
Sergey Alexandrovich Kryukov8-Mar-13 6:42
mvpSergey Alexandrovich Kryukov8-Mar-13 6:42 
GeneralSee my suggestion Pin
Indivara2-Jan-13 19:14
subeditorIndivara2-Jan-13 19:14 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

Permalink | Advertise | Privacy | Terms of Use | Mobile
Web02 | 2.8.171114.1 | Last Updated 2 Jan 2013
Article Copyright 2013 by Sergey Alexandrovich Kryukov
Everything else Copyright © CodeProject, 1999-2017
Layout: fixed | fluid