#1 Posted by insanejedi (655 posts) -

I can't help but to see that Giant Bomb has been littered with everyday sort of scam spam in the comments and forum such as...

my neighbor's sister-in-law makes $999999 every hour on the computer. She has been out of a job for 9999 months but last month her paycheck was $99999just working on the computer for a few hours. go to this site........ w­w­w.givemeavirus.c­o­m/

I don't know how much you guys want to put resources into filtering the spam and attempt to make Matt Rorie's job easier but it's clear that these post have a very similar pattern..

Im making over $30h a month working part time. I kept hearing other people tell me how much money they can make online so I decided to look into it. Well, it was all true and has totally changed my life. This is what I do----> (website)

my friend's step-aunt makes $86/hr on the internet. She has been out of a job for nine months but last month her paycheck was $18678 just working on the internet for a few hours. browse around this web-site........ (website)

I was wondering if there was some way to program a spam filter that would automatically detect post like this using a heuristic method of detection.

For example every one of these tends to start with...

My (relationship to other person, Aunt, Neighbor, Friend Ect.)

So you could do a flagging system for example that detect this particular format with a thesaurus or dictionary of nouns that relate to a persons relationship. Now to avoid false positives and type 1 errors for other posters this is obviously going to require more detection.

So another condition is if we detect $x tied to a specific time.

Almost everyone goes 'makes/making (over) $x (time)"

and we can tie that to a conditional if the string contains one of the following words shortly after.

"internet, home, office, computer"

Finally the nail in the coffin could be a website that are ALWAYS put at the end of the string.

"www.workathome.com, ect."

In fact you don't have to make a blacklist as almost all of these spam comments put a website at the very end of the string, never in the middle.

So if all these conditions are met, it's highly likely we have spam, with little error in the manner of having type 1 errors where legitimate users would be blocked because they typed something similar.

I do realize that this is a losing battle in some cases, they are going to get more crafty to send out their bullshit, but any attempts to automate this would certainly help admins and make Giant Bomb a better user experience.

There is a chance that they might give up and go to websites more easily exploited in this fashion as well, rather than dedicate resources to subvert the spam system.

#2 Posted by BisonHero (6193 posts) -

I support your suggestion.

Alternately, maybe just don't allow new users to post anything that looks like a URL for the first 72 hours? I think I've seen spammers get desperately and actually spell out dot-com or something, but that might be enough of a deterrent, since phonetically typed our URLs are not achieving the goal they're going for.

Though maybe the spammers would just make a bunch of accounts and sit on them for a few days until they could post URLs.

#3 Posted by Rorie (2700 posts) -

We have some new moderation stuff that will allow us to at least delete this stuff more quickly after it's posted rolling out. We'll see how that works before we do anything too programming-heavy to filter stuff out. But thanks for the suggestion!

Staff
#4 Edited by Gruebacca (499 posts) -

The ones that I'm more interested in ridding of are forum threads containing football livestream links. In my perfect world, if that kind of stuff pops up in the op post, then the coding would just delete the thread immediately after posting.

#5 Edited by jSlack (98 posts) -

Yes, it would definitely be cool to have better spam filters in place. Perhaps in the future when we have some more time we can roll out some better spam protection, as you guys are right, they tend to be pretty predictable.

Staff