#1 Edited by Demoskinos (14851 posts) -

So I was watching the WWE 13' quick look and I noticed the prompt they got about "hey, if any of your text contains words we deem inappropriate were going to censor them" So that got me thinking is there some kind of algorithm to filters or is it literally some dudes job to sit down and just think of every dirty and awful word they can for like 8 hours every day. Like you think it would have to be manual right? There are to many weird slangy ways to get curse words through filters. What say ye?

#2 Posted by mwjeffcott (40 posts) -
#3 Edited by Ravenlight (8040 posts) -

I wonder if you can license a third-party language plugin that has a database of offensive words. It's like Speedtree for swears.

#4 Posted by Scrawnto (2450 posts) -

@Ravenlight said:

I wonder if you can license a third-party language plugin that has a database of offensive words. It's like Speedtree for swears.

I bet you can. How great would it be to see procedurally generated compound-swears?

#5 Posted by TyCobb (1972 posts) -

There's no algorithm because words are always being added. You could do algorithms for specific words. Make something up that checks variations of "fuck" like "fuk" and "fuuuuuuck".

But it's really simple to just keep a dictionary and continually update it and also super easy to just run some sort of database query to see if any existing content has the new words you added. You then make business decisions of removing the content or flagging it for someone to deal with .

You can find dictionary text files out there for all sorts of purposes. Shouldn't be hard to find one for profanity/vulgar language.

#6 Posted by onimonkii (2443 posts) -
#7 Posted by BillyTheKid (486 posts) -

I am sure that there is an algorithmic way to find most of the common swears. Then I am sure that they go on an on case basis and deal with each report of cuss words on the account. Then a man would deem it worthy or not.

#8 Edited by Kaiserhawk (62 posts) -

Some companies do stuff like that. I remember Relic entertainment's list of banned words for Dawn of War II (I think) was leaked. It was hilarious in it's length and content.

EDIT: It was actually Space Marine. And here is the list

For obvious reasons, nsfw...although that entirely depends on where you work.

http://pastebin.com/xVaCNsje