So I was watching the WWE 13' quick look and I noticed the prompt they got about "hey, if any of your text contains words we deem inappropriate were going to censor them" So that got me thinking is there some kind of algorithm to filters or is it literally some dudes job to sit down and just think of every dirty and awful word they can for like 8 hours every day. Like you think it would have to be manual right? There are to many weird slangy ways to get curse words through filters. What say ye?
It's most likely a manual job. There was an article a while ago about a certain position at Google where they had to find all the horrible things on the internet so nobody else would. Link: http://gizmodo.com/5936572/the-worst-job-at-google-a-year-of-watching-beastiality-child-pornography-and-other-terrible-internet-things
There's no algorithm because words are always being added. You could do algorithms for specific words. Make something up that checks variations of "fuck" like "fuk" and "fuuuuuuck".
But it's really simple to just keep a dictionary and continually update it and also super easy to just run some sort of database query to see if any existing content has the new words you added. You then make business decisions of removing the content or flagging it for someone to deal with .
You can find dictionary text files out there for all sorts of purposes. Shouldn't be hard to find one for profanity/vulgar language.
here is the full list of filtered words for soon to be closed mmo city of heroes, it definitely is manual
Some companies do stuff like that. I remember Relic entertainment's list of banned words for Dawn of War II (I think) was leaked. It was hilarious in it's length and content.
EDIT: It was actually Space Marine. And here is the list
For obvious reasons, nsfw...although that entirely depends on where you work.