Comment Spam

bluerghh comment spam. Seem to be getting a fair bit of spam recently never sure if its some clever bot or a human.
Some comments are obvious spam, people whose mother called them ‘lean mean griller’ was obviously destined to have a website selling grillers. This is spam simples. Although its a bit dumb why would my site have any authority on that at all someones just wasting there own time as well as mine.

Often though the content seems generated by a machine they pick key words out of post. I was reading about markov chains sounds like that would be the way to go if you were trying to generate human sounding content. Perhaps if I read a bit more I can reverse engineer them as a sort of spam detector.

Often though I wonder if the commenter is actually a human its just that they are writing in their second language. So they phrase things a bit oddly which is fine/dandy I’m sure I would if I was trying to comment on someones blog in something other than my first language.

So I try to be positive towards them. I know some people just delete all comments but I’m happy to talk to anyone just not bots or people whose parents named them after seo terms, products or dating sites (1).

Anyway I’ve written a really simple WordPress plugin to represent the percentage of comments that are Pending, Approved, Spam or Trash. It’s output will update as more comments come in. I cleared a load of trashed ones out so the results seem a bit off at the moment.


Incidentally it will be quite fun to see how much spam I get on this post :D, how smart are the bots?

ps: if you want a name for your child here are some neat ones for boys: ichabod, octavius, titus, elastic, borin for girls: cornelia, precious.