On the value of informal votes on the internet: my biais is bigger than yours

2 minute read

Have you ever seen someone working on project (or company) A asking to vote for A in an internet driven vote? It always makes me feel awkward when someone does this. I rarely vote even if I *do* like the project. Of course, I do ask for votes even less frequently (trying to be consistent here ;) ). To me, it is like cheating. I'm not sure where this feeling comes from, very likely how I've been raised. I want to win in a fair race :)

Which leads to a second problem. Should we trust votes on the internet?

They come in diverse forms:

  • vote for a technology in an award contest
  • up vote a link on DZone or Digg
  • up vote an answer on StackOverflow (awesome site BTW, I wish all support forums were like that)

In a non biased environment, the vote at stake resolves itself in an organic way within a given community and thus is closer to a result that would have been run on the full targeted corpus. But if one of the contender reaches to his community and asks to vote for his product, the vote does not reflect the natural interest of the product but more the power of the contender on his community (like a bot master sending a command for all the bot slaves to vote). Let's be a bit less cynical and call this marketing power. All in all, this is not the result we are interested in.

Even when no competition is at stake, bias is right around the corner. How many times have you had to argue the merits of a vote result posted to a biased community and getting the marketing guy answering "I know but that''s the best we can do and we will treat this results essentially at face value"? I'm pushing here but you get the idea.

A better voting system would:

  • properly define the corpus (community) targeted
  • ignore votes from outside the corpus (ie Grand'ma voting for Hibernate OGM because his grand-son worked on the project)
  • inform the whole corpus of the vote at stake

StackOverflow's voting system is quite close to this though if you consider that most people monitor questions under a given set of tags and that a tag is quite close to a corpus (community). Also, the sheer number of questions makes it less likely to (ab)use the power of your crowd, at least on average. I'm sure though that questions of the style "what's the best product for..." are biased.

I'm not sure how to properly define a system that would ensure such organic voting system (let's call it the green vote), but that would be a nice little project. A few ideas:

  • people register to the voting system
  • they list the communities they belong to (Java, Hibernate, Programmers etc)
  • they are not authorized to vote for x weeks (to avoid "hot questions")
  • questions on a given subject targeting a community are run through a random subset of the valid community members
  • people not voting to enough questions are excluded out of the community (is that a necessary rule?)

Such a system would have a few interesting properties

  • it would be mainly unbiased
  • results would be comparable over time (ie could see the evolution of a community toward a subject


Note that binary vote (vote for one candidate/solution only) is less than optimal in itself but that's another topic.