Shading to indicate reliability of Wikipedia entries

One of the bugbears of community-supplied information is gauging its credibility.  Unlike a traditional outlet (say a newspaper) that verifies all the information before they print or airs it, information can be submitted from community members that is not very accurate.

A researcher at the University of California, Santa Cruz in their WikiLab has come up with a system for trying to give readers of such community sites a better indication of the veracity of the information. They use an algorithm that evaluates how long other entries from this individual survived in Wikipedia without being re-edited and how well vetted these entries were by trusted readers.  From this it generates a reliability measure for this author. Then it assigns that reliability or lack of reliability to the entries he/she made on other pages.

Entries judged to be more questionable are printed on orange highlighted background; more reliable entries are printed in white background.  The darker the hue, the more dubious the content.

Computer scientist Luca de Alfaro has so far just compiled several hundred Wikipedia pages into a demo.

Another interesting feature is that unknown authors can gain or lose reliability relatively quickly after well known Wikipedia users review their pages and either find the entries fine (increasing the author’s trust score) or problematic (lowering the author’s score).

“The idea is very simple,” de Alfaro said. “If your contribution lasts, you gain reputation. If your contribution is reverted [to the previous version], your reputation falls.” De Alfaro will speak about his new program August 4, 2007  at the Wikimania conference in Taipei, Taiwan.

They have tested the reliability scores against how long contemporary edits remain on Wikipedia. In sample runs, over 80% of flagged edits turn out to be wrong and were changed. [Other Wikipedia users corrected 60-70% of these edits relatively quickly, but presumably the value of the program is helping readers with the 30-40% of edits that are not quickly corrected and would otherwise dupe would-be readers.]

It takes roughly a week for de Alfaro to evaluate Wikipedia’s seven-year edit history with his algorithm.  While he is working off of a distributed copy of the Wikipedia site, he says real time adjustments could be made relatively easily.

They do not publish author rating scores, because they fear it would spark competitiveness (which is the opposite of the site’s culture) and to discourage infrequent but knowledgeable authors whose low score is a function of their low number of entries.

It may be easier to assign reliability scores on Wikipedia given the community editing function, but one could imagine using other similar reliability measures (that just looked at how reliable other users found this user’s posts or how many were discovered to be fraudulent or incorrect) to shade some of the community maps described in an earlier post, like the location of potholes or trail obstructions or the extent of flooding in England.  [The darkest of the shade could show the reliability of the item.]

Trust has been a key feature of other community sites like slashdot (that uses community ratings of moderators and metamoderators to gauge the value of a post and decide what gets aired) or e-bay which has ratings scores on transactions, etc.

Press release available here.


2 responses to “Shading to indicate reliability of Wikipedia entries

  1. Pingback: Gauging Wikipedia reliability: wikiscanner « Social Capital Blog

  2. Pingback: Your tax dollars at work spinning wikipedia entries « Social Capital Blog

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s