Humans solve about 60 million CAPTCHAs a day. A new project called reCAPTCHA aims to put this human computing power to good use: digitizing books. Especially cool is the reCAPTCHA Mailhide service. Type your email into a textbox and it generates HTML code that uses the reCAPTCHA servers to protect your email address. The end result looks like this:
mcmi...@cs.cmu.edu
How's it work? reCAPTCHA improves the process of digitizing books by sending words that cannot be read by OCR software to the Web, in the form of CAPTCHAs for humans to decipher. With reCAPTCHA, a pair of distorted English words is presented to a human. One word cannot be read correctly by OCR; the other word's answer is already known. The user is asked to transcribe both words. If they solve the one for which the answer is known, the system assumes their answer is correct for the unknown one. reCAPTCHA was developed by Luis von Ahn, a professor at Carnegie Mellon who was one of the inventors of the original CAPTCHA idea.
(Disclaimer: I know Luis well and will be working with him this summer.)
delicious
digg
reddit
Subscribe here