You’ve seen all the CAPTCHA images around – you know, those images that have numbers or letters in them that you have to type in correctly before you’re allowed to make a post or submit a form. Well, reCAPTCHA is a free project (reCAPTCHA website) by the Carnegie Mellon School of Computer Science at Carnegie Mellon University, and it aims to stop spam and preserve literature at the same time.
What it does is to get regular web users like you and I to help in the process of digitizing the text of books. ReCAPTCHA takes scanned words that optical character recognition software have been unable to read, and presents them for humans to decipher as CAPTCHA words.
ReCAPTCHA operates based on a critical assumption though. Since the word is a one which a OCR software has been unable to read, you’ll be presented with another word which has already been identified. If you correctly recognise the identified word, ReCAPTCHA assumes that you are also correct about the new word.
What I’m unsure about is — how about users who correctly recognise the identified word, but reads the other word wrongly?