Wearing long shorts since 1983.
25 May
On my travels, I came across Ben Maurer’s blog detailling his new project, ReCaptcha, to help fight blog/website spam but also help digitize old books that OCR can’t accurately translate.
As Ben says:
You might notice that reCAPTCHA has two words. Why? reCAPTCHA is more than a CAPTCHA, it also helps to digitize old books. One of the words in reCAPTCHA is a word that the computer knows what it is, much like a normal CAPTCHA. However, the other word is a word that the computer can’t read. When you solve a reCAPTCHA, we not only check that you are a human, but use the result on the other word to help read the book!
Luis von Ahn and myself estimated that about 60 million CAPTCHAs are solved every day. Assuming that each CAPTCHA takes 10 seconds to solve, this is over 160,000 human hours per day (that’s about 19 years). Harnessing even a fraction of this time for reading books will greatly help efforts in digitalizing books.
I think its a great idea, if Askimet wasnt so good, I would implement this on the comments system here.
edit: Did you know that CAPTCHA is the acronym for - “Completely Automated Public Turing test to tell Computers and Humans Apart”? I didnt!
Leave a reply