Friday, 22 June 2012

reCAPTCHA - what is it?

reCAPTCHA

Before we begin understanding reCAPTCHA, let us start by understanding "CAPTCHA".

CAPTCHA

CAPTCHA is a term that was introduced in 2000 by computer scientists of Carnegie Mellon University.

It is based on the English word 'capture'.  

CAPTCHA, in simple terms, is a method or test used by websites to ensure that visitors to that website are 'real' human beings and not computer programs (also called 'bots').  

Technically, CAPTCHA is an acronym that stands for  "Completely Automated Public Turing test to tell Computers and Humans Apart".

Why do websites need to ensure visitors are 'real' human beings?

It is possible to write computer programs and deploy them (put them on) to websites where they can, without a stop, keep registering users or posting comments.  The latter is also called "spam".  Such computer programs that automatically (based on the computer algorithm written for them) perform tasks are called "bots".

Therefore, if these computer programs are allowed to function without any check or stop - many websites will suffer from cramped infrastructure (servers may get jammed), genuine users will be put to difficulty etc.

How do websites ensure visitors are 'real' human beings?

By making sure visitors are required to view jumbled / mangled / not so perfectly etched characters and then key them in using their keyboards.

Only humans will be able to decipher the jumbled / mangled characters, making it difficult for computer programs to recognize.

Here is an example of where CAPTCHA is used:

If you need to reset your account password, for example, the Google website may ask you to view a CAPTCHA and then key it in from your keyboard.  Only if you key in successfully (which proves you are a real human being), you will be allowed to proceed further.




Many websites use CAPTCHA right at the registration stage to make sure only real human beings register and create profiles.

Types of CAPTCHA

Early CAPTCHA

Yahoo first used these types of CAPTCHA.  But, these had to be discontinued when computer programs that could actually read and key these in were subsequently written ;-)



Modern CAPTCHA

Modern CAPTCHA these days not only have mangled characters, but add angled lines - making it difficult for non-humans to read the CAPTCHA.



Another modern variation of CAPTCHA, as Yahoo currently uses, is crowding the characters so much that a computer program cannot separate them and read them:



Image CAPTCHA

Researchers are now also recommending alternatives to the 'text' based CAPTCHAs above.

Images combined with text are a good alternative CAPTCHA:



Now that we know what a CAPTCHA is, let us move on to reCAPTCHA.

reCAPTCHA

reCAPTCHA is a service that uses CAPTCHA.

Popular uses of reCAPTCHA:

(1) filter visitors to avoid Spam.  This is called Spam-Filtering.  a reCAPTCHA program can be run on the website for this purpose.

(2) reCAPTCHA has also been very useful in digitizing (converting to an electronic format for use in computers and gadgets) old historic textbooks, newspapers and materials.  Such conversion from old literary books to electronic form is firstly done with a technology called OCR - Optical Character Recognition.  OCR scans the old book and the scanned electronic copy is converted to words by computer algorithm.  During such conversion, many a time, OCR cannot recognize some characters / words.  

reCAPTCHA sends such words that cannot be read by computers to the Internet so that human beings can decipher them and key them in. Thus, each word that cannot be read correctly by the computer algorithm is captured as an image and sent as a CAPTCHA to a human being.

Below is an example of how a reCAPTCHA service / program has been deployed on a website:



How can you, if you were an Actor, use reCAPTCHA?

1.  If you own a blog, would you want to make sure the comments on your blog posts are by real human beings? Yes.

You can do it by making sure, before each comment is accepted on your blog, the person attempting to post a comment fills in a CAPTCHA.  

2.  Do you have a literary origin and valuable literary texts from ancestors? Would you want to convert it to an electronic form so your posterity might possess and read them? You can convert old texts and literary works to an electronic form (digitizing) using reCAPTCHA.

3.  Do you have very bound and old movie scripts that you want to preserve in an electronic form?  You can do it using reCAPTCHA.

With this, we come to the end of this introductory capsule on reCAPTCHA.

Thanks for taking time. 

Questions?

You can post your comments / questions below in the comments box.

Alternately, you could also post your questions / doubts to phantomdelight@gmail.com