What are CAPTCHAs and why are they getting tougher to solve?
CAPTCHAs, which stands for "Completely Automated Public Turing Test to tell Computers and Humans Apart," are generated using a combination of various techniques, and AI plays a significant role in both generating and solving them. Read more about them in article written by AI Club Research Member Abhav Bhanot
The Story
Alan Turing pioneered machine learning during the 1940s and 1950s. Turing introduced the "Turing test" in his 1950 paper called "Computing Machinery and Intelligence" while at the University of Manchester.
In his paper, Turing proposed a twist on what is called "The Imitation Game." The Imitation Game uses three human players in three different rooms rather than artificial intelligence. Each chamber contains a male, a female, and either a male or female judge, and they are all connected by a screen and keyboard. The judge tries to determine which is which as the female tries to persuade him that she is the man. Turing modifies the game's concept by adding an AI, a human, and a human questioner. The decision of which is an AI and which is a human is then the responsibility of the questioner.
The influence of the Turing Test on CAPTCHAs can be observed in how it underscored the necessity of distinguishing between humans and computers in online interactions. CAPTCHAs were subsequently introduced as a practical solution to this challenge, primarily focusing on evaluating whether a user possesses human-like visual and pattern recognition abilities.
How are they generated?
Here's how they are typically generated and why they are becoming more challenging for humans:
Random Generation: Many CAPTCHAs are generated by using random characters, numbers, or a combination of both. This randomness ensures that the CAPTCHA is unique and not easily predictable.
Distortion: CAPTCHAs often distort the characters or numbers to make it difficult for automated bots to recognize them. This distortion can include warping, stretching, or twisting the characters.
Background Noise: To further confuse automated systems, CAPTCHAs may include background noise or patterns that make it harder for optical character recognition (OCR) software to extract the characters.
Variability: CAPTCHAs can vary in complexity. Some may be simple and use standard fonts, while others can use more complex fonts or even images of text.
Puzzle-Driven: Certain CAPTCHAs require users to engage in solving puzzles or performing particular actions, like identifying and selecting images containing specific objects (e.g., "pick out all the pictures featuring fire hydrants").
Implementations in Python
Now, to understand the generation of CAPTCHAs, we'll construct a basic text-based CAPTCHA system in Python. We chose Python due to its extensive use in the fields of machine learning and artificial intelligence.
Code:?
In this example, we generated a random CAPTCHA string consisting of letters (both uppercase and lowercase) and digits. Users are asked to enter the CAPTCHA, and we check whether their input matches the generated string or not. CAPTCHAs can be a lot more complex and unique, but this was just a basic example to show you the inner workings behind CAPTCHAs and their generation.
import random
import string
# Function to generate a random CAPTCHA string
def generate_captcha_string(length=6):
characters = string.ascii_letters + string.digits
captcha_string = ''.join(random.choice(characters) for _ in range(length))
return captcha_string
# Generate a random CAPTCHA string
captcha_text = generate_captcha_string()
print("Generated CAPTCHA:", captcha_text)
# Simulate user input
user_input = input("Enter the CAPTCHA: ")
# Check if the user's input matches the generated CAPTCHA
if user_input == captcha_text:
print("CAPTCHA is correct!")
else:
print("CAPTCHA is incorrect.")
领英推荐
Algorithm
The basic algorithm behind all CAPTCHAs generative code is:
To make CAPTCHAs more secure and complex, you can apply techniques like distortion, noise, and background images. Additionally, you can use libraries like Pillow (PIL) for image manipulation to create image-based CAPTCHAs or use external CAPTCHA libraries to generate and validate CAPTCHAs more efficient.
# Python program to automatically generate CAPTCHA and verify user
import random
# Returns true if given two strings are same
def checkCaptcha(captcha, user_captcha):
if captcha == user_captcha:
return True
return False
# Generates a CAPTCHA of given length
def generateCaptcha(n):
# Characters to be included
chrs = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"
# Generate n characters from above set and
# add these characters to captcha.
captcha = ""
while (n):
captcha += chrs[random.randint(1, 1000) % 62]
n -= 1
return captcha
# Driver code
# Generate a random CAPTCHA
captcha = generateCaptcha(9)
print(captcha)
# Ask user to enter a CAPTCHA
print("Enter above CAPTCHA:")
usr_captcha = input()
# Notify user about matching status
if (checkCaptcha(captcha, usr_captcha)):
print("CAPTCHA Matched")
else:
print("CAPTCHA Not Matched")
AI's Role in CAPTCHAs
AI plays a dual role in CAPTCHAs.
Why CAPTCHAs Are Getting Tougher
CAPTCHAs are evolving to become more challenging for humans due to the ongoing arms race between developers trying to protect websites from bots and the AI technologies used by malicious actors. Some reasons for this trend include:
The End Result of the AI Boom
The ongoing advancement of AI has significant implications for CAPTCHAs and internet security in general. The end result could include:
In summary, CAPTCHAs are evolving to stay ahead of AI advancements, but the battle between AI technologies and security measures will likely continue. The end result will likely involve more sophisticated and diverse methods of online verification.