登录查看更多内容

The Future of Sketching: A Touchless Drawing Experience with Gesture Recognition

Ajith Kumar M↗?

Building @complyance | SEO & Technical Content Strategist | Audit → Research → Execute → Proven Results → Measurable Impact = Goal ??

发布日期: 2023年8月18日

Abstract:

Advancements in Artificial Intelligence (AI) technologies, especially computer vision, are reshaping research and creative processes. As the demand for seamless human-machine interaction increases, this paper introduces a Touchless User Interface (TUI). This system interprets hand gestures — rich, expressive forms of communication — to control computers without any physical contact with a keyboard, mouse, or screen. The paper demonstrates the application’s capacity to interpret hand gestures to sketch various shapes like circles, rectangles, and lines, or to employ freehand drawing and erasing functions. The system employs a convolutional neural network to segment and detect the human hand in real time against complex backgrounds.

Objective:

Develop a virtual canvas for sketching.

Utilize the human finger as a color marker in sketches.
Perform necessary morphological operations.
Establish an interactive interface between the user and the system.

Existing System:

The current system is restricted to finger inputs, with no support for additional tools like highlighters or paints.
Isolating and identifying an object, such as a finger, from an RGB image without a depth sensor is challenging.
Due to the absence of depth detection, tracking the vertical movements of the pen is not possible.

Proposed System:

In this project, a live video stream, captured using OpenCV, serves as the input. The system then interprets hand gestures, as identified through MediaPipe, to dictate the subsequent actions in the application. The outputcomprising the structures drawn by the user — is displayed in real time.

Let’s start Code Section:

we will learn how to create a touchless sketching application using hand gesture recognition. Our input is a live video stream, and we will use the?MediaPipe?library for hand gesture recognition and?OpenCV?for rendering and handling the video feed. The application recognizes different hand gestures to switch between various drawing tools. The entire code is in Python.

Let’s break down the code step by step:

Step 1: Import Necessary Libraries

We start by importing the necessary libraries:

import mediapipe as mp
import cv2
import numpy as np
import time

mediapipe: This library contains the tools we need for hand tracking.
cv2: OpenCV (Open Source Computer Vision Library) is an open-source computer vision and machine learning software library.
numpy: NumPy is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices.
time: We use the time library to create delays when selecting tools.

Step 2: Initialize Variables and Constants

ml = 150
max_x, max_y = 250 + ml, 50
curr_tool = "select tool"
time_init = True
rad = 40
var_inits = False
thick = 4
prevx, prevy = 0, 0

These are self-explanatory, representing various parameters like thickness of the line, previous x, and y coordinates, etc.

Step 3: Define Helper Functions

getTool(x): Based on the x-coordinate, this function determines which tool is selected. This can be a line, rectangle, circle, drawing tool, or eraser.
index_raised(yi, y9): This function determines if the index finger is raised by comparing the y-coordinates of different finger landmarks.

Step 4: Prepare Hand Tracking Model

We initialize the hand tracking model with some basic parameters.

hands = mp.solutions.hands
hand_landmark = hands.Hands(min_detection_confidence=0.6, min_tracking_confidence=0.6, max_num_hands=1)
draw = mp.solutions.drawing_utils

Step 5: Load Drawing Tools Image

We load an image which will be used to represent different drawing tools:

tools = cv2.imread("tools.png")
tools = tools.astype('uint8')

Step 6: Create a Mask

We create a white mask of the same size as our output window. This mask is used to draw the shapes.

mask = np.ones((480, 640)) * 255
mask = mask.astype('uint8')

领英推荐

In the Pursuit of Knowledge, There Be Dragons

danah boyd 3 年前

Enhancing Image Quality Consistency In Janus Pro 7B…

The-Next-Tech 3 周前

Combining the Power of Generative AI with the Creative…

Gary Stafford 1 年前

Step 7: Capture Video Feed and Process

The main loop of the program where we read frames from the webcam, process them, detect the hands, and based on the gesture, draw different shapes or use different tools.

cap = cv2.VideoCapture(0)
while True:
 # ... (all the hand detection and drawing logic)

This loop reads a frame from the webcam, flips it for a more natural interaction, processes it to detect hand landmarks, and uses these landmarks to draw with the selected tool.

Step 8: Drawing Tools

In this loop, based on the?curr_tool?variable, different shapes are drawn. For example, if?curr_tool?is set to "draw", we will draw freehand lines on the screen based on the position of our index finger.

Step 9: Display the Output

We display the processed frame, which includes our drawings:

cv2.imshow("paint app", frm)

Step 10: Exit Condition

The loop continues until the ‘Esc’ key is pressed:

if cv2.waitKey(1) == 27:
 cv2.destroyAllWindows()
 cap.release()
 break

And that’s it! We’ve built a real-time touchless drawing application using Python, MediaPipe, and OpenCV. This application can recognize hand gestures to switch between various drawing tools. This technology can be very useful in various fields like interactive presentations, virtual reality, etc.

Complete Code:

import mediapipe as mp
mp_face_mesh = mp.solutions.face_mesh
import cv2
import numpy as np
import time

#contants
ml = 150
max_x, max_y = 250+ml, 50
curr_tool = "select tool"
time_init = True
rad = 40
var_inits = False
thick = 4
prevx, prevy = 0,0

#get tools function

def getTool(x):
 if x < 50 + ml:
  return "line"

 elif x<100 + ml:
  return "rectangle"

 elif x < 150 + ml:
  return"draw"

 elif x<200 + ml:
  return "circle"

 else:
  return "erase"

def index_raised(yi, y9):
 if (y9 - yi) > 40:
  return True

 return False



hands = mp.solutions.hands
hand_landmark = hands.Hands(min_detection_confidence=0.6, min_tracking_confidence=0.6, max_num_hands=1)
draw = mp.solutions.drawing_utils


# drawing tools
tools = cv2.imread("tools.png")
tools = tools.astype('uint8')

mask = np.ones((480, 640))*255
mask = mask.astype('uint8')
'''
tools = np.zeros((max_y+5, max_x+5, 3), dtype="uint8")
cv2.rectangle(tools, (0,0), (max_x, max_y), (0,0,255), 2)
cv2.line(tools, (50,0), (50,50), (0,0,255), 2)
cv2.line(tools, (100,0), (100,50), (0,0,255), 2)
cv2.line(tools, (150,0), (150,50), (0,0,255), 2)
cv2.line(tools, (200,0), (200,50), (0,0,255), 2)
'''

cap = cv2.VideoCapture(0)
while True:
 _, frm = cap.read()
 frm = cv2.flip(frm, 1)

 rgb = cv2.cvtColor(frm, cv2.COLOR_BGR2RGB)

 op = hand_landmark.process(rgb)

 if op.multi_hand_landmarks:
  for i in op.multi_hand_landmarks:
   draw.draw_landmarks(frm, i, hands.HAND_CONNECTIONS)
   x, y = int(i.landmark[8].x*640), int(i.landmark[8].y*480)

   if x < max_x and y < max_y and x > ml:
    if time_init:
     ctime = time.time()
     time_init = False
    ptime = time.time()

    cv2.circle(frm, (x, y), rad, (0,255,255), 2)
    rad -= 1

    if (ptime - ctime) > 0.8:
     curr_tool = getTool(x)
     print("your current tool set to : ", curr_tool)
     time_init = True
     rad = 40

   else:
    time_init = True
    rad = 40

   if curr_tool == "draw":
    xi, yi = int(i.landmark[12].x*640), int(i.landmark[12].y*480)
    y9  = int(i.landmark[9].y*480)

    if index_raised(yi, y9):
     cv2.line(mask, (prevx, prevy), (x, y), 0, thick)
     prevx, prevy = x, y

    else:
     prevx = x
     prevy = y



   elif curr_tool == "line":
    xi, yi = int(i.landmark[12].x*640), int(i.landmark[12].y*480)
    y9  = int(i.landmark[9].y*480)

    if index_raised(yi, y9):
     if not(var_inits):
      xii, yii = x, y
      var_inits = True

     cv2.line(frm, (xii, yii), (x, y), (50,152,255), thick)

    else:
     if var_inits:
      cv2.line(mask, (xii, yii), (x, y), 0, thick)
      var_inits = False

   elif curr_tool == "rectangle":
    xi, yi = int(i.landmark[12].x*640), int(i.landmark[12].y*480)
    y9  = int(i.landmark[9].y*480)

    if index_raised(yi, y9):
     if not(var_inits):
      xii, yii = x, y
      var_inits = True

     cv2.rectangle(frm, (xii, yii), (x, y), (0,255,255), thick)

    else:
     if var_inits:
      cv2.rectangle(mask, (xii, yii), (x, y), 0, thick)
      var_inits = False

   elif curr_tool == "circle":
    xi, yi = int(i.landmark[12].x*640), int(i.landmark[12].y*480)
    y9  = int(i.landmark[9].y*480)

    if index_raised(yi, y9):
     if not(var_inits):
      xii, yii = x, y
      var_inits = True

     cv2.circle(frm, (xii, yii), int(((xii-x)**2 + (yii-y)**2)**0.5), (255,255,0), thick)

    else:
     if var_inits:
      cv2.circle(mask, (xii, yii), int(((xii-x)**2 + (yii-y)**2)**0.5), (0,255,0), thick)
      var_inits = False

   elif curr_tool == "erase":
    xi, yi = int(i.landmark[12].x*640), int(i.landmark[12].y*480)
    y9  = int(i.landmark[9].y*480)

    if index_raised(yi, y9):
     cv2.circle(frm, (x, y), 30, (0,0,0), -1)
     cv2.circle(mask, (x, y), 30, 255, -1)



 op = cv2.bitwise_and(frm, frm, mask=mask)
 frm[:, :, 1] = op[:, :, 1]
 frm[:, :, 2] = op[:, :, 2]

 frm[:max_y, ml:max_x] = cv2.addWeighted(tools, 0.7, frm[:max_y, ml:max_x], 0.3, 0)

 cv2.putText(frm, curr_tool, (270+ml,30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0,0,255), 2)
 cv2.imshow("paint app", frm)

 if cv2.waitKey(1) == 27:
  cv2.destroyAllWindows()
  cap.release()
  break

Download the code, Github Page:?click here

Follow my Linkedin page:?https://www.dhirubhai.net/in/ajitharunai/

Conclusion:

The research introduces an effective hand identification system for air application sketching, boasting an experimental accuracy of 98.48%. This system has the potential to revolutionize traditional writing and teaching methodologies. Designed as a real-time application for spatial sketching on a two-dimensional surface, this technology offers substantial benefits, particularly for individuals with disabilities, seniors, or those who struggle with conventional input devices like keyboards.

Future Scope:

This system holds the promise of broad utility, including the control of IoT devices. As an exemplary tool for smart wearables, it will enable more intuitive interactions with digital environments. Augmented reality technologies could further enrich text and visual information. Importantly, future iterations should focus on securing the system, ensuring that air-writing only responds to authorized gestures. Additionally, impending object detection techniques, such as YOLO v3, may further enhance fingertip recognition accuracy and processing speed.

References:

[1] Chen, Qing, Nicolas D. Georganas, and Emil M. Petriu. “Real-time vision-based hand gesture recognition using haar-like features.” In Instrumentation and Measurement Technology Conference Proceedings, 2007. IMTC 2007. IEEE, pp. 1–6. IEEE, 2007.

[2] Garg, Pragati, Naveen Aggarwal, and Sanjeev Sofat. “Vision based hand gesture recognition.” World Academy of Science, Engineering and Technology 49, no. 1 (2009): 972- 977.

[3] Lienhart, Rainer, and Jochen Maydt. “An extended set of haar-like features for rapid object detection.” In Image Processing. 2002. Proceedings. 2002 International Conference on, vol. 1, pp. I-I. IEEE, 2002.

[4] Freeman, William T., and Michal Roth. “Orientation histograms for hand gesture recognition.” In International workshop on automatic face and gesture recognition, vol. 12, pp. 296–301. 1995.

[5] Flórez, Francisco, Juan Manuel García, José García, and Antonio Hernández. “Hand gesture recognition following the dynamics of a topology preserving network.” In Automatic Face and Gesture Recognition, 2002. Proceedings. Fifth IEEE International Conference on, pp. 318–323. IEEE, 2002.

Real Python Python Coding Python Learning Machine Learning Mastery Artificial Intelligence News

Edujith Platform

3,435 位关注者

Manish Kumar

1 年

Hey buddy can you help me to add circle on my canvas app

1 次回应

Sanket Sarwade

14.3M+ Impressions | Content Writer | SEO Specialist | Data Scientist | Code plus Creativity

1 年

Your work is always inspiring, and your blog has taught me so much about touchless drawing using AI ????

1 次回应

查看更多评论

要查看或添加评论，请登录

Ajith Kumar M↗?的更多文章

From Silicon Valley to Indian Villages: Sridhar Vembu’s Bold Mission to Change the World

2025年3月1日

From Silicon Valley to Indian Villages: Sridhar Vembu’s Bold Mission to Change the World

A Simple Beginning with Big Dreams Sridhar Vembu was born in 1968 in a small village in Tamil Nadu, India. His family…

2 条评论
Welcome to Inspiraxion: The Power of Stories to Shape Your Future

2025年2月15日

Welcome to Inspiraxion: The Power of Stories to Shape Your Future

There was a time in my life when I felt stuck. I had dreams, ambitions, and a vision for the future, but something was…
How to Rank #1 on Google: A Simple Guide to SEO Success

2025年1月21日

How to Rank #1 on Google: A Simple Guide to SEO Success

Hey there! If you’ve ever wondered what it takes to get your website to the top of Google’s search results, you’re in…
5 Steps to Improve Your Brand Positioning in 2025

2025年1月10日

5 Steps to Improve Your Brand Positioning in 2025

In today’s hyper-competitive market, brand positioning is more critical than ever. It’s not just about what you…
Simplifying the RACE Planning Framework with AIDA Principles

2025年1月8日

Simplifying the RACE Planning Framework with AIDA Principles

The RACE Planning Framework is a structured approach to help businesses build an agile, strategic digital marketing…
Stay Hungry, Stay Foolish: The Timeless Lessons from Steve Jobs’ Journey

2024年9月25日

Stay Hungry, Stay Foolish: The Timeless Lessons from Steve Jobs’ Journey

Today, I want to share with you three stories from the life of one of the most iconic figures of our time, Steve Jobs…

1 条评论
Why Entrepreneurship is Booming: Insights and Trends for 2024 and Beyond

2024年9月22日

Why Entrepreneurship is Booming: Insights and Trends for 2024 and Beyond

Why More People Are Starting Businesses: What You Should Know Starting a business isn’t just a dream for a few people…
India's Richest Self-Made Billionaire: Radha Vembu’s Story – How Radha Vembu Constructed a Rs 47,500 Crore Wealth Portfolio.

2024年9月19日

India's Richest Self-Made Billionaire: Radha Vembu’s Story – How Radha Vembu Constructed a Rs 47,500 Crore Wealth Portfolio.

Radha Vembu is one of India’s most successful self-made billionaires, with a net worth of Rs 47,500 crore. She is the…
Top 1% of Successful and Powerful People Follow These Five Rules for Success

2024年9月18日

Top 1% of Successful and Powerful People Follow These Five Rules for Success

Have you ever wondered why some people are more successful than others? The top 1% of successful and powerful people…

1 条评论
What is Product Marketing and Why Is It So Important?

2024年9月17日

What is Product Marketing and Why Is It So Important?

When you hear the term "product marketing," you might imagine it as the point where three important areas—Product…

See all articles

The Future of Sketching: A Touchless Drawing Experience with Gesture Recognition

Ajith Kumar M↗?

Building @complyance | SEO & Technical Content Strategist | Audit → Research → Execute → Proven Results → Measurable Impact = Goal ??

Objective:

Let’s start Code Section:

Step 1: Import Necessary Libraries

Step 2: Initialize Variables and Constants

Step 3: Define Helper Functions

Step 4: Prepare Hand Tracking Model

Step 5: Load Drawing Tools Image

Step 6: Create a Mask

领英推荐

Step 7: Capture Video Feed and Process

Step 8: Drawing Tools

Step 9: Display the Output

Step 10: Exit Condition

Complete Code:

Conclusion:

Edujith Platform

3,435 位关注者

Ajith Kumar M↗?的更多文章

社区洞察

其他会员也浏览了

Generative Art and NFTs

Using Arti?cial Intelligence to Augment Human Intelligence

Generative Design Vs Generative AI: Know Your Apples

Generative AI and Design: The Disappearing Middle Ground

AI Photo Frames: Way You Display Memories

DR.AI AI-Powered Web Customization, Midjourney's Style Tuning, and the Rise of Generative 3D +5 tools +2 tutorials

Photoshop: AI Generative Fill

Art & Design from the perspective of Computer Science

How I Use AI To Create Comic In Under 5 Minutes.

This AI generated video almost fooled me - This Week in 3D

Objective:

Let’s start Code Section:

Step 1: Import Necessary Libraries

Step 2: Initialize Variables and Constants

Step 3: Define Helper Functions

Step 4: Prepare Hand Tracking Model

Step 5: Load Drawing Tools Image

Step 6: Create a Mask

领英推荐

Step 7: Capture Video Feed and Process

Step 8: Drawing Tools

Step 9: Display the Output

Step 10: Exit Condition

Complete Code:

Conclusion:

Edujith Platform

3,435 位关注者

Ajith Kumar M↗?的更多文章

From Silicon Valley to Indian Villages: Sridhar Vembu’s Bold Mission to Change the World

Welcome to Inspiraxion: The Power of Stories to Shape Your Future

How to Rank #1 on Google: A Simple Guide to SEO Success

5 Steps to Improve Your Brand Positioning in 2025

Simplifying the RACE Planning Framework with AIDA Principles

Stay Hungry, Stay Foolish: The Timeless Lessons from Steve Jobs’ Journey

Why Entrepreneurship is Booming: Insights and Trends for 2024 and Beyond

India's Richest Self-Made Billionaire: Radha Vembu’s Story – How Radha Vembu Constructed a Rs 47,500 Crore Wealth Portfolio.

Top 1% of Successful and Powerful People Follow These Five Rules for Success

What is Product Marketing and Why Is It So Important?

社区洞察

其他会员也浏览了

Generative Art and NFTs

Using Arti?cial Intelligence to Augment Human Intelligence

Generative Design Vs Generative AI: Know Your Apples

Generative AI and Design: The Disappearing Middle Ground

AI Photo Frames: Way You Display Memories

DR.AI AI-Powered Web Customization, Midjourney's Style Tuning, and the Rise of Generative 3D +5 tools +2 tutorials

Photoshop: AI Generative Fill

Art & Design from the perspective of Computer Science

How I Use AI To Create Comic In Under 5 Minutes.

This AI generated video almost fooled me - This Week in 3D