登录查看更多内容

Part 1 - Applied Computer Vision: Developmental Training for Autistic Children

Tianyi Pan

AI Generalist & LLM Whisperer ? Multicultural Biz & Tech Professional ? CMA?

发布日期: 2021年7月31日

tl;dr > New hobby project! My first one with a team, and they are total strangers. We are trying to build something that can be used to help autistic children train and develop their visual cognition. But we're not quite there yet. Help us?

When I was in junior high, our school was structured so that the regular students shared the same building with a wing dedicated to autistic children. The entrances to the autistic wing were often locked for obvious reasons, but from time to time, some of our classes were held in rooms located there so we would visit that part of the building. We also occasionally (though very rarely) got a chance to interact directly with the autistic children on joint school outing trips, for example.

Because I was attending a special "media emphasis" class, during one of these trips we were tasked to film a short documentary, focusing on the topic of autism. We were researching the topic for the narration which was recorded by my classmate Joonatan, and to this date I remember his opening line for our video, word-for-word:

Autism is a neurobiological developmental disorder of the central nervous system

These children, who were socially and emotionally extremely introverted even on a Finnish scale (and you know Finns can be VERY introverted), fascinated me and stayed with me long after I graduated on to high school, and beyond.

A few years back, I learned that one of our neighbors here in China has an autistic child. A diner I used to frequent in downtown Shanghai has an owner who also volunteers at events organized by and for autistic children. Even a Chinese anime series that I watched, set in the city of Chongqing, features a mysterious autistic child at a local children's welfare center...

These people are all around us, yet for the masses, they remain invisible, always a bit out of reach.

Background

Those who follow my writing here might have noticed it's been a year and a half since my previous post, and I never actually got to finish that project, either. Work and family have absolutely taken all my time and energy and I couldn't focus on other things, even on the weekends.

But I do have a bit more time around the summer to work on my own things. About a month ago I went to a randomish Mixlab meetup where people from various disciplines get together and try to collaborate on stuff. This time the topic was with chatbots and a local deep learning architecture called PaddlePaddle (which is Baidu's open source project, while the company was also co-sponsoring the event, along with an open source chatbot platform project Wechaty).

I have to admit I didn't have much interest in chatbots per se. But the event, which turned out to be a kick-off to a competition, was rigorous in form in that the participating projects should be using the chatbot platform and building their applications on top of that user interface. So at least for me, I am just going to treat it as a user input/output mechanism.

Developmental Training for Autistic Children

At the event, a young woman was describing her UX research at university and how she wanted to turn it into a tool or game that the autistic children could play with and train to develop some basic visual recognition skills. The topic intrigued me so I decided to join the group.

We soon settled on an idea for the user to interact through the chatbot feature, as required by the contest. The chatbot would randomly select one of the several basic shapes (round, square, triangle) and ask the user to take pictures of real life objects that would resemble this shape and submit it back. The app would rate the picture based on the shape that it could identify in it and give feedback to the user.

This was a first time that I do a hobby project in a team, with total strangers. Of course during these years working as machine learning engineer at an AI startup, I've gotten used to working with other people. Heck, we might even publish our source code on GitHub after the demo is done!

Because of my previous doodling with computer vision (and because it later led to my current job), I was pretty confident we could do something akin of a shape recognizer. My theory was we could make it work using OpenCV alone, without even the need for any deep learning. The basic workflow would be:

领英推荐

New Floreo lesson addresses water safety

Floreo 5 个月前

China's ALSOLIFE Revolutionizes Autism Care for…

Yicai 第一财经 2 个月前

Empowering Connections: A Comprehensive Guide with 10…

OneWell Health Care 1 年前

Take a photo
Run some processing (for example detecting the edges or contours)
Compare the image with the shape templates, also processed to have the same post edge/contour detection look
Predict the template that most resembles the user input image

Template and Test Data Collection

To get things started, our team members helped collect a bunch of photos from everyday life along each of the basic shape categories. We even collected some more obscure ones that couldn't be readily attributed to any single shape, but we were curious how the machine would recognize them.

Here we have already ingested the photos in grayscale and ran a normalizing pipeline to crop them in the middle to the same aspect ratio, then resize to an uniform resolution, and made into a collage for easy previewing. We would also keep this same view when applying further processing to the images to have an intuition on how same actions perform on various kinds of data.

Similarly, we would process the template images, as mentioned earlier.

Structural Similarity

The most naive way of comparing two images would be to sum up the difference on a pixel-per-pixel basis. However, in such situations, consider that the second image is shifted to the side by just a couple of pixels, and suddenly the pixel-to-pixel comparison fails spectacularly (since each pixel is now different, the sum of all the differences would be huge) although to a human eye, the two images still look almost exactly the same.

in my plan, the primary method of comparing the templates to a user image would be through an algorithm called structural similarity, lifted from the Scikit-Image library. The idea of structural similarity is to abstract the images to some degree before making the comparison on the detected structural features instead of the original pixel level.

However, as you can see from the above image, the recognition results from this structural comparison didn't immediately convince us. It seems there is still quite a lot of noise. We initially suspected it's because we only have one orientation for the templates where angles do matter (such as square or triangle) but even accounting for rotational variance, the accuracy did not increase accordingly. We suspect size also matters, although making our comparison algorithm robust to scale would seem much harder with just OpenCV.

Next Steps

This was a fairly short and straightforward playtest sprint, but so far we have already generated some ideas for the next steps.

Contrary to what I initially theorized, we could probably use deep learning after all, at least to generate vectorized embeddings for both the template and input images. Convolutional deep learning models have already been trained to extract highly robust features that are translation, rotation and scale invariant. Thus, we could be doing similarity comparisons in the abstracted embedding space instead of the absolute pixel space. Although we have converted the raw pixels into edge or contour maps, they're still too achored into actual pixel coordinates.

Do you, dear reader, have any further ideas or suggestions that you would want to add to our list of things to try? Feel free to leave a comment and participate in the discussion!

Lisa Emily Petersen

Video Editor & Social Media Manager

3 年

This looks awesome!

1 次回应

Hua Jin

Founder and CEO @ Aixedu.com /AI for Education and Sustainability

3 年

very interesting project

Tianyi Pan

AI Generalist & LLM Whisperer ? Multicultural Biz & Tech Professional ? CMA?

3 年

Joonatan Lintala mainittu ohimennen! :D Alkuper?inen lainaus suomeksihan oli: "Autismi on neurobiologinen keskushermoston kehitysh?iri?".

查看更多评论

要查看或添加评论，请登录

Tianyi Pan的更多文章

Setting Up Jenkins on Kamatera: A Survival Guide for Beginners

2024年6月18日

Setting Up Jenkins on Kamatera: A Survival Guide for Beginners

If you've ever tried setting up Jenkins for the first time with zero prior experience, you probably know the feeling of…

1 条评论
Would You Watch a Chinese Documentary Film Made Entirely From Short Videos? You should.

2024年1月14日

Would You Watch a Chinese Documentary Film Made Entirely From Short Videos? You should.

"Any resemblance to actual persons or events..

5 条评论
Part 3 - Applied LLMs: How to Build a Cat Generative Dialogue Processor (CatGDP)

2023年5月3日

Part 3 - Applied LLMs: How to Build a Cat Generative Dialogue Processor (CatGDP)

This article was originally published on the Streamlit blog: https://blog.streamlit.

4 条评论
On the Memory and Subconscious of Large Language Models (LLMs)

2023年4月7日

On the Memory and Subconscious of Large Language Models (LLMs)

This article will be a slight departure from my usual semi-technical sharings of the projects I've worked on lately. A…

13 条评论
Part 2 (Back to Basics) - Applied LLMs: The Real ChatGPT...4

2023年3月19日

Part 2 (Back to Basics) - Applied LLMs: The Real ChatGPT...4

In my previous article from about a month ago, I detailed the journey of building a ChatGPT-like bot in less than 24…
Part 1 - Applied LLMs: How I Built My Own "ChatGPT" for (Almost) Free in Less Than 24 Hours

2023年2月12日

Part 1 - Applied LLMs: How I Built My Own "ChatGPT" for (Almost) Free in Less Than 24 Hours

ChatGPT is everywhere. Already back in December 2022, we made a podcast episode on then-brand new AI chatbot from…

9 条评论
Part 3 (Deep Learning Saves The Day, Again) - Applied Computer Vision: Developmental Training for Autistic Children

2021年8月14日

Part 3 (Deep Learning Saves The Day, Again) - Applied Computer Vision: Developmental Training for Autistic Children

In part 1 of this article series, we started off with an idea for a project and some background for why we were…
Part 2 (Chatbots!) - Applied Computer Vision: Developmental Training for Autistic Children

2021年8月7日

Part 2 (Chatbots!) - Applied Computer Vision: Developmental Training for Autistic Children

We interrupt this programming with a special word from our charity project team: Please go to our GitHub project page…
Part 1 - Applied Reinforcement Learning with 2048

2019年12月22日

Part 1 - Applied Reinforcement Learning with 2048

Hi there and welcome back! It has been quite a long pause (over a year, actually) since my previous Artificial…
Part 2 (Towards Better Accuracy) - Applied ML & Timeline Prediction: Shanghai License Plate Auction Prices

2018年10月30日

Part 2 (Towards Better Accuracy) - Applied ML & Timeline Prediction: Shanghai License Plate Auction Prices

Hi and welcome back to my article series on predicting the monthly license plate auction ending prices for Shanghai…

2 条评论

See all articles

Part 1 - Applied Computer Vision: Developmental Training for Autistic Children

Tianyi Pan

AI Generalist & LLM Whisperer ? Multicultural Biz & Tech Professional ? CMA?

Background

Developmental Training for Autistic Children

领英推荐

Template and Test Data Collection

Structural Similarity

Next Steps

Tianyi Pan的更多文章

社区洞察

其他会员也浏览了

Exploring the Impact of Humanoid Robots on Enhancing Social and Learning Skills in Children with Autism

‘Tis the Season for Lots of Toys: The Benefits of Play-Based ABA for Developing Young Minds

We Need More Representation Of Autistic People!

Celebrating a Milestone

Unlocking Potential: The Autism (Early Identification) 10 Minute Rule Bill Unveiled

Book Review: "Stop the World I Want to Get Off" by Jodie Clarke

Exploring the Impact of Humanoid Robots on Enhancing Social and Learning Skills in Children with Autism

Treatments for Autism in Bangalore for Children | CAPAAR

Correlation between sensory preferences and Autism & ASD traits during early childhood

How Can I Support Older Autistic Learners Strategies

Background

Developmental Training for Autistic Children

领英推荐

Template and Test Data Collection

Structural Similarity

Next Steps

Tianyi Pan的更多文章

Setting Up Jenkins on Kamatera: A Survival Guide for Beginners

Would You Watch a Chinese Documentary Film Made Entirely From Short Videos? You should.

Part 3 - Applied LLMs: How to Build a Cat Generative Dialogue Processor (CatGDP)

On the Memory and Subconscious of Large Language Models (LLMs)

Part 2 (Back to Basics) - Applied LLMs: The Real ChatGPT...4

Part 1 - Applied LLMs: How I Built My Own "ChatGPT" for (Almost) Free in Less Than 24 Hours

Part 3 (Deep Learning Saves The Day, Again) - Applied Computer Vision: Developmental Training for Autistic Children

Part 2 (Chatbots!) - Applied Computer Vision: Developmental Training for Autistic Children

Part 1 - Applied Reinforcement Learning with 2048

Part 2 (Towards Better Accuracy) - Applied ML & Timeline Prediction: Shanghai License Plate Auction Prices

社区洞察

其他会员也浏览了

Exploring the Impact of Humanoid Robots on Enhancing Social and Learning Skills in Children with Autism

‘Tis the Season for Lots of Toys: The Benefits of Play-Based ABA for Developing Young Minds

We Need More Representation Of Autistic People!

Celebrating a Milestone

Unlocking Potential: The Autism (Early Identification) 10 Minute Rule Bill Unveiled

Book Review: "Stop the World I Want to Get Off" by Jodie Clarke

Exploring the Impact of Humanoid Robots on Enhancing Social and Learning Skills in Children with Autism

Treatments for Autism in Bangalore for Children | CAPAAR

Correlation between sensory preferences and Autism & ASD traits during early childhood

How Can I Support Older Autistic Learners Strategies