登录查看更多内容

Part 2 (Chatbots!) - Applied Computer Vision: Developmental Training for Autistic Children

Tianyi Pan

AI Generalist & LLM Whisperer ? Multicultural Biz & Tech Professional ? CMA?

发布日期: 2021年8月7日

We interrupt this programming with a special word from our charity project team: Please go to our GitHub project page and give the project a star to support us in the contest. Thank you, and now back to the regular program. >>

In the previous part of the article, we discussed the background and some of the core algorithms of the chatbot-based shape recognizing application that I am building as part of a team for an online contest. After posting the article, I received many great suggestions on how to improve the computer vision algorithms - super grateful for that.

However, this time around, I am going to focus on an entirely different aspect of the project: putting it all together as an MVP, especially the chatbot and interaction part, so we'll leave the algorithm improvement to a later stage.

As mentioned in the previous post, the competition we are partaking is co-sponsored by Wechaty, which (quoting on their Github page) is "a RPA (Robotic Process Automation) SDK for Chatbot Makers". I don't presume to fully understand what that even means, and to be honest, I wasn't particularly crazed about chatbots in general, but we did find a nice way of using it as input/output interface for our app. Moreover, getting Wechaty to work at all has been a tremendous challenge, so I wanted to share our experience lest someone else follows our footsteps and is bumping their heads into some very thick walls.

Wechaty Operating Principle

First of all, like the name suggests, Wechaty is mostly designed to operate with the Chinese chat app Wechat, although it also supports other chat apps. Wechat is a proprietary chat platform by Tencent, and they have very limited API support for third parties. For this reason, the whole implementation of Wechaty is quite convoluted, as mentioned above, and it took me very long time to figure it out (and I am not sure I've got it entirely even now). But here I try:

The leftmost (green) part is what every user of the chat experiences, namely the chatroom. It is where all the messaging happens back and forth between different users. On Wechat, apart from your mobile phone, you can also access the chatrooms via desktop or tablet apps. However, these "side-logins" are associated with the main phone app in that you cannot log in to them independently, only via the phone app through a single-use QR code.

Here is where the Puppet service (yellow) part comes in. It is an online service masking as an iPad device so it can login to the same Wechat account as your phone. After that, it has access to all of your chatrooms and private messages. And as these online Puppet services seem quite random, so BE VERY MINDFUL OF YOUR PRIVACY when using them. I, for one, use a second Wechat account altogether, separated from my main account, that I dedicate for these kind of little hacky projects. So basically, the chatbot app will be running on top of someone's real Wechat account instead of being spawned as a standalone bot that becomes part of the chatroom.

The iPad Puppet service connects with the main Wechaty program (blue) that you can just run on your local machine, or also deploy somewhere on the cloud. It allows the program to access all the messages and attachments sent to the chatrooms and grabbed by the Puppet service. Naturally, it can also send messages and attachments back into the chatrooms via the Puppet. The main chat logic of the app will be built into the Wechaty part such as specific wake words, if-this-then-thats, all the rules for interaction basically.

领英推荐

Teenagers vs. Robots: The Rise of AI and its Impact on…

Bonfire Digital Wellness 1 年前

The future waits for no one.

GameDev News 1 年前

Artificial Intelligence (AI) and Empathy: Can a Robot…

Zizi Afrique Foundation 1 个月前

Normally, one would have all elements of the main program running directly within the Wechaty code, including the computer vision parts for our specific project needs. However, the computer vision part was already implemented in Python but the Python version of the Wechaty codebase is very unstable (several members in our team and also from other teams were not able to get it running properly). In the end, we decided to implement the Wechaty chatbot logic in C# (thanks to the last-minute backend developer addition to our team) and have a separate Python codebase dedicated to the computer vision (purple) part. The CV part was made into a mini service using the Flask framework on Python that the Wechaty part could call via a simple http API to transfer data back and forth.

It's definitely not the most elegant implementation but gets the job done, which is all we needed for the first MVP demo and to get the project submitted to the contest.

Aaand... Action!

What we see here is basically an early prototype where every piece is put together and acts in an interactive way within the chatroom:

The chatbot prompting for input images (via a randomized shape that users are supposed to match)
The users replying with images that they think represents the shape
Finally, the bot's "verdict", "正确" being the correct match and "错误" being the incorrect match

Surely, due to deficiencies in my computer vision algorithms (part 1), the verdict itself is not always correctly judged, but for the sake of the demo and for the contest, it is working well enough already.

Next Steps

We are now gearing towards putting everything together, writing some documentations and filming an introductory video to submit our work to the contest. As mentioned in the opening, it would mean a world to us, and to me personally, to get your support in the form of a GitHub star to get our charity project for the autistic children off to a great start in the contest. Thank you!

And I will be back to give an update on those CV algorithms for sure :)

要查看或添加评论，请登录

Tianyi Pan的更多文章

Setting Up Jenkins on Kamatera: A Survival Guide for Beginners

2024年6月18日

Setting Up Jenkins on Kamatera: A Survival Guide for Beginners

If you've ever tried setting up Jenkins for the first time with zero prior experience, you probably know the feeling of…

1 条评论
Would You Watch a Chinese Documentary Film Made Entirely From Short Videos? You should.

2024年1月14日

Would You Watch a Chinese Documentary Film Made Entirely From Short Videos? You should.

"Any resemblance to actual persons or events..

5 条评论
Part 3 - Applied LLMs: How to Build a Cat Generative Dialogue Processor (CatGDP)

2023年5月3日

Part 3 - Applied LLMs: How to Build a Cat Generative Dialogue Processor (CatGDP)

This article was originally published on the Streamlit blog: https://blog.streamlit.

4 条评论
On the Memory and Subconscious of Large Language Models (LLMs)

2023年4月7日

On the Memory and Subconscious of Large Language Models (LLMs)

This article will be a slight departure from my usual semi-technical sharings of the projects I've worked on lately. A…

13 条评论
Part 2 (Back to Basics) - Applied LLMs: The Real ChatGPT...4

2023年3月19日

Part 2 (Back to Basics) - Applied LLMs: The Real ChatGPT...4

In my previous article from about a month ago, I detailed the journey of building a ChatGPT-like bot in less than 24…
Part 1 - Applied LLMs: How I Built My Own "ChatGPT" for (Almost) Free in Less Than 24 Hours

2023年2月12日

Part 1 - Applied LLMs: How I Built My Own "ChatGPT" for (Almost) Free in Less Than 24 Hours

ChatGPT is everywhere. Already back in December 2022, we made a podcast episode on then-brand new AI chatbot from…

9 条评论
Part 3 (Deep Learning Saves The Day, Again) - Applied Computer Vision: Developmental Training for Autistic Children

2021年8月14日

Part 3 (Deep Learning Saves The Day, Again) - Applied Computer Vision: Developmental Training for Autistic Children

In part 1 of this article series, we started off with an idea for a project and some background for why we were…
Part 1 - Applied Computer Vision: Developmental Training for Autistic Children

2021年7月31日

Part 1 - Applied Computer Vision: Developmental Training for Autistic Children

tl;dr > New hobby project! My first one with a team, and they are total strangers. We are trying to build something…

4 条评论
Part 1 - Applied Reinforcement Learning with 2048

2019年12月22日

Part 1 - Applied Reinforcement Learning with 2048

Hi there and welcome back! It has been quite a long pause (over a year, actually) since my previous Artificial…
Part 2 (Towards Better Accuracy) - Applied ML & Timeline Prediction: Shanghai License Plate Auction Prices

2018年10月30日

Part 2 (Towards Better Accuracy) - Applied ML & Timeline Prediction: Shanghai License Plate Auction Prices

Hi and welcome back to my article series on predicting the monthly license plate auction ending prices for Shanghai…

2 条评论

See all articles

Part 2 (Chatbots!) - Applied Computer Vision: Developmental Training for Autistic Children

Tianyi Pan

AI Generalist & LLM Whisperer ? Multicultural Biz & Tech Professional ? CMA?

Wechaty Operating Principle

领英推荐

Aaand... Action!

Next Steps

Tianyi Pan的更多文章

社区洞察

其他会员也浏览了

Raising Kids with AI

12 Day's of OpenAI, AI Literacy Initiatives across the U.S., NVIDIA $AIOZ & December's 1st week of massive AI drops - ZEN Weekly

WILL AI REPLACE ALL CODERS?

AI in the Classroom: How Should Schools Adapt to the ChatGPT Era?

Introducing AI Step by Step in the Classroom

Homework Apocalypse in the Era of ChatGPT

What Primary School Teachers Need to Know about AI

Why Schools Need to Stop using ChatGPT?

Children, Education and AI.

Big Data, Little Hands: Navigating Privacy in AI-Powered Toys

Wechaty Operating Principle

领英推荐

Aaand... Action!

Next Steps

Tianyi Pan的更多文章

Setting Up Jenkins on Kamatera: A Survival Guide for Beginners

Would You Watch a Chinese Documentary Film Made Entirely From Short Videos? You should.

Part 3 - Applied LLMs: How to Build a Cat Generative Dialogue Processor (CatGDP)

On the Memory and Subconscious of Large Language Models (LLMs)

Part 2 (Back to Basics) - Applied LLMs: The Real ChatGPT...4

Part 1 - Applied LLMs: How I Built My Own "ChatGPT" for (Almost) Free in Less Than 24 Hours

Part 3 (Deep Learning Saves The Day, Again) - Applied Computer Vision: Developmental Training for Autistic Children

Part 1 - Applied Computer Vision: Developmental Training for Autistic Children

Part 1 - Applied Reinforcement Learning with 2048

Part 2 (Towards Better Accuracy) - Applied ML & Timeline Prediction: Shanghai License Plate Auction Prices

社区洞察

其他会员也浏览了

Raising Kids with AI

12 Day's of OpenAI, AI Literacy Initiatives across the U.S., NVIDIA $AIOZ & December's 1st week of massive AI drops - ZEN Weekly

WILL AI REPLACE ALL CODERS?

AI in the Classroom: How Should Schools Adapt to the ChatGPT Era?

Introducing AI Step by Step in the Classroom

Homework Apocalypse in the Era of ChatGPT

What Primary School Teachers Need to Know about AI

Why Schools Need to Stop using ChatGPT?

Children, Education and AI.

Big Data, Little Hands: Navigating Privacy in AI-Powered Toys