AI, Machine Learning, AR and Voice Assistant (OK Google)
Have you ever wondered how emails get classified as SPAM ?hmmm…we kind of take it for granted these days but never stopped to wonder how…so read on.
Artificial Intelligence (AI) refers to a computer program able to "think" for itself without being programmed instructions whilst Machine Learning is one process by which a computer can learn its trade.
The old school ways of inserting rules doesn't work for most tasks now (think of it like, adding If-Then-Else rules), instead now we write down an algorithm that can look at a lot of data and learn from that data.
Can a machine approximate human intuition and outsmart the best human brain? The burning question, but before that.
A typical AI example is that you want to build a machine which recognises objects in an image. So you show millions of photos to this machine and correct it when it reads out wrong objects from the images. The initial output are a bit random as its not trained. But the trick is to adjust the internal parameters of the structure so that the next time, it reads out correctly. This is the algorithm (this adjusting technique). If we keep repeating this same set of photos over a period, the machine will probably get most of it right but then if you introduce a new photo, the system will be able to recognize this.
Lets geek out a little with real world examples.
Google (Deepmind division) has AlphaGo -The AI algorithm beat a human at the ancient Chinese board game of Go using neural networks and deep learning. The creators of AlphaGo mention this algorithm to be well and truly capable to learn many more things without alternation thereby making this to be a general purpose AI. There are several articles about this online if you need to learn more.
Deepmind learns from raw pixel of images from data input!!! That is revolutionary.This is on the same principles as explained in my typical example (shown million of game plays, continue to play on its own, tuning itself and then reigning supreme)
Now as the input is just raw pixel (i.e. images), this becomes general purpose wherein this same AI algorithm can be applied else where. (it could walk into a bar and walk out with all the girls or guys #darn)
Facebook (SLAM) - They have one of the best implementation of convolutional neural network for a small device. This extracts objects from images (i.e. image/object recognition). In 2017, they brought out Mask R-CNN which detects people and objects and their pose with clear distinct outline. This set idea mills rolling with great examples of where it can used (welcome AR cameras). They took this big computational algorithms and optimised it to run your mobile phones at 30fps. AR stickers were introduced into Instagram if didn't realise. The phone camera is used to track the object (real-time) in relation to the room , alongside the geometry of the room to pin virtual object to that room.
Recently in 2017 Apple(ARKit) and Google(ARCore) brought AR to their mobile platforms independently to leverage from phone's hardware. AR demands a separate post to give it justice in this space.
Back to the SPAM classification; Google for Gmail deploys neural net and machine learning, thereby catching 99.9% of Gmail spam. Pretty cool eh!
Some terminology that you come across is this realm of things:
Neural Network - is a biologically inspired network of artificial neurons configured to perform specific tasks.
Convolutional Neural Network - are a category of Neural Networks that have proven very effective in areas such as image recognition and classification.
General AI - Also referred as human-level AI or strong AI is a type of AI which can understand and reason its environment.
Narrow AI - Also referred as weak AI, is a type of AI which performs a specific task. Similar to image classification, speech recognition etc. This is what threatens to replace many human jobs.
Deep Learning - Is a subset of machine learning methods which solves a problem which requires "thought". For example Netflix decides what you want to watch next.
So, where does voice assistant fit in all this?
For example Google's assistant is driven by text to speech, where-in it uses a large database of high-quality recordings, collected from a single voice actor over many hours. These recordings are split into tiny chunks that can then be combined - or concatenated - to form complete utterances as needed. But come 2016, Deepmind brought in Wavenet, a new deep neural network for generating raw audio waveforms that is capable of producing better and more realistic-sounding speech than existing techniques. It was built using a convolutional neural network, which was trained on a large dataset of speech samples. This enables it to have multiple voices, multiple personalities or get the assistant to differentiate between German and Swiss German. It only gets better with time using AI and ML.
How can the assistant get better? Yes, we need AI on the go. Where Alexa (Amazon) fails, Google has the edge by bringing it to different form factors (Phone, Speakers, Auto, third party integrations etc).
CES 2018, has seen the proliferation of Google Assistant everywhere. Its interesting to see Amazon competing with Google (traditionally you hear Apple vs Google or M$ vs Google or Samasung vs Apple)! The current battle from a business monetisation perspective is, who will control the voice input from consumers.
AI, Should we be worried? To me it’s a divided opinion. The fear of the unknown drives my hysteria ….but I am cautiously optimistic.
Growth and Transformation leader with over two decades of experience in Ecommerce, Ticketing, Video, Blockchain, Data, Insight, all business models, GTM, and operational change. Passion for seed investing.
7 年Lovely write up @vijay... I think I need to get learning now!