India‘s AI Awakening
Parul Pandey
Community ?? & Open Source| Co-author of Machine learning for High-Risk Applications | Kaggle Grandmaster(Notebooks)
A take on India’s advantages ,Government’s initiatives and roadblocks towards becoming an AI powered nation.
As per Andrew Ng ‘AI is the New Electricity’.Just as electricity revolutionised the world years ago, AI has the ability to transform the world in the years to come. Today every industry is trying to align itself to the aspects of Artificial Intelligence. Currently, billions of dollars all over the world are being put into the AI research. From Cyber security to personal security, from Healthcare to financial trading, from e-commerce to chatbots, AI is almost everywhere.
Machines today are insanely smart and intelligent and are becoming smarter every day. Take for instance examples of Siri, Cortana, Alexa or google Assistant. The more we interact with the virtual assistants, better they become in taking instructions. The more we shop on Amazon, better recommendations it presents for our next shopping. It is almost as if these tools can read our minds. How many times has it occurred that you were about to type some query in google search engine and it showed the exact suggestion? Wow..is it magic? no its AI.
“Any sufficiently advanced technology is indistinguishable from magic,” Arthur C. Clarke
So why is it that AI and Big Data have started to make much more sense today even though their origins date back to 50’s and 60's? This is because the two basic elements on which AI thrives is readily available today i.e Data & Computing power. For AI to work effortlessly, both of the above factors are required on a gigantic scale. Fortunately, advancements in technology have paved the way for the information revolution. We have access to enormous amounts of Data and greatest computational power which wasn’t the case even 5 years back. Hence the buzz around AI becomes more evident today.
Data is to AI as Food is to Humans
Data is the sword of the 21st century, those who wield it the samurai. ~ Jonathan Rosenberg
The true potential of AI can only be realised only with huge amounts of Big Data. With the availability of humongous data, the algorithms train much faster and with greater accuracy. With sufficient quality data available, AI techniques easily outperform traditional, machine learning algorithms. Data fuels AI. Businesses today are trying to harness this data to get an edge over others and to remain in the competitive race.
Source: deeplearning.ai
Andrew Ng often mentions that in deep learning, more data + larger models = better performance.
Astronomy and Genomics were the areas first to experience the explosion in terms of data in 2000s. This gave rise to the term ‘Big Data’. The concept ,however spread to other areas too being most related with IT today. With a mobile phone in every hand, a computer in every bag, sophisticated IT infra in every office, data started growing at an enormous pace. The amount of data soon started exhausting the existing storage capabilities of existing computers giving rise to technologies like MapReduce and Hadoop. It was during this time that Internet companies came into being. They could not only collect this vast amounts of data but were also investing hefty amounts into processing techniques that could make sense out of it. It was due to the adoption of the new techniques and data related capabilities, that enabled the Internet companies to rule the world .
The success of the FAMGA( Facebook, Apple, Microsoft, Google, and Amazon ) highlights the dominance of the internet companies in the world today. Data shows that search, social and mobile are highly profitable businesses today.
The domination of Technology companies in terms of Market Cap is actually a fairly recent phenomenon .Just 11 years ago, Microsoft was the only tech company in the top 5.
Why Data is the Key
Google processes more than 24 petabytes of data per day, a volume that is thousands of times the quantity of all printed material in the U.S. library of Congress. Facebook, a company that did not exist a decade ago, gets more than 10 million new photos uploaded every hour.Facebook members click a ‘like’ button or leave a comment nearly three billion times per day, creating a digital trail that ht company can mine to learn about users’ preferences. Meanwhile, the 800 million monthly users of Google’s YouTube service upload over an hour of video every second. The number of messages on twitter grows at around 200 percent a year and by 2018 had exceeded 500 million tweets a day.[The Essential Guide to Work,Life and Learning in the Age of Insight]. We have an access to a lot more information today. The point , however, is to make effective use of this available information.
In the field of Big Data, more is the merrier. This is evident in the field of Natural language Processing: ability of computers to understand human speech as it is spoken. Even the traditional algorithms in NLP perform better when supplied with more data. Michele Banko and Eric Brill in their Research, highlight then fact that for a prototypical natural language classification task, the performance of learners can benefit significantly from much larger training sets. Machine translation is another astounding and challenging field of AI. The “Candide” project was an experimental machine translation system under development at IBM TJ Watson Research Centre in the early 1990's.The data comprised of French and English transcripts which were actually official documents. The translations were therefore of greatest quality.The corpus consisted of about 3 million word pairs which was a lot of data at that time. IBM tasted success initially but then stagnation arrived even after pumping in a lot of resources.Therefore, the project was terminated eventually. However, in 2006 google made its foray into machine translation. Unlike IBM, it did not work on nicely translated texts, but took a whole lot of messier data ranging from corporate websites to government reports.’We found that there’s no data like more data, and scaled up the size of our data by one order of magnitude, and then another, and then one more — resulting in a training corpus of one trillion words from public Web pages’,[All Our N-gram are Belong to You]. Google’s translation project achieved what IBM’s Candide could not.The reason was not a superior Machine Learning Algorithm but a whole lot of Data.While IBM interacted with a millions, google’s dataset took into account billions of text.
India’s Advantage in AI Race:
1) Data Goldmine
Ravi is a young, ambitious boy in his early twenties. He works as a cab driver with an app based company. When he is not driving or waiting for his clients, he can be caught spending time with his smartphone. He is either streaming videos on youtube or watching live cricket matches on his phone.
This is not just the story of Ravi but a majority of Indians today. Gone are the days of buffering . Today, whenever we need to listen to a new song, we stream it live. We are watching online movies. Netflix and Amazon Prime have penetrated our households.This wasn’t the case in India few years back. Back then we would use our data judiciously.We would try hard to save the last byte of data before the month expired. But then, Jio entered the telecom sector and proved to be a game changer of sorts. The average data consumption started soaring while prices falling more than half. Data cost has been constantly falling which is more pronounced in pre paid than postpaid.As per a report, Indian telcos carry most data in the world. Data carried by the top telcos of India saw a record five times increase in the past year.Earlier we were known as the Download nation but today we are slowly inching towards becoming a Streaming Nation.As per a report, on average, an Indian subscriber consumes about 7.4 GB of data per month through mobile devices over mobile networks alone, placing India ahead of UK, South Korea and France.
What do all these numbers signify? To simply put it, we Indians are consuming and generating a whopping amount of data in our daily lives by means of smartphones alone. This is called digital consumption and generation ,which is an all time high in India. In fact, In India many people have experienced the internet for the first time on their phones. With affordable and trendy smartphones ,we carry our world in our pockets. From surfing the Web to sending an e-mail to conducting a credit card transaction to, yes, making a phone call — all of this creates a data trail.Since, data is the raw material for AI and all machine learning algorithms, India holds an advantage. With the huge amount of data combined with the right skills , it can pave a way for AI boom in India.
2) Start up India Initiative
Current Indian Government is trying to push for industrial reforms in both Private and Public sector.Startup India is a flagship initiative of the Government of India, intended to build a strong eco-system for nurturing innovation and Startups in the country that will drive sustainable economic growth and generate large scale employment opportunities. The Government through this initiative aims to empower Startups to grow through innovation and design.(…).At present, over 170 start-ups are purely focussing on AI, and have raised over $36 million. There is a great opportunity for India to excel in AI domain like in case of IT years ago.
3)Late Adopter Advantage
Nate Silver explained at the recent IBM THINK Forum. “The hype often comes before the progress takes place. What that means is first you have the hype over new technology and people realize that we have great hardware and software but I need more people on staff who know how to use this stuff, I need to start experimenting”. In fact, being a late adopter also has immense advantages. You get to learn from the trials and errors of the early adopters, and benefit from the technology improvements driven by feedback from the early majority. By not falling for the hype, you get to leapfrog to the progress.(…)
4) India: a mess of complexity
India needs AI and AI revolution needs India, more like a symbiotic relationship.India has a rich diversity of languages, scripts, dress, accents and culture which presents itself as a rich and complex set of deep challenge for AI to make it more resilient.Current AI techniques pre-trained in West are not accustomed to handle complexity and will have to mature to deal with the diversity that India poses.This will benefit both AI community and India in the long run. In India more and more people want to interact in their mother tongue ,which acts a great learning ground for technologies.
However, the right ingredients doesn’t always guarantee a great dish. Though India is generating immeasurable amounts of data every second, which has no doubt caught attention of leading AI firms, more needs to be done to be actually able to harness its true potential.We do not want to be merely seen as Data provider nation for Facebook and Google. What we actually need is Home grown AI solutions for resolving India’s major socio-economic problems.
“India might end up as a big consumer of the new tech-economy featuring AI- and IoT-related technologies. But will it be a big producer in this economy?”,Kartik Hosanagar
?This article was originally published by me on my Medium Blogpost: India‘s AI Awakening
In the concluding part , we will delve into the steps being taken up by Government to make India an AI powerhouse. You can read it here: India’s AI Awakening :The Conclusion