Microsoft's AI Supercomputer: Massive Stupidity vs. General Intelligence

Microsoft's AI Supercomputer: Massive Stupidity vs. General Intelligence

Microsoft has built one of the top five publicly disclosed supercomputers in the world, making new infrastructure available in Azure to train extremely large artificial intelligence models, the company is announcing at its Build developers conference.

No alt text provided for this image

Built in collaboration with and exclusively for OpenAI, the supercomputer hosted in Azure was designed specifically to train that company’s AI models. It represents a key milestone in a partnership announced last year to jointly create new supercomputing technologies in Azure. https://azure.microsoft.com/en-us/overview/ai-platform/

No alt text provided for this image

It’s also a first step toward making the next generation of very large AI models and the infrastructure needed to train them available as a platform for other organizations and developers to build upon.

A new class of multitasking AI models

Machine learning experts have historically built separate, smaller AI models that use many labeled examples to learn a single task such as translating between languages, recognizing objects, reading text to identify key points in an email or recognizing speech well enough to deliver today’s weather report when asked.

A new class of models developed by the AI research community has proven that some of those tasks can be performed better by a single massive model — one that learns from examining billions of pages of publicly available text, for example. This type of model can so deeply absorb the nuances of language, grammar, knowledge, concepts and context that it can excel at multiple tasks: summarizing a lengthy speech, moderating content in live gaming chats, finding relevant passages across thousands of legal files or even generating code from scouring GitHub.

As part of a companywide AI at Scale initiative, The Next Generation AI, From Language to Models to Scale, Microsoft has developed its own family of large AI models, the Microsoft Turing models, which it has used to improve many different language understanding tasks across Bing, Office, Dynamics and other productivity products. Earlier this year, it also released to researchers the largest publicly available AI language model in the world, the Microsoft Turing model for natural language generation.

No alt text provided for this image

Turing Natural Language Generation (T-NLG) is a 17 billion parameter language model by Microsoft that outperforms the state of the art on many downstream NLP tasks. We present a demo of the model, including its freeform generation, question answering, and summarization capabilities, to academics for feedback and research purposes. 

Massive deep learning language models (LM), such as BERT and GPT-2, with billions of parameters learned from essentially all the text published on the internet, have improved the state of the art on nearly every downstream natural language processing (NLP) task, including question answering, conversational agents, and document understanding among others.

In this new approach, models learn by reading text and performing various prediction tasks (e.g., masking different words and predicting them based on the remaining text). This approach is also being used to analyze and interpret images and video.

Training massive AI models requires advanced supercomputing infrastructure, or clusters of state-of-the-art hardware connected by high-bandwidth networks. It also needs tools to train the models across these interconnected computers.

The supercomputer developed for OpenAI is a single system with more than 285,000 CPU cores, 10,000 GPUs and 400 gigabits per second of network connectivity for each GPU server. Compared with other machines listed on the TOP500 supercomputers in the world, it ranks in the top five, Microsoft says. Hosted in Azure, the supercomputer also benefits from all the capabilities of a robust modern cloud infrastructure, including rapid deployment, sustainable datacenters and access to Azure services.

https://blogs.microsoft.com/ai/openai-azure-supercomputer/

How is self-learning AI made?

Recently, progress has been made with ML agents that learn to perform tasks from sensory input by merging reinforcement learning (RL) algorithms with deep neural networks.

One shouldn’t take seriously the stimulus-response architecture of deep learning neural networks, exploiting a mix of rote learning and reinforcement (reward/punishment) learning and Markov Decision Processes (MDP).

A MDP is a Markov Reward Process with decisions.

It is described by a set of tuples 

<S, A, P, R, π, v(s), q(s,a)> 

where S is a sequence of possible states (a state may be a Go/chess board configuration) in which the current state depends on only the previous state; A being a finite set of possible actions the agent can take in the state s; P is a state transition probability matrix; R is the reward that the agent expects to receive in the state s; the policy determines the mapping from a state s to the action a that must be taken by the agent; some important functions, the state-value-function v(s) and the action-value function q(s,a).

Most visible achievements in deep self-learning were made due to deep reinforcement learning applications, as Google’s Alpha Go or DeepMind’s AI agents teaching themselves to walk, run and overcome obstacles.

Deep Reinforcement Learning shown as building an algorithm (or an AI agent) that learns directly from interaction with an environment (as the real world, a computer game, a simulation or a board game, like Go or chess) (Fig.).

No alt text provided for this image

Like a human mind, such a quasi-AI Agent is supposed to learn from effects of its Actions, rather than from being explicitly taught.

In Deep Reinforcement Learning such an Agent is represented by a neural network. The neural network interacts directly with the environment. It observes the current State of the Environment and decides which Action to take (e.g. move left, right etc.) on basis of the current State and the past experiences. Based on the taken Action the AI Agent receives a Reward. The amount of the Reward determines the quality of the taken Action with regards to solving the given problem (e.g. learning how to walk). The objective of an Agent is to learn taking Actions in any given circumstances that maximize the accumulated Reward over time.

Deep self-learning is impossible due to the primitive stimulus-response paradigm as the dominant conceptual framework, while neuroscience is under the dramatic shift in perspectives from input/output to output/input architecture.

Watching a paradigm shift in neuroscience

Self Learning AI-Agents Part I: Markov Decision Processes

It is like the MERLIN architecture learning new policies by playing back from a memory system.

We develop a model, the Memory, RL, and Inference Network (MERLIN), in which memory formation is guided by a process of predictive modeling. MERLIN facilitates the solution of tasks in 3D virtual reality environments for which partial observability is severe and memories must be maintained over long durations. Our model demonstrates a single learning agent architecture that can solve canonical behavioural tasks in psychology and neurobiology without strong simplifying assumptions about the dimensionality of sensory input or the duration of experiences.

Unsupervised Predictive Memory in a Goal-Directed Agent

There are misunderstandings that need to be recognized to make further advances in self-learning and general intelligence.

The first is that the brain has a mental, top-down architecture where it generates an internal mental model to perform predictions, fed by sensory bottom-up data inputs.

No alt text provided for this image

Top-down inputs are internal models that influence prediction from bottom-up inputs (from the real world).

The second is that intelligence is related to consciousness. And consciousness is the mental models of ourselves (i.e. models of the self) completed with the internal contextual model.

In short, the top-down mental models as overriding any bottom-up real perception are the mechanisms to avoid the well-known poverty of Deep Reinforcement Learning of Self Learning AI-Agents

AI as Pansophic Technology/Algorithms/Networks/Platforms/Machines/Systems/Applications

https://www.dhirubhai.net/pulse/how-build-real-ai-rai-from-deep-learning-big-data-smart-abdoullaev/

How can AI with General Intelligence be made?

AI attracts all the mass media attention, especially after a number of sensational techno-political statements, as of Russian President: "Artificial intelligence is the future, ... for all humankind... Whoever becomes the leader in this sphere will become the ruler of the world".

Creating AI, defining its nature, approach, applications, techniques, algorithms and models, looks the hardest task ever before human minds.

We could outline some big points and guidelines, keeping in mind the True AI involves world data, human intelligence/knowledge, machine learning, the internet/web, cloud computing, internet of things, and all exponential technologies.

For wanting a unified paradigm, people lost in the number of approaches, from cybernetic models and sub-symbolic and embodied models to symbolic and statistical learning models.

AI today is a combination of big data, machine learning, cloud computing and internet of things. Due to this, we still in the beginning of the quest to build machines that can reason, learn, and act intelligently.

Most advances are in machine learning, neural networks, and robots, as social bots and chatbots, and military AI.

What matters is computer vision, face recognition; machine learning; robots; voice assistants; and weaponized AI and Robotization of the Armed Forces.

A military AI arms race is a competition to have the military forces equipped with the best AI, incorporating it into uninhabited aerial, naval, and undersea vehicles.

For example, China pursues a strategic policy of 'military-civil fusion' on AI for global technological supremacy.

AI and the next revolution in military defense

Neural networks and deep learning is also driving the AI industry’s progress.

To create AI, one need to answer 3 fundamental questions,

what is Intelligence?

what is AI?

how Natural Intelligence and AI are related?

It is critical to see the difference between Subjective AI and Objective AI:

Weak/Narrow AI (ANI), mimicking human brains,

Strong/General AI (AGI), simulating human minds,

Artificial Super Intelligence (ASI), modeling and simulating the world itself.

No alt text provided for this image

To see differences among real AI, fake AI of machine learning and deep learning, one has to take a hierarchical model of intelligence.

It is all the matter of grade, ranking or levels of intelligence, like primary, secondary and post-secondary or tertiary education, graduate, post-graduate and polymath. Or, it is like a deep learning utilizing a hierarchical level of artificial neural networks to perform machine learning.

The equivalent classes of AI are graded as follows:

Machine Learning is about computational statistics and statistical learning theory, promoted as a computer program that can automatically adapt to new data without human interference, and which computer algorithm adjusts its parameters automatically to create a new pattern. Its programming code creates a model that identifies the data and builds predictions around the data it identifies. The model uses parameters built in the algorithm to form patterns for its decision-making process. When new or additional data becomes available, the algorithm automatically adjusts the parameters to check for a pattern change, if any, while the model shouldn’t change. Machine learning is used in different sectors for various reasons.

Deep Learning, known as deep neural learning or deep neural network, is self-adaptive algorithms bettering its analysis and patterns with experience or with new data. It imitates the workings of the human brain in processing data and creating patterns in decision making. Deep learning learns from vast amounts of unstructured data that would normally take humans ages to understand and process. Such big data is drawn from sources like social media, internet search engines, e-commerce or e-government platforms, online cinemas, and fintech applications like cloud computing.

Weak AI (narrow AI) – non-sentient automated intelligence, focused on a narrow task in specific domains (narrow AI).

Strong AI / artificial general intelligence (AGI) – real AI to be applied to any world problem.

Superintelligence – global AI far surpassing all human intelligence due to recursive self-improvement.

The AI world has the following evolutionary taxonomy, evolutionary systematics or Darwinian-like hierarchical classification, as Ladder of AI Being:

Machine Learning and Deep Learning

Artificial Narrow Intelligence (ANI), involving Machine Learning models and Deep Neural Networks,

Artificial General Intelligence (AGI), Strong AI, Human Intelligence

Global AI, Encyclopedic Intelligence, the Global Brain, I-Internet

Hybrid Super Intelligence (HSI), integrating Digital Superintelligence with Human Intelligence.

Kiryl Persianov's answer to What are the types of artificial intelligence?

Kiryl Persianov's answer to What are differences between artificial Intelligence, Machine Learning and Deep Learning?

No alt text provided for this image

The AI value chain — who will make money with AI? The companies noted are representative of larger players in each category but in no way is this list intended to be comprehensive or predictive. ? Best Practice AI Ltd

There are (1) AI chip and hardware makers who are looking to power all the AI applications that will be woven into the fabric of organisations big and small globally;

(2) the cloud platform and infrastructure providers who will host the AI applications;

(3) the AI algorithmic and cognitive services building block makers who provide the vision recognition, speech and deep machine learning predictive models to power AI applications;

(4) enterprise solution providers whose software is used in customer, HR, and asset management and planning applications;

(5) industry vertical solution providers who are looking to use AI to power companies across sectors such as healthcare to finance;

(6) corporate takers of AI who are looking to increase revenues, drive efficiencies and deepen their insights; and finally

(7) nation states who are looking to embed AI into their national strategies and become AI enabled countries.

The Secrets of Successful AI Startups. Who’s Making Money in AI Part II?

https://www.quora.com/Why-are-there-so-many-AI-startups-Are-they-legit-Are-their-products-good/answer/Kiryl-Persianov


要查看或添加评论,请登录

Azamat Abdoullaev的更多文章

社区洞察

其他会员也浏览了