ç™»å½•æŸ¥çœ‹æ›´å¤šå†…å®¹

AI & ML Jargon

Milind Patel

å‘å¸ƒæ—¥æœŸ: 2022å¹´6æœˆ17æ—¥

At the beginning of 2022 I joined Cerebras, and got introduced to new field of Artificial Intelligence (AI). The purpose of this article is to capture basic terminology used in the field and ignite curiosity to further explore each topic. It is written for someone who is new to the field and wants to learn fundamentals in very simple way. Cerebras Websites are very content rich and generously sharing knowledge of the field using their cutting edge product line (SW, HW and solutions for variety of real life applications).

When machines start learning and problem solving like humans (e.g., understands speech and performs tasks (Alexa, Siri etc.)), the machine understands userâ€™s behavior pattern and suggests relevant choices. Some examples of AI include autonomous vehicles that self-drive on a busy street safely. Mimicking human like behavior (e.g., reasoning, planning, learning, problem solving, decision making) while using all the information available to machine requires wide range of field knowledge including math, statistics, computer science, psychology, linguistics and more. Natural language processing (NLP) is a subfield of linguistics which is specific to computer science. This allows AI to understand, analyze, and process natural language data. NLP enables a machine to quickly make sense of unstructured text created by web data, emails, media, texts, social media, instant messages etc., and enable commercial enterprise and government agencies to identify trends, sentiment, correlations, and key ideas.

As a part of AI, the machine needs to make decisions based on all the data it acquires to accomplish the indented goals. The machine uses algorithms and mathematical models to process the data to fit the model. Fitted models use validation or test data sets to predicts the behavior and evaluating model fit on the training data to assure there is no overfitting on training data to save learning time. This is how Machine leaning (ML) algorithms build a model based on training/sample data to make predictions or decisions without being explicitly programmed to do so. Big Data is generated as businesses and technology collect more data; this triggers more advanced learning algorithms. For example, Bidirectional Encoder Representation from Transformers (BERT) the deep learning NLP model has gone beyond the simple text analyzer to more specialized applications in the field of BioBERT, FinBERT, SciBERT, ClinicalBERT, GilBERT, DNABERT, PatentBERT, mBERT etc. This allows data/text analysis in the field of biomedical, finance, science, clinical, geological, genomic, patent, and multi-lingual fields respectively.

More and more enterprises and individuals are using the Cloud to store and use compute power from off-site decentralized facilities and SaaS (Software as a Service- use of on-line SW applications with subscription than owning, installing and running it locally) applications. Platform as a service (Paas), Infrastructure-as-a-Service (Iaas) and Function-as-a-Service (Faas) are other Cloud service models which one can rent all the applications, servers/storage, and serverless function respectively.

A Container contains an application and everything to use it. What a container needs are hosting and run to perform its function. This is way to partitioning a machine or server into separate user space environments. Each environment runs only one application and is isolated from other applications/partitioned sections on the machine. However, each container shares the computer hardware through machineâ€™s kernel. While Virtual Machine (VM) is a software-based computer that resides on yet another computerâ€™s operating system. Cloud uses VMs for testing, back up, and running SaaS applications.

The machines also do Data Mining; analytics to extract patterns & knowledge from large data sets. Neural networks models solve larger and complex problems of AI with more advanced decision-making through interconnected nodes and granular decision-making more efficiently than linear algorithms can.

Machine learning through deep hidden layers of neural network data called Deep Learning (DL). DL uses vast data from neural network and training them to perform complicated tasks which are difficult to describe. Generative Pre-trained Transformer 3 (GPT-3) is language prediction model created by OpenAI in as latest as May 2020 (AI research lab) uses deep learning to produce human like text using autoregressive language model.

Most of us are familiar with Central Processing Unit (CPU) and Graphic Processing Unit (GPU) serial and parallel processing respectively to handle computing. GPU helped many industries like automotive, robotics, and healthcare & life science dramatically by accelerating applications. Deep learning processor (DLP) or deep learning accelerator designed for deep learning algorithms. Nvidia GPUs process matrix multiplication using Tensor cores. Tensor Processing Units (TPU) are Googleâ€™s custom-developed ASIC to accelerate ML workloads. TPUs are custom-built processors to run on Googleâ€™s specific TensorFlow framework. TPU or Neural processing units (NPUs) in Huawei cellphones are all types of DLPs. High data-level parallelism, large on-chip buffer/memory makes DLPs highly efficient to perform DL algorithms than FPGA (Field-programmable gate arrays), CPUs and GPUs. AI accelerators are computer hardware system designed to accelerate AI and ML applications.

NLP models have increased to 17 billion parameters in 2020 and continuous growth of network size demands more memory and time for training. GPT-3 demands close to 2.8 TB of memory to store weights and states. Individual GPU or compute. Individual compute units are not enough to handle GPT-3 types of modeling efficiently. Cluster of such units as many as 1000 GPUs requires to reduce distribute training time by just few days. Years of training time and huge memory on compute unit is required. Weight streaming allows growth of cluster size independent of model size. Wafer-Scale Engine is very innovative solution which has multiple cores with big SRAM, aggregate high on-wafer network bandwidth. It executes DL compute on BERT models way efficiently than standard GPUs. Cerebras CS-2 is one such system which has massive computation, storage and AI processing capabilities. One can create cluster of such systems to further handle more complex trainings and bigger models at very high speed for any AI applications.

Bendik Kleveland

@Startup Again!

2 å¹´

That is a nice introduction to introduce the terms in context, Milind! You have many talents!

èµž

å›žå¤

1 æ¬¡å›žåº”

æŸ¥çœ‹æ›´å¤šè¯„è®º

è¦æŸ¥çœ‹æˆ–æ·»åŠ è¯„è®ºï¼Œè¯·ç™»å½•

Milind Patelçš„æ›´å¤šæ–‡ç«

Key to Success - Communication

2017å¹´12æœˆ26æ—¥

Key to Success - Communication

The means world relates to you. There is information/communication that radiates out of each one of us at any givenâ€¦

2 æ¡è¯„è®º
Fundamentals of EDVT (Electrical Design Validation/Verification Test)

2017å¹´2æœˆ1æ—¥

Fundamentals of EDVT (Electrical Design Validation/Verification Test)

Electronics products are becoming faster, smaller and more complex day by day. While many product functions areâ€¦

8 æ¡è¯„è®º
Make the best out of the meeting

2016å¹´9æœˆ9æ—¥

Make the best out of the meeting

Check list for both host and attendees. Before meeting- 1.
Mechanical Design Validation Test (MDVT) Fundamentals

2016å¹´8æœˆ15æ—¥

Mechanical Design Validation Test (MDVT) Fundamentals

Design validation tests are performed to validate a productâ€™s design; as for design specifications, internal design orâ€¦

10 æ¡è¯„è®º
Catch the wave or wipe out safely

2016å¹´5æœˆ20æ—¥

Catch the wave or wipe out safely

Are you ready for next technological wave? It is fascinating to observe how innovators, visionaries, and proactiveâ€¦

1 æ¡è¯„è®º
To do or not to do?

2016å¹´5æœˆ13æ—¥

To do or not to do?

It is challenging to plan the amount of testing a product needs to validate the design in order for faster of theâ€¦

1 æ¡è¯„è®º

See all articles

AI & ML Jargon

Milind Patel

Milind Patelçš„æ›´å¤šæ–‡ç«

ç¤¾åŒºæ´žå¯Ÿ

å…¶ä»–ä¼šå‘˜ä¹Ÿæµè§ˆäº†

Accelerating AI Understanding: Five Essential Insights for Professionals

Navigating the Generative AI Landscape

Efficiently Training Transformers: A Comprehensive Guide to High-Performance NLP Models

Efficiently Training Transformers: A Comprehensive Guide to High-Performance NLP Models

Matryoshka Embeddings: Big Benefits in Smaller Packages

Where to Get Started with Generative AI: A Beginner's Guide

"Smart Learning Paths: Navigating Education Through AI Adaptability"

Machine Learning and Artificial Intelligence

Optimizing Business Strategy with EBSK and EBDK: A Comprehensive Approach to Knowledge and Innovation

AI, ML, DL, NLP, and Generative AI: A Beginner's Guide

Milind Patelçš„æ›´å¤šæ–‡ç«

Key to Success - Communication

Fundamentals of EDVT (Electrical Design Validation/Verification Test)

Make the best out of the meeting

Mechanical Design Validation Test (MDVT) Fundamentals

Catch the wave or wipe out safely

To do or not to do?

ç¤¾åŒºæ´žå¯Ÿ

å…¶ä»–ä¼šå‘˜ä¹Ÿæµè§ˆäº†

Accelerating AI Understanding: Five Essential Insights for Professionals

Navigating the Generative AI Landscape

Efficiently Training Transformers: A Comprehensive Guide to High-Performance NLP Models

Efficiently Training Transformers: A Comprehensive Guide to High-Performance NLP Models

Matryoshka Embeddings: Big Benefits in Smaller Packages

Where to Get Started with Generative AI: A Beginner's Guide

"Smart Learning Paths: Navigating Education Through AI Adaptability"

Machine Learning and Artificial Intelligence

Optimizing Business Strategy with EBSK and EBDK: A Comprehensive Approach to Knowledge and Innovation

AI, ML, DL, NLP, and Generative AI: A Beginner's Guide

å…¶ä»–ä¼šå‘˜ä¹Ÿæµè§ˆäº†