Types of LLMs(Large Language Models)

Suvradeep Rudra

Data Strategy | Data Governance, Privacy & Security Architect | Data Platforms | Data Products - AI (Predictive and Gen AI) | GenAI Agents | Data Quality & Catalog | MDM | Program & Product Mgmt. | Advisor & Board Member

发布日期: 2024年8月16日

Large Language Models (LLMs) have revolutionized natural language processing, powering a wide range of AI applications. These models can be broadly categorized into two types: open-source and closed-source. Open-source LLMs, like BERT and GPT-J, offer transparency and customization options, allowing developers to fine-tune them for specific tasks. In contrast, closed-source LLMs, such as GPT-3 and Claude, are proprietary systems that offer powerful capabilities but with limited access to their inner workings. Understanding the distinctions between these types is crucial for organizations and developers when selecting the right LLM for their projects.

General-purpose LLMs : These models are trained on large amounts of general web text and aim to be useful for a wide range of tasks. Examples include:

???????? GPT-3 — Developed by OpenAI, GPT-3 has 175 billion parameters and can generate text, answer questions and perform other tasks.

???????? BERT — Developed by Google, BERT uses Transformer architecture and is widely used for downstream NLP tasks.

Domain-specific LLMs: These models are trained on text from a specific domain like science, medicine or law. They tend to perform better for domain-specific tasks. Examples are:

???????? BioBERT — Trained on biomedical literature and used for biomedical NLP tasks.

???????? SciBERT — Trained on scientific text and effective for scientific information extraction and question answering.

Multilingual LLMs: These models support multiple languages and are trained on text data from different languages. Examples are:

???????? XLM — Developed by Facebook, XLM supports 100 languages and aims to be useful for cross-lingual tasks.

???????? Multilingual BERT — Supports 104 languages and can perform tasks like sentiment analysis and named entity recognition

Few-shot LLMs: These models are designed to perform well even when fine-tuned with small amounts of labeled data. Examples are:

???????? GPT-3

???????? T5 — Developed by Google, T5 can achieve high performance with as little as 10 examples for training.

Task-specific LLMs: These models are tailored for a specific NLP task like summarization, question answering, translation, etc. Examples are:

???????? BART — Facebook’s model for text generation tasks like summarization and question generation.

???????? ALBERT — Google’s model for question answering and sentence classification.

Types of LLMs(Large Language Models)

Suvradeep Rudra

Data Strategy | Data Governance, Privacy & Security Architect | Data Platforms | Data Products - AI (Predictive and Gen AI) | GenAI Agents | Data Quality & Catalog | MDM | Program & Product Mgmt. | Advisor & Board Member

更多精彩文章

社区洞察

Customer Empathy is Product Management

2018年10月26日

Top Qs. for Idea Extraction for a new Product or New Feature : Product Management Tips

2018年4月22日

Analytics Around Payments

2016年9月26日

社区洞察