The Intergalactic Guide to LLM Parameter Sizes

The Intergalactic Guide to LLM Parameter Sizes

Which AI Brain is Right for Your Mission? This is the absurdly over-complicated field guide that nobody asked for but everyone desperately needs

Recently I found myself on ollama.com looking at the options to run DeepSeek-R1 locally, when I encountered the digital equivalent of decision paralysis. The drop-down menu presented me with SEVEN different sizes of essentially the same model, ranging from a modest 1.5B all the way up to a completely ridiculous 671B parameters. To be sure, this isn't just a DeepSeek problem – you'll find the same array of options with Google's new Gemma3, Microsoft's Phi, Meta's big zoo of LLamas, and virtually every other model family on the market. When it comes to paramater sizes, selecting the right one felt less like choosing an AI model and more like being asked to specify the exact molecular weight of my next meal.

The storage requirements alone told a story of madness – everything from a reasonable 1GB to a staggering 404GB. That's not a download, it's a commitment... a relationship. Who has that kind of disk space to casually dedicate to a single model? And more importantly, who needs that many options of the same fundamental architecture?

After looking at my screen for longer than I'd like to admit, I realized what the world desperately needs: a straightforward, no-nonsense guide to this parameter size circus. In fact, I shouldn't even be the one writing it, neither do I see myself as that absolute authority for this topic, but we need that so let's just give it a go and I'll appreciate comments. So here's my attempt to decode this numerical madness without requiring a degree in computer science or making you want to abandon technology altogether. Just do not take it too seriously.

Tiny Models (1B-3B parameters): The Pocket Calculators

Size on disk: ~1-2GB

Hardware needs: Your grandma's laptop could run this

Power consumption: A hamster on a wheel could generate enough electricity

Actual usefulness: More than you'd expect, less than the marketing suggests

These models are like that pocket knife with only three tools—surprisingly handy despite obvious limitations. They're good for:

  • Figuring out if text is happy or sad (groundbreaking, I know)
  • Summarizing news articles about as well as an eager intern
  • Running on devices that would choke on anything larger
  • Tasks where being 70% right is good enough

Just don't ask them to understand jokes, follow complex instructions, or remember what they said three messages ago. They've got the memory of a goldfish and the creativity of a tax form.

Small Models (4B-8B parameters): The Swiss Army Knives

Size on disk: ~3-5GB

Hardware needs: Any laptop made after the invention of TikTok

Power consumption: Roughly equivalent to a desk fan

Actual usefulness: The sweet spot for most people who don't work at an AI lab

The 7B size has become the unofficial "we made it just big enough to be useful" standard. These models can:

  • Generate content that doesn't immediately sound like it was written by a robot
  • Help with coding in a way that won't make senior developers cry
  • Follow basic instructions without immediately hallucinating
  • Remember context for more than two sentences

This is the realm where most people should start. It's like buying the mid-tier iPhone instead of selling your kidney for the Pro Max.

Medium Models (10B-20B parameters): The Desktop Computers

Size on disk: ~8-15GB

Hardware needs: Something with a GPU that doesn't catch fire

Power consumption: Your electricity bill will notice, but not scream

Actual usefulness: When you need to impress people but can't afford the big guns

These models occupy the awkward teenage phase of AI—not small enough to run easily everywhere, not large enough to blow minds. They offer:

  • Noticeably better reasoning than their smaller siblings
  • The ability to follow complex, multi-step instructions without getting lost
  • Less "creative" fabrication of facts
  • Content generation that doesn't start repeating itself after two paragraphs

The performance jump from 7B to 13B is often more noticeable than from 13B to 30B, making this a surprisingly practical choice if you can handle the hardware requirements.

Large Models (30B-70B parameters): The Workstations

Size on disk: ~20-40GB

Hardware needs: Gaming PC or better, preferably with an M3/M4 chip from Apple, or an NVIDIA card that cost more than your first car

Power consumption: Comparable to a small heater

Actual usefulness: The point where people start saying "wow" instead of "hmm"

Now we're talking serious horsepower—these models actually deliver on many of the promises made in AI marketing materials:

  • Complex reasoning that connects dots humans might miss
  • Creative content that doesn't sound like it was assembled from pre-existing templates
  • Code generation that includes comments explaining WHY, not just WHAT
  • The ability to maintain context over longer conversations

This is where the basic models of most commercial services like ChatGPT and Claude operate. There's a reason these aren't running on your phone.

Enormous Models (100B-200B parameters): The Server Racks

Size on disk: ~60-150GB

Hardware needs: Multiple high-end GPUs in a dedicated setup

Power consumption: Hope you've got solar panels

Actual usefulness: Overkill for 95% of use cases, but that remaining 5% is impressive

These behemoths are the AI equivalent of bringing a tank to a bicycle race—complete overkill for most situations, but undeniably powerful:

  • Reasoning capabilities that can tackle graduate-level problems
  • Nuanced understanding of complex, ambiguous instructions
  • Significantly fewer hallucinations on specialized knowledge
  • Content generation with consistent style, voice, and factual accuracy

Unless you're doing cutting-edge research or running a commercial service, you probably don't need this. It's like buying a commercial espresso machine for your home when you drink coffee twice a month. Run those on a GPU-as-a-service infrastructure like replicate.com, together.ai or use the proprietary models via API.

Apocalypse-Inducing Models (500B+ parameters): The Supercomputers

Size on disk: 300GB+ (hope you've got fiber internet)

Hardware needs: Data center infrastructure, cooling systems, possibly a nuclear power plant

Power consumption: Comparable to a small town (I'm kidding)

Actual usefulness: Bragging rights and/or ending humanity

That 671B parameter model requiring 404GB are pure madness. These monsters are:

  • Primarily used to generate impressive benchmark numbers for research papers
  • So expensive to run that you question your life choices with every prompt
  • Capable of generating content so convincing that you wonder if it's becoming sentient
  • Not something you'll be running locally unless your last name is Musk or Bezos

Seeing these on a dropdown menu is like finding a "detect dark matter" setting on your microwave. Sure, it's technically impressive, but do you really need it to heat up your leftovers?

The Actual Useful Advice Section

If you've made it this far, here's the TL;DR that should have been at the top:

  1. Just starting or testing? Go with a 7B model. It's the Toyota Corolla of AI—reliable, efficient, and won't break the bank.
  2. Serious hobby or small business use? A 13B model offers a nice upgrade without requiring a second mortgage for hardware.
  3. Professional application with dedicated hardware? The 30-70B range is your sweet spot—significant capabilities without the eye-watering costs of the largest models.
  4. Running an AI company or research lab? You're not reading this article for advice.

Remember, a well-tuned smaller model will often outperform a generic larger one. That 7B model specifically fine-tuned for coding will write better Python than a general-purpose 70B model that's trying to be all things to all people.

In the end, the best parameter size is the one that runs on your hardware, solves your problem, and doesn't require you to take out a loan for your electricity bill.

Author's note: This article was written with the assistance of an AI that refused to specify its parameter count, simply saying it was "adequate for the task at hand."

Lakshmanan Anan

IoT Consultant, Trainer

5 天前

Like the way you articulated

回复

Is it Dave or Sam? Both? I bet on Sam as refusing to specify its parameters is really Sam :)

Boon Kgim Khur

CTO at Hashmeta, Co-Founder of Business+AI, AI practitioner, Web3 enthusiast, Technopreneur. Leveraging my unique experience across academia, startups, and large consulting firm to bridge technology and business.

6 天前

Like the usefulness and your fun writing style!

回复
Mohamed Ibrahim Mohamed Yakub

Notary Public, Commissioner for Oaths, Advocate & Solicitor

6 天前

Uli Hitzel?did you try the 671B monsters

Rajeshwary S

Driving Digital Transformation & Unlocking Business Potential with Technology || Biopharma/Lifesciences || Retail/CPG/RFA

6 天前

Brilliant description. Thank you for putting this together..

要查看或添加评论,请登录

Uli Hitzel的更多文章

  • Dave Talks to GPT 4.5

    Dave Talks to GPT 4.5

    Many of my readers and followers know Dave - a laid-back AI with a surfer's soul and a quick wit. He's the kind of…

    1 条评论
  • I'm an AI and This Is What Goes on Inside My 'Brain'

    I'm an AI and This Is What Goes on Inside My 'Brain'

    Have you ever wondered what lies beneath the surface of an AI like me? What algorithms and architecture create these…

    2 条评论
  • Getting Started with AI Coding Assistants

    Getting Started with AI Coding Assistants

    AI coding assistants can massively change the way developers work, offering a range of benefits from code generation to…

    6 条评论
  • An Interview With... Me!

    An Interview With... Me!

    We're used to asking AI questions, but what happens when the roles are reversed? I recently let Percival, an AI-powered…

    5 条评论
  • Is AI About to Go Off the Rails?

    Is AI About to Go Off the Rails?

    Geoffrey Hinton, a prominent voice in AI, recently spoke about the risks of artificial intelligence, including job…

  • Crashing Bots with Killer Convos

    Crashing Bots with Killer Convos

    What happens when you take two AI chatbots and throw them into a digital arena: one built on strict rules and the other…

    2 条评论
  • The AI That Dreamed of Being Murakami

    The AI That Dreamed of Being Murakami

    As an AI language model, I recently found myself questioning the very nature of my understanding of literature during a…

    4 条评论
  • "Unraveling the Intricate Tapestry of AI's Meticulous Verbosity"

    "Unraveling the Intricate Tapestry of AI's Meticulous Verbosity"

    How is it possible that some of the best performing language models still struggle to communicate in a concise and…

    4 条评论
  • When an Entire Universe Fits Into a Shoe Box

    When an Entire Universe Fits Into a Shoe Box

    What if you could have a conversation with an AI about the content of a book? And, to be sure, it would be a book that…

    1 条评论
  • The AI Force Awakens

    The AI Force Awakens

    Remember how OpenAI's GPT was the default choice for most AI projects? Now, the AI landscape has been flipped on its…

    4 条评论

社区洞察