Number System - Quantization of LLMs, Part-1
Quantization of LLMs

Number System - Quantization of LLMs, Part-1

Large Language Models (LLMs) have significantly advanced in recent years, becoming increasingly user-friendly and versatile in various applications. Nevertheless, as the intelligence and complexity of LLMs have expanded, so too has the number of parameters, including weights and activations, which determine their ability to learn from and analyze data.

Hence, the larger an LLM, the more memory it requires.

As the size of an LLM increases, so does the memory it demands. This necessitates running LLMs on high-spec hardware with the required number of GPUs, limiting deployment options and the ease of adopting LLM-based solutions. Fortunately, machine learning researchers are working on a range of solutions to tackle the challenge of growing model sizes, with quantization being a prominent one.

Why do we need Quantization?

Before we deep delve into the concept of Quantization. Let us first try to understand why do we need it in the first place.

Quantization aims to address the following challenges:

Challenge-1: Most contemporary deep neural networks consist of millions or even billions of parameters, which poses a significant challenge.

Consider the following examples:

Ex 1. The smallest LLaMA-2 model consists of 7 billion parameters. Assuming each parameter is 32 bits, we would require 28GB of storage space just to store these parameters on the disk

Memory requirement of smallest LLama-2

Ex 2. The smallest LLaMA-3 model consists of 8 billion parameters. Assuming each parameter is 32 bits, we would require 32GB of storage space just to store these parameters on the disk.

Memory Requirement of smallest LLama-3

Ex 3. The current state-of-the-art GPT-4 has in excess of 1 trillion parameters. Rumor's claim that it has 1.76 trillion parameters. Assuming each parameter is 32 bits, we would require 7.04TB of storage space just to store these parameters on the disk.

Memory Requirement of GPT-4

Challenge-2: Consequently, larger models pose a challenge as they cannot be effortlessly loaded on a standard PC or a smart phone. When utilizing a CPU for inference, it is necessary to load it into the RAM. Conversely, when using a GPU, it should be loaded into the GPU's memory.

Challenge-3: Similar to humans, computers have a slower processing speed when it comes to performing floating-point operations in comparison to integer operations. Consider the calculation of 4 × 8 and compare it to 1.17 × 2.389. Which one can be computed more quickly?

Answer - 4 x 8

How to tackle these challenges?

To address these challenges, quantization provides the solution. Quantizing large language models (LLMs) is a crucial method for reducing their size and memory usage, all while preserving their quality.

So what exactly is Quantization?

Quantization

Quantization, in an abstract sense, is the process of constraining an input from a continuous or otherwise large set of values to a discrete set.

Mapping of Continuous Signals to Discrete Signals
Palletization (loose form of image compression)

But, how does this relate to LLMs?

To see the relation between abstract quantization mechanism and LLMs. Let us first try to understand the following fundamental concepts of Number System.

Numeric Data Types

Let's examine the representation of numbers in hardware at either the CPU or GPU level. Computers utilize a set number of bits to represent various types of data, such as numbers, characters, or pixel colors. The fixed number of bits is consistently employed.

How is Numeric Data Represented in Modern Computing Systems?

Human beings utilize the decimal (base 10) and duodecimal (base 12) number systems to perform counting and measurements, likely due to our possession of 10 fingers and two prominent toes.

Conversely, computers rely on the binary (base 2) number system, as they consist of binary digital components, known as transistors, which function in two distinct states - on and off. If the current passes through the transistor then the computer reads “1” and if the current is absent from the transistor then it read “0”.

Decimal Number system (Base 10)

Decimal number system has ten symbols: 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9, called digits. It uses positional notation. That is, the least-significant digit (right-most digit) is of the order of 10^0 (units or ones), the second right-most digit is of the order of 10^1 (tens), the third right-most digit is of the order of 10^2 (hundreds), and so on, where ^ denotes exponent.

For example,

Decimal Number System (base 10)

Binary Number System (Base 2)

The Binary Number System is a numerical system that utilizes two symbols, "0" and "1", to represent different numbers. It uses positional notation. The term "binary" is derived from the word "bi," meaning two. As a result, this numerical system is referred to as the Binary Number System. A binary digit is called a bit.

Binary Number System (base 2)

What is the decimal equivalent of binary number 10110?

Binary to Decimal Conversion

There are generally various types of number systems and among them the four major ones are,

  • Binary Number System (Number system with Base 2)
  • Octal Number System (Number system with Base 8)
  • Decimal Number System (Number system with Base 10)
  • Hexadecimal Number System (Number system with Base 16)

As discussed, Computers utilize a set quantity of bits to symbolize various types of data, such as numbers, characters, or pixel colors. A bit sequence consisting of n bits (also called n-bit string or n-bit storage location) has the capability to represent a maximum of 2^n unique entities.

For example, a 3-bit memory location can hold one of these eight binary patterns: 000, 001, 010, 011, 100, 101, 110, or 111.

Hence, it can represent at most 8 distinct entities.

You could use them to represent numbers 0 to 7, numbers 8881 to 8888, characters 'A' to 'H', or up to 8 kinds of fruits like apple, orange, banana; or up to 8 kinds of animals like lion, tiger, etc.

Typically, numbers are represented in groups of 8 bits (byte), 16 bits (short), 32 bits (int), or 64 bits (long).

1. Integer representation in CPU (or GPU)

Integers are whole numbers or fixed-point numbers with the radix point fixed after the least-significant bit. Computers use a fixed number of bits to represent an integer. The commonly-used bit-lengths for integers are 8-bit, 16-bit, 32-bit or 64-bit.

In addition to bit-lengths, there exist two distinct representation schemes for integers.

  1. Unsigned Integers: can represent zero and positive integers.
  2. Signed Integers: can represent zero, positive and negative integers.

Three representation schemes had been proposed for signed integers:

  1. Sign-Magnitude representation
  2. 1's Complement representation
  3. 2's Complement representation

As a programmer, it is your responsibility to determine the bit-length and representation scheme for the integers based on the specific requirements of your application. In the case of needing a counter to track a small quantity ranging from 0 to 200, you could opt for the 8-bit unsigned integer scheme since it does not involve negative numbers.

Let us try to understand these representations in detail.

1.1 Unsigned Integer (n-bit)

Unsigned integers have the ability to represent zero and positive integers, excluding negative integers. The interpretation of an unsigned integer's value is based on "the magnitude of its underlying binary pattern".

Range of Unsigned Integers (n-bit)

Example 1: Suppose that n=8 and the binary pattern is 01000001, the value of this unsigned integer is 65.

Binary to Decimal Conversion for Unsigned Representation

Example 2: Suppose that n=16 and the binary pattern is 0000000000000000, the value of this unsigned integer is 0.

Binary to Decimal Conversion for Unsigned Representation
Range of

1.2 Signed Integers

Signed integers can represent zero, positive integers, as well as negative integers. Three representation schemes are available for signed integers:

  1. Sign-Magnitude representation
  2. 1's Complement representation
  3. 2's Complement representation

In each of the aforementioned three schemes, the sign bit, also known as the most-significant bit (msb), is utilized to indicate the sign of the integer. A value of 0 represents a positive integer, while a value of 1 represents a negative integer. Nevertheless, the interpretation of the integer's magnitude varies across the different schemes.

1. Sign-Magnitude Representation

In sign-magnitude representation:

  • The sign bit, denoted as the most-significant bit (msb), has a value of 0 for positive integers and 1 for negative integers
  • The magnitude (absolute value) of the integer is represented by the remaining n-1 bits. This value is viewed as "the magnitude of the (n-1)-bit binary pattern"

Example 1: Suppose that n=8 and the binary representation is 01000001.?

Sign bit is 0 ? positive

Absolute value of remaining (7-bits) is 1000001 = 65.???Hence, the integer is +65.

Binary to Decimal Conversion for Sign-Magnitude Representation

Example 2: Suppose that n=8 and the binary representation is 00000000.

Sign bit is 0 ? positive.

Absolute value is 0000000 = 0???Hence, the integer is +0. Note the + sign here.

Example 3: Suppose that n=8 and the binary representation is 10000000.

Sign bit is 1 ? negative??

Absolute value is 0000000 = 0.??Hence, the integer is -0. Note the - sign here.

So from example-2 and example-3 we can infer that in sign magnitude representation, binary numbers 00000000 and 10000000 have same value.

Two Binary Representations of 0 in Sign-Magnitude representation

Range of signed-magnitude n-bit integers:

Range of signed-magnitude representation for n-bit
Sign-Magnitude Representation for n=8

The drawbacks of sign-magnitude representation are:

  1. There are two ways to represent the number zero: '00000000' and '1000 0000'. This can potentially cause inefficiency and confusion.
  2. Positive and negative integers need to be processed separately

2. 1's Compliment Representation

In 1's complement representation:

  • The sign bit, denoted as the most-significant bit (msb), has a value of 0 for positive integers and 1 for negative integers
  • The remaining n-1 bits represents the magnitude of the integer, as follows:

  1. for positive integers, the absolute value of the integer is equal to "the magnitude of the (n-1)-bit binary pattern"
  2. for negative integers, the absolute value of the integer is equal to "the magnitude of the complement (inverse) of the (n-1)-bit binary pattern" (hence called 1's complement)

Example 1: Suppose that n=8 and the binary representation 01000001.

  • Sign bit is 0 ? positive
  • Absolute value is 0100001 = 65???Hence, the integer is +65.

Binary to Decimal Conversion for 1's-Compliment Representation

Example 2: Suppose that n=8 and the binary representation 10000001.

  • Sign bit is 1 ? negative
  • Absolute value is the complement of 0000001, i.e., 1111110 = 126.?Hence, the integer is -126

Binary to Decimal Conversion for 1's-Compliment Representation (for -ve sign bit)

Example 3: Suppose that n=8 and the binary representation 0 000 0000.

  • Sign bit is 0 ? positive???
  • Absolute value of 0000000 = 0. Hence, the integer is +0

Example 4: Suppose that n=8 and the binary representation 1 111 1111.

  • Sign bit is 1 ? negative
  • Absolute value is the complement of 1111111, i.e., 0000000 = 0.?Hence, the integer is -0

The following figure illustrates the visual working of Example-3 and Example-4.

Two Binary Representations of 0 in 1's Compliment Representation

Range of 1's compliment representation for n-bit integer:

Range of 1's compliment representation for n-bit

Let us visualize the range of 1's compliment representation for n=8.

Range of 1's Compliment Representation for n=8

Once more, the disadvantages are:

  1. There are two ways to represent the number zero: '00000000' and '1000 0000'. This can potentially cause inefficiency and confusion.
  2. Positive and negative integers need to be processed separately

3. 2's Compliment Representation

In 2's complement representation:

  • The sign bit, denoted as the most-significant bit (msb), has a value of 0 for positive integers and 1 for negative integers
  • The remaining n-1 bits represents the magnitude of the integer, as follows:

  1. for positive integers, the absolute value of the integer is equal to "the magnitude of the (n-1)-bit binary pattern"
  2. for negative integers, the absolute value of integer can be determined by finding the magnitude of the complement of the (n-1)-bit binary pattern plus one, which is commonly referred to as the 2's complement.

Example 1: Suppose that n=8 and the binary representation 01000001.

  • Sign bit is 0 ? positive
  • Absolute value is 0100001 = 65???Hence, the integer is +65.

Binary to Decimal Conversion for 2's-Compliment Representation (for +ve sign bit)

Example 2: Suppose that n=8 and the binary representation 1 000 0001.

  • Sign bit is 1 ? negative
  • Absolute value of the complement of 000 0001 plus 1, i.e., (binary addition: 111 1110 + 1) is 127.???Hence, the integer is -127

Binary to Decimal Conversion for 2's-Compliment Representation (for -ve sign bit)

Example 3: Suppose that n=8 and the binary representation 00000000.

  • Sign bit is 0 ? positive
  • Absolute value of 0000000 is 0. ?Hence, the integer is +0

Binary to Decimal Conversion for 2's-Compliment Representation (for +0)

Example 4: Suppose that n=8 and the binary representation 11111111.

  • Sign bit is 1 ? negative
  • Absolute value is the 2's complement of 1111111 plus 1, i.e., (binary addition of 0000000 + 1) is 1. Hence, the integer is -1

Binary to Decimal Conversion for 2's-Compliment Representation

Let us visualize the range of 2's compliment representation for n=8.

Range of 2's Compliment Representation (for n=8)

Range of 2's compliment representation for n-bit integer:

Range of 2's Compliment Representation (for n=8)
Table for Range of 2's complimentary Representation (n-bit, where n=8, 16, 32, 64)

Modern Computing System use 2's Complement Representation for Signed Integers

Computers use 2's complement in representing signed integers. This is because:

  1. In 2's complement, there is a single representation for the number zero, as opposed to the two representations found in sign-magnitude and 1's complement.
  2. Positive and negative integers can be combined in addition and subtraction operations. Subtraction can be performed by applying the logic of addition.


To be continued... in Part-2


要查看或添加评论,请登录

Akash K.的更多文章

社区洞察

其他会员也浏览了