Friday, May 16, 2025
News PouroverAI
Visit PourOver.AI
No Result
View All Result
  • Home
  • AI Tech
  • Business
  • Blockchain
  • Data Science & ML
  • Cloud & Programming
  • Automation
  • Front-Tech
  • Marketing
  • Home
  • AI Tech
  • Business
  • Blockchain
  • Data Science & ML
  • Cloud & Programming
  • Automation
  • Front-Tech
  • Marketing
News PouroverAI
No Result
View All Result

Quantizing the AI Colossi. Streamlining Giants Part 2: Neural… | by Nate Cibik | Apr, 2024

April 15, 2024
in Data Science & ML
Reading Time: 5 mins read
0 0
A A
0
Share on FacebookShare on Twitter



To ground our investigation into quantization, it is important to reflect on exactly what we mean by “quantizing” numbers. So far we’ve discussed that through quantization, we take a set of high-precision values and map them to a lower precision in such a way that best preserves their relationships, but we have not zoomed into the mechanics of this operation. Unsurprisingly, we find there are nuances and design choices to be made concerning how we remap values into the quantized space, which vary depending on use case. In this section, we will seek to understand the knobs and levers which guide the quantization process, so that we can better understand the research and equip ourselves to bring educated decision making into our deployments.

Bit Width

Throughout our discussion on quantization, we will refer to the bit widths of the quantized values, which represents the number of bits available to express the value. A bit can only store a binary value of 0 or 1, but sets of bits can have their combinations interpreted as incremental integers. For instance, having 2 bits allows for 4 total combinations ({0, 0}, {0, 1}, {1, 0}, {1, 1}) which can represent integers in the range [0, 3]. As we add N bits, we get 2 to the power of N possible combinations, so an 8-bit integer can represent 256 numbers. While unsigned integers will count from zero to the maximum value, signed integers will place zero at the center of the range by interpreting the first bit as the +/- sign. Therefore, an unsigned 8-bit integer has a range of [0, 255], and a signed 8-bit integer spans from [-128, 127].

This fundamental knowledge of how bits represent information will help us to contextualize the numeric spaces that the floating point values get mapped to in the techniques we study, as when we hear that a network layer is quantized to 4 bits, we understand that the destination space has 2 to the power of 4 (16) discrete values. In quantization, these values do not necessarily represent integer values for the quantized weights, and often refer to the indices of the quantization levels – the “buckets” into which the values of the input distribution are mapped. Each index corresponds to a codeword that represents a specific quantized value within the predefined numeric space. Together, these codewords form a codebook, and the values obtained from the codebook can be either floating point or integer values, depending on the type of arithmetic to be performed. The thresholds that define the buckets depend on the chosen quantization function, as we will see. Note that codeword and codebook are general terms, and that in most cases the codeword will be the same as the value returned from the codebook.

Floating-Point, Fixed-Point, and Integer-Only Quantization

Now that we understand bit widths, we should take a moment to touch on the distinctions between floating-point, fixed-point, and integer-only quantization, so that we are clear on their meaning. While representing integers with binary bits is straightforward, operating on numbers with fractional components is a bit more complex. Both floating-point and fixed-point data types have been designed to do this, and selecting between them depends on both on the deployment hardware and desired accuracy-efficiency tradeoff, as not all hardware supports floating-point operations, and fixed-point arithmetic can offer more power efficiency at the cost of reduced numeric range and precision.

Floating-point numbers allocate their bits to represent three pieces of information: the sign, the exponent, and the mantissa, which enables efficient bitwise operations on their representative values. The number of bits in the exponent define the magnitude of the numeric range, and the number of mantissa bits define the level of precision. As one example, the IEEE 754 standard for a 32-bit floating point (FP32) gives the first bit to the sign, 8 bits to the exponent, and the remaining 23 bits to the mantissa. Floating-point values are “floating” because they store an exponent for each individual number, allowing the position of the radix point to “float,” akin to how scientific notation moves the decimal in base 10, but different in that computers operate in base 2 (binary). This flexibility enables precise representation of a wide range of values, especially near zero, which underscores the importance of normalization in various applications.

In contrast, “fixed” point precision does not use a dynamic scaling factor, and instead allocates bits into sign, integer, and fractional (often still referred to as mantissa) components. While this means higher efficiency and power-saving operations, the dynamic range and precision will suffer. To understand this, imagine that you want to represent a number which is as close to zero as possible. In order to do so, you would carry the decimal place out as far as you could. Floating-points are free to use increasingly negative exponents to push the decimal further to the left and provide extra resolution in this situation, but the fixed-point value is stuck with the precision offered by a fixed number of fractional bits.

Integers can be considered an extreme case of fixed-point where no bits are given to the fractional component. In fact, fixed-point bits can be operated on directly as if they were an integer, and the result can be rescaled with software to achieve the correct fixed-point result. Since integer arithmetic is more power-efficient on hardware, neural network quantization research favors integer-only quantization, converting the original float values into integers, rather than the fixed-point floats, because their calculations will ultimately be equivalent, but the integer-only math can be performed more efficiently with less power. This is particularly important for deployment on battery-powered devices, which also often contain hardware that only supports integer arithmetic.

Uniform Quantization

To quantize a set of numbers, we must first define a quantization function Q(r), where r is the real number (weight or activation) to be quantized. The most common quantization function is shown below:

Typical quantization function. Image by author.

In this formula, Z represents an integer zero-point, and S is the scaling factor. In symmetrical quantization, Z is simply set to zero, and cancels out of the equation, while for asymmetrical quantization, Z is used to offset the zero point, allowing for focusing more of the quantization range on either the positive or negative side of the input distribution. This asymmetry can be extremely useful in certain cases, for example when quantizing post-ReLU activation signals, which contain only positive numbers. The Int(·) function assigns a scaled continuous value to an integer, typically through rounding, but in some cases following more complex procedures, as we will encounter later.

Choosing the correct scaling factor (S) is non-trivial, and requires careful consideration of the distribution of values to be quantized. Because the quantized output space has a finite range of values (or quantization levels) to map the inputs to, a clipping range [α, β] must be established that provides a good fit for the incoming value distribution. The chosen clipping range must strike a balance between not over-clamping extreme input values and not oversaturating the quantization levels by allocating too many bits to the long tails. For now, we consider uniform quantization, where the bucketing thresholds, or quantization steps, are evenly spaced. The calculation of the scaling factor is as follows:

Formula for calculating the quantization function’s scaling factor (S) based on the clipping range ([α, β]) and desired bit-width (b). Image by author.

The shapes of trained parameter distributions can vary widely between networks and are influenced by a number of factors. The activation signals generated by those weights are even more dynamic and unpredictable, making any assumptions about the correct clipping ranges difficult. This is why we must calibrate the clipping range based on our model and data. For best accuracy, practitioners may choose to calibrate the clipping range for activations online during inference, known as dynamic quantization. As one might expect, this comes with extra computational overhead, and is therefore by far less popular than static quantization, where the clipping range is calibrated ahead of time, and fixed during inference.

Dequantization

Here we establish the reverse uniform quantization operation which decodes the quantized values back into the original numeric space, albeit imperfectly, since the rounding operation is non-reversible. We can decode our approximate values using the following formula:

Dequantization operation. Image by author.

Non-Uniform Quantization

The astute reader will probably have noticed that enacting uniformly-spaced bucketing thresholds on an input distribution that is any shape other than uniform will lead to some…



Source link

Tags: AprCibikColossigiantsNateNeuralâPartQuantizingStreamlining
Previous Post

Dow Jones Futures: Another Ugly Market Reversal; Nvidia Skids, Tesla Tumbles On ‘Dark Day’

Next Post

Schiff Predicts Bitcoin Slump to $20K

Related Posts

AI Compared: Which Assistant Is the Best?
Data Science & ML

AI Compared: Which Assistant Is the Best?

June 10, 2024
5 Machine Learning Models Explained in 5 Minutes
Data Science & ML

5 Machine Learning Models Explained in 5 Minutes

June 7, 2024
Cohere Picks Enterprise AI Needs Over ‘Abstract Concepts Like AGI’
Data Science & ML

Cohere Picks Enterprise AI Needs Over ‘Abstract Concepts Like AGI’

June 7, 2024
How to Learn Data Analytics – Dataquest
Data Science & ML

How to Learn Data Analytics – Dataquest

June 6, 2024
Adobe Terms Of Service Update Privacy Concerns
Data Science & ML

Adobe Terms Of Service Update Privacy Concerns

June 6, 2024
Build RAG applications using Jina Embeddings v2 on Amazon SageMaker JumpStart
Data Science & ML

Build RAG applications using Jina Embeddings v2 on Amazon SageMaker JumpStart

June 6, 2024
Next Post
Schiff Predicts Bitcoin Slump to $20K

Schiff Predicts Bitcoin Slump to $20K

Observable systems with wide events

Observable systems with wide events

New Snowflake Features Released in March 2024

New Snowflake Features Released in March 2024

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

  • Trending
  • Comments
  • Latest
Is C.AI Down? Here Is What To Do Now

Is C.AI Down? Here Is What To Do Now

January 10, 2024
23 Plagiarism Facts and Statistics to Analyze Latest Trends

23 Plagiarism Facts and Statistics to Analyze Latest Trends

June 4, 2024
Porfo: Revolutionizing the Crypto Wallet Landscape

Porfo: Revolutionizing the Crypto Wallet Landscape

October 9, 2023
A Complete Guide to BERT with Code | by Bradney Smith | May, 2024

A Complete Guide to BERT with Code | by Bradney Smith | May, 2024

May 19, 2024
How To Build A Quiz App With JavaScript for Beginners

How To Build A Quiz App With JavaScript for Beginners

February 22, 2024
Saginaw HMI Enclosures and Suspension Arm Systems from AutomationDirect – Library.Automationdirect.com

Saginaw HMI Enclosures and Suspension Arm Systems from AutomationDirect – Library.Automationdirect.com

December 6, 2023
Can You Guess What Percentage Of Their Wealth The Rich Keep In Cash?

Can You Guess What Percentage Of Their Wealth The Rich Keep In Cash?

June 10, 2024
AI Compared: Which Assistant Is the Best?

AI Compared: Which Assistant Is the Best?

June 10, 2024
How insurance companies can use synthetic data to fight bias

How insurance companies can use synthetic data to fight bias

June 10, 2024
5 SLA metrics you should be monitoring

5 SLA metrics you should be monitoring

June 10, 2024
From Low-Level to High-Level Tasks: Scaling Fine-Tuning with the ANDROIDCONTROL Dataset

From Low-Level to High-Level Tasks: Scaling Fine-Tuning with the ANDROIDCONTROL Dataset

June 10, 2024
UGRO Capital: Targeting to hit milestone of Rs 20,000 cr loan book in 8-10 quarters: Shachindra Nath

UGRO Capital: Targeting to hit milestone of Rs 20,000 cr loan book in 8-10 quarters: Shachindra Nath

June 10, 2024
Facebook Twitter LinkedIn Pinterest RSS
News PouroverAI

The latest news and updates about the AI Technology and Latest Tech Updates around the world... PouroverAI keeps you in the loop.

CATEGORIES

  • AI Technology
  • Automation
  • Blockchain
  • Business
  • Cloud & Programming
  • Data Science & ML
  • Digital Marketing
  • Front-Tech
  • Uncategorized

SITEMAP

  • Disclaimer
  • Privacy Policy
  • DMCA
  • Cookie Privacy Policy
  • Terms and Conditions
  • Contact us

Copyright © 2023 PouroverAI News.
PouroverAI News

No Result
View All Result
  • Home
  • AI Tech
  • Business
  • Blockchain
  • Data Science & ML
  • Cloud & Programming
  • Automation
  • Front-Tech
  • Marketing

Copyright © 2023 PouroverAI News.
PouroverAI News

Welcome Back!

Login to your account below

Forgotten Password? Sign Up

Create New Account!

Fill the forms bellow to register

All fields are required. Log In

Retrieve your password

Please enter your username or email address to reset your password.

Log In