Saturday, May 24, 2025
News PouroverAI
Visit PourOver.AI
No Result
View All Result
  • Home
  • AI Tech
  • Business
  • Blockchain
  • Data Science & ML
  • Cloud & Programming
  • Automation
  • Front-Tech
  • Marketing
  • Home
  • AI Tech
  • Business
  • Blockchain
  • Data Science & ML
  • Cloud & Programming
  • Automation
  • Front-Tech
  • Marketing
News PouroverAI
No Result
View All Result

Meet LoftQ: LoRA-Fine-Tuning-Aware Quantization for Large Language Models

October 25, 2023
in AI Technology
Reading Time: 4 mins read
0 0
A A
0
Share on FacebookShare on Twitter


The introduction of Pre-trained Language Models (PLMs) has signified a transformative shift in the field of Natural Language Processing. They have demonstrated exceptional proficiency in performing a wide range of language tasks, including Natural Language Understanding (NLU) and Natural Language Generation (NLG). These models typically incorporate millions or even billions of parameters, leading to substantial computational and memory requirements. However, the considerable computational and memory needs of these models present significant challenges, as acknowledged by the research community.

In this paper, the authors introduce a novel quantization framework known as LoRA-Fine-Tuning-aware Quantization (LoftQ). This framework is specifically tailored for pre-trained models that necessitate quantization and LoRA fine-tuning. The framework actively combines low-rank approximation, working in conjunction with quantization to jointly approximate the original high-precision pre-trained weights.

The above image demonstrates QLoRA performance with different bits. Left: QLoRA initialization of LLAMA-2-13b on WikiText-2. Right: Apply QLoRA to LLAMA-2-13b on WikiText-2 language modelling task. Smaller perplexity indicates better performance.

Quantization Methods. We apply two quantization methods to demonstrate LoftQ is compatible with different quantization functions:

• Uniform quantization is a classic quantization method. It uniformly divides a continuous interval into 2N categories and stores a local maximum absolute value for dequantization.

• NF4 and its 2-bit variant NF2 are quantization methods used in QLoRA. They assume that the high-precision values are drawn from a Gaussian distribution and map these values to discrete slots that have equal probability.

We perform 2-bit and 4-bit quantization on all models, achieving compression ratios of 25-30% and 15-20% at the 4-bit and 2-bit levels, respectively. All the experiments are conducted on NVIDIA A100 GPUs.

The evaluation of their quantization framework is carried out through extensive experiments on various downstream tasks, including NLU, question answering, summarization, and NLG. The results of these experiments demonstrate that LoftQ consistently surpasses QLoRA across all precision levels. For example, with 4-bit quantization, they attain a 1.1 and 0.8 improvement in Rouge-1 for XSum and CNN/DailyMail, respectively. As the field of NLP continues to advance, it is expected that further innovations and optimizations will help bridge the gap between the immense potential of PLMs and their practical deployment, benefiting a wide range of applications and users.

Check out the Paper. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 31k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

If you like our work, you will love our newsletter..

We are also on WhatsApp. Join our AI Channel on Whatsapp..

Janhavi Lande, is an Engineering Physics graduate from IIT Guwahati, class of 2023. She is an upcoming data scientist and has been working in the world of ml/ai research for the past two years. She is most fascinated by this ever changing world and its constant demand of humans to keep up with it. In her pastime she enjoys traveling, reading and writing poems.

▶️ Now Watch AI Research Updates On Our Youtube Channel [Watch Now]



Source link

Tags: languageLargeLoftQLoRAFineTuningAwareMeetmodelsQuantization
Previous Post

Murrplastik Two-Piece Flex Split Conduit from AutomationDirect

Next Post

Dealing with Intellectual Property Concerns

Related Posts

How insurance companies can use synthetic data to fight bias
AI Technology

How insurance companies can use synthetic data to fight bias

June 10, 2024
From Low-Level to High-Level Tasks: Scaling Fine-Tuning with the ANDROIDCONTROL Dataset
AI Technology

From Low-Level to High-Level Tasks: Scaling Fine-Tuning with the ANDROIDCONTROL Dataset

June 10, 2024
How Game Theory Can Make AI More Reliable
AI Technology

How Game Theory Can Make AI More Reliable

June 9, 2024
Decoding Decoder-Only Transformers: Insights from Google DeepMind’s Paper
AI Technology

Decoding Decoder-Only Transformers: Insights from Google DeepMind’s Paper

June 9, 2024
Buffer of Thoughts (BoT): A Novel Thought-Augmented Reasoning AI Approach for Enhancing Accuracy, Efficiency, and Robustness of LLMs
AI Technology

Buffer of Thoughts (BoT): A Novel Thought-Augmented Reasoning AI Approach for Enhancing Accuracy, Efficiency, and Robustness of LLMs

June 9, 2024
Deciphering Doubt: Navigating Uncertainty in LLM Responses
AI Technology

Deciphering Doubt: Navigating Uncertainty in LLM Responses

June 9, 2024
Next Post
Dealing with Intellectual Property Concerns

Dealing with Intellectual Property Concerns

What is Page Speed? How to Load Website Faster

What is Page Speed? How to Load Website Faster

NVIDIA’s Marketing Strategy Case Study

NVIDIA's Marketing Strategy Case Study

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

  • Trending
  • Comments
  • Latest
Is C.AI Down? Here Is What To Do Now

Is C.AI Down? Here Is What To Do Now

January 10, 2024
23 Plagiarism Facts and Statistics to Analyze Latest Trends

23 Plagiarism Facts and Statistics to Analyze Latest Trends

June 4, 2024
A faster, better way to prevent an AI chatbot from giving toxic responses | MIT News

A faster, better way to prevent an AI chatbot from giving toxic responses | MIT News

April 10, 2024
Porfo: Revolutionizing the Crypto Wallet Landscape

Porfo: Revolutionizing the Crypto Wallet Landscape

October 9, 2023
Implementing User Authentication in React Apps with Appwrite — SitePoint

Implementing User Authentication in React Apps with Appwrite — SitePoint

January 30, 2024
NousResearch Released Nous-Hermes-2-Mixtral-8x7B: An Open-Source LLM with SFT and DPO Versions

NousResearch Released Nous-Hermes-2-Mixtral-8x7B: An Open-Source LLM with SFT and DPO Versions

January 25, 2024
Can You Guess What Percentage Of Their Wealth The Rich Keep In Cash?

Can You Guess What Percentage Of Their Wealth The Rich Keep In Cash?

June 10, 2024
AI Compared: Which Assistant Is the Best?

AI Compared: Which Assistant Is the Best?

June 10, 2024
How insurance companies can use synthetic data to fight bias

How insurance companies can use synthetic data to fight bias

June 10, 2024
5 SLA metrics you should be monitoring

5 SLA metrics you should be monitoring

June 10, 2024
From Low-Level to High-Level Tasks: Scaling Fine-Tuning with the ANDROIDCONTROL Dataset

From Low-Level to High-Level Tasks: Scaling Fine-Tuning with the ANDROIDCONTROL Dataset

June 10, 2024
UGRO Capital: Targeting to hit milestone of Rs 20,000 cr loan book in 8-10 quarters: Shachindra Nath

UGRO Capital: Targeting to hit milestone of Rs 20,000 cr loan book in 8-10 quarters: Shachindra Nath

June 10, 2024
Facebook Twitter LinkedIn Pinterest RSS
News PouroverAI

The latest news and updates about the AI Technology and Latest Tech Updates around the world... PouroverAI keeps you in the loop.

CATEGORIES

  • AI Technology
  • Automation
  • Blockchain
  • Business
  • Cloud & Programming
  • Data Science & ML
  • Digital Marketing
  • Front-Tech
  • Uncategorized

SITEMAP

  • Disclaimer
  • Privacy Policy
  • DMCA
  • Cookie Privacy Policy
  • Terms and Conditions
  • Contact us

Copyright © 2023 PouroverAI News.
PouroverAI News

No Result
View All Result
  • Home
  • AI Tech
  • Business
  • Blockchain
  • Data Science & ML
  • Cloud & Programming
  • Automation
  • Front-Tech
  • Marketing

Copyright © 2023 PouroverAI News.
PouroverAI News

Welcome Back!

Login to your account below

Forgotten Password? Sign Up

Create New Account!

Fill the forms bellow to register

All fields are required. Log In

Retrieve your password

Please enter your username or email address to reset your password.

Log In