Friday, May 9, 2025
News PouroverAI
Visit PourOver.AI
No Result
View All Result
  • Home
  • AI Tech
  • Business
  • Blockchain
  • Data Science & ML
  • Cloud & Programming
  • Automation
  • Front-Tech
  • Marketing
  • Home
  • AI Tech
  • Business
  • Blockchain
  • Data Science & ML
  • Cloud & Programming
  • Automation
  • Front-Tech
  • Marketing
News PouroverAI
No Result
View All Result

A Deep Dive into Fine-Tuning. Stepping out of the “comfort zone” —… | by Aris Tsakpinis | Jun, 2024

June 3, 2024
in AI Technology
Reading Time: 8 mins read
0 0
A A
0
Share on FacebookShare on Twitter


Stepping out of the “comfort zone” — part 3/3 of a deep-dive into domain adaptation approaches for LLMs

Towards Data Science
Photo by StableDiffusionXL on Amazon Web Services

Exploring domain adapting large language models (LLMs) to your specific domain or use case? This 3-part blog post series explains the motivation for domain adaptation and dives deep into various options to do so. Further, a detailed guide for mastering the entire domain adaptation journey covering popular tradeoffs is being provided.

Part 1: Introduction into domain adaptation — motivation, options, tradeoffs Part 2: A deep dive into in-context learning Part 3: A deep dive into fine-tuning — You’re here!

Note: All images, unless otherwise noted, are by the author.

In the previous part of this blog post series, we explored the concept of in-context learning as a powerful approach to overcome the “comfort zone” limitations of large language models (LLMs). We discussed how these techniques can be used to transform tasks and move them back into the models’ areas of expertise, leading to improved performance and alignment with the key design principles of Helpfulness, Honesty, and Harmlessness. In this third part, we will shift our focus to the second domain adaptation approach: fine-tuning. We will dive into the details of fine-tuning, exploring how it can be leveraged to expand the models’ “comfort zones” and hence uplift performance by adapting them to specific domains and tasks. We will discuss the trade-offs between prompt engineering and fine-tuning, and provide guidance on when to choose one approach over the other based on factors such as data velocity, task ambiguity, and other considerations.

Most state-of-the-art LLMs are powered by the transformer architecture, a family of deep neural network architectures which has disrupted the field of NLP after being proposed by Vaswani et al in 2017, breaking all common benchmarks across the domain. The core differentiator of this architecture family is a concept called “attention” which excels in capturing the semantic meaning of words or larger pieces of natural language based on the context they are used in.

The transformer architecture consists of two fundamentally different building blocks. On the one side, the “encoder” block focuses on translating the semantics of natural language into so-called contextualized embeddings, which are mathematical representations in the vector space. This makes encoder models particularly useful in use cases utilizing these vector representations for downstream deterministic or probabilistic tasks like classification problems, NER, or semantic search. On the other side, the decoder block is trained on next-token prediction and hence capable of generatively producing text if used in a recursive manner. They can be used for all tasks relying on the generation of text. These building blocks can be used independently of each other, but also in combination. Most of the models referred to within the field of generative AI today are decoder-only models. This is why this blog post will focus on this type of model.

Figure 1: The transformer architecture (adapted from Vaswani et al, 2017)

Fine-tuning leverages transfer learning to efficiently inject niche expertise into a foundation model like LLaMA2. The process involves updating the model’s weights through training on domain-specific data, while keeping the overall network architecture unchanged. Unlike full pre-training which requires massive datasets and compute, fine-tuning is highly sample and compute efficient. On a high level, the end-to-end process can be broken down into the following phases:

Figure 2: E2E fine-tuning pipeline

Data collection and selection: The set of proprietary data to be ingested into the model needs to be carefully selected. On top of that, for specific fine-tuning purposes data might not be available yet and has to be purposely collected. Depending on the data available and task to be achieved through fine-tuning, data of different quantitative or qualitative characteristics might be selected (e.g. labeled, un-labeled, preference data — see below) Besides the data quality aspect, dimensions like data source, confidentiality and IP, licensing, copyright, PII and more need to be considered.

LLM pre-training usually leverages a mix of web scrapes and curated corpora, the nature of fine-tuning as a domain adaptation approach implies that the datasets used are mostly curated corpora of labeled or unlabelled data specific to an organizational, knowledge, or task-specific domain.

Figure 3: Pre-training vs. fine-tuning: data composition and selection criteria

While this data can be sourced differently (document repositories, human-created content, etc.), this underlines that for fine-tuning, it is important to carefully select the data with respect to quality, but as mentioned above, also consider topics like confidentiality and IP, licensing, copyright, PII, and others.

Figure 4: Data requirements per fine-tuning approach

In addition to this, an important dimension is the categorization of the training dataset into unlabelled and labeled (including preference) data. Domain adaptation fine-tuning requires unlabelled textual data (as opposed to other fine-tuning approaches, see the figure 4). In other words, we can simply use any full-text documents in natural language that we consider to be of relevant content and sufficient quality. This could be user manuals, internal documentation, or even legal contracts, depending on the actual use case.

On the other hand, labeled datasets like an instruction-context-response dataset can be used for supervised fine-tuning approaches. Lately, reinforcement learning approaches for aligning models to actual user feedback have shown great results, leveraging human- or machine-created preference data, e.g., binary human feedback (thumbs up/down) or multi-response ranking.

As opposed to unlabeled data, labeled datasets are more difficult and expensive to collect, especially at scale and with sufficient domain expertise. Open-source data hubs like HuggingFace Datasets can be good sources for labeled datasets, especially in areas where the broader part of a relevant human population group agrees (e.g., a toxicity dataset for red-teaming), and using an open-source dataset as a proxy for the model’s real users’ preferences is sufficient.

Recently, synthetic data collection has become more and more a topic in the space of fine-tuning. This is the practice of using powerful LLMs to synthetically create labeled datasets, be it for SFT or preference alignment. Even though this approach has already shown promising results, it is currently still subject to further research and has to prove itself to be useful at scale in practice.

Data pre-processing: The selected data needs to be pre-processed to make it “well digestible” for the downstream training algorithm. Popular pre-processing steps are the following:Quality-related pre-processing, e.g. formatting, deduplication, PII filteringFine-tuning approach related pre-processing: e.g. rendering into prompt templates for supervised fine-tuningNLP-related pre-processing, e.g. tokenisation, embedding, chunking (according to context window)Model training: training of the deep neural network according to selected fine-tuning approach. Popular fine-tuning approaches we will discuss in detail further below are:Continued pre-training aka domain-adaptation fine-tuning: training on full-text data, alignment tied to a next-token-prediction taskSupervised fine-tuning: fine-tuning approach leveraging labeled data, alignment tied towards the target labelPreference-alignment approaches: fine-tuning approach leveraging preference data, aligning to a desired behaviour defined by the actual users of a model / system

Subsequently, we will dive deeper into the single phases, starting with an introduction to the training approach and different fine-tuning approaches before we move over to the dataset and data processing requirements.

In this section we will explore the approach for training decoder transformer models. This applies for pre-training as well as fine-tuning.As opposed to traditional ML training approaches like unsupervised learning with unlabeled data or supervised learning with labeled data, training of transformer models utilizes a hybrid approach referred to as self-supervised learning. This is because although being fed with unlabeled textual data, the algorithm is actually intrinsically supervising itself by masking specific input tokens. Given the below input sequence of tokens “Berlin is the capital of Germany.”, this natively leads into a supervised sample with y being the masked token and X being the rest.

Source link

Tags: âcomfortArisDeepDiveFineTuningJunSteppingTsakpiniszoneââââ
Previous Post

Parrot: Optimizing End-to-End Performance in LLM Applications Through Semantic Variables

Next Post

5 Ways to Supercharge Your B2B Content Innovation – TopRank® Marketing

Related Posts

How insurance companies can use synthetic data to fight bias
AI Technology

How insurance companies can use synthetic data to fight bias

June 10, 2024
From Low-Level to High-Level Tasks: Scaling Fine-Tuning with the ANDROIDCONTROL Dataset
AI Technology

From Low-Level to High-Level Tasks: Scaling Fine-Tuning with the ANDROIDCONTROL Dataset

June 10, 2024
Decoding Decoder-Only Transformers: Insights from Google DeepMind’s Paper
AI Technology

Decoding Decoder-Only Transformers: Insights from Google DeepMind’s Paper

June 9, 2024
How Game Theory Can Make AI More Reliable
AI Technology

How Game Theory Can Make AI More Reliable

June 9, 2024
Buffer of Thoughts (BoT): A Novel Thought-Augmented Reasoning AI Approach for Enhancing Accuracy, Efficiency, and Robustness of LLMs
AI Technology

Buffer of Thoughts (BoT): A Novel Thought-Augmented Reasoning AI Approach for Enhancing Accuracy, Efficiency, and Robustness of LLMs

June 9, 2024
Deciphering Doubt: Navigating Uncertainty in LLM Responses
AI Technology

Deciphering Doubt: Navigating Uncertainty in LLM Responses

June 9, 2024
Next Post
5 Ways to Supercharge Your B2B Content Innovation – TopRank® Marketing

5 Ways to Supercharge Your B2B Content Innovation – TopRank® Marketing

MicroStrategy and CEO Saylor Settle Tax Fraud Case for $40M

MicroStrategy and CEO Saylor Settle Tax Fraud Case for $40M

Bitcoin (BTC) Enters Accumulation Phase Amid Market Adjustments

Bitcoin (BTC) Enters Accumulation Phase Amid Market Adjustments

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

  • Trending
  • Comments
  • Latest
Is C.AI Down? Here Is What To Do Now

Is C.AI Down? Here Is What To Do Now

January 10, 2024
Porfo: Revolutionizing the Crypto Wallet Landscape

Porfo: Revolutionizing the Crypto Wallet Landscape

October 9, 2023
A Complete Guide to BERT with Code | by Bradney Smith | May, 2024

A Complete Guide to BERT with Code | by Bradney Smith | May, 2024

May 19, 2024
A faster, better way to prevent an AI chatbot from giving toxic responses | MIT News

A faster, better way to prevent an AI chatbot from giving toxic responses | MIT News

April 10, 2024
Part 1: ABAP RESTful Application Programming Model (RAP) – Introduction

Part 1: ABAP RESTful Application Programming Model (RAP) – Introduction

November 20, 2023
Saginaw HMI Enclosures and Suspension Arm Systems from AutomationDirect – Library.Automationdirect.com

Saginaw HMI Enclosures and Suspension Arm Systems from AutomationDirect – Library.Automationdirect.com

December 6, 2023
Can You Guess What Percentage Of Their Wealth The Rich Keep In Cash?

Can You Guess What Percentage Of Their Wealth The Rich Keep In Cash?

June 10, 2024
AI Compared: Which Assistant Is the Best?

AI Compared: Which Assistant Is the Best?

June 10, 2024
How insurance companies can use synthetic data to fight bias

How insurance companies can use synthetic data to fight bias

June 10, 2024
5 SLA metrics you should be monitoring

5 SLA metrics you should be monitoring

June 10, 2024
From Low-Level to High-Level Tasks: Scaling Fine-Tuning with the ANDROIDCONTROL Dataset

From Low-Level to High-Level Tasks: Scaling Fine-Tuning with the ANDROIDCONTROL Dataset

June 10, 2024
UGRO Capital: Targeting to hit milestone of Rs 20,000 cr loan book in 8-10 quarters: Shachindra Nath

UGRO Capital: Targeting to hit milestone of Rs 20,000 cr loan book in 8-10 quarters: Shachindra Nath

June 10, 2024
Facebook Twitter LinkedIn Pinterest RSS
News PouroverAI

The latest news and updates about the AI Technology and Latest Tech Updates around the world... PouroverAI keeps you in the loop.

CATEGORIES

  • AI Technology
  • Automation
  • Blockchain
  • Business
  • Cloud & Programming
  • Data Science & ML
  • Digital Marketing
  • Front-Tech
  • Uncategorized

SITEMAP

  • Disclaimer
  • Privacy Policy
  • DMCA
  • Cookie Privacy Policy
  • Terms and Conditions
  • Contact us

Copyright © 2023 PouroverAI News.
PouroverAI News

No Result
View All Result
  • Home
  • AI Tech
  • Business
  • Blockchain
  • Data Science & ML
  • Cloud & Programming
  • Automation
  • Front-Tech
  • Marketing

Copyright © 2023 PouroverAI News.
PouroverAI News

Welcome Back!

Login to your account below

Forgotten Password? Sign Up

Create New Account!

Fill the forms bellow to register

All fields are required. Log In

Retrieve your password

Please enter your username or email address to reset your password.

Log In