Amazon SageMaker JumpStart offers a selection of built-in algorithms, pre-trained models, and solution templates to assist data scientists and machine learning practitioners in quickly training and deploying ML models. These algorithms and models can be used for both supervised and unsupervised learning across various input data types such as images, text, and tabular data.
This article focuses on utilizing text classification and fill-mask models available on Hugging Face within SageMaker JumpStart for text classification on a custom dataset. It also demonstrates real-time and batch inference for these models. The supervised learning algorithm supports transfer learning for all pre-trained models on Hugging Face, allowing for fine-tuning even with limited text data. This functionality is available in the SageMaker JumpStart UI in Amazon SageMaker Studio and can also be accessed through the SageMaker Python SDK.
The text classification with Hugging Face in SageMaker involves attaching a classification layer to the pre-trained Hugging Face model based on the number of class labels in the training data. This layer can then be fine-tuned on custom training data, enabling training with smaller datasets. In addition, the article provides details on how to run inference on the pre-trained model and how to fine-tune it on a custom dataset.
To fine-tune a pre-trained model, the training data should be in CSV format with class labels and corresponding text data. The fine-tuned model can be deployed for inference or further training. The article also covers how to adjust hyperparameters for training and how to use the default training dataset provided for fine-tuning. It further explains how to launch a training job for fine-tuning the model on a custom dataset using SageMaker.
For users interested in fine-tuning any Hugging Face fill-mask or text classification model, the article outlines how to download the model from the Hugging Face hub and perform the fine-tuning process. It also introduces SageMaker automatic model tuning (ATM) for finding the best model version by running multiple training jobs with specified hyperparameters.
Source link