Fine-tune Gemma models in Keras using LoRA 实验二

数据架构

已于 2024-05-01 12:59:25 修改

阅读量1.3k

点赞数 31

分类专栏：大语言模型文章标签： keras 深度学习人工智能

于 2024-05-01 12:55:58 首次发布

本文链接：https://blog.youkuaiyun.com/u011868279/article/details/138368739

版权

本文介绍如何在Keras中利用LoRA技术对Gemma2B语言模型进行轻量级微调，以处理下游任务，特别关注DatabricksDolly15k数据集的应用，以减少训练参数和提升效率。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

Fine-tune Gemma models in Keras using LoRA | Kaggle

1.Overview

Gemma is a family of lightweight, state-of-the art open models built from the same research and technology used to create the Gemini models.

Large Language Models (LLMs) like Gemma have been shown to be effective at a variety of NLP tasks. An LLM is first pre-trained on a large corpus of text in a self-supervised fashion. Pre-training helps LLMs learn general-purpose knowledge, such as statistical relationships between words. An LLM can then be fine-tuned with domain-specific data to perform downstream tasks (such as sentiment analysis).

LLMs are extremely large in size (parameters in the order of millions). Full fine-tuning (which updates all the parameters in the model) is not required for most applications because typical fine-tuning datasets are relatively much smaller than the pre-training datasets.

Low Rank Adaptation (LoRA){:.external} is a fine-tuning technique which greatly reduces the number of trainable parameters for downstream tasks by freezing the weights of the model and inserting a smaller number of new weights into the model. This makes training with LoRA much faster and more memory-efficient, and produces smaller model weights (a few hundred MBs), all while maintaining the quality of the model outputs.

This tutorial walks you through using KerasNLP to perform LoRA fine-tuning on a Gemma 2B model using the Databricks Dolly 15k dataset{:.external}. This dataset contains 15,000 high-quality human-generated prompt / response pairs specifically designed for fine-tuning LLMs.

2.Setup

Get access to Gemma

To complete this tutorial, you will first need to complete the setup instructions at Gemma setup. The Gemma setup instructions show you how to do the following:

Gemma models are hosted by Kaggle. To use Gemma, request access on Kaggle: