Keras Gemma distributed finetuning and inference之实践项目_the model generates a list of great comedy movies -优快云博客

本文链接：https://blog.youkuaiyun.com/u011868279/article/details/137748297

本文介绍了如何在GoogleTPU上使用Keras和JAX库对Gemma7B模型进行参数高效（LoRA）微调，并展示了如何结合模型并行处理技术，以适应单设备或多GPU环境。教程详细步骤包括设置Gemma模型、使用LoRA和TPU进行训练，以及在IMDB数据集上进行微调以改变输出风格。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

Overview

Gemma is a family of lightweight, state-of-the-art open models built from research and technology used to create Google Gemini models. Gemma can be further finetuned to suit specific needs. But Large Language Models, such as Gemma, can be very large in size and some of them may not fit on a sing accelerator for finetuning. In this case there are two general approaches for finetuning them:

Parameter Efficient Fine-Tuning (PEFT), which seeks to shrink the effective model size by sacrificing some fidelity. LoRA falls in this category and the Finetune Gemma models in Keras using LoRA tutorial demonstrates how to finetune the Gemma 2B model gemma_2b_en with LoRA using KerasNLP on a single GPU.
Full parameter finetuning with model parallelism. Model parallelism distributes a single model's weights across multiple devices and enables horizontal scaling. You can find out more about distributed training in this Keras guide.

This tutorial walks you through using Keras with a JAX backend to finetune the Gemma 7B model with LoRA and model-parallism distributed training on Google's Tensor Processing Unit (TPU). Note that LoRA can be turned off in this tutorial for a slower but more accurate full-parameter tuning.

Using accelerators

Technically you can use either TPU or GPU for this tutorial.

Notes on TPU environments

Google has 3 products that provide TPUs:

Colab provides TPU v2, which is not sufficient for this tutorial.
Kaggle offers TPU v3 for free and they work for this tutorial.
Cloud TPU offers TPU v3 and newer generations. One way to set it up is:
1. Create a new TPU VM
2. Set up SSH port forwarding for your intended Jupyter server port
3. Install Jupyter and start it on the TPU VM, then connect to Colab through "Connect to a local runtime"