Keras Gemma distributed finetuning and inference之实践项目

本文介绍了如何在GoogleTPU上使用Keras和JAX库对Gemma7B模型进行参数高效(LoRA)微调,并展示了如何结合模型并行处理技术,以适应单设备或多GPU环境。教程详细步骤包括设置Gemma模型、使用LoRA和TPU进行训练,以及在IMDB数据集上进行微调以改变输出风格。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

Overview

Gemma is a family of lightweight, state-of-the-art open models built from research and technology used to create Google Gemini models. Gemma can be further finetuned to suit specific needs. But Large Language Models, such as Gemma, can be very large in size and some of them may not fit on a sing accelerator for finetuning. In this case there are two general approaches for finetuning them:

  1. Parameter Efficient Fine-Tuning (PEFT), which seeks to shrink the effective model size by sacrificing some fidelity. LoRA falls in this category and the Finetune Gemma models in Keras using LoRA tutorial demonstrates how to finetune the Gemma 2B model gemma_2b_en with LoRA using KerasNLP on a single GPU.
  2. Full parameter finetuning with model parallelism. Model parallelism distributes a single model's weights across multiple devices and enables horizontal scaling. You can find out more about distributed training in this Keras guide.

This tutorial walks you through using Keras with a JAX backend to finetune the Gemma 7B model with LoRA and model-parallism distributed training on Google's Tensor Processing Unit (TPU). Note that LoRA can be turned off in this tutorial for a slower but more accurate full-parameter tuning.

Using accelerators

Technically you can use either TPU or GPU for this tutorial.

Notes on TPU environments

Google has 3 products that provide TPUs:

  • Colab provides TPU v2, which is not sufficient for this tutorial.
  • Kaggle offers TPU v3 for free and they work for this tutorial.
  • Cloud TPU offers TPU v3 and newer generations. One way to set it up is:
    1. Create a new TPU VM
    2. Set up SSH port forwarding for your intended Jupyter server port
    3. Install Jupyter and start it on the TPU VM, then connect to Colab through "Connect to a local runtime"

Notes on multi-GPU setup

Although this tutorial foc

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值