Unsloth: Input IDs of length 10201 > the model‘s max sequence length of 8192.

问题:运行Llama3-Chinese-8B-Instruct微调后的模型,出现Unsloth: Input IDs of length 10201 > the model's max sequence length of 8192.问题。

解决办法:

将max_seq_length = 2048 修改为 max_seq_length = 20402

# 原语句
max_seq_length = 2048
# 修改为下列语句
max_seq_length = 20402

[WARNING|logging.py:329] 2025-03-04 18:56:07,620 >> Unsloth: Input IDs of length 8200 > the model's max sequence length of 8192. We shall truncate it ourselves. It's imperative if you correct this issue first. swanlab: Error happened while training swanlab: 🌟 Run `swanlab watch /root/autodl-tmp/ai/LLaMA-Factory/swanlog` to view SwanLab Experiment Dashboard locally swanlab: 🏠 View project at https://swanlab.cn/@chrisfang/llamafactory-test swanlab: 🚀 View run at https://swanlab.cn/@chrisfang/llamafactory-test/runs/up1xn9h4tc0ynh9sfnogq File "/root/miniconda3/bin/llamafactory-cli", line 8, in <module> sys.exit(main()) ^^^^^^ File "/root/autodl-tmp/ai/LLaMA-Factory/src/llamafactory/cli.py", line 112, in main run_exp() File "/root/autodl-tmp/ai/LLaMA-Factory/src/llamafactory/train/tuner.py", line 93, in run_exp _training_function(config={"args": args, "callbacks": callbacks}) File "/root/autodl-tmp/ai/LLaMA-Factory/src/llamafactory/train/tuner.py", line 67, in _training_function run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks) File "/root/autodl-tmp/ai/LLaMA-Factory/src/llamafactory/train/sft/workflow.py", line 102, in run_sft train_result = trainer.train(resume_from_checkpoint=training_args.resume_from_checkpoint) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/lib/python3.12/site-packages/transformers/trainer.py", line 2241, in train return inner_training_loop( ^^^^^^^^^^^^^^^^^^^^ File "<string>", line 329, in _fast_inner_training_loop File "<string>", line 31, in _unsloth_training_step File "/root/miniconda3/lib/python3.12/site-packages/unsloth/models/_utils.py", line 1077, in _unsloth_pre_compute_loss return self._old_compute_loss(model, inputs, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/miniconda3/lib/python3.12/site-packages/transformers/trainer.py", line 3759, in compute_loss outputs = model(**inputs) ^^^^^^^^^^^^^^^ 但是我的数据集最大长度也才6000多
03-08
import tensorflow as tf import tensorflow_hub as hub from tensorflow.keras import layers import bert import numpy as np from transformers import BertTokenizer, BertModel # 设置BERT模型的路径和参数 bert_path = "E:\\AAA\\523\\BERT-pytorch-master\\bert1.ckpt" max_seq_length = 128 train_batch_size = 32 learning_rate = 2e-5 num_train_epochs = 3 # 加载BERT模型 def create_model(): input_word_ids = tf.keras.layers.Input(shape=(max_seq_length,), dtype=tf.int32, name="input_word_ids") input_mask = tf.keras.layers.Input(shape=(max_seq_length,), dtype=tf.int32, name="input_mask") segment_ids = tf.keras.layers.Input(shape=(max_seq_length,), dtype=tf.int32, name="segment_ids") bert_layer = hub.KerasLayer(bert_path, trainable=True) pooled_output, sequence_output = bert_layer([input_word_ids, input_mask, segment_ids]) output = layers.Dense(1, activation='sigmoid')(pooled_output) model = tf.keras.models.Model(inputs=[input_word_ids, input_mask, segment_ids], outputs=output) return model # 准备数据 def create_input_data(sentences, labels): tokenizer = bert.tokenization.FullTokenizer(vocab_file=bert_path + "trainer/vocab.small", do_lower_case=True) # tokenizer = BertTokenizer.from_pretrained('bert-base-uncased') input_ids = [] input_masks = [] segment_ids = [] for sentence in sentences: tokens = tokenizer.tokenize(sentence) tokens = ["[CLS]"] + tokens + ["[SEP]"] input_id = tokenizer.convert_tokens_to_ids(tokens) input_mask = [1] * len(input_id) segment_id = [0] * len(input_id) padding_length = max_seq_length - len(input_id) input_id += [0] * padding_length input_mask += [0] * padding_length segment_id += [0] * padding_length input_ids.append(input_id) input_masks.append(input_mask) segment_ids.append(segment_id) return np.array(input_ids), np.array(input_masks), np.array(segment_ids), np.array(labels) # 加载训练数据 train_sentences = ["Example sentence 1", "Example sentence 2", ...] train_labels = [0, 1, ...] train_input_ids, train_input_masks, train_segment_ids, train_labels = create_input_data(train_sentences, train_labels) # 构建模型 model = create_model() model.compile(optimizer=tf.keras.optimizers.Adam(lr=learning_rate), loss='binary_crossentropy', metrics=['accuracy']) # 开始微调 model.fit([train_input_ids, train_input_masks, train_segment_ids], train_labels, batch_size=train_batch_size, epochs=num_train_epochs)这段代码有什么问题吗?
05-24
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值