DeepSeek-R1模型现已在Amazon Bedrock Marketplace、Amazon SageMaker JumpStart上线,并且在Amazon Bedrock上推出了Serverless版本,该模型以其深入且详尽的思考方式而广受欢迎。根据DeepSeek公布的结果,这种思考方式使DeepSeek-R1模型在AIME-2024和MATH-500等极具挑战性的数学基准测试中表现优异。
在训练过程中,研究人员展示了DeepSeek-R1-Zero如何自然而然地学会用更多的思考时间来解决问题,从而提升性能。然而在推理阶段所需要的思考tokens数量,以及在回答原始问题之前生成这些tokens所耗费的时间和成本通常被忽视。
本文将介绍如何使用Amazon Bedrock的提示词优化功能,来优化DeepSeek-R1等推理模型。
长推理链和最大tokens限制带来的挑战
在DeepSeek-R1上尝试提问如下这一简单问题。
这是一个数学问题:Nate的小狗一天能挖六个洞。Nate度假时,它挖了14天。Nate回家后,开始每天填9个洞,但小狗每天晚上都会继续挖6个新洞。要填完所有的洞,需要几周时间?请写出解题步骤。
在Amazon Bedrock Chat/Text Playground上,您可以选择新的DeepSeek-R1模型,如下图所示。
您可能会发现,有时基于问题本身,推理模型并无法在总体最大tokens预算内完成思考。
增加输出tokens预算可以让模型进行更长时间的思考,例如将最大tokens数从2048增加到4096,可以看到模型在打印最终答案之前,先推理了一段时间。
本文末尾的附录提供了完整答案,您可以跳过推理步骤查看最终答案。
从2048个tokens的预算案例中可以看到,思考过程并未结束,这不仅花费了相应的时间和成本,而且也没有得到最终答案。思考过程需要如此多的tokens,不免会让人产生如下疑问。
是否有可能在减少思考所用的tokens数的同时,仍然可以得到正确答案?
是否可以限制思维tokens的最大数量或思维预算?
从更高层次上讲,诸如DeepSeek这类需要大量思考的模型是否应该用于实时应用?
本文将展示如何使用Amazon Bedrock的提示词优化功能,来优化DeepSeek-R1等推理模型,从而在不牺牲准确性的情况下获得更简洁的思考过程。
优化DeepSeek-R1模型的提示词
要开始提示词优化,请在Amazon Bedrock的model playground中选择DeepSeek-R1模型,输入您的提示词,然后选择魔法棒图标,或者使用Amazon Bedrock的optimize_prompt() API。您也可以在控制台上使用提示词优化功能,根据需要添加变量,将模型设置为DeepSeek-R1并设置模型参数,然后点击“优化”。
为了展示Amazon Bedrock的提示词优化功能如何助力推理模型,首先需要一个具有挑战性的数据集。Humanity’s Last Exam(HLE)是一项涵盖数十个学科领域中极具挑战性问题的基准测试,旨在成为全面检验学术能力的“终极”封闭式基准测试。HLE是多模态的,既有纯文本的问题,又有附带参考图片的问题,还包括用于自动答案验证的多选题和精确匹配题。
这些问题要求答题者具备各个领域的深厚专业知识,表述清晰明确,难以通过简单的网络搜索或数据库检索找到答案。因此,一些先进模型(包括思考模型)在该基准测试上的表现都不尽人意(详见本文结果表)。
如下是该数据集中的一个示例问题。
In an alternate universe where the mass of the electron was 1% heavier and the charges of the
electron and proton were both 1% smaller, but all other fundamental constants stayed the same,
approximately how would the speed of sound in diamond change?
Answer Choices:
A. Decrease by 2%
B. Decrease by 1.5%
C. Decrease by 1%
D. Decrease by 0.5%
E. Stay approximately the same
F. Increase by 0.5%
G. Increase by 1%
H. Increase by 1.5%
I. Increase by 2%
左右滑动查看完整示意
要回答这一问题,需要对物理学有深刻的理解,而这是目前大多数大语言模型(LLMs)无法做到的。在Amazon Bedrock上对推理模型进行提示词优化,旨在减少思考tokens的数量,同时又不牺牲准确性。使用提示词优化功能后,优化后的提示词如下。
## Question
<extracted_question_1>In an alternate universe where the mass of the electron was 1% heavier
and the charges of the electron and proton were both 1% smaller, but all other fundamental constants
stayed the same, approximately how would the speed of sound in diamond change?
Answer Choices:
A. Decrease by 2%
B. Decrease by 1.5%
C. Decrease by 1%
D. Decrease by 0.5%
E. Stay approximately the same
F. Increase by 0.5%
G. Increase by 1%
H. Increase by 1.5%
I. Increase by 2%</extracted_question_1>
## Instruction
Read the question above carefully and provide the most accurate answer possible.
If multiple choice options are provided within the question, respond with the entire text of the
correct answer option, not just the letter or number. Do not include any additional explanations or
preamble in your response.
Remember, your goal is to answer as precisely and accurately as possible!
左右滑动查看完整示意
下图展示了在这个特定案例中,思考tokens的数量减少了35%(从5000个减少到了3300个),仍然得出了正确答案。同时在本例和其他使用原始提示词的示例中,部分推理在得出最终答案之前出现了总结或重复的情况。
从本例可见,优化后的提示词表述清晰、条理分明,能够根据不同的问题类型和回答方式提供针对性的指导,从而简化了推理过程,使其更加直观明了,同时也便于直接获取最终答案。
使用优化后的提示词能够获得正确答案,而非之前长时间思考后得到的却可能是错误答案,这是因为思考并不能确保最终答案的正确性。在本例中,思考tokens的数量从5000个减少至1555个,且答案直接得出,无需再经过冗长的解释。以下图表展示了一个具体示例。
上述两个示例展示了如何利用提示词优化减少DeepSeek-R1等模型的输出tokens数量的同时,改善输出结果。
本演示还将提示词优化应用于来自HLE的400个问题,结果如下表。
观察结果可见,在HLE数据集的这一特定子集上,整体准确率显著提升至11%。同时思考和输出tokens的数量均有所减少,从而缩短了处理时间以及降低了成本,并且思考完成的总体比率也提高到了90%。
通过实验可以发现,尽管进行提示词优化时并未直接提及减少思考tokens数量,但提示词优化后给出的更明确更详尽的任务指令,可能降低了DeepSeek-R1等模型在自我澄清或深入理解问题方面所需的额外工作量。对推理模型进行提示词优化,可以确保思考和整体流程的质量(这种质量是自适应的且依赖于具体问题)基本不受影响,从而能够得出更加优质的答案。
总结
本文深入探讨了如何利用Amazon Bedrock的提示词优化功能,有效提升诸如DeepSeek-R1等需要思维密集型模型的性能。通过针对HLE数据集的实验,证明了经过优化后的提示词不仅能够显著减少思考tokens的数量,还能将整体准确率从8.75%提升至11%。这一优化在保障答案质量的同时,还使推理过程变得更加高效,进而加快了响应速度并降低了成本。
效率和效果的双重提升,充分表明在需要平衡准确率与计算资源的生产环境中,提示词优化是部署思维密集型模型的一项宝贵工具。随着AI领域的持续进步,更加复杂的思维模型不断涌现,提示词优化等技术在实践应用中的价值将愈发凸显。
要开始使用Amazon Bedrock的提示词优化功能,请参阅《提示词优化》用户指南以及《通过Amazon Bedrock的提示词优化提升生成式AI应用的性能》博客。
《提示词优化》用户指南:
https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-management-optimize.html
《通过Amazon Bedrock的提示词优化提升生成式AI应用的性能》:
https://aws.amazon.com/blogs/machine-learning/improve-the-performance-of-your-generative-ai-applications-with-prompt-optimization-on-amazon-bedrock/
附录
以下是前文所述的关于Nate的狗挖洞问题的完整回答。
Okay, let's see. So the problem is about Nate's dog digging holes and Nate trying to fill them.
I need to figure out how many weeks it takes Nate to fill all the holes once he starts. Let me break
it down step by step.\n\nFirst, the dog digs 6 holes a day. He does this for 14 days while Nate is on
vacation. So when Nate comes back, there are a certain number of holes already dug. Let me calculate
that first. If the dog digs 6 holes each day for 14 days, that's 6 multiplied by 14. Let me do that
math: 6*14=84. So there are 84 holes when Nate gets home.\n\nNow, once Nate starts filling the holes,
he fills 9 holes per day. But here's the catch: the dog keeps digging 6 new holes every night. So
each day, Nate fills 9 holes, but the dog adds 6 more overnight. I need to figure out the net change
per day. That would be the number of holes filled minus the new holes dug. So 9 filled - 6 dug = 3
holes net reduction per day. Wait, but does this happen every day? Let me make sure.\n\nSo Nate works
during the day, filling 9 holes. Then, during the night, the dog digs 6 new holes. So each full day
(day and night) results in a net decrease of 3 holes. So starting from 84 holes, each day the total
number decreases by 3. But I need to check if the process is that Nate fills 9, then the dog digs 6,
so the next day starts with 84 -9 +6 = 84 -3 =81, and so on. So yes, each day the net is -3.\n\nBut
wait, maybe the timing is important. Let's think: when Nate starts filling, does he fill 9 holes on
the first day, and then the dog digs 6 that night? So each day after Nate fills, the dog adds 6. So
the sequence is: Day 1: Nate fills 9, holes become 84-9=75. Then night: dog digs 6, holes become
75+6=81. Day 2: Nate fills 9, holes 81-9=72. Night: +6=78. So each full day cycle (day and night)
reduces the total by 3. So yes, net 3 per day.\n\nBut wait, maybe on the last day, if Nate fills the
remaining holes and the dog doesn't get to dig that night because all holes are filled. Hmm, that's a
crucial point. For example, if on a certain day, the number of holes left is less than or equal to 9,
Nate can fill them all in that day, and the dog doesn't add any more because there are no holes left
to dig. So we need to check if on the day when Nate fills the remaining holes, the total would reach
zero, and the dog doesn't dig that night. So we can't just do total holes divided by 3, because the
last day might not have the dog digging.\n\nSo let's model this properly. Let's denote the number of
holes each day.\n\nStarting with 84 holes.\n\nEach day:\n\n1. Nate fills 9 holes. Subtract 9.\n\n2.
If the result is zero or negative, then done.\n\n3. Otherwise, the dog digs 6 holes. Add 6.\n\nSo we
need to iterate this until the number of holes is zero or less after Nate fills them.\n\nLet me try
to calculate day by day.\n\nStart: 84\n\nDay 1:\n\nAfter Nate fills: 84 -9 =75\n\nSince 75 >0, dog
digs 6: 75 +6=81\n\nDay 2:\n\nAfter Nate fills:81-9=72\n\nDog digs 6:72+6=78\n\nDay3:\n\n78-
9=69\n\n+6=75\n\nDay4:75-9=66+6=72\n\nDay5:72-9=63+6=69\n\nDay6:69-9=60+6=66\n\nDay7:66-
9=57+6=63\n\nDay8:63-9=54+6=60\n\nDay9:60-9=51+6=57\n\nDay10:57-9=48+6=54\n\nDay11:54-
9=45+6=51\n\nDay12:51-9=42+6=48\n\nDay13:48-9=39+6=45\n\nDay14:45-9=36+6=42\n\nDay15:42-
9=33+6=39\n\nDay16:39-9=30+6=36\n\nDay17:36-9=27+6=33\n\nDay18:33-9=24+6=30\n\nDay19:30-
9=21+6=27\n\nDay20:27-9=18+6=24\n\nDay21:24-9=15+6=21\n\nDay22:21-9=12+6=18\n\nDay23:18-
9=9+6=15\n\nDay24:15-9=6+6=12\n\nDay25:12-9=3+6=9\n\nDay26:9-9=0\n\nAt this point, after Nate fills
on day26, the holes are 0. Since there are no holes left, the dog can't dig any more that night. So
the process stops here.\n\nSo total days:26 days.\n\nBut the question asks for how many weeks. Since
a week is 7 days, we divide 26 by7.\n\n26 divided by7 is 3 with a remainder of5. So 3 weeks and5
days. But the question says \"how many weeks does it take him to fill in all the holes\". Depending
on whether they want it rounded up to full weeks or expressed as a fraction. Let me check the problem
statement again.\n\n\"How many weeks does it take him to fill in all the holes?\"\n\nTypically, such
problems expect the answer in whole weeks, rounded up if there's a remainder. Because you can't have
a fraction of a week in practical terms. So 26 days is 3 weeks and5 days, which is more than3 weeks
but less than4. So they might expect 4 weeks. But let me verify.\n\nAlternatively, maybe they want
the exact number in weeks as a fraction. 26 days is 26/7 weeks \u22483.714 weeks. But the problem
might expect an integer number of weeks, rounded up. Let's see.\n\nBut let's check the problem again.
It says \"how many weeks does it take him to fill in all the holes?\" So if it takes 26 days, which
is 3 weeks and5 days, then the answer would be 4 weeks if we're counting full weeks needed. Because
after 3 weeks (21 days), he still has 5 days left. So total weeks is4.\n\nBut maybe the answer is
26/7 weeks, which is approximately3.714, but since the problem might expect an exact answer in weeks,
possibly as a fraction. However, the problem might be designed to have a whole number. Wait, let's
check the math again.\n\nWait, when I did the day-by-day count, I got 26 days. Let me recount to make
sure I didn't make a mistake.\n\nStarting from day1 to day26. Let's check day26:\n\nStart with
day1:84\n\nAfter day1:81\n\n...\n\nLet me check day25:\n\nAfter day24:12\n\nDay25:12-9=3, then dog
digs6:3+6=9\n\nDay26:9-9=0. So day26 is the day when Nate fills the last 9 holes. So yes, 26
days.\n\n26 days divided by7 days/week=3.714... weeks. So the answer is 26/7 weeks, which is3 and5/7
weeks. But the problem might want it expressed as a mixed number or an exact number. However, the
question says \"how many weeks does it take him\", so likely they expect the answer in weeks,
possibly as a fraction. But let's see if there's a better way to model this without counting each
day.\n\nAlternative approach:\n\nTotal holes after vacation:84\n\nEach day, Nate fills9, dog digs6,
net -3 per day, except possibly the last day.\n\nBut if on the last day, Nate fills the remaining
holes without the dog digging, then we need to adjust.\n\nLet\u2019s model it as:\n\nLet H be the
number of holes. Start with H=84.\n\nEach day, H decreases by9, then if H>0, increases by6.\n\nWe
need to find the number of days until H<=0.\n\nThis can be modeled as:\n\nEach full cycle (day and
night) reduces H by3, except the last day where if H<=9, then H becomes0 and no addition.\n\nSo
the number of full cycles (days where H>9 after filling) plus the final day.\n\nLet\u2019s
denote:\n\nLet\u2019s say after n days, the holes would be:\n\nEach day, H = H -9 +6 = H -3, but only
if after subtracting9, H >=1 (since if H-9 <=0, then the dog doesn\u2019t dig). Wait, no. Wait,
the dog digs6 only if after Nate fills9, there are still holes left. Wait, no. Wait, the problem says
the dog keeps digging6 new holes every night. So regardless of whether there are holes left, the dog
digs6 every night. Wait, but if all holes are filled, can the dog dig new ones? The problem says
\"fill in all the holes\", so once all holes are filled, the dog can't dig new ones because there are
no existing holes? Wait, the problem might be implying that the dog digs new holes regardless, but
Nate is filling the existing ones. Wait, the problem says \"the dog keeps digging6 new holes every
night\". So perhaps every night, regardless of the current state, the dog digs6 new holes. But that
would mean that even if Nate fills all holes on a day, the dog digs6 new ones that night, making it
impossible to ever finish. But that can't be, because the problem asks how long to fill all holes,
which implies that eventually Nate can catch up.\n\nWait, this is a critical point. Let me re-examine
the problem statement:\n\n\"When Nate gets home, he starts filling in 9 holes a day, but the dog
keeps digging 6 new holes every night. How many weeks does it take him to fill in all the holes?
\"\n\nSo the key is: every night, the dog digs6 new holes. So even if Nate fills all the holes during
the day, the dog will dig6 new ones that night. But that would mean the process never ends. But the
problem must assume that once all holes are filled, the dog stops. But the wording is ambiguous.
However, logically, if all holes are filled, there's nothing for the dog to dig. So perhaps the dog
only digs6 holes if there's ground to dig, but the problem states \"keeps digging6 new holes every
night\", implying that regardless of the current state, the dog digs6 each night. This is a problem
because if Nate ever fills all holes, the next night the dog digs6, so Nate would have to fill those6
the next day, but then the dog digs6 again, leading to an infinite loop. But the problem must have a
solution, so likely the dog stops digging once all holes are filled. Therefore, the correct
interpretation is that each night, the dog digs6 new holes, but if there are no holes (i.e., all
filled), the dog can't dig. Therefore, the last day when Nate fills the remaining holes, the dog
doesn't dig that night.\n\nTherefore, the correct model is:\n\nEach day:\n\n1. Nate fills9
holes.\n\n2. If holes remain after filling, the dog digs6 that night.\n\nSo the process is:\n\nStart
with H=84.\n\nWhile H>0:\n\n H = H -9\n\n if H >0:\n\n H = H +6\n\n days +=1\n\nSo let's model
this with equations.\n\nLet\u2019s find the number of days required.\n\nThis is similar to the
classic problem where each day you have a net change, but with a different condition on the last
day.\n\nLet\u2019s denote that each \"cycle\" (day and night) reduces H by3, except possibly the last
day where H <=9, so after filling, H becomes0, and no addition.\n\nSo the total number of days is
the number of full cycles (each reducing H by3) plus the final day.\n\nBut how to calculate this?
\n\nLet\u2019s think of it as:\n\nLet\u2019s suppose that after n days, the number of holes is H_n
=84 -3n.\n\nBut this is only true if every day the net change is -3. However, this would be the case
until H_n -9 <=0. Wait, no. Wait, the H after filling is H -9. If H -9 <=0, then the dog
doesn't dig, so the total reduction is H (to 0). So we need to find the day when H -9 <=0.\n\nSo
let\u2019s solve for when H -9 <=0:\n\nH =84 -3(n-1) -9 <=0\n\nWait, this is getting
complicated. Let me use another approach.\n\nLet\u2019s consider that each day before the final day,
the net change is -3. On the final day, the net change is -9 (since the dog doesn\u2019t add
6).\n\nLet\u2019s let x be the number of full days (with net -3), and 1 final day (net -9).\n\nTotal
reduction: 3x +9 =84\n\nSo 3x =84 -9=75\n\nx=75/3=25\n\nSo x=25 days of net -3, and 1 final day of
-9.\n\nTotal days=25+1=26 days.\n\nYes, this matches the day-by-day count. Therefore, 26
days.\n\nConvert to weeks:26/7=3.714... weeks. But the problem asks \"how many weeks does it take
him\". If they want a whole number of weeks, rounded up, it's4 weeks. If they accept fractions,
it's26/7\u22483.714. But the problem might expect the answer in weeks as a mixed number, like3 and5/7
weeks. However, typically, such problems expect the answer in whole weeks, rounded up. Let me check
the problem statement again.\n\nThe problem says \"how many weeks does it take him to fill in all the
holes\". The answer should be in weeks, and since it's more than3 weeks but less than4, but you can't
have a fraction of a week in practical terms when scheduling, so they might expect4 weeks. However,
mathematically, it's26/7 weeks. Let me see if the problem expects an exact answer or a whole
number.\n\nLooking back at the problem statement, it's a math problem, so likely expects the exact
value, which is26/7 weeks, which can be written as3 5/7 weeks. However, sometimes problems like this
expect the answer in whole weeks, rounding up. Let's check both possibilities.\n\nIf the answer is26
days, which is3 weeks and5 days, then depending on the context:\n\n- If partial weeks count as a full
week, then4 weeks.\n\n- If exact, then3 5/7 weeks.\n\nBut the problem doesn't specify, so as a math
problem, the precise answer is26/7 weeks, which is3 and5/7 weeks. However, the question might want it
in weeks as a whole number, rounded up. Let's see.\n\nBut let's check the problem again. The problem
says \"how many weeks does it take him to fill in all the holes?\" The answer choices (if any) aren't
given, but since the user hasn't provided options, we need to infer. In math problems like this,
unless specified otherwise, the answer is expected in the most precise form, which would be26/7 weeks
or3 5/7 weeks. However, sometimes problems expect whole numbers, so maybe the answer is4 weeks. But
let's verify with the day count.\n\n26 days is exactly3 weeks and5 days. If the question asks for the
number of weeks, and if part of a week counts as a full week, then it's4. But if they want the exact
number, it's3 5/7. However, the problem might have an integer answer. Wait, let's check the
calculations again.\n\nWait, when I did the day-by-day count, I got26 days. Let me verify that again
quickly.\n\nStart with84.\n\nEach day:\n\n1. Subtract9.\n\n2. If result >0, add6.\n\nRepeat until
H<=0.\n\nLet\u2019s model this with a formula.\n\nLet\u2019s denote that each day, except the
last, the net change is -3. The last day is -9.\n\nSo total holes =84 =3x +9\n\nSo3x=75 \u2192x=25
days of net -3, plus1 day of -9. Total26 days.\n\nYes, that's correct. So26 days.\n\n26 days divided
by7 days/week=3.714... weeks.\n\nBut the problem might want the answer as a mixed number, so3 and5/7
weeks. However, in some contexts, especially in school problems, they might expect the answer in
whole weeks, rounded up, so4 weeks. But since the problem is from Nate's perspective, he would count
the exact days, so converting26 days to weeks gives3 weeks and5 days. But the question asks for
weeks, so unless specified, it's better to present it as26/7 weeks, which is3 5/7 weeks. However,
let's check if there's a different approach that gives a whole number.\n\nAlternatively, maybe I made
a mistake in the day count. Let me recount the days more carefully.\n\nStarting with84
holes.\n\nDay1:\n\nFill9:84-9=75. Since75>0, dog digs6:75+6=81.\n\nDay2:\n\n81-9=72>0
\u2192+6=78.\n\nDay3:78-9=69>0 \u2192+6=75.\n\nDay4:75-9=66>0 \u2192+6=72.\n\nDay5:72-9=63>0
\u2192+6=69.\n\nDay6:69-9=60>0 \u2192+6=66.\n\nDay7:66-9=57>0 \u2192+6=63.\n\nDay8:63-9=54>0
\u2192+6=60.\n\nDay9:60-9=51>0 \u2192+6=57.\n\nDay10:57-9=48>0 \u2192+6=54.\n\nDay11:54-
9=45>0 \u2192+6=51.\n\nDay12:51-9=42>0 \u2192+6=48.\n\nDay13:48-9=39>0
\u2192+6=45.\n\nDay14:45-9=36>0 \u2192+6=42.\n\nDay15:42-9=33>0 \u2192+6=39.\n\nDay16:39-
9=30>0 \u2192+6=36.\n\nDay17:36-9="
左右滑动查看完整示意
本篇作者
Shreyas Subramanian
首席数据科学家,通过使用生成式AI和深度学习,来帮助客户利用亚马逊云科技服务解决业务挑战。Shreyas在大规模优化和机器学习领域拥有丰富经验,并擅长使用机器学习和强化学习加速优化任务。
Zhengyuan Shen
Amazon Bedrock应用科学家,专注于自然语言理解和结构化数据理解等复杂任务的基础模型和机器学习建模,热衷于利用创新的机器学习解决方案来优化产品或服务,通过科学与工程的无缝结合为客户带来便捷。
Xuan Qi
Amazon Bedrock应用科学家,她运用其物理学背景,解决人工智能与机器学习领域的复杂挑战,热衷于将科学概念转化为实际应用,推动技术的实质性进步。她的工作聚焦于创建更加直观更加高效的AI系统,使其能够更好地理解世界并与世界互动。
Shuai Wang
Amazon Bedrock高级应用科学家兼经理,专注于自然语言处理、机器学习、大语言建模以及其他相关人工智能领域。
星标不迷路,开发更极速!
关注后记得星标「亚马逊云开发者」
听说,点完下面4个按钮
就不会碰到bug了!
点击阅读原文查看博客!获得更详细内容!