Llama/Qwen/DeepSeek开源之争——CLiB开源大模型排行榜：教育领域03.05

开源模型综合能力见：Llama/Qwen/DeepSeek开源之争——CLiB开源大模型排行榜03.04。

【公众号】大模型评测EasyLLM，欢迎交流！

以下为教育领域排行榜：

输出价格单位：（元/M tok）

排名	大模型	机构	输出价格	教育
1	DeepSeek-R1	深度求索	16	94.32
2	DeepSeek-R1-Distill-Qwen-32B	深度求索	1.3	88.79
3	qwq-32b-preview	阿里巴巴	7	87.41
4	qwen2.5-32b-instruct	阿里巴巴	7	86.77
5	qwen2.5-72b-instruct	阿里巴巴	12	85.45
6	qwen2.5-14b-instruct	阿里巴巴	6	85.03
7	DeepSeek-R1-Distill-Qwen-14B	深度求索	0.7	83.73
8	deepseek-chat-v3	深度求索	8	82.92
9	glm-4-9b-chat	智谱AI	0.6	81.27
10	qwen2.5-7b-instruct	阿里巴巴	2	80.49
11	Yi-1.5-34B-Chat	零一万物	1.3	79.54
12	DeepSeek-R1-Distill-Llama-70B	深度求索	4.1	79.4
13	internlm2_5-20b-chat	上海人工智能实验室	1	78.72
14	internlm2_5-7b-chat	上海人工智能实验室	0.4	72.9
15	qwen2.5-math-72b-instruct	阿里巴巴	12	71.67
16	Llama-3.3-70B-Instruct	meta	4.1	70.21
17	Hermes-3-Llama-3.1-405B	NousResearch	5.8	70.13
18	Yi-1.5-9B-Chat	零一万物	0.4	69.83
19	Meta-Llama-3.1-405B-Instruct	Meta	21	69.09
20	Llama-3.3-70B-Instruct-fp8	meta	2.2	68.28
21	qwen2.5-3b-instruct	阿里巴巴	0	67.65
22	Llama-3.1-Nemotron-70B-Instruct-fp8	nvidia	2.2	67.15
23	phi-4	微软	1	66.73
24	gemma-2-27b-it	Google	1.3	63.71
25	qwen2.5-1.5b-instruct	阿里巴巴	0	62.83
26	DeepSeek-R1-Distill-Qwen-7B	深度求索	0.4	62.63
27	gemma-2-9b-it	Google	0.6	62.27
28	DeepSeek-R1-Distill-Llama-8B	深度求索	0.4	58.35
29	Mistral-Nemo-Instruct-2407	Mistral	0.6	56.63
30	Llama-3.1-8B-Instruct	Meta	0.4	53.31
31	Meta-Llama-3.1-8B-Instruct-fp8	meta	0.4	51.21
32	Mistral-7B-Instruct-v0.3	Mistral	0.4	47.19
33	DeepSeek-R1-Distill-Qwen-1.5B	深度求索	0.1	46.63
34	qwen2.5-0.5b-instruct	阿里巴巴	0	46.39
35	Llama-3.2-3B-Instruct	meta	0.2	43.3
36	Llama-3.2-1B-Instruct	meta	0.2	39.07

教育领域目前囊括4个维度：高考，高中各学科，初中各学科，小学各学科。

完整评测结果详见：https://github.com/jeinlee1991/chinese-llm-benchmark

【公众号】大模型评测EasyLLM，欢迎交流！

往期文章

教育行业｜小学至高中3个阶段、9个学科、110个大模型应用实测！

医疗行业｜110个大模型，12个分类、18科目应用实测！

Llama/Qwen/DeepSeek开源之争——CLiB开源大模型排行榜03.04

那些免费的大模型API效果到底好不好？——CLiB大模型排行榜

参数量5B以下端侧大模型03.13——CLiB大模型排行榜

DeepSeek｜到底强在哪？这个评测一目了然！

【公众号】大模型评测EasyLLM，欢迎交流！

关于大模型评测EasyLLM

最全——全球最全大模型产品评测平台，已囊括203个大模型
最新——月更各个大模型各项能力指标评测，输出排行榜
最方便——无需注册/梯子，国内外各个大模型可一键评测
结果可见——所有大模型评测的方法、题集、过程、得分结果，可见可追溯！
错题本——百万级大模型错题本

大模型评测EasyLLM目前已囊括203个大模型，覆盖chatgpt、gpt-4o、o3-mini、谷歌gemini、Claude3.5、智谱GLM-Zero、文心一言、qwen-max、百川、讯飞星火、商汤senseChat、minimax等商用模型，以及DeepSeek-R1、deepseek-v3、qwen2.5、llama3.3、phi-4、glm4、书生internLM2.5等开源大模型。不仅提供能力评分排行榜，也提供所有模型的原始输出结果！

完整评测题集及结果详见：https://github.com/jeinlee1991/chinese-llm-benchmark

【公众号】大模型评测EasyLLM，欢迎交流！