CCQ: Convolutional Code for Extreme Low-bit Quantization in LLMs

在这里插入图片描述

文章主要内容总结

本文针对大型语言模型(LLMs)部署中存在的推理成本高、内存占用大等问题,提出了一种名为卷积码量化(Convolutional Code Quantization, CCQ)的极端低位量化方法。该方法旨在将LLMs压缩至2.0–2.75位,同时最大限度减少精度损失,解决了现有标量量化精度不足和向量量化推理效率低的瓶颈。

  • 研究背景:LLMs规模持续扩大(如DeepSeek-V3达671B参数、ERNIE 4.5达300B/424B参数),导致推理延迟高、跨设备通信开销大。现有量化方法中,8/4位量化虽能缓解问题,但3位以下量化面临精度、可扩展性和效率的严重下降。标量量化在2位时因数值空间受限精度骤降,向量量化因码本查找导致内存访问不规则,推理效率低。
  • 核心方法:CCQ整合了卷积码理论、混合编码(Hybrid Encoding)和码簇(Code Cluster)算法,构建无查找表的编码空间,实现码本与权重向量的线性映射,并用位移解码替代耗时的索引操作,优化推理性能。具体包括:
    • 基于卷积码构建码本,利用位移解码实现高效推理;
    • 混合编码灵活组合不同卷积码配置,满足存储需求并提高压缩率;
    • 码簇通过均匀量化进一步压缩编码空间,实现2位量化;
    • 优化尺度因子(scale)的计算和存储,减少额外
serverLog格式 08-26-2025 02:21:30.688 DEBUG [tcp-message-executor-22-2] [] c.t.s.e.s.s.ServerLoggingHandler(): [L:/192.168.0.20:29814 - R:8LMXmYEBnoIwNUkqf4GSaZIJYfzUhm0VHxN1lpE+H/s=] READ: ecsp message = {"header":{"seq":68,"version":"2.3.0","verCap":3,"device":"ap","mac":"00-FF-00-08-1E-02","type":256,"error":0,"dest":"f743d1de27951e0da716206367821a4d","timestamp":1756146090687,"ip":"192.168.0.254"},"body":{"clients":[{"mac":"2C-9C-58-9F-A7-29","ip":"192.168.0.188","ipv6List":["fe80::f4f4:65de:1561:9f79"],"ap":"00-FF-00-08-1E-02","ssid":"sqj-2","name":"","type":"unknown","ccq":0,"rid":1,"snr":41,"rssi":-54,"guest":0,"txR":154900,"rxR":288200,"txP":10,"rxP":267,"ps":0,"pm":6,"rids":0,"txT":0,"down":1470,"up":39137,"time":"0 days 00:33:44","aTime":1,"vlan":0,"radiusUsername":""}],"wiredClients":[],"lanMCastStat":{"ARP":7,"IPV4":{"IGMP":22,"MDNS":59,"SSDP":135,"DHCP":0},"IPV6":{"MLD":26,"MDNS":63,"SSDP":6,"NS/NA":2,"RS/RA":4,"REDIRECT":0,"DHCPv6":12},"other_bcast":186,"other_mcast":103},"portalAuthClients":[],"deviceInfo":{"model":"AP9770","name":"00-FF-00-08-1E-02","firmwareVersion":"1.0.0 Build 20250812 Rel. 73011","modelVersion":"1.0","hardwareVersion":"1.0","upTime":"0 days 01:18:24","ipv6List":[],"cpuUti":4,"cpuDetail":{"cpuNum":4,"cpuUtiArray":[5,4,2,4]},"memUti":58,"txRate":0,"rxRate":0,"wirelessLinked":false,"powerMode":0,"powerModeArray":[0,0,0,0],"isFactory":false,"ip":"192.168.0.254"},"mesh":{"childAPs":[],"isolatedAPs":[{"mac":"00-FF-00-08-3E-50","ch":69,"model":"AP9770","modelVer":"1.0","meshVer":1,"rssi":-65,"snr":31,"chainNum":2,"ctrlId":"ff9feac9f66de3c61bc16a5f54cd9119","verCap":3,"supportCluster":1,"fac":false,"modelType":"NORMAL","radioId":3},{"mac":"00-FF-00-AB-42-35","ch":157,"model":"EAP727","modelVer":"1.0","meshVer":1,"rssi":-42,"snr":54,"chainNum":3,"ctrlId":"164f82251fb0780d8c3ee32e66500a9a","verCap":3,"supportCluster":1,"fac":false,"modelType":"NORMAL","radioId":1},{"mac":"00-FF-00-AB-42-25","ch":157,"model":"EAP727","modelVer":"1.0","meshVer":1,"rssi":-40,"snr":56,"chainNum":3,"ctrlId":"c21f969b5f03d33d43e04f8f136e7682","verCap":3,"supportCluster":1,"fac":true,"modelType":"NORMAL","radioId":1}],"status":0},"needReply":1}} 如何用正则表达式提取ecsp message = 后的内容,但是不包括ecsp message =
09-04
a.c: In function ‘main: a.c:36:26: warning: implicit declaration of function ‘malloc’ [-Wimplicit-function-declaration] 36 | int *scores = (int *)malloc(num_teachers * num_students * sizeof(int)); | ^~~~~~ a.c:3:1: note: include ‘<stdlib.h>’ or provide a declaration of ‘malloc’ 2 | #include <pthread.h> +++ |+#include <stdlib.h> 3 | a.c:36:26: warning: incompatible implicit declaration of built-in function ‘malloc’ [-Wbuiltin-declaration-mismatch] 36 | int *scores = (int *)malloc(num_teachers * num_students * sizeof(int)); | ^~~~~~ a.c:36:26: note: include ‘<stdlib.h>’ or provide a declaration of ‘malloc’ a.c:68:5: warning: implicit declaration of function ‘free’ [-Wimplicit-function-declaration] 68 | free(scores); | ^~~~ a.c:68:5: note: include ‘<stdlib.h>’ or provide a declaration of ‘free’ a.c:68:5: warning: incompatible implicit declaration of built-in function ‘free’ [-Wbuiltin-declaration-mismatch] a.c:68:5: note: include ‘<stdlib.h>’ or provide a declaration of ‘free’ a.c:32:5: warning: ignoring return value of ‘scanf’ declared with attribute ‘warn_unused_result’ [-Wunused-result] 32 | scanf("%d", &num_teachers); | ^~~~~~~~~~~~~~~~~~~~~~~~~~ a.c:33:5: warning: ignoring return value of ‘scanf’ declared with attribute ‘warn_unused_result’ [-Wunused-result] 33 | scanf("%d", &num_students); | ^~~~~~~~~~~~~~~~~~~~~~~~~~ a.c:38:9: warning: ignoring return value of ‘scanf’ declared with attribute ‘warn_unused_result’ [-Wunused-result] 38 | scanf("%d", &scores[i]); | ^~~~~~~~~~~~~~~~~~~~~~~ /usr/bin/ld: /tmp/ccq8wtuk.o: in function `main': a.c:(.text.startup+0x10b): undefined reference to `pthread_create' /usr/bin/ld: a.c:(.text.startup+0x12b): undefined reference to `pthread_join' collect2: error: ld returned 1 exit status
11-01
正在执行任务: C/C++: gcc.exe 生成活动文件 正在启动生成... cmd /c chcp 65001>nul && "C:\Program Files\mingw64\bin\gcc.exe" -fdiagnostics-color=always -g D:\code\main.cpp -o D:\code\main.exe C:/Program Files/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/13.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:\Users\ADMINI~1\AppData\Local\Temp\ccq35lze.o: in function `main': D:\code/main.cpp:7: undefined reference to `std::istream::operator>>(int&)' C:/Program Files/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/13.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: D:\code/main.cpp:8: undefined reference to `std::ostream::operator<<(int)' C:/Program Files/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/13.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: D:\code/main.cpp:8: undefined reference to `std::ostream::operator<<(std::ostream& (*)(std::ostream&))' C:/Program Files/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/13.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:\Users\ADMINI~1\AppData\Local\Temp\ccq35lze.o:main.cpp:(.rdata$.refptr._ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_[.refptr._ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_]+0x0): undefined reference to `std::basic_ostream<char, std::char_traits<char> >& std::endl<char, std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&)' C:/Program Files/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/13.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:\Users\ADMINI~1\AppData\Local\Temp\ccq35lze.o:main.cpp:(.rdata$.refptr._ZSt4cout[.refptr._ZSt4cout]+0x0): undefined reference to `std::cout' C:/Program Files/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/13.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:\Users\ADMINI~1\AppData\Local\Temp\ccq35lze.o:main.cpp:(.rdata$.refptr._ZSt3cin[.refptr._ZSt3cin]+0x0): undefined reference to `std::cin' collect2.exe: error: ld returned 1 exit status 生成已完成,但出现错误。 * 终端进程已终止,退出代码: -1。 * 终端将被任务重用,按任意键关闭。
04-30
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

UnknownBody

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值