
本文字数:13817;估计阅读时间:35 分钟
作者:ClickHouse Team
本文在公众号【ClickHouseInc】首发

又到了新版本发布的时间!
发布概要
本次ClickHouse 24.10 版本包含了25个新功能🎁、15项性能优化🛷、60个bug修复🐛
在本次发布中,clickhouse-local 更加实用,新增了复制和计算器模式。可刷新物化视图已达到生产就绪标准,远程文件支持缓存,表克隆操作也得到了简化。
新贡献者
正如往常,我们热烈欢迎所有 24.9 版本中的新贡献者!ClickHouse 的流行离不开社区的积极贡献,见证这个社区的壮大总是令人感到鼓舞。
以下是新加入的贡献者名单:
Alsu Giliazova, Baitur, Baitur Ulukbekov, Damian Kula, DamianMaslanka5, Daniil Gentili, David Tsukernik, Dergousov Maxim, Diana Carroll, Faizan Patel, Hung Duong, Jiří Kozlovský, Kaushik Iska, Konstantin Vedernikov, Lionel Palacin, Mariia Khristenko, Miki Matsumoto, Oleg Galizin, Patrick Druley, SayeedKhan21, Sharath K S, Shichao, Shichao Jin, Vladimir Cherkasov, Vladimir Valerianov, Z.H., aleksey, alsu, johnnyfish, kurikuQwQ, kylhuk, lwz9103, scanhex12, sharathks118, singhksandeep25, sum12, tuanpach
视频相关 PPT 下载【https://presentations.clickhouse.com/release_24.10/】
clickhouse-local 让文件转换更简单
贡献者:Denis Hananein
ClickHouse 提供多种形式,其中之一是 clickhouse-local。它无需安装数据库服务器即可使用 SQL 快速处理本地和远程文件。以下是 ClickHouse 各种形式的图示:

clickhouse-local 现在新增了一个标志 --copy,可以作为 SELECT * FROM table 的快捷方式。这使得在不同数据格式之间进行转换变得异常简单。
我们将从 datablist/sample-csv-files GitHub 仓库【https://github.com/datablist/sample-csv-files?tab=readme-ov-file】下载一个包含 100 万人数据的 CSV 文件,然后可以运行以下查询来生成该文件的 Parquet 版本:
clickhouse -t --copy < people-1000000.csv > people.parquet
我们可以通过以下查询来查看新文件的内容:
clickhouse -f PrettyCompact \
"SELECT \"Job Title\", count()
FROM 'people.parquet'
GROUP BY ALL
ORDER BY count() DESC
LIMIT 10"
┌─Job Title───────────────────────────────────┬─count()─┐
1. │ Dealer │ 1684 │
2. │ IT consultant │ 1678 │
3. │ Designer, ceramics/pottery │ 1676 │
4. │ Pathologist │ 1673 │
5. │ Pharmacist, community │ 1672 │
6. │ Biochemist, clinical │ 1669 │
7. │ Chief Strategy Officer │ 1663 │
8. │ Armed forces training and education officer │ 1661 │
9. │ Archaeologist │ 1657 │
10. │ Education officer, environmental │ 1657 │
└─────────────────────────────────────────────┴─────────┘
如果我们想将数据从 CSV 转换为 JSON lines 格式,可以执行以下操作:
clickhouse -t --copy < people-1000000.csv > people.jsonl
接下来,我们看看新文件的内容:
head -n3 people.jsonl
{"Index":"1","User Id":"9E39Bfc4fdcc44e","First Name":"Diamond","Last Name":"Dudley","Sex":"Male","Email":"teresa26@example.org","Phone":"922.460.8218x66252","Date of birth":"1970-01-01","Job Title":"Photographer"}
{"Index":"2","User Id":"32C079F2Bad7e6F","First Name":"Ethan","Last Name":"Hanson","Sex":"Female","Email":"ufrank@example.com","Phone":"(458)005-8931x2478","Date of birth":"1985-03-08","Job Title":"Actuary"}
{"Index":"3","User Id":"a1F7faeBf5f7A3a","First Name":"Grace","Last Name":"Huerta","Sex":"Female","Email":"tiffany51@example.com","Phone":"(720)205-4521x7811","Date of birth":"1970-01-01","Job Title":"Visual merchandiser"}

最低0.47元/天 解锁文章
1336

被折叠的 条评论
为什么被折叠?



