Analyze OnTime datasets with Databend on AWS EC2 and S3

1.基准环境

ec2 size:c5a.4xlarge

ec2 region:<您的 s3 存储区域>

本地磁盘300G,本地磁盘仅用于ontime save和databend complie。

操作系统:ubuntu 20 x64

准备安装包:

$sudo apt-get install unzip make mysql-client-core-8.0

2.部署Databend

2.1 编译Databend

$git clone https://github.com/datafuselabs/databend.git

$cd databend

$make setup

$export PATH=$PATH:~/.cargo/bin

$make build-native

最后,databend 相关的二进制文件位于

./target/release/{databend-meta, databend-query}

2.2 启动Databend

# Please replace the s3 env config with your own. 
export STORAGE_TYPE=s3
export S3_STORAGE_BUCKET=<your-s3-bucket>
export S3_STORAGE_REGION=<your-s3-region>
export S3_STORAGE_ENDPOINT_URL=<your-bucket>.s3.amazonaws.com
export S3_STORAGE_ACCESS_KEY_ID=<your-s3-key-id>
export S3_STORAGE_SECRET_ACCESS_KEY=<your-s3-access-key>

echo "Starting standalone DatabendQuery(release)"
./scripts/ci/deploy/databend-query-standalone.sh release

2.3 测试 Databend

mysql -h 127.0.0.1 -P3307 -uroot

检查连接是否正常。

3. 加载 Ontime

3.1 创建 Ontime 表

wget --no-check-certificate https://repo.databend.rs/ontime/create_table.sql
cat create_table.sql |mysql -h 127.0.0.1 -P3307 -uroot

3.2 加载数据

wget --no-check-certificate https://repo.databend.rs/t_ontime/t_ontime.csv.zip

unzip t_ontime.csv.zip

ls *.csv|xargs -I{} echo  curl -H \"insert_sql:insert into ontime format CSV\" -H \"csv_header:0\" -H \"field_delimitor:'\t'\"  -F  \"upload=@{}\"  -XPUT http://localhost:8001/v1/streaming_load |bash

4. 查询

执行查询

mysql -h 127.0.0.1 -P3307 -uroot 
mysql>set parallel_read_threads=4;
mysql>select count(*) from ontime;
mysql>select Year, count(*) from ontime group by Year;

基准查询要点

NumberQuery
Q1SELECT DayOfWeek, count() AS c FROM ontime WHERE Year >= 2000 AND Year <= 2008 GROUP BY DayOfWeek ORDER BY c DESC;
Q2SELECT DayOfWeek, count() *AS c FROM ontime WHERE DepDelay>10 AND Year >= 2000 AND Year <= 2008 GROUP BY DayOfWeek ORDER BY c DESC;
Q3SELECT Origin, count(*) AS c FROM ontime WHERE DepDelay>10 AND Year >= 2000 AND Year <= 2008 GROUP BY Origin ORDER BY c DESC LIMIT 10;
Q4SELECT IATACODE_Reporting_Airline AS Carrier, count() FROM ontime WHERE DepDelay>10 AND Year = 2007 GROUP BY Carrier ORDER BY count() DESC;
Q5SELECT IATA_CODE_Reporting_Airline AS Carrier, avg(cast(DepDelay>10 as Int8))*1000 AS c3 FROM ontime WHERE Year=2007 GROUP BY Carrier ORDER BY c3 DESC;
Q6SELECT IATA_CODE_Reporting_Airline AS Carrier, avg(cast(DepDelay>10 as Int8))*1000 AS c3 FROM ontime WHERE Year>=2000 AND Year <=2008 GROUP BY Carrier ORDER BY c3 DESC;
Q7SELECT IATA_CODE_Reporting_Airline AS Carrier, avg(DepDelay) * 1000 AS c3 FROM ontime WHERE Year >= 2000 AND Year <= 2008 GROUP BY Carrier;
Q8SELECT Year, avg(DepDelay) FROM ontime GROUP BY Year;
Q9SELECT Year, count(*) as c1 FROM ontime group by Year;
Q10SELECT avg(cnt) FROM (SELECT Year,Month,count(*) AS cnt FROM ontime WHERE DepDel15=1 GROUP BY Year,Month) a;
Q11SELECT avg(c1) FROM (select Year,Month,count(*) as c1 from ontime group by Year,Month) a;
Q12SELECT OriginCityName, DestCityName, count(*) AS c FROM ontime GROUP BY OriginCityName, DestCityName ORDER BY c DESC LIMIT 10;
Q13SELECT OriginCityName, count(*) AS c FROM ontime GROUP BY OriginCityName ORDER BY c DESC LIMIT 10;
Q14SELECT count(*) FROM ontime;

如果想了解更多,可以关注公众号: Databend

基于数据驱动的 Koopman 算子的递归神经网络模型线性化,用于纳米定位系统的预测控制研究(Matlab代码实现)内容概要:本文围绕“基于数据驱动的Koopman算子的递归神经网络模型线性化”展开,旨在研究纳米定位系统的预测控制问题,并提供完整的Matlab代码实现。文章结合数据驱动方法与Koopman算子理论,利用递归神经网络(RNN)对非线性系统进行建模与线性化处理,从而提升纳米级定位系统的精度与动态响应性能。该方法通过提取系统隐含动态特征,构建近似线性模型,便于后续模型预测控制(MPC)的设计与优化,适用于高精度自动化控制场景。文中还展示了相关实验验证与仿真结果,证明了该方法的有效性和先进性。; 适合人群:具备一定控制理论基础和Matlab编程能力,从事精密控制、智能制造、自动化或相关领域研究的研究生、科研人员及工程技术人员。; 使用场景及目标:①应用于纳米级精密定位系统(如原子力显微镜、半导体制造设备)中的高性能控制设计;②为非线性系统建模与线性化提供一种结合深度学习与现代控制理论的新思路;③帮助读者掌握Koopman算子、RNN建模与模型预测控制的综合应用。; 阅读建议:建议读者结合提供的Matlab代码逐段理解算法实现流程,重点关注数据预处理、RNN结构设计、Koopman观测矩阵构建及MPC控制器集成等关键环节,并可通过更换实际系统数据进行迁移验证,深化对方法泛化能力的理解。
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值