【Reading】2014-01, 02

本文探讨了Hadoop及HBase的架构特点,包括HDFS的高可用性进展、HBase的master-slave机制差异及其扩展性原理,并对比了MapReduce与并行数据库管理系统在大规模数据分析中的性能。

http://blog.cloudera.com/blog/2012/01/an-update-on-apache-hadoop-1-0/

Cloudera Blog上的这篇文章描述了Apache Projects通常的branching策略。回顾了Hadoop因为没有严格遵照这种策略而造成版本混乱的历史。


http://blog.cloudera.com/blog/2009/07/file-appends-in-hdfs/

这篇2009年的文章介绍了HDFS Append特性的早期发展历史。作者为大名鼎鼎的《Hadoop Definitive》的作者Tom White


http://hadoop.apache.org/docs/current2/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithNFS.html

http://blog.cloudera.com/blog/2012/03/high-availability-for-the-hadoop-distributed-file-system-hdfs/

Hadoop 2.0已经支持HDFS HA,参考第一篇。下面这篇是2012的文章,更加细致地介绍了HA开发细节和进展。当然,在当时还没有实现Automatic Failover,现在已经出来了。


http://blog.cloudera.com/blog/2013/04/how-scaling-really-works-in-apache-hbase/

这篇文章解释了HBase的master/slave分工机制。它与HDFS的master/slave有差别:

  • HDFS Master负责管理和提供两类metadata,namespace和block locations(block-dataNode mapping)。因此,HDFS File create/delete, read/writer操作都要先咨询Master。
  • HBase Master仅仅管理table metadata,而region-regionServer mapping由系统文件META提供,直接被client读取。因此,HBase Table create/delete操作需要经过master,而put/get操作完全不需要经过master,client直接联系相关的region server。

http://share.youkuaiyun.com/slides/1227

这张PPT的看点是给出一个实际的例子,表明什么是Fact Table,什么是Dimension Table,两个数据仓库中的基本概念。


<A Comparison of Approaches to Large-Scale Data Analysis>

<The Performance of MapReduce: An Indepth Study>
以上两篇论文比较了Hadoop MR与parallel DBMS之间的性能,结果在一些分析任务基准测试中,Hadoop比parallel DBMS要慢3~6倍。其中讨论了MR因为提供以下几点特性而影响任务运行性能:

  • storage independent: streaming IO。MR被设计成一个独立的processing system,不绑定固定的storage system。在Hadoop中,MR可以基于HDFS,也可以基于其他存储系统如local fs, Database等。因此,MR只能采取streaming IO方式读取输入文件。streaming IO指的是任务执行进程(tasktracker)需要从另一个进程(datanode)读取输入,而不是直接从磁盘读取文件(direct IO)。如果是direct IO,执行进程可以利用DMA,把数据异步地读取到它的内存空间。
  • fault-tolerance: intermediate file。假设其中某个reduce task失败了,MR系统不需要重新执行整个Job,而只是将失败的reduce task分配到其他节点执行。这是因为每个map output都是写入到一个中间文件中,而reducer将文件fetch到本地。这是pull方式,区别于parallel dbms的push方式(直接将中间结果push到下一个任务)
  • elastic scalability: runtime scheduling。MR Job启动很慢的一个直接原因是一个job划分成多个task,每个task分配给哪个node去执行,是动态调度的。master等待slaves来要任务,而不是事先分配好了。slaves给master发送heartbeat是有时间间隔的,再者若同时有多个slaves要任务,master只能一个一个顺序分配。但是,这种设计的好处在于系统很有弹性,可以随时动态地分配/撤销node而不影响job运行。
  • schema-on-read:MR处理的文件可以是任何format任何schema,区别于DBMS的write-on-schema。但是,这意味着MR运行时需要解析文件,将文件内容转换成相应的key/value,这又可能带来不小的运行时开销。





[root@master cleaned]# cd [root@master ~]# hdfs dfs -ls /user/hive/warehouse/result_*/ Found 1 items -rw-r--r-- 1 root supergroup 447 2025-12-29 09:27 /user/hive/warehouse/result_financing_survival_corr/000000_0 Found 1 items -rw-r--r-- 1 root supergroup 23 2025-12-29 11:30 /user/hive/warehouse/result_founder_invest_analysis/000000_0 Found 1 items -rw-r--r-- 1 root supergroup 939 2025-12-29 11:28 /user/hive/warehouse/result_longest_survivor_per_industry/000000_0 Found 1 items -rw-r--r-- 1 root supergroup 187 2025-12-29 09:21 /user/hive/warehouse/result_policy_death_region/000000_0 Found 1 items -rw-r--r-- 1 root supergroup 559 2025-12-29 09:16 /user/hive/warehouse/result_top_funded_company_rank/000000_0 Found 1 items -rw-r--r-- 1 root supergroup 1388 2025-12-29 11:32 /user/hive/warehouse/result_top_investor_avg_survival/000000_0 [root@master ~]# hdfs dfs -cat /user/hive/warehouse/result_longest_survivor_per_industry/000000_0 | head -3 10652016-07-011 12122012-04-011 12472016-01-011 [root@master ~]# hdfs dfs -cat /user/hive/warehouse/result_top_investor_avg_survival/000000_0 | head -3北京2238365.002.238E9 广东1095365.001.095E9 上海1034365.001.034E9 [root@master ~]# mysql -u root -p181750Qy. -e " > CREATE DATABASE IF NOT EXISTS bigdata_analysis DEFAULT CHARSET=utf8mb4; > > USE bigdata_analysis; > > -- 表1:每行业最长存活(3字段) > CREATE TABLE IF NOT EXISTS result_longest_survivor_per_industry ( > industry VARCHAR(50), > company_name VARCHAR(100), > max_live_days INT > ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4; > > -- 表2:创始人投资分析(3字段) > CREATE TABLE IF NOT EXISTS result_founder_invest_analysis ( > investor VARCHAR(100), > founder_company_count INT, > avg_survival_days DECIMAL(10,2) > ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4; > > -- 表3:顶级投资者表现(4字段) > CREATE TABLE IF NOT EXISTS result_top_investor_avg_survival ( > top_investor VARCHAR(100), > invested_company_count INT, > avg_survival_days DECIMAL(10,2), > total_investment DOUBLE > ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4; > " [root@master ~]# mkdir -p /usr/local/datax/job [root@master ~]# vim /usr/local/datax/job/job_longest_survivor.json [root@master ~]# python /usr/local/datax/bin/datax.py /usr/local/datax/job/job_longest_survivor.json DataX (DATAX-OPENSOURCE-3.0), From Alibaba ! Copyright (C) 2010-2017, Alibaba Group. All Rights Reserved. 2025-12-29 11:33:50.834 [main] INFO MessageSource - JVM TimeZone: GMT+08:00, Locale: zh_CN 2025-12-29 11:33:50.835 [main] INFO MessageSource - use Locale: zh_CN timeZone: sun.util.calendar.ZoneInfo[id="GMT+08:00",offset=28800000,dstSavings=0,useDaylight=false,transitions=0,lastRule=null] 2025-12-29 11:33:50.849 [main] INFO VMInfo - VMInfo# operatingSystem class => sun.management.OperatingSystemImpl 2025-12-29 11:33:50.854 [main] INFO Engine - the machine info => osInfo: Oracle Corporation 1.8 25.121-b13 jvmInfo: Linux amd64 3.10.0-1160.119.1.el7.x86_64 cpu num: 4 totalPhysicalMemory: -0.00G freePhysicalMemory: -0.00G maxFileDescriptorCount: -1 currentOpenFileDescriptorCount: -1 GC Names [PS MarkSweep, PS Scavenge] MEMORY_NAME | allocation_size | init_size PS Eden Space | 256.00MB | 256.00MB Code Cache | 240.00MB | 2.44MB Compressed Class Space | 1,024.00MB | 0.00MB PS Survivor Space | 42.50MB | 42.50MB PS Old Gen | 683.00MB | 683.00MB Metaspace | -0.00MB | 0.00MB 2025-12-29 11:33:50.867 [main] INFO Engine - { "content":[ { "reader":{ "name":"hdfsreader", "parameter":{ "column":[ { "index":0, "type":"string" }, { "index":1, "type":"string" }, { "index":2, "type":"long" } ], "defaultFS":"hdfs://master:9000", "encoding":"UTF-8", "fieldDelimiter":"\u0001", "fileType":"text", "path":"/user/hive/warehouse/result_longest_survivor_per_industry/*" } }, "writer":{ "name":"mysqlwriter", "parameter":{ "column":[ "industry", "company_name", "max_live_days" ], "connection":[ { "jdbcUrl":"jdbc:mysql://localhost:3306/bigdata_analysis?useSSL=false&serverTimezone=UTC", "table":[ "result_longest_survivor_per_industry" ] } ], "password":"*********", "username":"root" } } } ], "setting":{ "speed":{ "channel":1 } } } 2025-12-29 11:33:50.882 [main] WARN Engine - prioriy set to 0, because NumberFormatException, the value is: null 2025-12-29 11:33:50.883 [main] INFO PerfTrace - PerfTrace traceId=job_-1, isEnable=false, priority=0 2025-12-29 11:33:50.883 [main] INFO JobContainer - DataX jobContainer starts job. 2025-12-29 11:33:50.884 [main] INFO JobContainer - Set jobId = 0 2025-12-29 11:33:50.898 [job-0] INFO HdfsReader$Job - init() begin... 2025-12-29 11:33:51.365 [job-0] INFO HdfsReader$Job - hadoopConfig details:{"finalParameters":[]} 2025-12-29 11:33:51.365 [job-0] INFO HdfsReader$Job - init() ok and end... 2025-12-29 11:33:51.716 [job-0] INFO OriginalConfPretreatmentUtil - table:[result_longest_survivor_per_industry] all columns:[ industry,company_name,max_live_days ]. 2025-12-29 11:33:51.721 [job-0] INFO OriginalConfPretreatmentUtil - Write data [ INSERT INTO %s (industry,company_name,max_live_days) VALUES(?,?,?) ], which jdbcUrl like:[jdbc:mysql://localhost:3306/bigdata_analysis?useSSL=false&serverTimezone=UTC&yearIsDateType=false&zeroDateTimeBehavior=CONVERT_TO_NULL&rewriteBatchedStatements=true&tinyInt1isBit=false] 2025-12-29 11:33:51.722 [job-0] INFO JobContainer - jobContainer starts to do prepare ... 2025-12-29 11:33:51.722 [job-0] INFO JobContainer - DataX Reader.Job [hdfsreader] do prepare work . 2025-12-29 11:33:51.722 [job-0] INFO HdfsReader$Job - prepare(), start to getAllFiles... 2025-12-29 11:33:51.723 [job-0] INFO HdfsReader$Job - get HDFS all files in path = [/user/hive/warehouse/result_longest_survivor_per_industry/*] 十二月 29, 2025 11:33:51 上午 org.apache.hadoop.util.NativeCodeLoader <clinit> 警告: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2025-12-29 11:33:52.414 [job-0] INFO HdfsReader$Job - [hdfs://master:9000/user/hive/warehouse/result_longest_survivor_per_industry/000000_0]是[text]类型的文件, 将该文件加入source files列表 2025-12-29 11:33:52.416 [job-0] INFO HdfsReader$Job - 您即将读取的文件数为: [1], 列表为: [hdfs://master:9000/user/hive/warehouse/result_longest_survivor_per_industry/000000_0] 2025-12-29 11:33:52.416 [job-0] INFO JobContainer - DataX Writer.Job [mysqlwriter] do prepare work . 2025-12-29 11:33:52.417 [job-0] INFO JobContainer - jobContainer starts to do split ... 2025-12-29 11:33:52.417 [job-0] INFO JobContainer - Job set Channel-Number to 1 channels. 2025-12-29 11:33:52.417 [job-0] INFO HdfsReader$Job - split() begin... 2025-12-29 11:33:52.417 [job-0] INFO JobContainer - DataX Reader.Job [hdfsreader] splits to [1] tasks. 2025-12-29 11:33:52.417 [job-0] INFO JobContainer - DataX Writer.Job [mysqlwriter] splits to [1] tasks. 2025-12-29 11:33:52.425 [job-0] INFO JobContainer - jobContainer starts to do schedule ... 2025-12-29 11:33:52.431 [job-0] INFO JobContainer - Scheduler starts [1] taskGroups. 2025-12-29 11:33:52.434 [job-0] INFO JobContainer - Running by standalone Mode. 2025-12-29 11:33:52.462 [taskGroup-0] INFO TaskGroupContainer - taskGroupId=[0] start [1] channels for [1] tasks. 2025-12-29 11:33:52.466 [taskGroup-0] INFO Channel - Channel set byte_speed_limit to -1, No bps activated. 2025-12-29 11:33:52.466 [taskGroup-0] INFO Channel - Channel set record_speed_limit to -1, No tps activated. 2025-12-29 11:33:52.476 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] taskId[0] attemptCount[1] is started 2025-12-29 11:33:52.499 [0-0-0-reader] INFO HdfsReader$Job - hadoopConfig details:{"finalParameters":["mapreduce.job.end-notification.max.retry.interval","mapreduce.job.end-notification.max.attempts"]} 2025-12-29 11:33:52.500 [0-0-0-reader] INFO Reader$Task - read start 2025-12-29 11:33:52.500 [0-0-0-reader] INFO Reader$Task - reading file : [hdfs://master:9000/user/hive/warehouse/result_longest_survivor_per_industry/000000_0] 2025-12-29 11:33:52.520 [0-0-0-reader] INFO UnstructuredStorageReaderUtil - CsvReader使用默认值[{"captureRawRecord":true,"columnCount":0,"comment":"#","currentRecord":-1,"delimiter":"\u0001","escapeMode":1,"headerCount":0,"rawRecord":"","recordDelimiter":"\u0000","safetySwitch":false,"skipEmptyRecords":true,"textQualifier":"\"","trimWhitespace":true,"useComments":false,"useTextQualifier":true,"values":[]}],csvReaderConfig值为[null] 2025-12-29 11:33:52.529 [0-0-0-reader] INFO Reader$Task - end read source files... 2025-12-29 11:33:52.576 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] taskId[0] is successed, used[104]ms 2025-12-29 11:33:52.577 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] completed it's tasks. 2025-12-29 11:34:02.479 [job-0] INFO StandAloneJobContainerCommunicator - Total 43 records, 540 bytes | Speed 54B/s, 4 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.000s | All Task WaitReaderTime 0.000s | Percentage 100.00% 2025-12-29 11:34:02.479 [job-0] INFO AbstractScheduler - Scheduler accomplished all tasks. 2025-12-29 11:34:02.480 [job-0] INFO JobContainer - DataX Writer.Job [mysqlwriter] do post work. 2025-12-29 11:34:02.480 [job-0] INFO JobContainer - DataX Reader.Job [hdfsreader] do post work. 2025-12-29 11:34:02.480 [job-0] INFO JobContainer - DataX jobId [0] completed successfully. 2025-12-29 11:34:02.481 [job-0] INFO HookInvoker - No hook invoked, because base dir not exists or is a file: /usr/local/datax/hook 2025-12-29 11:34:02.482 [job-0] INFO JobContainer - [total cpu info] => averageCpu | maxDeltaCpu | minDeltaCpu -1.00% | -1.00% | -1.00% [total gc info] => NAME | totalGCCount | maxDeltaGCCount | minDeltaGCCount | totalGCTime | maxDeltaGCTime | minDeltaGCTime PS MarkSweep | 1 | 1 | 1 | 0.019s | 0.019s | 0.019s PS Scavenge | 1 | 1 | 1 | 0.016s | 0.016s | 0.016s 2025-12-29 11:34:02.482 [job-0] INFO JobContainer - PerfTrace not enable! 2025-12-29 11:34:02.482 [job-0] INFO StandAloneJobContainerCommunicator - Total 43 records, 540 bytes | Speed 54B/s, 4 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.000s | All Task WaitReaderTime 0.000s | Percentage 100.00% 2025-12-29 11:34:02.485 [job-0] INFO JobContainer - 任务启动时刻 : 2025-12-29 11:33:50 任务结束时刻 : 2025-12-29 11:34:02 任务总计耗时 : 11s 任务平均流量 : 54B/s 记录写入速度 : 4rec/s 读出记录总数 : 43 读写失败总数 : 0 [root@master ~]# python /usr/local/datax/bin/datax.py /usr/local/datax/job/job_founder_invest.json DataX (DATAX-OPENSOURCE-3.0), From Alibaba ! Copyright (C) 2010-2017, Alibaba Group. All Rights Reserved. 2025-12-29 11:34:03.125 [main] ERROR Engine - 经DataX智能分析,该任务最可能的错误原因是: com.alibaba.datax.common.exception.DataXException: Code:[Framework-03], Description:[DataX引擎配置错误,该问题通常是由于DataX安装错误引起,请联系您的运维解决 .]. - 获取作业配置信息失败:/usr/local/datax/job/job_founder_invest.json - java.io.FileNotFoundException: File '/usr/local/datax/job/job_founder_invest.json' does not exist at org.apache.commons.io.FileUtils.openInputStream(FileUtils.java:299) at org.apache.commons.io.FileUtils.readFileToString(FileUtils.java:1711) at org.apache.commons.io.FileUtils.readFileToString(FileUtils.java:1748) at com.alibaba.datax.core.util.ConfigParser.getJobContent(ConfigParser.java:106) at com.alibaba.datax.core.util.ConfigParser.parseJobConfig(ConfigParser.java:74) at com.alibaba.datax.core.util.ConfigParser.parse(ConfigParser.java:26) at com.alibaba.datax.core.Engine.entry(Engine.java:138) at com.alibaba.datax.core.Engine.main(Engine.java:208) at com.alibaba.datax.common.exception.DataXException.asDataXException(DataXException.java:41) at com.alibaba.datax.core.util.ConfigParser.getJobContent(ConfigParser.java:108) at com.alibaba.datax.core.util.ConfigParser.parseJobConfig(ConfigParser.java:74) at com.alibaba.datax.core.util.ConfigParser.parse(ConfigParser.java:26) at com.alibaba.datax.core.Engine.entry(Engine.java:138) at com.alibaba.datax.core.Engine.main(Engine.java:208) Caused by: java.io.FileNotFoundException: File '/usr/local/datax/job/job_founder_invest.json' does not exist at org.apache.commons.io.FileUtils.openInputStream(FileUtils.java:299) at org.apache.commons.io.FileUtils.readFileToString(FileUtils.java:1711) at org.apache.commons.io.FileUtils.readFileToString(FileUtils.java:1748) at com.alibaba.datax.core.util.ConfigParser.getJobContent(ConfigParser.java:106) ... 4 more [root@master ~]# python /usr/local/datax/bin/datax.py /usr/local/datax/job/job_top_investor.json DataX (DATAX-OPENSOURCE-3.0), From Alibaba ! Copyright (C) 2010-2017, Alibaba Group. All Rights Reserved. 2025-12-29 11:34:03.504 [main] ERROR Engine - 经DataX智能分析,该任务最可能的错误原因是: com.alibaba.datax.common.exception.DataXException: Code:[Framework-03], Description:[DataX引擎配置错误,该问题通常是由于DataX安装错误引起,请联系您的运维解决 .]. - 获取作业配置信息失败:/usr/local/datax/job/job_top_investor.json - java.io.FileNotFoundException: File '/usr/local/datax/job/job_top_investor.json' does not exist at org.apache.commons.io.FileUtils.openInputStream(FileUtils.java:299) at org.apache.commons.io.FileUtils.readFileToString(FileUtils.java:1711) at org.apache.commons.io.FileUtils.readFileToString(FileUtils.java:1748) at com.alibaba.datax.core.util.ConfigParser.getJobContent(ConfigParser.java:106) at com.alibaba.datax.core.util.ConfigParser.parseJobConfig(ConfigParser.java:74) at com.alibaba.datax.core.util.ConfigParser.parse(ConfigParser.java:26) at com.alibaba.datax.core.Engine.entry(Engine.java:138) at com.alibaba.datax.core.Engine.main(Engine.java:208) at com.alibaba.datax.common.exception.DataXException.asDataXException(DataXException.java:41) at com.alibaba.datax.core.util.ConfigParser.getJobContent(ConfigParser.java:108) at com.alibaba.datax.core.util.ConfigParser.parseJobConfig(ConfigParser.java:74) at com.alibaba.datax.core.util.ConfigParser.parse(ConfigParser.java:26) at com.alibaba.datax.core.Engine.entry(Engine.java:138) at com.alibaba.datax.core.Engine.main(Engine.java:208) Caused by: java.io.FileNotFoundException: File '/usr/local/datax/job/job_top_investor.json' does not exist at org.apache.commons.io.FileUtils.openInputStream(FileUtils.java:299) at org.apache.commons.io.FileUtils.readFileToString(FileUtils.java:1711) at org.apache.commons.io.FileUtils.readFileToString(FileUtils.java:1748) at com.alibaba.datax.core.util.ConfigParser.getJobContent(ConfigParser.java:106) ... 4 more [root@master ~]# mysql -u root -p181750Qy. -e "SELECT * FROM bigdata_analysis.result_longest_survivor_per_industry;" +--------------+--------------------+---------------+ | industry | company_name | max_live_days | +--------------+--------------------+---------------+ | 金融 | 信诚贷 | 999 | | 金融 | 比特人 | 999 | | 电子商务 | 折子网 | 999 | | 游戏 | 千奇网络 | 999 | | 汽车交通 | 淘车乐 | 999 | | 新工业 | 必因科技 | 999 | | 广告营销 | 纸指天下 | 999 | | 企业服务 | Smart2On | 999 | | 企业服务 | 九千年 | 999 | | 文娱传媒 | 电音网 | 998 | | 文娱传媒 | 爱豆影视 | 998 | | 医疗健康 | 爱健康 | 998 | | 社交网络 | 法天下 | 998 | | 本地生活 | 一点通 | 997 | | 本地生活 | 威力恩 | 997 | | 旅游 | 去海钓网 | 997 | | 教育 | 零距校园网 | 997 | | 本地生活 | 帮个忙 | 997 | | 硬件 | 中博宏大 | 997 | | 工具软件 | 51CV.me | 997 | | 工具软件 | 通知盒amybox | 997 | | 房产服务 | E居多得 | 996 | | 物流 | 运多多 | 981 | | 体育运动 | 速播体育 | 949 | | 农业 | 华夏康家 | 390 | | 1065 | 2016-07-01 | 1 | | 1212 | 2012-04-01 | 1 | | 1247 | 2016-01-01 | 1 | | 1278 | 2015-12-01 | 1 | | 1400 | 2015-08-01 | 4 | | 1521 | 2014-11-01 | 2 | | 1522 | 2014-05-01 | 1 | | 1825 | 2013-01-01 | 1 | | 1887 | 2011-11-01 | 1 | | 2039 | 2012-06-01 | 1 | | 2070 | 2012-05-01 | 2 | | 2131 | 2010-03-01 | 1 | | 2222 | 2006-12-01 | 1 | | 2556 | 2006-01-01 | 1 | | 418 | 2016-06-01 | 1 | | 725 | 2013-08-01 | 1 | | 734 | 2014-08-01 | 1 | | 740 | 2014-09-01 | 1 | | 791 | 2017-04-01 | 1 | | 822 | 2017-03-01 | 1 | | 854 | 2014-04-01 | 1 | | 942 | 2013-01-01 | 1 | | 944 | 2014-01-01 | 1 | | 950 | 2015-01-01 | 1 | | 企业服务 | 龙腾天下 | 789 | | 体育运动 | 骑程网 | 73 | | 农业 | 许鲜网 | 4 | | 医疗健康 | 鼎诚智能科技 | 201 | | 工具软件 | 鸸鹋爱卖萌 | 374 | | 广告营销 | 魅蓝互动 | 172 | | 房产服务 | 鼎家网络 | 124 | | 教育 | 齐贤教育 | 376 | | 文娱传媒 | 鸡蛋娱乐 | 421 | | 新工业 | 鼎喜手机 | 72 | | 旅游 | 麻游旅行 | 255 | | 本地生活 | 黄帽子劳务 | 595 | | 汽车交通 | 麦麦车 | 311 | | 游戏 | 龙辰文化 | 334 | | 物流 | 马上快递app | 58 | | 电子商务 | 齐表网 | 925 | | 硬件 | 黑豆吉他 | 178 | | 社交网络 | 麦浪 | 500 | | 金融 | 龙矿科技 | 510 | +--------------+--------------------+---------------+ [root@master ~]# mysql -u root -p181750Qy. -e "SELECT * FROM bigdata_analysis.result_founder_invest_analysis;" [root@master ~]# mysql -u root -p181750Qy. -e "SELECT * FROM bigdata_analysis.result_top_investor_avg_survival;" [root@master ~]# hdfs dfs -ls /input/cleaned_data/ Found 1 items -rw-r--r-- 1 root supergroup 2455476 2025-12-29 11:15 /input/cleaned_data/companies.csv [root@master ~]#
12-30
wsl: 检测到 localhost 代理配置,但未镜像到 WSL。NAT 模式下的 WSL 不支持 localhost 代理。 (base) os01@DESKTOP-F3AFPI2:~$ sudo apt update && sudo apt upgrade -y sudo apt install -y build-essential git wget curl vim tmux htop -bash: syntax error near unexpected token `;&' [sudo] password for os01: Reading package lists... Done Building dependency tree... Done Reading state information... Done build-essential is already the newest version (12.10ubuntu1). wget is already the newest version (1.21.4-1ubuntu4.1). wget set to manually installed. curl is already the newest version (8.5.0-2ubuntu10.6). curl set to manually installed. tmux is already the newest version (3.4-1ubuntu0.1). tmux set to manually installed. The following packages were automatically installed and are no longer required: libdrm-nouveau2 libdrm-radeon1 libgl1-amber-dri libglapi-mesa libxcb-dri2-0 Use 'sudo apt autoremove' to remove them. The following additional packages will be installed: git-man libnl-genl-3-200 vim-common vim-runtime vim-tiny xxd Suggested packages: git-daemon-run | git-daemon-sysvinit git-doc git-email git-gui gitk gitweb git-cvs git-mediawiki git-svn lm-sensors strace ctags vim-doc vim-scripts indent The following NEW packages will be installed: htop libnl-genl-3-200 The following packages will be upgraded: git git-man vim vim-common vim-runtime vim-tiny xxd 7 upgraded, 2 newly installed, 0 to remove and 168 not upgraded. Need to get 15.2 MB/15.4 MB of archives. After this operation, 496 kB of additional disk space will be used. Get:1 http://archive.ubuntu.com/ubuntu noble-updates/main amd64 vim amd64 2:9.1.0016-1ubuntu7.9 [1881 kB] Get:2 http://archive.ubuntu.com/ubuntu noble-updates/main amd64 vim-common all 2:9.1.0016-1ubuntu7.9 [386 kB] Get:3 http://archive.ubuntu.com/ubuntu noble-updates/main amd64 vim-tiny amd64 2:9.1.0016-1ubuntu7.9 [803 kB] Get:4 http://archive.ubuntu.com/ubuntu noble-updates/main amd64 vim-runtime all 2:9.1.0016-1ubuntu7.9 [7281 kB] Get:5 http://archive.ubuntu.com/ubuntu noble-updates/main amd64 xxd amd64 2:9.1.0016-1ubuntu7.9 [63.8 kB] Get:6 http://archive.ubuntu.com/ubuntu noble-updates/main amd64 git-man all 1:2.43.0-1ubuntu7.3 [1100 kB] Get:7 http://archive.ubuntu.com/ubuntu noble-updates/main amd64 git amd64 1:2.43.0-1ubuntu7.3 [3680 kB] Get:8 http://archive.ubuntu.com/ubuntu noble-updates/main amd64 libnl-genl-3-200 amd64 3.7.0-0.3build1.1 [12.2 kB] Fetched 15.2 MB in 7s (2121 kB/s) (Reading database ... 55810 files and directories currently installed.) Preparing to unpack .../0-vim_2%3a9.1.0016-1ubuntu7.9_amd64.deb ... Unpacking vim (2:9.1.0016-1ubuntu7.9) over (2:9.1.0016-1ubuntu7.5) ... Preparing to unpack .../1-vim-common_2%3a9.1.0016-1ubuntu7.9_all.deb ... Unpacking vim-common (2:9.1.0016-1ubuntu7.9) over (2:9.1.0016-1ubuntu7.5) ... Preparing to unpack .../2-vim-tiny_2%3a9.1.0016-1ubuntu7.9_amd64.deb ... Unpacking vim-tiny (2:9.1.0016-1ubuntu7.9) over (2:9.1.0016-1ubuntu7.5) ... Preparing to unpack .../3-vim-runtime_2%3a9.1.0016-1ubuntu7.9_all.deb ... Unpacking vim-runtime (2:9.1.0016-1ubuntu7.9) over (2:9.1.0016-1ubuntu7.5) ... Preparing to unpack .../4-xxd_2%3a9.1.0016-1ubuntu7.9_amd64.deb ... Unpacking xxd (2:9.1.0016-1ubuntu7.9) over (2:9.1.0016-1ubuntu7.5) ... Preparing to unpack .../5-git-man_1%3a2.43.0-1ubuntu7.3_all.deb ... Unpacking git-man (1:2.43.0-1ubuntu7.3) over (1:2.43.0-1ubuntu7.1) ... Preparing to unpack .../6-git_1%3a2.43.0-1ubuntu7.3_amd64.deb ... Unpacking git (1:2.43.0-1ubuntu7.3) over (1:2.43.0-1ubuntu7.1) ... Selecting previously unselected package libnl-genl-3-200:amd64. Preparing to unpack .../7-libnl-genl-3-200_3.7.0-0.3build1.1_amd64.deb ... Unpacking libnl-genl-3-200:amd64 (3.7.0-0.3build1.1) ... Selecting previously unselected package htop. Preparing to unpack .../8-htop_3.3.0-4build1_amd64.deb ... Unpacking htop (3.3.0-4build1) ... Setting up xxd (2:9.1.0016-1ubuntu7.9) ... Setting up vim-common (2:9.1.0016-1ubuntu7.9) ... Setting up libnl-genl-3-200:amd64 (3.7.0-0.3build1.1) ... Setting up git-man (1:2.43.0-1ubuntu7.3) ... Setting up vim-runtime (2:9.1.0016-1ubuntu7.9) ... Setting up vim (2:9.1.0016-1ubuntu7.9) ... Setting up htop (3.3.0-4build1) ... Setting up vim-tiny (2:9.1.0016-1ubuntu7.9) ... Setting up git (1:2.43.0-1ubuntu7.3) ... Processing triggers for libc-bin (2.39-0ubuntu8.6) ... Processing triggers for man-db (2.12.0-4build2) ... Processing triggers for hicolor-icon-theme (0.17-2) ... (base) os01@DESKTOP-F3AFPI2:~$ # 安装NVIDIA容器工具包 distribution=$(. /etc/os-release;echo $ID$VERSION_ID) curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list sudo apt update && sudo apt install -y nvidia-docker2 # 验证CUDA可用性 nvidia-smi # 应显示与Windows主机相同的GPU信息 Warning: apt-key is deprecated. Manage keyring files in trusted.gpg.d instead (see apt-key(8)). gpg: no valid OpenPGP data found. # Unsupported distribution! # Check https://nvidia.github.io/nvidia-docker -bash: syntax error near unexpected token `;&' Sat Oct 11 15:45:43 2025 +-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 580.82.10 Driver Version: 581.29 CUDA Version: 13.0 | +-----------------------------------------+------------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+========================+======================| | 0 NVIDIA GeForce RTX 5090 On | 00000000:01:00.0 On | N/A | | 0% 43C P0 73W / 600W | 1244MiB / 32607MiB | 2% Default | | | | N/A | +-----------------------------------------+------------------------+----------------------+ +-----------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=========================================================================================| | No running processes found | +-----------------------------------------------------------------------------------------+ (base) os01@DESKTOP-F3AFPI2:~$ git clone https://gitcode.com/GitHub_Trending/op/Open-Sora-Plan.git cd Open-Sora-Plan git checkout mindspeed_mmdit # 切换到最新NPU支持分支 fatal: destination path 'Open-Sora-Plan' already exists and is not an empty directory. Already on 'mindspeed_mmdit' Your branch is up to date with 'origin/mindspeed_mmdit'. (base) os01@DESKTOP-F3AFPI2:~/Open-Sora-Plan$ # 使用conda管理环境(推荐) wget https://mirrors.tuna.tsinghua.edu.cn/anaconda/miniconda/Miniconda3-py38_23.11.0-2-Linux-x86_64.sh bash Miniconda3-py38_23.11.0-2-Linux-x86_64.sh -b -p $HOME/miniconda3 source $HOME/miniconda3/bin/activate conda env create -f environment.yml conda activate open-sora # 验证Python版本和CUDA支持 python -c "import torch; print('CUDA可用:', torch.cuda.is_available())" # 应返回True --2025-10-11 15:46:07-- https://mirrors.tuna.tsinghua.edu.cn/anaconda/miniconda/Miniconda3-py38_23.11.0-2-Linux-x86_64.sh Resolving mirrors.tuna.tsinghua.edu.cn (mirrors.tuna.tsinghua.edu.cn)... 101.6.15.130, 2402:f000:1:400::2 Connecting to mirrors.tuna.tsinghua.edu.cn (mirrors.tuna.tsinghua.edu.cn)|101.6.15.130|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 131844786 (126M) [application/octet-stream] Saving to: ‘Miniconda3-py38_23.11.0-2-Linux-x86_64.sh.1’ Miniconda3-py38_23.11.0-2-Lin 100%[=================================================>] 125.74M 39.6MB/s in 3.3s 2025-10-11 15:46:11 (37.8 MB/s) - ‘Miniconda3-py38_23.11.0-2-Linux-x86_64.sh.1’ saved [131844786/131844786] ERROR: File or directory already exists: '/home/os01/miniconda3' If you want to update an existing installation, use the -u option. EnvironmentFileNotFound: '/home/os01/Open-Sora-Plan/environment.yml' file not found EnvironmentNameNotFound: Could not find conda environment: open-sora You can list all discoverable environments with `conda info --envs`. CUDA可用: True (base) os01@DESKTOP-F3AFPI2:~/Open-Sora-Plan$ # 配置pip国内源 pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple # 手动安装特定版本依赖 pip install torch==2.1.0+cu118 torchvision==0.16.0+cu118 --extra-index-url https://download.pytorch.org/whl/cu118 pip install xformers==0.0.22.post7 accelerate==0.34.0 deepspeed==0.12.6 Writing to /home/os01/.config/pip/pip.conf Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple, https://download.pytorch.org/whl/cu118 Requirement already satisfied: torch==2.1.0+cu118 in /home/os01/miniconda3/lib/python3.8/site-packages (2.1.0+cu118) Requirement already satisfied: torchvision==0.16.0+cu118 in /home/os01/miniconda3/lib/python3.8/site-packages (0.16.0+cu118) Requirement already satisfied: filelock in /home/os01/miniconda3/lib/python3.8/site-packages (from torch==2.1.0+cu118) (3.16.1) Requirement already satisfied: typing-extensions in /home/os01/miniconda3/lib/python3.8/site-packages (from torch==2.1.0+cu118) (4.13.2) Requirement already satisfied: sympy in /home/os01/miniconda3/lib/python3.8/site-packages (from torch==2.1.0+cu118) (1.13.3) Requirement already satisfied: networkx in /home/os01/miniconda3/lib/python3.8/site-packages (from torch==2.1.0+cu118) (3.1) Requirement already satisfied: jinja2 in /home/os01/miniconda3/lib/python3.8/site-packages (from torch==2.1.0+cu118) (3.1.6) Requirement already satisfied: fsspec in /home/os01/miniconda3/lib/python3.8/site-packages (from torch==2.1.0+cu118) (2025.3.0) Requirement already satisfied: triton==2.1.0 in /home/os01/miniconda3/lib/python3.8/site-packages (from torch==2.1.0+cu118) (2.1.0) Requirement already satisfied: numpy in /home/os01/miniconda3/lib/python3.8/site-packages (from torchvision==0.16.0+cu118) (1.24.4) Requirement already satisfied: requests in /home/os01/miniconda3/lib/python3.8/site-packages (from torchvision==0.16.0+cu118) (2.31.0) Requirement already satisfied: pillow!=8.3.*,>=5.3.0 in /home/os01/miniconda3/lib/python3.8/site-packages (from torchvision==0.16.0+cu118) (10.4.0) Requirement already satisfied: MarkupSafe>=2.0 in /home/os01/miniconda3/lib/python3.8/site-packages (from jinja2->torch==2.1.0+cu118) (2.1.5) Requirement already satisfied: charset-normalizer<4,>=2 in /home/os01/miniconda3/lib/python3.8/site-packages (from requests->torchvision==0.16.0+cu118) (2.0.4) Requirement already satisfied: idna<4,>=2.5 in /home/os01/miniconda3/lib/python3.8/site-packages (from requests->torchvision==0.16.0+cu118) (3.4) Requirement already satisfied: urllib3<3,>=1.21.1 in /home/os01/miniconda3/lib/python3.8/site-packages (from requests->torchvision==0.16.0+cu118) (1.26.18) Requirement already satisfied: certifi>=2017.4.17 in /home/os01/miniconda3/lib/python3.8/site-packages (from requests->torchvision==0.16.0+cu118) (2023.11.17) Requirement already satisfied: mpmath<1.4,>=1.1.0 in /home/os01/miniconda3/lib/python3.8/site-packages (from sympy->torch==2.1.0+cu118) (1.3.0) Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple Collecting xformers==0.0.22.post7 Using cached https://pypi.tuna.tsinghua.edu.cn/packages/56/5f/20481c8ccfbd2ac0f936908c9b0ff3e31380d8d186d7dabf34a941b3127f/xformers-0.0.22.post7-cp38-cp38-manylinux2014_x86_64.whl (211.8 MB) Collecting accelerate==0.34.0 Using cached https://pypi.tuna.tsinghua.edu.cn/packages/02/0e/626f2dd4325f4545fbaaf9c590390d2d4ab8e7551579346fe1e319bd93af/accelerate-0.34.0-py3-none-any.whl (324 kB) Collecting deepspeed==0.12.6 Using cached https://pypi.tuna.tsinghua.edu.cn/packages/f1/ff/0fba0fec90e7de1c7148b0527e8ac9cdf2280d274ed135bcb2187f7497a7/deepspeed-0.12.6.tar.gz (1.2 MB) Preparing metadata (setup.py) ... error error: subprocess-exited-with-error × python setup.py egg_info did not run successfully. │ exit code: 1 ╰─> [16 lines of output] Traceback (most recent call last): File "<string>", line 2, in <module> File "<pip-setuptools-caller>", line 34, in <module> File "/tmp/pip-install-3za_tc1r/deepspeed_40a9c88a17284f49b38ee0bcb816fee1/setup.py", line 100, in <module> cuda_major_ver, cuda_minor_ver = installed_cuda_version() File "/tmp/pip-install-3za_tc1r/deepspeed_40a9c88a17284f49b38ee0bcb816fee1/op_builder/builder.py", line 52, in installed_cuda_version output = subprocess.check_output([cuda_home + "/bin/nvcc", "-V"], universal_newlines=True) File "/home/os01/miniconda3/lib/python3.8/subprocess.py", line 415, in check_output return run(*popenargs, stdout=PIPE, timeout=timeout, check=True, File "/home/os01/miniconda3/lib/python3.8/subprocess.py", line 493, in run with Popen(*popenargs, **kwargs) as process: File "/home/os01/miniconda3/lib/python3.8/subprocess.py", line 858, in __init__ self._execute_child(args, executable, preexec_fn, close_fds, File "/home/os01/miniconda3/lib/python3.8/subprocess.py", line 1720, in _execute_child raise child_exception_type(errno_num, err_msg, err_filename) FileNotFoundError: [Errno 2] No such file or directory: '/usr/local/cuda/bin/nvcc' [end of output] note: This error originates from a subprocess, and is likely not a problem with pip. error: metadata-generation-failed × Encountered error while generating package metadata. ╰─> See above for output. note: This is an issue with the package mentioned above, not pip. hint: See above for details. (base) os01@DESKTOP-F3AFPI2:~/Open-Sora-Plan$ ls /usr/local/cuda*/bin/nvcc # 检查常见路径 ls: cannot access '/usr/local/cuda*/bin/nvcc': No such file or directory (base) os01@DESKTOP-F3AFPI2:~/Open-Sora-Plan$ # 查看已安装驱动 nvidia-smi # 确认最高支持的CUDA版本 # 从官网下载对应版本安装 wget https://developer.download.nvidia.com/compute/cuda/12.2.2/local_installers/cuda_12.2.2_535.104.05_linux.run sudo sh cuda_12.2.2_535.104.05_linux.run Sat Oct 11 15:47:57 2025 +-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 580.82.10 Driver Version: 581.29 CUDA Version: 13.0 | +-----------------------------------------+------------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+========================+======================| | 0 NVIDIA GeForce RTX 5090 On | 00000000:01:00.0 On | N/A | | 0% 42C P5 33W / 600W | 1290MiB / 32607MiB | 1% Default | | | | N/A | +-----------------------------------------+------------------------+----------------------+ +-----------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=========================================================================================| | No running processes found | +-----------------------------------------------------------------------------------------+ --2025-10-11 15:47:57-- https://developer.download.nvidia.com/compute/cuda/12.2.2/local_installers/cuda_12.2.2_535.104.05_linux.run Resolving developer.download.nvidia.com (developer.download.nvidia.com)... 23.200.143.149, 23.200.143.133 Connecting to developer.download.nvidia.com (developer.download.nvidia.com)|23.200.143.149|:443... connected. HTTP request sent, awaiting response... 301 Moved Permanently Location: https://developer.download.nvidia.cn/compute/cuda/12.2.2/local_installers/cuda_12.2.2_535.104.05_linux.run [following] --2025-10-11 15:47:57-- https://developer.download.nvidia.cn/compute/cuda/12.2.2/local_installers/cuda_12.2.2_535.104.05_linux.run Resolving developer.download.nvidia.cn (developer.download.nvidia.cn)... 39.173.184.184, 39.173.184.185, 39.173.184.186, ... Connecting to developer.download.nvidia.cn (developer.download.nvidia.cn)|39.173.184.184|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 4344134690 (4.0G) [application/octet-stream] Saving to: ‘cuda_12.2.2_535.104.05_linux.run’ cuda_12.2.2_535.104.05_linux. 100%[=================================================>] 4.04G 43.0MB/s in 97s 2025-10-11 15:49:35 (42.5 MB/s) - ‘cuda_12.2.2_535.104.05_linux.run’ saved [4344134690/4344134690] sh: 1: dkms: not found Installation failed. See log at /var/log/cuda-installer.log for details. (base) os01@DESKTOP-F3AFPI2:~/Open-Sora-Plan$ nano ~/.bashrc (base) os01@DESKTOP-F3AFPI2:~/Open-Sora-Plan$ source ~/.bashrc (base) os01@DESKTOP-F3AFPI2:~/Open-Sora-Plan$ echo $CUDA_HOME # 应显示正确路径 nvcc --version # 应显示版本信息 /usr/local/cuda-11.8 nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2023 NVIDIA Corporation Built on Fri_Jan__6_16:45:21_PST_2023 Cuda compilation tools, release 12.0, V12.0.140 Build cuda_12.0.r12.0/compiler.32267302_0 (base) os01@DESKTOP-F3AFPI2:~/Open-Sora-Plan$ CUDA_HOME=/usr/local/cuda-11.8 pip install deepspeed==0.12.6 Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple Collecting deepspeed==0.12.6 Using cached https://pypi.tuna.tsinghua.edu.cn/packages/f1/ff/0fba0fec90e7de1c7148b0527e8ac9cdf2280d274ed135bcb2187f7497a7/deepspeed-0.12.6.tar.gz (1.2 MB) Preparing metadata (setup.py) ... error error: subprocess-exited-with-error × python setup.py egg_info did not run successfully. │ exit code: 1 ╰─> [16 lines of output] Traceback (most recent call last): File "<string>", line 2, in <module> File "<pip-setuptools-caller>", line 34, in <module> File "/tmp/pip-install-lu7o1st4/deepspeed_fb52507ecf3c4a568c73c1bf64c6c15a/setup.py", line 100, in <module> cuda_major_ver, cuda_minor_ver = installed_cuda_version() File "/tmp/pip-install-lu7o1st4/deepspeed_fb52507ecf3c4a568c73c1bf64c6c15a/op_builder/builder.py", line 52, in installed_cuda_version output = subprocess.check_output([cuda_home + "/bin/nvcc", "-V"], universal_newlines=True) File "/home/os01/miniconda3/lib/python3.8/subprocess.py", line 415, in check_output return run(*popenargs, stdout=PIPE, timeout=timeout, check=True, File "/home/os01/miniconda3/lib/python3.8/subprocess.py", line 493, in run with Popen(*popenargs, **kwargs) as process: File "/home/os01/miniconda3/lib/python3.8/subprocess.py", line 858, in __init__ self._execute_child(args, executable, preexec_fn, close_fds, File "/home/os01/miniconda3/lib/python3.8/subprocess.py", line 1720, in _execute_child raise child_exception_type(errno_num, err_msg, err_filename) FileNotFoundError: [Errno 2] No such file or directory: '/usr/local/cuda-11.8/bin/nvcc' [end of output] note: This error originates from a subprocess, and is likely not a problem with pip. error: metadata-generation-failed × Encountered error while generating package metadata. ╰─> See above for output. note: This is an issue with the package mentioned above, not pip. hint: See above for details. (base) os01@DESKTOP-F3AFPI2:~/Open-Sora-Plan$ sudo ln -s /usr/local/cuda-11.8 /usr/local/cuda (base) os01@DESKTOP-F3AFPI2:~/Open-Sora-Plan$ CUDA_HOME=/usr/local/cuda-11.8 pip install deepspeed==0.12.6 Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple Collecting deepspeed==0.12.6 Using cached https://pypi.tuna.tsinghua.edu.cn/packages/f1/ff/0fba0fec90e7de1c7148b0527e8ac9cdf2280d274ed135bcb2187f7497a7/deepspeed-0.12.6.tar.gz (1.2 MB) Preparing metadata (setup.py) ... error error: subprocess-exited-with-error × python setup.py egg_info did not run successfully. │ exit code: 1 ╰─> [16 lines of output] Traceback (most recent call last): File "<string>", line 2, in <module> File "<pip-setuptools-caller>", line 34, in <module> File "/tmp/pip-install-2g8bc64o/deepspeed_72185470bf214f78966e52dbacfb643d/setup.py", line 100, in <module> cuda_major_ver, cuda_minor_ver = installed_cuda_version() File "/tmp/pip-install-2g8bc64o/deepspeed_72185470bf214f78966e52dbacfb643d/op_builder/builder.py", line 52, in installed_cuda_version output = subprocess.check_output([cuda_home + "/bin/nvcc", "-V"], universal_newlines=True) File "/home/os01/miniconda3/lib/python3.8/subprocess.py", line 415, in check_output return run(*popenargs, stdout=PIPE, timeout=timeout, check=True, File "/home/os01/miniconda3/lib/python3.8/subprocess.py", line 493, in run with Popen(*popenargs, **kwargs) as process: File "/home/os01/miniconda3/lib/python3.8/subprocess.py", line 858, in __init__ self._execute_child(args, executable, preexec_fn, close_fds, File "/home/os01/miniconda3/lib/python3.8/subprocess.py", line 1720, in _execute_child raise child_exception_type(errno_num, err_msg, err_filename) FileNotFoundError: [Errno 2] No such file or directory: '/usr/local/cuda-11.8/bin/nvcc' [end of output] note: This error originates from a subprocess, and is likely not a problem with pip. error: metadata-generation-failed × Encountered error while generating package metadata. ╰─> See above for output. note: This is an issue with the package mentioned above, not pip. hint: See above for details. (base) os01@DESKTOP-F3AFPI2:~/Open-Sora-Plan$ sudo chmod -R 755 /usr/local/cuda-11.8/bin chmod: cannot access '/usr/local/cuda-11.8/bin': No such file or directory (base) os01@DESKTOP-F3AFPI2:~/Open-Sora-Plan$ python -c "import deepspeed; print(deepspeed.__version__)" # 应输出 0.12.6 Traceback (most recent call last): File "<string>", line 1, in <module> ModuleNotFoundError: No module named 'deepspeed' (base) os01@DESKTOP-F3AFPI2:~/Open-Sora-Plan$ sudo apt install build-essential # 确保编译工具链完整 Reading package lists... Done Building dependency tree... Done Reading state information... Done build-essential is already the newest version (12.10ubuntu1). The following packages were automatically installed and are no longer required: libdrm-nouveau2 libdrm-radeon1 libgl1-amber-dri libglapi-mesa libxcb-dri2-0 Use 'sudo apt autoremove' to remove them. 0 upgraded, 0 newly installed, 0 to remove and 168 not upgraded. (base) os01@DESKTOP-F3AFPI2:~/Open-Sora-Plan$ source ~/.bashrc (base) os01@DESKTOP-F3AFPI2:~/Open-Sora-Plan$ echo $CUDA_HOME # 应显示正确路径 nvcc --version # 应显示版本信息 /usr/local/cuda-11.8 nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2023 NVIDIA Corporation Built on Fri_Jan__6_16:45:21_PST_2023 Cuda compilation tools, release 12.0, V12.0.140 Build cuda_12.0.r12.0/compiler.32267302_0 (base) os01@DESKTOP-F3AFPI2:~/Open-Sora-Plan$ sudo ln -s /usr/local/cuda-11.8 /usr/local/cuda ln: failed to create symbolic link '/usr/local/cuda': File exists (base) os01@DESKTOP-F3AFPI2:~/Open-Sora-Plan$ CUDA_HOME=/usr/local/cuda-11.8 pip install deepspeed==0.12.6 Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple Collecting deepspeed==0.12.6 Using cached https://pypi.tuna.tsinghua.edu.cn/packages/f1/ff/0fba0fec90e7de1c7148b0527e8ac9cdf2280d274ed135bcb2187f7497a7/deepspeed-0.12.6.tar.gz (1.2 MB) Preparing metadata (setup.py) ... error error: subprocess-exited-with-error × python setup.py egg_info did not run successfully. │ exit code: 1 ╰─> [16 lines of output] Traceback (most recent call last): File "<string>", line 2, in <module> File "<pip-setuptools-caller>", line 34, in <module> File "/tmp/pip-install-opz3tmcr/deepspeed_cadb668f897d483785a78637c4e1c468/setup.py", line 100, in <module> cuda_major_ver, cuda_minor_ver = installed_cuda_version() File "/tmp/pip-install-opz3tmcr/deepspeed_cadb668f897d483785a78637c4e1c468/op_builder/builder.py", line 52, in installed_cuda_version output = subprocess.check_output([cuda_home + "/bin/nvcc", "-V"], universal_newlines=True) File "/home/os01/miniconda3/lib/python3.8/subprocess.py", line 415, in check_output return run(*popenargs, stdout=PIPE, timeout=timeout, check=True, File "/home/os01/miniconda3/lib/python3.8/subprocess.py", line 493, in run with Popen(*popenargs, **kwargs) as process: File "/home/os01/miniconda3/lib/python3.8/subprocess.py", line 858, in __init__ self._execute_child(args, executable, preexec_fn, close_fds, File "/home/os01/miniconda3/lib/python3.8/subprocess.py", line 1720, in _execute_child raise child_exception_type(errno_num, err_msg, err_filename) FileNotFoundError: [Errno 2] No such file or directory: '/usr/local/cuda-11.8/bin/nvcc' [end of output] note: This error originates from a subprocess, and is likely not a problem with pip. error: metadata-generation-failed × Encountered error while generating package metadata. ╰─> See above for output. note: This is an issue with the package mentioned above, not pip. hint: See above for details. (base) os01@DESKTOP-F3AFPI2:~/Open-Sora-Plan$ ls -ld /usr/local/cuda lrwxrwxrwx 1 root root 20 Oct 11 15:54 /usr/local/cuda -> /usr/local/cuda-11.8 (base) os01@DESKTOP-F3AFPI2:~/Open-Sora-Plan$ # 删除旧链接 sudo rm /usr/local/cuda # 创建新链接(替换为你的实际路径) sudo ln -s /usr/local/cuda-11.8 /usr/local/cuda (base) os01@DESKTOP-F3AFPI2:~/Open-Sora-Plan$ # 删除旧链接 sudo rm /usr/local/cuda # 创建新链接(替换为你的实际路径) sudo ln -s /usr/local/cuda-11.8 /usr/local/cuda (base) os01@DESKTOP-F3AFPI2:~/Open-Sora-Plan$ ls -l /usr/local | grep cuda lrwxrwxrwx 1 root root 20 Oct 11 15:58 cuda -> /usr/local/cuda-11.8 drwxr-xr-x 6 root root 4096 Oct 11 15:51 cuda-12.2 (base) os01@DESKTOP-F3AFPI2:~/Open-Sora-Plan$ sudo ldconfig # 更新动态链接库缓存 source ~/.bashrc # 重新加载环境变量 (base) os01@DESKTOP-F3AFPI2:~/Open-Sora-Plan$ CUDA_HOME=/usr/local/cuda-11.8 pip install deepspeed==0.12.6 Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple Collecting deepspeed==0.12.6 Using cached https://pypi.tuna.tsinghua.edu.cn/packages/f1/ff/0fba0fec90e7de1c7148b0527e8ac9cdf2280d274ed135bcb2187f7497a7/deepspeed-0.12.6.tar.gz (1.2 MB) Preparing metadata (setup.py) ... error error: subprocess-exited-with-error × python setup.py egg_info did not run successfully. │ exit code: 1 ╰─> [16 lines of output] Traceback (most recent call last): File "<string>", line 2, in <module> File "<pip-setuptools-caller>", line 34, in <module> File "/tmp/pip-install-d0r06ic8/deepspeed_c32c366c992d4568813fd9a585158b8c/setup.py", line 100, in <module> cuda_major_ver, cuda_minor_ver = installed_cuda_version() File "/tmp/pip-install-d0r06ic8/deepspeed_c32c366c992d4568813fd9a585158b8c/op_builder/builder.py", line 52, in installed_cuda_version output = subprocess.check_output([cuda_home + "/bin/nvcc", "-V"], universal_newlines=True) File "/home/os01/miniconda3/lib/python3.8/subprocess.py", line 415, in check_output return run(*popenargs, stdout=PIPE, timeout=timeout, check=True, File "/home/os01/miniconda3/lib/python3.8/subprocess.py", line 493, in run with Popen(*popenargs, **kwargs) as process: File "/home/os01/miniconda3/lib/python3.8/subprocess.py", line 858, in __init__ self._execute_child(args, executable, preexec_fn, close_fds, File "/home/os01/miniconda3/lib/python3.8/subprocess.py", line 1720, in _execute_child raise child_exception_type(errno_num, err_msg, err_filename) FileNotFoundError: [Errno 2] No such file or directory: '/usr/local/cuda-11.8/bin/nvcc' [end of output] note: This error originates from a subprocess, and is likely not a problem with pip. error: metadata-generation-failed × Encountered error while generating package metadata. ╰─> See above for output. note: This is an issue with the package mentioned above, not pip. hint: See above for details. (base) os01@DESKTOP-F3AFPI2:~/Open-Sora-Plan$ sudo chmod -R 755 /usr/local/cuda-11.8/bin chmod: cannot access '/usr/local/cuda-11.8/bin': No such file or directory (base) os01@DESKTOP-F3AFPI2:~/Open-Sora-Plan$
10-12
Linux raspberrypi 6.12.34+rpt-rpi-v8 #1 SMP PREEMPT Debian 1:6.12.34-1+rpt1~bookworm (2025-06-26) aarch64 The programs included with the Debian GNU/Linux system are free software; the exact distribution terms for each program are described in the individual files in /usr/share/doc/*/copyright. Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent permitted by applicable law. Last login: Sat Aug 2 22:18:37 2025 from 192.168.163.80 w@raspberrypi:~ $ pip install dronekit error: externally-managed-environment × This environment is externally managed ╰─> To install Python packages system-wide, try apt install python3-xyz, where xyz is the package you are trying to install. If you wish to install a non-Debian-packaged Python package, create a virtual environment using python3 -m venv path/to/venv. Then use path/to/venv/bin/python and path/to/venv/bin/pip. Make sure you have python3-full installed. For more information visit http://rptl.io/venv note: If you believe this is a mistake, please contact your Python installation or OS distribution provider. You can override this, at the risk of breaking your Python installation or OS, by passing --break-system-packages. hint: See PEP 668 for the detailed specification. w@raspberrypi:~ $ conda create -n dronepy27 python=2.7 -bash: conda: command not found w@raspberrypi:~ $ conda create -n dronepy27 python=2.7 -bash: conda: command not found w@raspberrypi:~ $ sudo apt-get install python-pip python-dev Reading package lists... Done Building dependency tree... Done Reading state information... Done Package python-pip is not available, but is referred to by another package. This may mean that the package is missing, has been obsoleted, or is only available from another source However the following packages replace it: python3-pip Package python-dev is not available, but is referred to by another package. This may mean that the package is missing, has been obsoleted, or is only available from another source However the following packages replace it: python-dev-is-python3 E: Package 'python-pip' has no installation candidate E: Package 'python-dev' has no installation candidate w@raspberrypi:~ $ sudo apt-get install python-pip python-dev Reading package lists... Done Building dependency tree... Done Reading state information... Done Package python-pip is not available, but is referred to by another package. This may mean that the package is missing, has been obsoleted, or is only available from another source However the following packages replace it: python3-pip Package python-dev is not available, but is referred to by another package. This may mean that the package is missing, has been obsoleted, or is only available from another source However the following packages replace it: python-dev-is-python3 E: Package 'python-pip' has no installation candidate E: Package 'python-dev' has no installation candidate w@raspberrypi:~ $ ^C w@raspberrypi:~ $ sudo apt-get install python2.7 Reading package lists... Done Building dependency tree... Done Reading state information... Done Note, selecting 'libpython2.7:armhf' for regex 'python2.7' The following packages were automatically installed and are no longer required: libbasicusageenvironment1 libgroupsock8 liblivemedia77 python3-v4l2 Use 'sudo apt autoremove' to remove them. 0 upgraded, 0 newly installed, 0 to remove and 4 not upgraded. w@raspberrypi:~ $ sudo apt-get install python-pip python-dev Reading package lists... Done Building dependency tree... Done Reading state information... Done Package python-pip is not available, but is referred to by another package. This may mean that the package is missing, has been obsoleted, or is only available from another source However the following packages replace it: python3-pip Package python-dev is not available, but is referred to by another package. This may mean that the package is missing, has been obsoleted, or is only available from another source However the following packages replace it: python-dev-is-python3 E: Package 'python-pip' has no installation candidate E: Package 'python-dev' has no installation candidate w@raspberrypi:~ $ sudo pip install dronekit error: externally-managed-environment × This environment is externally managed ╰─> To install Python packages system-wide, try apt install python3-xyz, where xyz is the package you are trying to install. If you wish to install a non-Debian-packaged Python package, create a virtual environment using python3 -m venv path/to/venv. Then use path/to/venv/bin/python and path/to/venv/bin/pip. Make sure you have python3-full installed. For more information visit http://rptl.io/venv note: If you believe this is a mistake, please contact your Python installation or OS distribution provider. You can override this, at the risk of breaking your Python installation or OS, by passing --break-system-packages. hint: See PEP 668 for the detailed specification. w@raspberrypi:~ $ sudo apt-get install python-pip python-dev Reading package lists... Done Building dependency tree... Done Reading state information... Done Package python-pip is not available, but is referred to by another package. This may mean that the package is missing, has been obsoleted, or is only available from another source However the following packages replace it: python3-pip Package python-dev is not available, but is referred to by another package. This may mean that the package is missing, has been obsoleted, or is only available from another source However the following packages replace it: python-dev-is-python3 E: Package 'python-pip' has no installation candidate E: Package 'python-dev' has no installation candidate w@raspberrypi:~ $ sudo apt-get install libxml2-dev libxslt1-dev zlib1g-dev python-py Reading package lists... Done Building dependency tree... Done Reading state information... Done E: Unable to locate package python-py w@raspberrypi:~ $ cd w@raspberrypi:~ $ sudo apt-get install python-pip Reading package lists... Done Building dependency tree... Done Reading state information... Done Package python-pip is not available, but is referred to by another package. This may mean that the package is missing, has been obsoleted, or is only available from another source However the following packages replace it: python3-pip E: Package 'python-pip' has no installation candidate w@raspberrypi:~ $ sudo apt-get install python-dev Reading package lists... Done Building dependency tree... Done Reading state information... Done Package python-dev is not available, but is referred to by another package. This may mean that the package is missing, has been obsoleted, or is only available from another source However the following packages replace it: python-dev-is-python3 E: Package 'python-dev' has no installation candidate w@raspberrypi:~ $ sudo apt-get install libxml2-dev libxslt1-dev zlib1g-dev python-py Reading package lists... Done Building dependency tree... Done Reading state information... Done E: Unable to locate package python-py w@raspberrypi:~ $ sudo apt update Hit:1 https://packages.microsoft.com/repos/code stable InRelease Hit:2 http://archive.raspberrypi.com/debian bookworm InRelease Hit:3 http://deb.debian.org/debian bookworm InRelease Hit:4 http://deb.debian.org/debian-security bookworm-security InRelease Get:5 http://deb.debian.org/debian bookworm-updates InRelease [55.4 kB] Fetched 55.4 kB in 13s (4,374 B/s) Reading package lists... Done Building dependency tree... Done Reading state information... Done 4 packages can be upgraded. Run 'apt list --upgradable' to see them. w@raspberrypi:~ $ sudo apt install python3-pip python3-dev Reading package lists... Done Building dependency tree... Done Reading state information... Done python3-pip is already the newest version (23.0.1+dfsg-1+rpt1). python3-dev is already the newest version (3.11.2-1+b1). The following packages were automatically installed and are no longer required: libbasicusageenvironment1 libgroupsock8 liblivemedia77 python3-v4l2 Use 'sudo apt autoremove' to remove them. 0 upgraded, 0 newly installed, 0 to remove and 4 not upgraded. w@raspberrypi:~ $ sudo apt install python-is-python3 Reading package lists... Done Building dependency tree... Done Reading state information... Done python-is-python3 is already the newest version (3.11.2-1+deb12u1). The following packages were automatically installed and are no longer required: libbasicusageenvironment1 libgroupsock8 liblivemedia77 python3-v4l2 Use 'sudo apt autoremove' to remove them. 0 upgraded, 0 newly installed, 0 to remove and 4 not upgraded. w@raspberrypi:~ $ sudo pip install dronekit error: externally-managed-environment × This environment is externally managed ╰─> To install Python packages system-wide, try apt install python3-xyz, where xyz is the package you are trying to install. If you wish to install a non-Debian-packaged Python package, create a virtual environment using python3 -m venv path/to/venv. Then use path/to/venv/bin/python and path/to/venv/bin/pip. Make sure you have python3-full installed. For more information visit http://rptl.io/venv note: If you believe this is a mistake, please contact your Python installation or OS distribution provider. You can override this, at the risk of breaking your Python installation or OS, by passing --break-system-packages. hint: See PEP 668 for the detailed specification. w@raspberrypi:~ $ ^C w@raspberrypi:~ $ pip --version pip 23.0.1 from /usr/lib/python3/dist-packages/pip (python 3.11) w@raspberrypi:~ $ sudo apt install -y \ > build-essential \ > python3-dev \ libxml2-dev \ libxslt1-dev \ libgeos-dev \ libproj-dev \ libssl-dev \ zlib1g-dev \ libffi-dev Reading package lists... Done Building dependency tree... Done Reading state information... Done E: Unable to locate package w@raspberrypi:~ $ sudo apt update sudo apt install -y \ build-essential \ python3-dev \ libxml2-dev \ libxslt1-dev \ libgeos-dev \ libproj-dev \ libssl-dev \ zlib1g-dev \ libffi-dev Hit:1 https://packages.microsoft.com/repos/code stable InRelease Hit:2 http://deb.debian.org/debian bookworm InRelease Hit:3 http://deb.debian.org/debian-security bookworm-security InRelease Hit:4 http://deb.debian.org/debian bookworm-updates InRelease Hit:5 http://archive.raspberrypi.com/debian bookworm InRelease Reading package lists... Done Building dependency tree... Done Reading state information... Done 4 packages can be upgraded. Run 'apt list --upgradable' to see them. Reading package lists... Done Building dependency tree... Done Reading state information... Done build-essential is already the newest version (12.9). python3-dev is already the newest version (3.11.2-1+b1). libxml2-dev is already the newest version (2.9.14+dfsg-1.3~deb12u2). libxslt1-dev is already the newest version (1.1.35-1+deb12u1). libssl-dev is already the newest version (3.0.17-1~deb12u1). zlib1g-dev is already the newest version (1:1.2.13.dfsg-1+rpt1). The following packages were automatically installed and are no longer required: libbasicusageenvironment1 libgroupsock8 liblivemedia77 python3-v4l2 Use 'sudo apt autoremove' to remove them. The following additional packages will be installed: libcurl4-gnutls-dev libdeflate-dev libgeos-c1v5 libgeos3.11.1 libjbig-dev libjpeg-dev libjpeg62-turbo-dev liblerc-dev liblzma-dev libproj25 libsqlite3-dev libtiff-dev libtiffxx6 libwebp-dev libzstd-dev proj-data Suggested packages: libcurl4-doc libgnutls28-dev libidn-dev libkrb5-dev libldap2-dev librtmp-dev libssh2-1-dev libgdal-doc liblzma-doc proj-bin sqlite3-doc The following NEW packages will be installed: libcurl4-gnutls-dev libdeflate-dev libffi-dev libgeos-c1v5 libgeos-dev libgeos3.11.1 libjbig-dev libjpeg-dev libjpeg62-turbo-dev liblerc-dev liblzma-dev libproj-dev libproj25 libsqlite3-dev libtiff-dev libtiffxx6 libwebp-dev libzstd-dev proj-data 0 upgraded, 19 newly installed, 0 to remove and 4 not upgraded. Need to get 11.9 MB of archives. After this operation, 43.5 MB of additional disk space will be used. Get:1 http://deb.debian.org/debian bookworm/main arm64 libcurl4-gnutls-dev arm64 7.88.1-10+deb12u12 [472 kB] Get:2 http://deb.debian.org/debian bookworm/main arm64 libdeflate-dev arm64 1.14-1 [50.0 kB] Get:3 http://deb.debian.org/debian bookworm/main arm64 libffi-dev arm64 3.4.4-1 [56.0 kB] Get:4 http://deb.debian.org/debian bookworm/main arm64 libgeos3.11.1 arm64 3.11.1-1 [683 kB] Get:5 http://deb.debian.org/debian bookworm/main arm64 libgeos-c1v5 arm64 3.11.1-1 [75.6 kB] Get:6 http://deb.debian.org/debian bookworm/main arm64 libgeos-dev arm64 3.11.1-1 [52.4 kB] Get:7 http://deb.debian.org/debian bookworm/main arm64 libjbig-dev arm64 2.1-6.1 [29.9 kB] Get:8 http://deb.debian.org/debian bookworm/main arm64 libjpeg62-turbo-dev arm64 1:2.1.5-2 [292 kB] Get:9 http://deb.debian.org/debian bookworm/main arm64 libjpeg-dev arm64 1:2.1.5-2 [71.8 kB] Get:10 http://deb.debian.org/debian bookworm/main arm64 liblerc-dev arm64 4.0.0+ds-2 [146 kB] Get:11 http://deb.debian.org/debian bookworm/main arm64 liblzma-dev arm64 5.4.1-1 [255 kB] Get:12 http://deb.debian.org/debian bookworm/main arm64 proj-data all 9.1.1-1 [6,212 kB] Get:13 http://deb.debian.org/debian bookworm/main arm64 libproj25 arm64 9.1.1-1+b1 [1,102 kB] Get:14 http://deb.debian.org/debian bookworm/main arm64 libsqlite3-dev arm64 3.40.1-2+deb12u1 [979 kB] Get:15 http://deb.debian.org/debian bookworm/main arm64 libzstd-dev arm64 1.5.4+dfsg2-5 [317 kB] Get:16 http://deb.debian.org/debian bookworm/main arm64 libwebp-dev arm64 1.2.4-0.2+deb12u1 [377 kB] Get:17 http://deb.debian.org/debian bookworm/main arm64 libtiffxx6 arm64 4.5.0-6+deb12u2 [144 kB] Get:18 http://deb.debian.org/debian bookworm/main arm64 libtiff-dev arm64 4.5.0-6+deb12u2 [441 kB] Get:19 http://deb.debian.org/debian bookworm/main arm64 libproj-dev arm64 9.1.1-1+b1 [128 kB] Fetched 11.9 MB in 11min 56s (16.6 kB/s) Selecting previously unselected package libcurl4-gnutls-dev:arm64. (Reading database ... 157355 files and directories currently installed.) Preparing to unpack .../00-libcurl4-gnutls-dev_7.88.1-10+deb12u12_arm64.deb ... Unpacking libcurl4-gnutls-dev:arm64 (7.88.1-10+deb12u12) ... Selecting previously unselected package libdeflate-dev:arm64. Preparing to unpack .../01-libdeflate-dev_1.14-1_arm64.deb ... Unpacking libdeflate-dev:arm64 (1.14-1) ... Selecting previously unselected package libffi-dev:arm64. Preparing to unpack .../02-libffi-dev_3.4.4-1_arm64.deb ... Unpacking libffi-dev:arm64 (3.4.4-1) ... Selecting previously unselected package libgeos3.11.1:arm64. Preparing to unpack .../03-libgeos3.11.1_3.11.1-1_arm64.deb ... Unpacking libgeos3.11.1:arm64 (3.11.1-1) ... Selecting previously unselected package libgeos-c1v5:arm64. Preparing to unpack .../04-libgeos-c1v5_3.11.1-1_arm64.deb ... Unpacking libgeos-c1v5:arm64 (3.11.1-1) ... Selecting previously unselected package libgeos-dev. Preparing to unpack .../05-libgeos-dev_3.11.1-1_arm64.deb ... Unpacking libgeos-dev (3.11.1-1) ... Selecting previously unselected package libjbig-dev:arm64. Preparing to unpack .../06-libjbig-dev_2.1-6.1_arm64.deb ... Unpacking libjbig-dev:arm64 (2.1-6.1) ... Selecting previously unselected package libjpeg62-turbo-dev:arm64. Preparing to unpack .../07-libjpeg62-turbo-dev_1%3a2.1.5-2_arm64.deb ... Unpacking libjpeg62-turbo-dev:arm64 (1:2.1.5-2) ... Selecting previously unselected package libjpeg-dev:arm64. Preparing to unpack .../08-libjpeg-dev_1%3a2.1.5-2_arm64.deb ... Unpacking libjpeg-dev:arm64 (1:2.1.5-2) ... Selecting previously unselected package liblerc-dev:arm64. Preparing to unpack .../09-liblerc-dev_4.0.0+ds-2_arm64.deb ... Unpacking liblerc-dev:arm64 (4.0.0+ds-2) ... Selecting previously unselected package liblzma-dev:arm64. Preparing to unpack .../10-liblzma-dev_5.4.1-1_arm64.deb ... Unpacking liblzma-dev:arm64 (5.4.1-1) ... Selecting previously unselected package proj-data. Preparing to unpack .../11-proj-data_9.1.1-1_all.deb ... Unpacking proj-data (9.1.1-1) ... Selecting previously unselected package libproj25:arm64. Preparing to unpack .../12-libproj25_9.1.1-1+b1_arm64.deb ... Unpacking libproj25:arm64 (9.1.1-1+b1) ... Selecting previously unselected package libsqlite3-dev:arm64. Preparing to unpack .../13-libsqlite3-dev_3.40.1-2+deb12u1_arm64.deb ... Unpacking libsqlite3-dev:arm64 (3.40.1-2+deb12u1) ... Selecting previously unselected package libzstd-dev:arm64. Preparing to unpack .../14-libzstd-dev_1.5.4+dfsg2-5_arm64.deb ... Unpacking libzstd-dev:arm64 (1.5.4+dfsg2-5) ... Selecting previously unselected package libwebp-dev:arm64. Preparing to unpack .../15-libwebp-dev_1.2.4-0.2+deb12u1_arm64.deb ... Unpacking libwebp-dev:arm64 (1.2.4-0.2+deb12u1) ... Selecting previously unselected package libtiffxx6:arm64. Preparing to unpack .../16-libtiffxx6_4.5.0-6+deb12u2_arm64.deb ... Unpacking libtiffxx6:arm64 (4.5.0-6+deb12u2) ... Selecting previously unselected package libtiff-dev:arm64. Preparing to unpack .../17-libtiff-dev_4.5.0-6+deb12u2_arm64.deb ... Unpacking libtiff-dev:arm64 (4.5.0-6+deb12u2) ... Selecting previously unselected package libproj-dev:arm64. Preparing to unpack .../18-libproj-dev_9.1.1-1+b1_arm64.deb ... Unpacking libproj-dev:arm64 (9.1.1-1+b1) ... Setting up libzstd-dev:arm64 (1.5.4+dfsg2-5) ... Setting up proj-data (9.1.1-1) ... Setting up libgeos3.11.1:arm64 (3.11.1-1) ... Setting up libproj25:arm64 (9.1.1-1+b1) ... Setting up libjbig-dev:arm64 (2.1-6.1) ... Setting up libcurl4-gnutls-dev:arm64 (7.88.1-10+deb12u12) ... Setting up libffi-dev:arm64 (3.4.4-1) ... Setting up libwebp-dev:arm64 (1.2.4-0.2+deb12u1) ... Setting up libsqlite3-dev:arm64 (3.40.1-2+deb12u1) ... Setting up libjpeg62-turbo-dev:arm64 (1:2.1.5-2) ... Setting up libgeos-c1v5:arm64 (3.11.1-1) ... Setting up liblerc-dev:arm64 (4.0.0+ds-2) ... Setting up liblzma-dev:arm64 (5.4.1-1) ... Setting up libtiffxx6:arm64 (4.5.0-6+deb12u2) ... Setting up libdeflate-dev:arm64 (1.14-1) ... Setting up libjpeg-dev:arm64 (1:2.1.5-2) ... Setting up libtiff-dev:arm64 (4.5.0-6+deb12u2) ... Setting up libgeos-dev (3.11.1-1) ... Setting up libproj-dev:arm64 (9.1.1-1+b1) ... Processing triggers for man-db (2.11.2-2) ... Processing triggers for libc-bin (2.36-9+rpt2+deb12u12) ... w@raspberrypi:~ $ sudo apt install python3-dronekit Reading package lists... Done Building dependency tree... Done Reading state information... Done E: Unable to locate package python3-dronekit w@raspberrypi:~ $ sudo pip3 install --no-cache-dir --upgrade setuptools wheel error: externally-managed-environment × This environment is externally managed ╰─> To install Python packages system-wide, try apt install python3-xyz, where xyz is the package you are trying to install. If you wish to install a non-Debian-packaged Python package, create a virtual environment using python3 -m venv path/to/venv. Then use path/to/venv/bin/python and path/to/venv/bin/pip. Make sure you have python3-full installed. For more information visit http://rptl.io/venv note: If you believe this is a mistake, please contact your Python installation or OS distribution provider. You can override this, at the risk of breaking your Python installation or OS, by passing --break-system-packages. hint: See PEP 668 for the detailed specification. w@raspberrypi:~ $ sudo pip3 install --no-cache-dir dronekit error: externally-managed-environment × This environment is externally managed ╰─> To install Python packages system-wide, try apt install python3-xyz, where xyz is the package you are trying to install. If you wish to install a non-Debian-packaged Python package, create a virtual environment using python3 -m venv path/to/venv. Then use path/to/venv/bin/python and path/to/venv/bin/pip. Make sure you have python3-full installed. For more information visit http://rptl.io/venv note: If you believe this is a mistake, please contact your Python installation or OS distribution provider. You can override this, at the risk of breaking your Python installation or OS, by passing --break-system-packages. hint: See PEP 668 for the detailed specification. w@raspberrypi:~ $ sudo apt install -y build-essential python3-dev python3-venv Reading package lists... Done Building dependency tree... Done Reading state information... Done build-essential is already the newest version (12.9). python3-dev is already the newest version (3.11.2-1+b1). python3-venv is already the newest version (3.11.2-1+b1). The following packages were automatically installed and are no longer required: libbasicusageenvironment1 libgroupsock8 liblivemedia77 python3-v4l2 Use 'sudo apt autoremove' to remove them. 0 upgraded, 0 newly installed, 0 to remove and 4 not upgraded. w@raspberrypi:~ $ python3 -m venv --system-site-packages ~/dronekit_safe_env w@raspberrypi:~ $ source ~/dronekit_safe_env/bin/activate (dronekit_safe_env) w@raspberrypi:~ $ pip install --no-cache-dir --upgrade pip setuptools wheel cython Looking in indexes: https://pypi.org/simple, https://www.piwheels.org/simple Requirement already satisfied: pip in ./dronekit_safe_env/lib/python3.11/site-packages (23.0.1) Collecting pip Downloading pip-25.2-py3-none-any.whl (1.8 MB) ━━━━━━━━━━━━━━━━━━━━━ 1.8/1.8 MB 21.6 kB/s eta 0:00:00 Requirement already satisfied: setuptools in ./dronekit_safe_env/lib/python3.11/site-packages (66.1.1) Collecting setuptools Downloading setuptools-80.9.0-py3-none-any.whl (1.2 MB) ━━━━━━━━━━━━━━━━━━━━━ 1.2/1.2 MB 23.3 kB/s eta 0:00:00 Requirement already satisfied: wheel in /usr/lib/python3/dist-packages (0.38.4) Collecting wheel Downloading https://www.piwheels.org/simple/wheel/wheel-0.45.1-py3-none-any.whl (72 kB) ━━━━━━━━━━━━━━━━━━ 72.5/72.5 kB 119.9 kB/s eta 0:00:00 Collecting cython Downloading cython-3.1.2-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (3.1 MB) ━━━━━━━━━━━━━━━━━━━━━ 3.1/3.1 MB 24.2 kB/s eta 0:00:00 Installing collected packages: wheel, setuptools, pip, cython Attempting uninstall: wheel Found existing installation: wheel 0.38.4 Not uninstalling wheel at /usr/lib/python3/dist-packages, outside environment /home/w/dronekit_safe_env Can't uninstall 'wheel'. No files were found to uninstall. Attempting uninstall: setuptools Found existing installation: setuptools 66.1.1 Uninstalling setuptools-66.1.1: Successfully uninstalled setuptools-66.1.1 Attempting uninstall: pip Found existing installation: pip 23.0.1 Uninstalling pip-23.0.1: Successfully uninstalled pip-23.0.1 Successfully installed cython-3.1.2 pip-25.2 setuptools-80.9.0 wheel-0.45.1 (dronekit_safe_env) w@raspberrypi:~ $ git clone https://github.com/mavlink/pymavlink.git Cloning into 'pymavlink'... Username for 'https://github.com': cd pymavlink Password for 'https://cd%20pymavlink@github.com': remote: Invalid username or token. Password authentication is not supported for Git operations. fatal: Authentication failed for 'https://github.com/mavlink/pymavlink.git/' (dronekit_safe_env) w@raspberrypi:~ $ git clone https://github.com/mavlink/pymavlink.git cd pymavlink python setup.py bdist_wheel pip install dist/pymavlink-*.whl --no-cache-dir cd .. Cloning into 'pymavlink'... Username for 'https://github.com': Password for 'https://github.com': remote: Repository not found. fatal: Authentication failed for 'https://github.com/mavlink/pymavlink.git/' -bash: cd: pymavlink: No such file or directory python: can't open file '/home/w/setup.py': [Errno 2] No such file or directory WARNING: Requirement 'dist/pymavlink-*.whl' looks like a filename, but the file does not exist ERROR: Invalid wheel filename (wrong number of parts): 'pymavlink-*' (dronekit_safe_env) w@raspberrypi:/home $ git clone https://github.com/mavlink/pymavlink.git fatal: could not create work tree dir 'pymavlink': Permission denied (dronekit_safe_env) w@raspberrypi:/home $ cd ~ (dronekit_safe_env) w@raspberrypi:~ $ git clone https://github.com/mavlink/pymavlink.git Cloning into 'pymavlink'... Username for 'https://github.com': cd pymavlink Password for 'https://cd%20pymavlink@github.com': (dronekit_safe_env) w@raspberrypi:~ $ git clone https://github.com/mavlink/pymavlink.git Cloning into 'pymavlink'... Username for 'https://github.com': ^[[D^[[D^[[D^[[D^[[D^[[D^[[D^^[[3~^[[3~^[[3~^[[3~^[[3~^[[3~^[[3~^[[3~^[[3~^[[3~^[[3~^[[3~^[[3^[[C^[[C^[[C^[[C^C (dronekit_safe_env) w@raspberrypi:~ $ git clone https://github.com/mavlink/pymavlink.git Cloning into 'pymavlink'... Username for 'https://github.com': w Password for 'https://w@github.com': remote: Invalid username or token. Password authentication is not supported for Git operations. fatal: Authentication failed for 'https://github.com/mavlink/pymavlink.git/' (dronekit_safe_env) w@raspberrypi:~ $ rm -rf pymavlink (dronekit_safe_env) w@raspberrypi:~ $ sudo apt install -y \ build-essential \ python3-dev \ libxml2-dev \ libxslt1-dev \ libgeos-dev \ libproj-dev \ libssl-dev \ zlib1g-dev \ libffi-dev Reading package lists... Done Building dependency tree... Done Reading state information... Done build-essential is already the newest version (12.9). python3-dev is already the newest version (3.11.2-1+b1). libxml2-dev is already the newest version (2.9.14+dfsg-1.3~deb12u2). libxslt1-dev is already the newest version (1.1.35-1+deb12u1). libgeos-dev is already the newest version (3.11.1-1). libproj-dev is already the newest version (9.1.1-1+b1). libssl-dev is already the newest version (3.0.17-1~deb12u1). zlib1g-dev is already the newest version (1:1.2.13.dfsg-1+rpt1). libffi-dev is already the newest version (3.4.4-1). The following packages were automatically installed and are no longer required: libbasicusageenvironment1 libgroupsock8 liblivemedia77 python3-v4l2 Use 'sudo apt autoremove' to remove them. 0 upgraded, 0 newly installed, 0 to remove and 4 not upgraded. (dronekit_safe_env) w@raspberrypi:~ $ sudo pip3 install --no-cache-dir --upgrade setuptools wheel error: externally-managed-environment × This environment is externally managed ╰─> To install Python packages system-wide, try apt install python3-xyz, where xyz is the package you are trying to install. If you wish to install a non-Debian-packaged Python package, create a virtual environment using python3 -m venv path/to/venv. Then use path/to/venv/bin/python and path/to/venv/bin/pip. Make sure you have python3-full installed. For more information visit http://rptl.io/venv note: If you believe this is a mistake, please contact your Python installation or OS distribution provider. You can override this, at the risk of breaking your Python installation or OS, by passing --break-system-packages. hint: See PEP 668 for the detailed specification. (dronekit_safe_env) w@raspberrypi:~ $ sudo apt install python3-dronekit Reading package lists... Done Building dependency tree... Done Reading state information... Done E: Unable to locate package python3-dronekit (dronekit_safe_env) w@raspberrypi:~ $ sudo pip3 install --no-cache-dir Cython numpy error: externally-managed-environment × This environment is externally managed ╰─> To install Python packages system-wide, try apt install python3-xyz, where xyz is the package you are trying to install. If you wish to install a non-Debian-packaged Python package, create a virtual environment using python3 -m venv path/to/venv. Then use path/to/venv/bin/python and path/to/venv/bin/pip. Make sure you have python3-full installed. For more information visit http://rptl.io/venv note: If you believe this is a mistake, please contact your Python installation or OS distribution provider. You can override this, at the risk of breaking your Python installation or OS, by passing --break-system-packages. hint: See PEP 668 for the detailed specification. (dronekit_safe_env) w@raspberrypi:~ $ sudo pip3 install --no-cache-dir --upgrade setuptools wheel error: externally-managed-environment × This environment is externally managed ╰─> To install Python packages system-wide, try apt install python3-xyz, where xyz is the package you are trying to install. If you wish to install a non-Debian-packaged Python package, create a virtual environment using python3 -m venv path/to/venv. Then use path/to/venv/bin/python and path/to/venv/bin/pip. Make sure you have python3-full installed. For more information visit http://rptl.io/venv note: If you believe this is a mistake, please contact your Python installation or OS distribution provider. You can override this, at the risk of breaking your Python installation or OS, by passing --break-system-packages. hint: See PEP 668 for the detailed specification. (dronekit_safe_env) w@raspberrypi:~ $ source ~/dronekit_safe_env/bin/activate (dronekit_safe_env) w@raspberrypi:~ $ pip install --no-cache-dir --upgrade setuptools wheel Looking in indexes: https://pypi.org/simple, https://www.piwheels.org/simple Requirement already satisfied: setuptools in ./dronekit_safe_env/lib/python3.11/site-packages (80.9.0) Requirement already satisfied: wheel in ./dronekit_safe_env/lib/python3.11/site-packages (0.45.1) WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProtocolError('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))': /simple/wheel/ WARNING: Error parsing dependencies of send2trash: Expected matching RIGHT_PARENTHESIS for LEFT_PARENTHESIS, after version specifier sys-platform (=="darwin") ; extra == 'objc' ~^ (dronekit_safe_env) w@raspberrypi:~ $ pip install --no-cache-dir --upgrade setuptools wheel Looking in indexes: https://pypi.org/simple, https://www.piwheels.org/simple Requirement already satisfied: setuptools in ./dronekit_safe_env/lib/python3.11/site-packages (80.9.0) Requirement already satisfied: wheel in ./dronekit_safe_env/lib/python3.11/site-packages (0.45.1) WARNING: Error parsing dependencies of send2trash: Expected matching RIGHT_PARENTHESIS for LEFT_PARENTHESIS, after version specifier sys-platform (=="darwin") ; extra == 'objc' ~^ (dronekit_safe_env) w@raspberrypi:~ $ python3 -m venv ~/dronekit_safe_env (dronekit_safe_env) w@raspberrypi:~ $ source ~/dronekit_safe_env/bin/activate (dronekit_safe_env) w@raspberrypi:~ $ pip install --upgrade pip setuptools wheel Looking in indexes: https://pypi.org/simple, https://www.piwheels.org/simple Requirement already satisfied: pip in ./dronekit_safe_env/lib/python3.11/site-packages (25.2) Requirement already satisfied: setuptools in ./dronekit_safe_env/lib/python3.11/site-packages (80.9.0) Requirement already satisfied: wheel in ./dronekit_safe_env/lib/python3.11/site-packages (0.45.1) (dronekit_safe_env) w@raspberrypi:~ $ pip list | grep -E "setuptools|wheel" setuptools 80.9.0 wheel 0.45.1 (dronekit_safe_env) w@raspberrypi:~ $ sudo pip3 install --no-cache-dir Cython numpy error: externally-managed-environment × This environment is externally managed ╰─> To install Python packages system-wide, try apt install python3-xyz, where xyz is the package you are trying to install. If you wish to install a non-Debian-packaged Python package, create a virtual environment using python3 -m venv path/to/venv. Then use path/to/venv/bin/python and path/to/venv/bin/pip. Make sure you have python3-full installed. For more information visit http://rptl.io/venv note: If you believe this is a mistake, please contact your Python installation or OS distribution provider. You can override this, at the risk of breaking your Python installation or OS, by passing --break-system-packages. hint: See PEP 668 for the detailed specification. (dronekit_safe_env) w@raspberrypi:~ $
08-03
Quickstart Note The data files used in the quickstart guide are updated from time to time, which means that the adjusted close changes and with it the close (and the other components). That means that the actual output may be different to what was put in the documentation at the time of writing. Using the platform Let’s run through a series of examples (from almost an empty one to a fully fledged strategy) but not without before roughly explaining 2 basic concepts when working with backtrader Lines Data Feeds, Indicators and Strategies have lines. A line is a succession of points that when joined together form this line. When talking about the markets, a Data Feed has usually the following set of points per day: Open, High, Low, Close, Volume, OpenInterest The series of “Open”s along time is a Line. And therefore a Data Feed has usually 6 lines. If we also consider “DateTime” (which is the actual reference for a single point), we could count 7 lines. Index 0 Approach When accessing the values in a line, the current value is accessed with index: 0 And the “last” output value is accessed with -1. This in line with Python conventions for iterables (and a line can be iterated and is therefore an iterable) where index -1 is used to access the “last” item of the iterable/array. In our case is the last output value what’s getting accessed. As such and being index 0 right after -1, it is used to access the current moment in line. With that in mind and if we imagine a Strategy featuring a Simple Moving average created during initialization: self.sma = SimpleMovingAverage(.....) The easiest and simplest way to access the current value of this moving average: av = self.sma[0] There is no need to know how many bars/minutes/days/months have been processed, because “0” uniquely identifies the current instant. Following pythonic tradition, the “last” output value is accessed using -1: previous_value = self.sma[-1] Of course earlier output values can be accessed with -2, -3, … From 0 to 100: the samples Basic Setup Let’s get running. from __future__ import (absolute_import, division, print_function, unicode_literals) import backtrader as bt if __name__ == '__main__': cerebro = bt.Cerebro() print('Starting Portfolio Value: %.2f' % cerebro.broker.getvalue()) cerebro.run() print('Final Portfolio Value: %.2f' % cerebro.broker.getvalue()) After the execution the output is: Starting Portfolio Value: 10000.00 Final Portfolio Value: 10000.00 In this example: backtrader was imported The Cerebro engine was instantiated The resulting cerebro instance was told to run (loop over data) And the resulting outcome was printed out Although it doesn’t seem much, let’s point out something explicitly shown: The Cerebro engine has created a broker instance in the background The instance already has some cash to start with This behind the scenes broker instantiation is a constant trait in the platform to simplify the life of the user. If no broker is set by the user, a default one is put in place. And 10K monetary units is a usual value with some brokers to begin with. Setting the Cash In the world of finance, for sure only “losers” start with 10k. Let’s change the cash and run the example again. from __future__ import (absolute_import, division, print_function, unicode_literals) import backtrader as bt if __name__ == '__main__': cerebro = bt.Cerebro() cerebro.broker.setcash(100000.0) print('Starting Portfolio Value: %.2f' % cerebro.broker.getvalue()) cerebro.run() print('Final Portfolio Value: %.2f' % cerebro.broker.getvalue()) After the execution the output is: Starting Portfolio Value: 1000000.00 Final Portfolio Value: 1000000.00 Mission accomplished. Let’s move to tempestuous waters. Adding a Data Feed Having cash is fun, but the purpose behind all this is to let an automated strategy multiply the cash without moving a finger by operating on an asset which we see as a Data Feed Ergo … No Data Feed -> No Fun. Let’s add one to the ever growing example. from __future__ import (absolute_import, division, print_function, unicode_literals) import datetime # For datetime objects import os.path # To manage paths import sys # To find out the script name (in argv[0]) # Import the backtrader platform import backtrader as bt if __name__ == '__main__': # Create a cerebro entity cerebro = bt.Cerebro() # Datas are in a subfolder of the samples. Need to find where the script is # because it could have been called from anywhere modpath = os.path.dirname(os.path.abspath(sys.argv[0])) datapath = os.path.join(modpath, '../../datas/orcl-1995-2014.txt') # Create a Data Feed data = bt.feeds.YahooFinanceCSVData( dataname=datapath, # Do not pass values before this date fromdate=datetime.datetime(2000, 1, 1), # Do not pass values after this date todate=datetime.datetime(2000, 12, 31), reverse=False) # Add the Data Feed to Cerebro cerebro.adddata(data) # Set our desired cash start cerebro.broker.setcash(100000.0) # Print out the starting conditions print('Starting Portfolio Value: %.2f' % cerebro.broker.getvalue()) # Run over everything cerebro.run() # Print out the final result print('Final Portfolio Value: %.2f' % cerebro.broker.getvalue()) After the execution the output is: Starting Portfolio Value: 1000000.00 Final Portfolio Value: 1000000.00 The amount of boilerplate has grown slightly, because we added: Finding out where our example script is to be able to locate the sample Data Feed file Having datetime objects to filter on which data from the Data Feed we will be operating Aside from that, the Data Feed is created and added to cerebro. The output has not changed and it would be a miracle if it had. Note Yahoo Online sends the CSV data in date descending order, which is not the standard convention. The reversed=True prameter takes into account that the CSV data in the file has already been reversed and has the standard expected date ascending order. Our First Strategy The cash is in the broker and the Data Feed is there. It seems like risky business is just around the corner. Let’s put a Strategy into the equation and print the “Close” price of each day (bar). DataSeries (the underlying class in Data Feeds) objects have aliases to access the well known OHLC (Open High Low Close) daily values. This should ease up the creation of our printing logic. from __future__ import (absolute_import, division, print_function, unicode_literals) import datetime # For datetime objects import os.path # To manage paths import sys # To find out the script name (in argv[0]) # Import the backtrader platform import backtrader as bt # Create a Stratey class TestStrategy(bt.Strategy): def log(self, txt, dt=None): ''' Logging function for this strategy''' dt = dt or self.datas[0].datetime.date(0) print('%s, %s' % (dt.isoformat(), txt)) def __init__(self): # Keep a reference to the "close" line in the data[0] dataseries self.dataclose = self.datas[0].close def next(self): # Simply log the closing price of the series from the reference self.log('Close, %.2f' % self.dataclose[0]) if __name__ == '__main__': # Create a cerebro entity cerebro = bt.Cerebro() # Add a strategy cerebro.addstrategy(TestStrategy) # Datas are in a subfolder of the samples. Need to find where the script is # because it could have been called from anywhere modpath = os.path.dirname(os.path.abspath(sys.argv[0])) datapath = os.path.join(modpath, '../../datas/orcl-1995-2014.txt') # Create a Data Feed data = bt.feeds.YahooFinanceCSVData( dataname=datapath, # Do not pass values before this date fromdate=datetime.datetime(2000, 1, 1), # Do not pass values before this date todate=datetime.datetime(2000, 12, 31), # Do not pass values after this date reverse=False) # Add the Data Feed to Cerebro cerebro.adddata(data) # Set our desired cash start cerebro.broker.setcash(100000.0) # Print out the starting conditions print('Starting Portfolio Value: %.2f' % cerebro.broker.getvalue()) # Run over everything cerebro.run() # Print out the final result print('Final Portfolio Value: %.2f' % cerebro.broker.getvalue()) After the execution the output is: Starting Portfolio Value: 100000.00 2000-01-03T00:00:00, Close, 27.85 2000-01-04T00:00:00, Close, 25.39 2000-01-05T00:00:00, Close, 24.05 ... ... ... 2000-12-26T00:00:00, Close, 29.17 2000-12-27T00:00:00, Close, 28.94 2000-12-28T00:00:00, Close, 29.29 2000-12-29T00:00:00, Close, 27.41 Final Portfolio Value: 100000.00 Someone said the stockmarket was risky business, but it doesn’t seem so. Let’s explain some of the magic: Upon init being called the strategy already has a list of datas that are present in the platform This is a standard Python list and datas can be accessed in the order they were inserted. The first data in the list self.datas[0] is the default data for trading operations and to keep all strategy elements synchronized (it’s the system clock) self.dataclose = self.datas[0].close keeps a reference to the close line. Only one level of indirection is later needed to access the close values. The strategy next method will be called on each bar of the system clock (self.datas[0]). This is true until other things come into play like indicators, which need some bars to start producing an output. More on that later. Adding some Logic to the Strategy Let’s try some crazy idea we had by looking at some charts If the price has been falling 3 sessions in a row … BUY BUY BUY!!! from __future__ import (absolute_import, division, print_function, unicode_literals) import datetime # For datetime objects import os.path # To manage paths import sys # To find out the script name (in argv[0]) # Import the backtrader platform import backtrader as bt # Create a Stratey class TestStrategy(bt.Strategy): def log(self, txt, dt=None): ''' Logging function fot this strategy''' dt = dt or self.datas[0].datetime.date(0) print('%s, %s' % (dt.isoformat(), txt)) def __init__(self): # Keep a reference to the "close" line in the data[0] dataseries self.dataclose = self.datas[0].close def next(self): # Simply log the closing price of the series from the reference self.log('Close, %.2f' % self.dataclose[0]) if self.dataclose[0] < self.dataclose[-1]: # current close less than previous close if self.dataclose[-1] < self.dataclose[-2]: # previous close less than the previous close # BUY, BUY, BUY!!! (with all possible default parameters) self.log('BUY CREATE, %.2f' % self.dataclose[0]) self.buy() if __name__ == '__main__': # Create a cerebro entity cerebro = bt.Cerebro() # Add a strategy cerebro.addstrategy(TestStrategy) # Datas are in a subfolder of the samples. Need to find where the script is # because it could have been called from anywhere modpath = os.path.dirname(os.path.abspath(sys.argv[0])) datapath = os.path.join(modpath, '../../datas/orcl-1995-2014.txt') # Create a Data Feed data = bt.feeds.YahooFinanceCSVData( dataname=datapath, # Do not pass values before this date fromdate=datetime.datetime(2000, 1, 1), # Do not pass values before this date todate=datetime.datetime(2000, 12, 31), # Do not pass values after this date reverse=False) # Add the Data Feed to Cerebro cerebro.adddata(data) # Set our desired cash start cerebro.broker.setcash(100000.0) # Print out the starting conditions print('Starting Portfolio Value: %.2f' % cerebro.broker.getvalue()) # Run over everything cerebro.run() # Print out the final result print('Final Portfolio Value: %.2f' % cerebro.broker.getvalue()) After the execution the output is: Starting Portfolio Value: 100000.00 2000-01-03, Close, 27.85 2000-01-04, Close, 25.39 2000-01-05, Close, 24.05 2000-01-05, BUY CREATE, 24.05 2000-01-06, Close, 22.63 2000-01-06, BUY CREATE, 22.63 2000-01-07, Close, 24.37 ... ... ... 2000-12-20, BUY CREATE, 26.88 2000-12-21, Close, 27.82 2000-12-22, Close, 30.06 2000-12-26, Close, 29.17 2000-12-27, Close, 28.94 2000-12-27, BUY CREATE, 28.94 2000-12-28, Close, 29.29 2000-12-29, Close, 27.41 Final Portfolio Value: 99725.08 Several “BUY” creation orders were issued, our porftolio value was decremented. A couple of important things are clearly missing. The order was created but it is unknown if it was executed, when and at what price. The next example will build upon that by listening to notifications of order status. The curious reader may ask how many shares are being bought, what asset is being bought and how are orders being executed. Where possible (and in this case it is) the platform fills in the gaps: self.datas[0] (the main data aka system clock) is the target asset if no other one is specified The stake is provided behind the scenes by a position sizer which uses a fixed stake, being the default “1”. It will be modified in a later example The order is executed “At Market”. The broker (shown in previous examples) executes this using the opening price of the next bar, because that’s the 1st tick after the current under examination bar. The order is executed so far without any commission (more on that later) Do not only buy … but SELL After knowing how to enter the market (long), an “exit concept” is needed and also understanding whether the strategy is in the market. Luckily a Strategy object offers access to a position attribute for the default data feed Methods buy and sell return the created (not yet executed) order Changes in orders’ status will be notified to the strategy via a notify method The “exit concept” will be an easy one: Exit after 5 bars (on the 6th bar) have elapsed for good or for worse Please notice that there is no “time” or “timeframe” implied: number of bars. The bars can represent 1 minute, 1 hour, 1 day, 1 week or any other time period. Although we know the data source is a daily one, the strategy makes no assumption about that. Additionally and to simplify: Do only allow a Buy order if not yet in the market Note The next method gets no “bar index” passed and therefore it seems obscure how to understand when 5 bars may have elapsed, but this has been modeled in pythonic way: call len on an object and it will tell you the length of its lines. Just write down (save in a variable) at which length in an operation took place and see if the current length is 5 bars away. from __future__ import (absolute_import, division, print_function, unicode_literals) import datetime # For datetime objects import os.path # To manage paths import sys # To find out the script name (in argv[0]) # Import the backtrader platform import backtrader as bt # Create a Stratey class TestStrategy(bt.Strategy): def log(self, txt, dt=None): ''' Logging function fot this strategy''' dt = dt or self.datas[0].datetime.date(0) print('%s, %s' % (dt.isoformat(), txt)) def __init__(self): # Keep a reference to the "close" line in the data[0] dataseries self.dataclose = self.datas[0].close # To keep track of pending orders self.order = None def notify_order(self, order): if order.status in [order.Submitted, order.Accepted]: # Buy/Sell order submitted/accepted to/by broker - Nothing to do return # Check if an order has been completed # Attention: broker could reject order if not enough cash if order.status in [order.Completed]: if order.isbuy(): self.log('BUY EXECUTED, %.2f' % order.executed.price) elif order.issell(): self.log('SELL EXECUTED, %.2f' % order.executed.price) self.bar_executed = len(self) elif order.status in [order.Canceled, order.Margin, order.Rejected]: self.log('Order Canceled/Margin/Rejected') # Write down: no pending order self.order = None def next(self): # Simply log the closing price of the series from the reference self.log('Close, %.2f' % self.dataclose[0]) # Check if an order is pending ... if yes, we cannot send a 2nd one if self.order: return # Check if we are in the market if not self.position: # Not yet ... we MIGHT BUY if ... if self.dataclose[0] < self.dataclose[-1]: # current close less than previous close if self.dataclose[-1] < self.dataclose[-2]: # previous close less than the previous close # BUY, BUY, BUY!!! (with default parameters) self.log('BUY CREATE, %.2f' % self.dataclose[0]) # Keep track of the created order to avoid a 2nd order self.order = self.buy() else: # Already in the market ... we might sell if len(self) >= (self.bar_executed + 5): # SELL, SELL, SELL!!! (with all possible default parameters) self.log('SELL CREATE, %.2f' % self.dataclose[0]) # Keep track of the created order to avoid a 2nd order self.order = self.sell() if __name__ == '__main__': # Create a cerebro entity cerebro = bt.Cerebro() # Add a strategy cerebro.addstrategy(TestStrategy) # Datas are in a subfolder of the samples. Need to find where the script is # because it could have been called from anywhere modpath = os.path.dirname(os.path.abspath(sys.argv[0])) datapath = os.path.join(modpath, '../../datas/orcl-1995-2014.txt') # Create a Data Feed data = bt.feeds.YahooFinanceCSVData( dataname=datapath, # Do not pass values before this date fromdate=datetime.datetime(2000, 1, 1), # Do not pass values before this date todate=datetime.datetime(2000, 12, 31), # Do not pass values after this date reverse=False) # Add the Data Feed to Cerebro cerebro.adddata(data) # Set our desired cash start cerebro.broker.setcash(100000.0) # Print out the starting conditions print('Starting Portfolio Value: %.2f' % cerebro.broker.getvalue()) # Run over everything cerebro.run() # Print out the final result print('Final Portfolio Value: %.2f' % cerebro.broker.getvalue()) After the execution the output is: Starting Portfolio Value: 100000.00 2000-01-03T00:00:00, Close, 27.85 2000-01-04T00:00:00, Close, 25.39 2000-01-05T00:00:00, Close, 24.05 2000-01-05T00:00:00, BUY CREATE, 24.05 2000-01-06T00:00:00, BUY EXECUTED, 23.61 2000-01-06T00:00:00, Close, 22.63 2000-01-07T00:00:00, Close, 24.37 2000-01-10T00:00:00, Close, 27.29 2000-01-11T00:00:00, Close, 26.49 2000-01-12T00:00:00, Close, 24.90 2000-01-13T00:00:00, Close, 24.77 2000-01-13T00:00:00, SELL CREATE, 24.77 2000-01-14T00:00:00, SELL EXECUTED, 25.70 2000-01-14T00:00:00, Close, 25.18 ... ... ... 2000-12-15T00:00:00, SELL CREATE, 26.93 2000-12-18T00:00:00, SELL EXECUTED, 28.29 2000-12-18T00:00:00, Close, 30.18 2000-12-19T00:00:00, Close, 28.88 2000-12-20T00:00:00, Close, 26.88 2000-12-20T00:00:00, BUY CREATE, 26.88 2000-12-21T00:00:00, BUY EXECUTED, 26.23 2000-12-21T00:00:00, Close, 27.82 2000-12-22T00:00:00, Close, 30.06 2000-12-26T00:00:00, Close, 29.17 2000-12-27T00:00:00, Close, 28.94 2000-12-28T00:00:00, Close, 29.29 2000-12-29T00:00:00, Close, 27.41 2000-12-29T00:00:00, SELL CREATE, 27.41 Final Portfolio Value: 100018.53 Blistering Barnacles!!! The system made money … something must be wrong The broker says: Show me the money! And the money is called “commission”. Let’s add a reasonable 0.1% commision rate per operation (both for buying and selling … yes the broker is avid …) A single line will suffice for it: # 0.1% ... divide by 100 to remove the % cerebro.broker.setcommission(commission=0.001) Being experienced with the platform we want to see the profit or loss after a buy/sell cycle, with and without commission. from __future__ import (absolute_import, division, print_function, unicode_literals) import datetime # For datetime objects import os.path # To manage paths import sys # To find out the script name (in argv[0]) # Import the backtrader platform import backtrader as bt # Create a Stratey class TestStrategy(bt.Strategy): def log(self, txt, dt=None): ''' Logging function fot this strategy''' dt = dt or self.datas[0].datetime.date(0) print('%s, %s' % (dt.isoformat(), txt)) def __init__(self): # Keep a reference to the "close" line in the data[0] dataseries self.dataclose = self.datas[0].close # To keep track of pending orders and buy price/commission self.order = None self.buyprice = None self.buycomm = None def notify_order(self, order): if order.status in [order.Submitted, order.Accepted]: # Buy/Sell order submitted/accepted to/by broker - Nothing to do return # Check if an order has been completed # Attention: broker could reject order if not enough cash if order.status in [order.Completed]: if order.isbuy(): self.log( 'BUY EXECUTED, Price: %.2f, Cost: %.2f, Comm %.2f' % (order.executed.price, order.executed.value, order.executed.comm)) self.buyprice = order.executed.price self.buycomm = order.executed.comm else: # Sell self.log('SELL EXECUTED, Price: %.2f, Cost: %.2f, Comm %.2f' % (order.executed.price, order.executed.value, order.executed.comm)) self.bar_executed = len(self) elif order.status in [order.Canceled, order.Margin, order.Rejected]: self.log('Order Canceled/Margin/Rejected') self.order = None def notify_trade(self, trade): if not trade.isclosed: return self.log('OPERATION PROFIT, GROSS %.2f, NET %.2f' % (trade.pnl, trade.pnlcomm)) def next(self): # Simply log the closing price of the series from the reference self.log('Close, %.2f' % self.dataclose[0]) # Check if an order is pending ... if yes, we cannot send a 2nd one if self.order: return # Check if we are in the market if not self.position: # Not yet ... we MIGHT BUY if ... if self.dataclose[0] < self.dataclose[-1]: # current close less than previous close if self.dataclose[-1] < self.dataclose[-2]: # previous close less than the previous close # BUY, BUY, BUY!!! (with default parameters) self.log('BUY CREATE, %.2f' % self.dataclose[0]) # Keep track of the created order to avoid a 2nd order self.order = self.buy() else: # Already in the market ... we might sell if len(self) >= (self.bar_executed + 5): # SELL, SELL, SELL!!! (with all possible default parameters) self.log('SELL CREATE, %.2f' % self.dataclose[0]) # Keep track of the created order to avoid a 2nd order self.order = self.sell() if __name__ == '__main__': # Create a cerebro entity cerebro = bt.Cerebro() # Add a strategy cerebro.addstrategy(TestStrategy) # Datas are in a subfolder of the samples. Need to find where the script is # because it could have been called from anywhere modpath = os.path.dirname(os.path.abspath(sys.argv[0])) datapath = os.path.join(modpath, '../../datas/orcl-1995-2014.txt') # Create a Data Feed data = bt.feeds.YahooFinanceCSVData( dataname=datapath, # Do not pass values before this date fromdate=datetime.datetime(2000, 1, 1), # Do not pass values before this date todate=datetime.datetime(2000, 12, 31), # Do not pass values after this date reverse=False) # Add the Data Feed to Cerebro cerebro.adddata(data) # Set our desired cash start cerebro.broker.setcash(100000.0) # Set the commission - 0.1% ... divide by 100 to remove the % cerebro.broker.setcommission(commission=0.001) # Print out the starting conditions print('Starting Portfolio Value: %.2f' % cerebro.broker.getvalue()) # Run over everything cerebro.run() # Print out the final result print('Final Portfolio Value: %.2f' % cerebro.broker.getvalue()) After the execution the output is: Starting Portfolio Value: 100000.00 2000-01-03T00:00:00, Close, 27.85 2000-01-04T00:00:00, Close, 25.39 2000-01-05T00:00:00, Close, 24.05 2000-01-05T00:00:00, BUY CREATE, 24.05 2000-01-06T00:00:00, BUY EXECUTED, Price: 23.61, Cost: 23.61, Commission 0.02 2000-01-06T00:00:00, Close, 22.63 2000-01-07T00:00:00, Close, 24.37 2000-01-10T00:00:00, Close, 27.29 2000-01-11T00:00:00, Close, 26.49 2000-01-12T00:00:00, Close, 24.90 2000-01-13T00:00:00, Close, 24.77 2000-01-13T00:00:00, SELL CREATE, 24.77 2000-01-14T00:00:00, SELL EXECUTED, Price: 25.70, Cost: 25.70, Commission 0.03 2000-01-14T00:00:00, OPERATION PROFIT, GROSS 2.09, NET 2.04 2000-01-14T00:00:00, Close, 25.18 ... ... ... 2000-12-15T00:00:00, SELL CREATE, 26.93 2000-12-18T00:00:00, SELL EXECUTED, Price: 28.29, Cost: 28.29, Commission 0.03 2000-12-18T00:00:00, OPERATION PROFIT, GROSS -0.06, NET -0.12 2000-12-18T00:00:00, Close, 30.18 2000-12-19T00:00:00, Close, 28.88 2000-12-20T00:00:00, Close, 26.88 2000-12-20T00:00:00, BUY CREATE, 26.88 2000-12-21T00:00:00, BUY EXECUTED, Price: 26.23, Cost: 26.23, Commission 0.03 2000-12-21T00:00:00, Close, 27.82 2000-12-22T00:00:00, Close, 30.06 2000-12-26T00:00:00, Close, 29.17 2000-12-27T00:00:00, Close, 28.94 2000-12-28T00:00:00, Close, 29.29 2000-12-29T00:00:00, Close, 27.41 2000-12-29T00:00:00, SELL CREATE, 27.41 Final Portfolio Value: 100016.98 God Save the Queen!!! The system still made money. Before moving on, let’s notice something by filtering the “OPERATION PROFIT” lines: 2000-01-14T00:00:00, OPERATION PROFIT, GROSS 2.09, NET 2.04 2000-02-07T00:00:00, OPERATION PROFIT, GROSS 3.68, NET 3.63 2000-02-28T00:00:00, OPERATION PROFIT, GROSS 4.48, NET 4.42 2000-03-13T00:00:00, OPERATION PROFIT, GROSS 3.48, NET 3.41 2000-03-22T00:00:00, OPERATION PROFIT, GROSS -0.41, NET -0.49 2000-04-07T00:00:00, OPERATION PROFIT, GROSS 2.45, NET 2.37 2000-04-20T00:00:00, OPERATION PROFIT, GROSS -1.95, NET -2.02 2000-05-02T00:00:00, OPERATION PROFIT, GROSS 5.46, NET 5.39 2000-05-11T00:00:00, OPERATION PROFIT, GROSS -3.74, NET -3.81 2000-05-30T00:00:00, OPERATION PROFIT, GROSS -1.46, NET -1.53 2000-07-05T00:00:00, OPERATION PROFIT, GROSS -1.62, NET -1.69 2000-07-14T00:00:00, OPERATION PROFIT, GROSS 2.08, NET 2.01 2000-07-28T00:00:00, OPERATION PROFIT, GROSS 0.14, NET 0.07 2000-08-08T00:00:00, OPERATION PROFIT, GROSS 4.36, NET 4.29 2000-08-21T00:00:00, OPERATION PROFIT, GROSS 1.03, NET 0.95 2000-09-15T00:00:00, OPERATION PROFIT, GROSS -4.26, NET -4.34 2000-09-27T00:00:00, OPERATION PROFIT, GROSS 1.29, NET 1.22 2000-10-13T00:00:00, OPERATION PROFIT, GROSS -2.98, NET -3.04 2000-10-26T00:00:00, OPERATION PROFIT, GROSS 3.01, NET 2.95 2000-11-06T00:00:00, OPERATION PROFIT, GROSS -3.59, NET -3.65 2000-11-16T00:00:00, OPERATION PROFIT, GROSS 1.28, NET 1.23 2000-12-01T00:00:00, OPERATION PROFIT, GROSS 2.59, NET 2.54 2000-12-18T00:00:00, OPERATION PROFIT, GROSS -0.06, NET -0.12 Adding up the “NET” profits the final figure is: 15.83 But the system said the following at the end: 2000-12-29T00:00:00, SELL CREATE, 27.41 Final Portfolio Value: 100016.98 And obviously 15.83 is not 16.98. There is no error whatsoever. The “NET” profit of 15.83 is already cash in the bag. Unfortunately (or fortunately to better understand the platform) there is an open position on the last day of the Data Feed. Even if a SELL operation has been sent … IT HAS NOT YET BEEN EXECUTED. The “Final Portfolio Value” calculated by the broker takes into account the “Close” price on 2000-12-29. The actual execution price would have been set on the next trading day which happened to be 2001-01-02. Extending the Data Feed” to take into account this day the output is: 2001-01-02T00:00:00, SELL EXECUTED, Price: 27.87, Cost: 27.87, Commission 0.03 2001-01-02T00:00:00, OPERATION PROFIT, GROSS 1.64, NET 1.59 2001-01-02T00:00:00, Close, 24.87 2001-01-02T00:00:00, BUY CREATE, 24.87 Final Portfolio Value: 100017.41 Now adding the previous NET profit to the completed operation’s net profit: 15.83 + 1.59 = 17.42 Which (discarding rounding errors in the “print” statements) is the extra Portfolio above the initial 100000 monetary units the strategy started with. Customizing the Strategy: Parameters It would a bit unpractical to hardcode some of the values in the strategy and have no chance to change them easily. Parameters come in handy to help. Definition of parameters is easy and looks like: params = (('myparam', 27), ('exitbars', 5),) Being this a standard Python tuple with some tuples inside it, the following may look more appealling to some: params = ( ('myparam', 27), ('exitbars', 5), ) With either formatting parametrization of the strategy is allowed when adding the strategy to the Cerebro engine: # Add a strategy cerebro.addstrategy(TestStrategy, myparam=20, exitbars=7) Note The setsizing method below is deprecated. This content is kept here for anyone looking at old samples of the sources. The sources have been update to use: cerebro.addsizer(bt.sizers.FixedSize, stake=10)`` Please read the section about sizers Using the parameters in the strategy is easy, as they are stored in a “params” attribute. If we for example want to set the stake fix, we can pass the stake parameter to the position sizer like this durint init: # Set the sizer stake from the params self.sizer.setsizing(self.params.stake) We could have also called buy and sell with a stake parameter and self.params.stake as the value. The logic to exit gets modified: # Already in the market ... we might sell if len(self) >= (self.bar_executed + self.params.exitbars): With all this in mind the example evolves to look like: from __future__ import (absolute_import, division, print_function, unicode_literals) import datetime # For datetime objects import os.path # To manage paths import sys # To find out the script name (in argv[0]) # Import the backtrader platform import backtrader as bt # Create a Stratey class TestStrategy(bt.Strategy): params = ( ('exitbars', 5), ) def log(self, txt, dt=None): ''' Logging function fot this strategy''' dt = dt or self.datas[0].datetime.date(0) print('%s, %s' % (dt.isoformat(), txt)) def __init__(self): # Keep a reference to the "close" line in the data[0] dataseries self.dataclose = self.datas[0].close # To keep track of pending orders and buy price/commission self.order = None self.buyprice = None self.buycomm = None def notify_order(self, order): if order.status in [order.Submitted, order.Accepted]: # Buy/Sell order submitted/accepted to/by broker - Nothing to do return # Check if an order has been completed # Attention: broker could reject order if not enough cash if order.status in [order.Completed]: if order.isbuy(): self.log( 'BUY EXECUTED, Price: %.2f, Cost: %.2f, Comm %.2f' % (order.executed.price, order.executed.value, order.executed.comm)) self.buyprice = order.executed.price self.buycomm = order.executed.comm else: # Sell self.log('SELL EXECUTED, Price: %.2f, Cost: %.2f, Comm %.2f' % (order.executed.price, order.executed.value, order.executed.comm)) self.bar_executed = len(self) elif order.status in [order.Canceled, order.Margin, order.Rejected]: self.log('Order Canceled/Margin/Rejected') self.order = None def notify_trade(self, trade): if not trade.isclosed: return self.log('OPERATION PROFIT, GROSS %.2f, NET %.2f' % (trade.pnl, trade.pnlcomm)) def next(self): # Simply log the closing price of the series from the reference self.log('Close, %.2f' % self.dataclose[0]) # Check if an order is pending ... if yes, we cannot send a 2nd one if self.order: return # Check if we are in the market if not self.position: # Not yet ... we MIGHT BUY if ... if self.dataclose[0] < self.dataclose[-1]: # current close less than previous close if self.dataclose[-1] < self.dataclose[-2]: # previous close less than the previous close # BUY, BUY, BUY!!! (with default parameters) self.log('BUY CREATE, %.2f' % self.dataclose[0]) # Keep track of the created order to avoid a 2nd order self.order = self.buy() else: # Already in the market ... we might sell if len(self) >= (self.bar_executed + self.params.exitbars): # SELL, SELL, SELL!!! (with all possible default parameters) self.log('SELL CREATE, %.2f' % self.dataclose[0]) # Keep track of the created order to avoid a 2nd order self.order = self.sell() if __name__ == '__main__': # Create a cerebro entity cerebro = bt.Cerebro() # Add a strategy cerebro.addstrategy(TestStrategy) # Datas are in a subfolder of the samples. Need to find where the script is # because it could have been called from anywhere modpath = os.path.dirname(os.path.abspath(sys.argv[0])) datapath = os.path.join(modpath, '../../datas/orcl-1995-2014.txt') # Create a Data Feed data = bt.feeds.YahooFinanceCSVData( dataname=datapath, # Do not pass values before this date fromdate=datetime.datetime(2000, 1, 1), # Do not pass values before this date todate=datetime.datetime(2000, 12, 31), # Do not pass values after this date reverse=False) # Add the Data Feed to Cerebro cerebro.adddata(data) # Set our desired cash start cerebro.broker.setcash(100000.0) # Add a FixedSize sizer according to the stake cerebro.addsizer(bt.sizers.FixedSize, stake=10) # Set the commission - 0.1% ... divide by 100 to remove the % cerebro.broker.setcommission(commission=0.001) # Print out the starting conditions print('Starting Portfolio Value: %.2f' % cerebro.broker.getvalue()) # Run over everything cerebro.run() # Print out the final result print('Final Portfolio Value: %.2f' % cerebro.broker.getvalue()) After the execution the output is: Starting Portfolio Value: 100000.00 2000-01-03T00:00:00, Close, 27.85 2000-01-04T00:00:00, Close, 25.39 2000-01-05T00:00:00, Close, 24.05 2000-01-05T00:00:00, BUY CREATE, 24.05 2000-01-06T00:00:00, BUY EXECUTED, Size 10, Price: 23.61, Cost: 236.10, Commission 0.24 2000-01-06T00:00:00, Close, 22.63 ... ... ... 2000-12-20T00:00:00, BUY CREATE, 26.88 2000-12-21T00:00:00, BUY EXECUTED, Size 10, Price: 26.23, Cost: 262.30, Commission 0.26 2000-12-21T00:00:00, Close, 27.82 2000-12-22T00:00:00, Close, 30.06 2000-12-26T00:00:00, Close, 29.17 2000-12-27T00:00:00, Close, 28.94 2000-12-28T00:00:00, Close, 29.29 2000-12-29T00:00:00, Close, 27.41 2000-12-29T00:00:00, SELL CREATE, 27.41 Final Portfolio Value: 100169.80 In order to see the difference, the print outputs have also been extended to show the execution size. Having multiplied the stake by 10, the obvious has happened: the profit and loss has been multiplied by 10. Instead of 16.98, the surplus is now 169.80 Adding an indicator Having heard of indicators, the next thing anyone would add to the strategy is one of them. For sure they must be much better than a simple “3 lower closes” strategy. Inspired in one of the examples from PyAlgoTrade a strategy using a Simple Moving Average. Buy “AtMarket” if the close is greater than the Average If in the market, sell if the close is smaller than the Average Only 1 active operation is allowed in the market Most of the existing code can be kept in place. Let’s add the average during init and keep a reference to it: self.sma = bt.indicators.MovingAverageSimple(self.datas[0], period=self.params.maperiod) And of course the logic to enter and exit the market will rely on the Average values. Look in the code for the logic. Note The starting cash will be 1000 monetary units to be in line with the PyAlgoTrade example and no commission will be applied from __future__ import (absolute_import, division, print_function, unicode_literals) import datetime # For datetime objects import os.path # To manage paths import sys # To find out the script name (in argv[0]) # Import the backtrader platform import backtrader as bt # Create a Stratey class TestStrategy(bt.Strategy): params = ( ('maperiod', 15), ) def log(self, txt, dt=None): ''' Logging function fot this strategy''' dt = dt or self.datas[0].datetime.date(0) print('%s, %s' % (dt.isoformat(), txt)) def __init__(self): # Keep a reference to the "close" line in the data[0] dataseries self.dataclose = self.datas[0].close # To keep track of pending orders and buy price/commission self.order = None self.buyprice = None self.buycomm = None # Add a MovingAverageSimple indicator self.sma = bt.indicators.SimpleMovingAverage( self.datas[0], period=self.params.maperiod) def notify_order(self, order): if order.status in [order.Submitted, order.Accepted]: # Buy/Sell order submitted/accepted to/by broker - Nothing to do return # Check if an order has been completed # Attention: broker could reject order if not enough cash if order.status in [order.Completed]: if order.isbuy(): self.log( 'BUY EXECUTED, Price: %.2f, Cost: %.2f, Comm %.2f' % (order.executed.price, order.executed.value, order.executed.comm)) self.buyprice = order.executed.price self.buycomm = order.executed.comm else: # Sell self.log('SELL EXECUTED, Price: %.2f, Cost: %.2f, Comm %.2f' % (order.executed.price, order.executed.value, order.executed.comm)) self.bar_executed = len(self) elif order.status in [order.Canceled, order.Margin, order.Rejected]: self.log('Order Canceled/Margin/Rejected') self.order = None def notify_trade(self, trade): if not trade.isclosed: return self.log('OPERATION PROFIT, GROSS %.2f, NET %.2f' % (trade.pnl, trade.pnlcomm)) def next(self): # Simply log the closing price of the series from the reference self.log('Close, %.2f' % self.dataclose[0]) # Check if an order is pending ... if yes, we cannot send a 2nd one if self.order: return # Check if we are in the market if not self.position: # Not yet ... we MIGHT BUY if ... if self.dataclose[0] > self.sma[0]: # BUY, BUY, BUY!!! (with all possible default parameters) self.log('BUY CREATE, %.2f' % self.dataclose[0]) # Keep track of the created order to avoid a 2nd order self.order = self.buy() else: if self.dataclose[0] < self.sma[0]: # SELL, SELL, SELL!!! (with all possible default parameters) self.log('SELL CREATE, %.2f' % self.dataclose[0]) # Keep track of the created order to avoid a 2nd order self.order = self.sell() if __name__ == '__main__': # Create a cerebro entity cerebro = bt.Cerebro() # Add a strategy cerebro.addstrategy(TestStrategy) # Datas are in a subfolder of the samples. Need to find where the script is # because it could have been called from anywhere modpath = os.path.dirname(os.path.abspath(sys.argv[0])) datapath = os.path.join(modpath, '../../datas/orcl-1995-2014.txt') # Create a Data Feed data = bt.feeds.YahooFinanceCSVData( dataname=datapath, # Do not pass values before this date fromdate=datetime.datetime(2000, 1, 1), # Do not pass values before this date todate=datetime.datetime(2000, 12, 31), # Do not pass values after this date reverse=False) # Add the Data Feed to Cerebro cerebro.adddata(data) # Set our desired cash start cerebro.broker.setcash(1000.0) # Add a FixedSize sizer according to the stake cerebro.addsizer(bt.sizers.FixedSize, stake=10) # Set the commission cerebro.broker.setcommission(commission=0.0) # Print out the starting conditions print('Starting Portfolio Value: %.2f' % cerebro.broker.getvalue()) # Run over everything cerebro.run() # Print out the final result print('Final Portfolio Value: %.2f' % cerebro.broker.getvalue()) Now, before skipping to the next section LOOK CAREFULLY to the first date which is shown in the log: It’ no longer 2000-01-03, the first trading day in the year 2K. It’s 2000-01-24 … Who has stolen my cheese? The missing days are not missing. The platform has adapted to the new circumstances: An indicator (SimpleMovingAverage) has been added to the Strategy. This indicator needs X bars to produce an output: in the example: 15 2000-01-24 is the day in which the 15th bar occurs The backtrader platform assumes that the Strategy has the indicator in place for a good reason, to use it in the decision making process. And it makes no sense to try to make decisions if the indicator is not yet ready and producing values. next will be 1st called when all indicators have already reached the minimum needed period to produce a value In the example there is a single indicator, but the strategy could have any number of them. After the execution the output is: Starting Portfolio Value: 1000.00 2000-01-24T00:00:00, Close, 25.55 2000-01-25T00:00:00, Close, 26.61 2000-01-25T00:00:00, BUY CREATE, 26.61 2000-01-26T00:00:00, BUY EXECUTED, Size 10, Price: 26.76, Cost: 267.60, Commission 0.00 2000-01-26T00:00:00, Close, 25.96 2000-01-27T00:00:00, Close, 24.43 2000-01-27T00:00:00, SELL CREATE, 24.43 2000-01-28T00:00:00, SELL EXECUTED, Size 10, Price: 24.28, Cost: 242.80, Commission 0.00 2000-01-28T00:00:00, OPERATION PROFIT, GROSS -24.80, NET -24.80 2000-01-28T00:00:00, Close, 22.34 2000-01-31T00:00:00, Close, 23.55 2000-02-01T00:00:00, Close, 25.46 2000-02-02T00:00:00, Close, 25.61 2000-02-02T00:00:00, BUY CREATE, 25.61 2000-02-03T00:00:00, BUY EXECUTED, Size 10, Price: 26.11, Cost: 261.10, Commission 0.00 ... ... ... 2000-12-20T00:00:00, SELL CREATE, 26.88 2000-12-21T00:00:00, SELL EXECUTED, Size 10, Price: 26.23, Cost: 262.30, Commission 0.00 2000-12-21T00:00:00, OPERATION PROFIT, GROSS -20.60, NET -20.60 2000-12-21T00:00:00, Close, 27.82 2000-12-21T00:00:00, BUY CREATE, 27.82 2000-12-22T00:00:00, BUY EXECUTED, Size 10, Price: 28.65, Cost: 286.50, Commission 0.00 2000-12-22T00:00:00, Close, 30.06 2000-12-26T00:00:00, Close, 29.17 2000-12-27T00:00:00, Close, 28.94 2000-12-28T00:00:00, Close, 29.29 2000-12-29T00:00:00, Close, 27.41 2000-12-29T00:00:00, SELL CREATE, 27.41 Final Portfolio Value: 973.90 In the name of the King!!! A winning system turned into a losing one … and that with no commission. It may well be that simply adding an indicator is not the universal panacea. Note The same logic and data with PyAlgoTrade yields a slightly different result (slightly off). Looking at the entire printout reveals that some operations are not exactly the same. Being the culprit again the usual suspect: rounding. PyAlgoTrade does not round the datafeed values when applying the divided “adjusted close” to the data feed values. The Yahoo Data Feed provided by backtrader rounds the values down to 2 decimals after applying the adjusted close. Upon printing the values everything seems the same, but it’s obvious that sometimes that 5th place decimal plays a role. Rounding down to 2 decimals seems more realistic, because Market Exchanges do only allow a number of decimals per asset (being that 2 decimals usually for stocks) Note The Yahoo Data Feed (starting with version 1.8.11.99 allows to specify if rounding has to happen and how many decimals) Visual Inspection: Plotting A printout or log of the actual whereabouts of the system at each bar-instant is good but humans tend to be visual and therefore it seems right to offer a view of the same whereabouts as chart. Note To plot you need to have matplotlib installed Once again defaults for plotting are there to assist the platform user. Plotting is incredibly a 1 line operation: cerebro.plot() Being the location for sure after cerebro.run() has been called. In order to display the automatic plotting capabilities and a couple of easy customizations, the following will be done: A 2nd MovingAverage (Exponential) will be added. The defaults will plot it (just like the 1st) with the data. A 3rd MovingAverage (Weighted) will be added. Customized to plot in an own plot (even if not sensible) A Stochastic (Slow) will be added. No change to the defaults. A MACD will be added. No change to the defaults. A RSI will be added. No change to the defaults. A MovingAverage (Simple) will be applied to the RSI. No change to the defaults (it will be plotted with the RSI) An AverageTrueRange will be added. Changed defaults to avoid it being plotted. The entire set of additions to the init method of the Strategy: # Indicators for the plotting show bt.indicators.ExponentialMovingAverage(self.datas[0], period=25) bt.indicators.WeightedMovingAverage(self.datas[0], period=25).subplot = True bt.indicators.StochasticSlow(self.datas[0]) bt.indicators.MACDHisto(self.datas[0]) rsi = bt.indicators.RSI(self.datas[0]) bt.indicators.SmoothedMovingAverage(rsi, period=10) bt.indicators.ATR(self.datas[0]).plot = False Note Even if indicators are not explicitly added to a member variable of the strategy (like self.sma = MovingAverageSimple…), they will autoregister with the strategy and will influence the minimum period for next and will be part of the plotting. In the example only RSI is added to a temporary variable rsi with the only intention to create a MovingAverageSmoothed on it. The example now: from __future__ import (absolute_import, division, print_function, unicode_literals) import datetime # For datetime objects import os.path # To manage paths import sys # To find out the script name (in argv[0]) # Import the backtrader platform import backtrader as bt # Create a Stratey class TestStrategy(bt.Strategy): params = ( ('maperiod', 15), ) def log(self, txt, dt=None): ''' Logging function fot this strategy''' dt = dt or self.datas[0].datetime.date(0) print('%s, %s' % (dt.isoformat(), txt)) def __init__(self): # Keep a reference to the "close" line in the data[0] dataseries self.dataclose = self.datas[0].close # To keep track of pending orders and buy price/commission self.order = None self.buyprice = None self.buycomm = None # Add a MovingAverageSimple indicator self.sma = bt.indicators.SimpleMovingAverage( self.datas[0], period=self.params.maperiod) # Indicators for the plotting show bt.indicators.ExponentialMovingAverage(self.datas[0], period=25) bt.indicators.WeightedMovingAverage(self.datas[0], period=25, subplot=True) bt.indicators.StochasticSlow(self.datas[0]) bt.indicators.MACDHisto(self.datas[0]) rsi = bt.indicators.RSI(self.datas[0]) bt.indicators.SmoothedMovingAverage(rsi, period=10) bt.indicators.ATR(self.datas[0], plot=False) def notify_order(self, order): if order.status in [order.Submitted, order.Accepted]: # Buy/Sell order submitted/accepted to/by broker - Nothing to do return # Check if an order has been completed # Attention: broker could reject order if not enough cash if order.status in [order.Completed]: if order.isbuy(): self.log( 'BUY EXECUTED, Price: %.2f, Cost: %.2f, Comm %.2f' % (order.executed.price, order.executed.value, order.executed.comm)) self.buyprice = order.executed.price self.buycomm = order.executed.comm else: # Sell self.log('SELL EXECUTED, Price: %.2f, Cost: %.2f, Comm %.2f' % (order.executed.price, order.executed.value, order.executed.comm)) self.bar_executed = len(self) elif order.status in [order.Canceled, order.Margin, order.Rejected]: self.log('Order Canceled/Margin/Rejected') # Write down: no pending order self.order = None def notify_trade(self, trade): if not trade.isclosed: return self.log('OPERATION PROFIT, GROSS %.2f, NET %.2f' % (trade.pnl, trade.pnlcomm)) def next(self): # Simply log the closing price of the series from the reference self.log('Close, %.2f' % self.dataclose[0]) # Check if an order is pending ... if yes, we cannot send a 2nd one if self.order: return # Check if we are in the market if not self.position: # Not yet ... we MIGHT BUY if ... if self.dataclose[0] > self.sma[0]: # BUY, BUY, BUY!!! (with all possible default parameters) self.log('BUY CREATE, %.2f' % self.dataclose[0]) # Keep track of the created order to avoid a 2nd order self.order = self.buy() else: if self.dataclose[0] < self.sma[0]: # SELL, SELL, SELL!!! (with all possible default parameters) self.log('SELL CREATE, %.2f' % self.dataclose[0]) # Keep track of the created order to avoid a 2nd order self.order = self.sell() if __name__ == '__main__': # Create a cerebro entity cerebro = bt.Cerebro() # Add a strategy cerebro.addstrategy(TestStrategy) # Datas are in a subfolder of the samples. Need to find where the script is # because it could have been called from anywhere modpath = os.path.dirname(os.path.abspath(sys.argv[0])) datapath = os.path.join(modpath, '../../datas/orcl-1995-2014.txt') # Create a Data Feed data = bt.feeds.YahooFinanceCSVData( dataname=datapath, # Do not pass values before this date fromdate=datetime.datetime(2000, 1, 1), # Do not pass values before this date todate=datetime.datetime(2000, 12, 31), # Do not pass values after this date reverse=False) # Add the Data Feed to Cerebro cerebro.adddata(data) # Set our desired cash start cerebro.broker.setcash(1000.0) # Add a FixedSize sizer according to the stake cerebro.addsizer(bt.sizers.FixedSize, stake=10) # Set the commission cerebro.broker.setcommission(commission=0.0) # Print out the starting conditions print('Starting Portfolio Value: %.2f' % cerebro.broker.getvalue()) # Run over everything cerebro.run() # Print out the final result print('Final Portfolio Value: %.2f' % cerebro.broker.getvalue()) # Plot the result cerebro.plot() After the execution the output is: Starting Portfolio Value: 1000.00 2000-02-18T00:00:00, Close, 27.61 2000-02-22T00:00:00, Close, 27.97 2000-02-22T00:00:00, BUY CREATE, 27.97 2000-02-23T00:00:00, BUY EXECUTED, Size 10, Price: 28.38, Cost: 283.80, Commission 0.00 2000-02-23T00:00:00, Close, 29.73 ... ... ... 2000-12-21T00:00:00, BUY CREATE, 27.82 2000-12-22T00:00:00, BUY EXECUTED, Size 10, Price: 28.65, Cost: 286.50, Commission 0.00 2000-12-22T00:00:00, Close, 30.06 2000-12-26T00:00:00, Close, 29.17 2000-12-27T00:00:00, Close, 28.94 2000-12-28T00:00:00, Close, 29.29 2000-12-29T00:00:00, Close, 27.41 2000-12-29T00:00:00, SELL CREATE, 27.41 Final Portfolio Value: 981.00 The final result has changed even if the logic hasn’t. This is true but the logic has not been applied to the same number of bars. Note As explained before, the platform will first call next when all indicators are ready to produce a value. In this plotting example (very clear in the chart) the MACD is the last indicator to be fully ready (all 3 lines producing an output). The 1st BUY order is no longer scheduled during Jan 2000 but close to the end of Feb 2000. The chart: image Let’s Optimize Many trading books say each market and each traded stock (or commodity or ..) have different rythms. That there is no such thing as a one size fits all. Before the plotting sample, when the strategy started using an indicator the period default value was 15 bars. It’s a strategy parameter and this can be used in an optimization to change the value of the parameter and see which one better fits the market. Note There is plenty of literature about Optimization and associated pros and cons. But the advice will always point in the same direction: do not overoptimize. If a trading idea is not sound, optimizing may end producing a positive result which is only valid for the backtested dataset. The sample is modified to optimize the period of the Simple Moving Average. For the sake of clarity any output with regards to Buy/Sell orders has been removed The example now: from __future__ import (absolute_import, division, print_function, unicode_literals) import datetime # For datetime objects import os.path # To manage paths import sys # To find out the script name (in argv[0]) # Import the backtrader platform import backtrader as bt # Create a Stratey class TestStrategy(bt.Strategy): params = ( ('maperiod', 15), ('printlog', False), ) def log(self, txt, dt=None, doprint=False): ''' Logging function fot this strategy''' if self.params.printlog or doprint: dt = dt or self.datas[0].datetime.date(0) print('%s, %s' % (dt.isoformat(), txt)) def __init__(self): # Keep a reference to the "close" line in the data[0] dataseries self.dataclose = self.datas[0].close # To keep track of pending orders and buy price/commission self.order = None self.buyprice = None self.buycomm = None # Add a MovingAverageSimple indicator self.sma = bt.indicators.SimpleMovingAverage( self.datas[0], period=self.params.maperiod) def notify_order(self, order): if order.status in [order.Submitted, order.Accepted]: # Buy/Sell order submitted/accepted to/by broker - Nothing to do return # Check if an order has been completed # Attention: broker could reject order if not enough cash if order.status in [order.Completed]: if order.isbuy(): self.log( 'BUY EXECUTED, Price: %.2f, Cost: %.2f, Comm %.2f' % (order.executed.price, order.executed.value, order.executed.comm)) self.buyprice = order.executed.price self.buycomm = order.executed.comm else: # Sell self.log('SELL EXECUTED, Price: %.2f, Cost: %.2f, Comm %.2f' % (order.executed.price, order.executed.value, order.executed.comm)) self.bar_executed = len(self) elif order.status in [order.Canceled, order.Margin, order.Rejected]: self.log('Order Canceled/Margin/Rejected') # Write down: no pending order self.order = None def notify_trade(self, trade): if not trade.isclosed: return self.log('OPERATION PROFIT, GROSS %.2f, NET %.2f' % (trade.pnl, trade.pnlcomm)) def next(self): # Simply log the closing price of the series from the reference self.log('Close, %.2f' % self.dataclose[0]) # Check if an order is pending ... if yes, we cannot send a 2nd one if self.order: return # Check if we are in the market if not self.position: # Not yet ... we MIGHT BUY if ... if self.dataclose[0] > self.sma[0]: # BUY, BUY, BUY!!! (with all possible default parameters) self.log('BUY CREATE, %.2f' % self.dataclose[0]) # Keep track of the created order to avoid a 2nd order self.order = self.buy() else: if self.dataclose[0] < self.sma[0]: # SELL, SELL, SELL!!! (with all possible default parameters) self.log('SELL CREATE, %.2f' % self.dataclose[0]) # Keep track of the created order to avoid a 2nd order self.order = self.sell() def stop(self): self.log('(MA Period %2d) Ending Value %.2f' % (self.params.maperiod, self.broker.getvalue()), doprint=True) if __name__ == '__main__': # Create a cerebro entity cerebro = bt.Cerebro() # Add a strategy strats = cerebro.optstrategy( TestStrategy, maperiod=range(10, 31)) # Datas are in a subfolder of the samples. Need to find where the script is # because it could have been called from anywhere modpath = os.path.dirname(os.path.abspath(sys.argv[0])) datapath = os.path.join(modpath, '../../datas/orcl-1995-2014.txt') # Create a Data Feed data = bt.feeds.YahooFinanceCSVData( dataname=datapath, # Do not pass values before this date fromdate=datetime.datetime(2000, 1, 1), # Do not pass values before this date todate=datetime.datetime(2000, 12, 31), # Do not pass values after this date reverse=False) # Add the Data Feed to Cerebro cerebro.adddata(data) # Set our desired cash start cerebro.broker.setcash(1000.0) # Add a FixedSize sizer according to the stake cerebro.addsizer(bt.sizers.FixedSize, stake=10) # Set the commission cerebro.broker.setcommission(commission=0.0) # Run over everything cerebro.run(maxcpus=1) Instead of calling addstrategy to add a stratey class to Cerebro, the call is made to optstrategy. And instead of passing a value a range of values is passed. One of the “Strategy” hooks is added, the stop method, which will be called when the data has been exhausted and backtesting is over. It’s used to print the final net value of the portfolio in the broker (it was done in Cerebro previously) The system will execute the strategy for each value of the range. The following will be output: 2000-12-29, (MA Period 10) Ending Value 880.30 2000-12-29, (MA Period 11) Ending Value 880.00 2000-12-29, (MA Period 12) Ending Value 830.30 2000-12-29, (MA Period 13) Ending Value 893.90 2000-12-29, (MA Period 14) Ending Value 896.90 2000-12-29, (MA Period 15) Ending Value 973.90 2000-12-29, (MA Period 16) Ending Value 959.40 2000-12-29, (MA Period 17) Ending Value 949.80 2000-12-29, (MA Period 18) Ending Value 1011.90 2000-12-29, (MA Period 19) Ending Value 1041.90 2000-12-29, (MA Period 20) Ending Value 1078.00 2000-12-29, (MA Period 21) Ending Value 1058.80 2000-12-29, (MA Period 22) Ending Value 1061.50 2000-12-29, (MA Period 23) Ending Value 1023.00 2000-12-29, (MA Period 24) Ending Value 1020.10 2000-12-29, (MA Period 25) Ending Value 1013.30 2000-12-29, (MA Period 26) Ending Value 998.30 2000-12-29, (MA Period 27) Ending Value 982.20 2000-12-29, (MA Period 28) Ending Value 975.70 2000-12-29, (MA Period 29) Ending Value 983.30 2000-12-29, (MA Period 30) Ending Value 979.80 Results: For periods below 18 the strategy (commissionless) loses money. For periods between 18 and 26 (both included) the strategy makes money. Above 26 money is lost again. And the winning period for this strategy and the given data set is: 20 bars, which wins 78.00 units over 1000 $/€ (a 7.8%) Note The extra indicators from the plotting example have been removed and the start of operations is only influenced by the Simple Moving Average which is being optimized. Hence the slightly different results for period 15 Conclusion The incremental samples have shown how to go from a barebones script to a fully working trading system which even plots the results and can be optimized. A lot more can be done to try to improve the chances of winning: Self defined Indicators Creating an indicator is easy (and even plotting them is easy) Sizers Money Management is for many the key to success Order Types (limit, stop, stoplimit) Some others To ensure all the above items can be fully utilized the documentation provides an insight into them (and other topics) Look in the table of contents and keep on reading … and developing. Best of luck
07-08
下载前可以先看下教程 https://pan.quark.cn/s/a426667488ae 标题“仿淘宝jquery图片左右切换带数字”揭示了这是一个关于运用jQuery技术完成的图片轮播机制,其特色在于具备淘宝在线平台普遍存在的图片切换表现,并且在整个切换环节中会展示当前图片的序列号。 此类功能一般应用于电子商务平台的产品呈现环节,使用户可以便捷地查看多张商品的照片。 说明中的“NULL”表示未提供进一步的信息,但我们可以借助标题来揣摩若干核心的技术要点。 在构建此类功能时,开发者通常会借助以下技术手段:1. **jQuery库**:jQuery是一个应用广泛的JavaScript框架,它简化了HTML文档的遍历、事件管理、动画效果以及Ajax通信。 在此项目中,jQuery将负责处理用户的点击动作(实现左右切换),并且制造流畅的过渡效果。 2. **图片轮播扩展工具**:开发者或许会采用现成的jQuery扩展,例如Slick、Bootstrap Carousel或个性化的轮播函数,以达成图片切换的功能。 这些扩展能够辅助迅速构建功能完善的轮播模块。 3. **即时数字呈现**:展示当前图片的序列号,这需要通过JavaScript或jQuery来追踪并调整。 每当图片切换时,相应的数字也会同步更新。 4. **CSS美化**:为了达成淘宝图片切换的视觉效果,可能需要设计特定的CSS样式,涵盖图片的排列方式、过渡效果、点状指示器等。 CSS3的动画和过渡特性(如`transition`和`animation`)在此过程中扮演关键角色。 5. **事件监测**:运用jQuery的`.on()`方法来监测用户的操作,比如点击左右控制按钮或自动按时间间隔切换。 根据用户的交互,触发相应的函数来执行...
垃圾实例分割数据集 一、基础信息 • 数据集名称:垃圾实例分割数据集 • 图片数量: 训练集:7,000张图片 验证集:426张图片 测试集:644张图片 • 训练集:7,000张图片 • 验证集:426张图片 • 测试集:644张图片 • 分类类别: 垃圾(Sampah) • 垃圾(Sampah) • 标注格式:YOLO格式,包含实例分割的多边形点坐标,适用于实例分割任务。 • 数据格式:图片文件 二、适用场景 • 智能垃圾检测系统开发:数据集支持实例分割任务,帮助构建能够自动识别和分割图像中垃圾区域的AI模型,适用于智能清洁机器人、自动垃圾桶等应用。 • 环境监控与管理:集成到监控系统中,用于实时检测公共区域的垃圾堆积,辅助环境清洁和治理决策。 • 计算机视觉研究:支持实例分割算法的研究和优化,特别是在垃圾识别领域,促进AI在环保方面的创新。 • 教育与实践:可用于高校或培训机构的AI课程,作为实例分割技术的实践数据集,帮助学生理解计算机视觉应用。 三、数据集优势 • 精确的实例分割标注:每个垃圾实例都使用详细的多边形点进行标注,确保分割边界准确,提升模型训练效果。 • 数据多样性:包含多种垃圾物品实例,覆盖不同场景,增强模型的泛化能力和鲁棒性。 • 格式兼容性强:YOLO标注格式易于与主流深度学习框架集成,如YOLO系列、PyTorch等,方便研究人员和开发者使用。 • 实际应用价值:直接针对现实世界的垃圾管理需求,为自动化环保解决方案提供可靠数据支持,具有重要的社会意义。
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值