1.问题
spark可以一直接收数据 但是不分析
2020-07-16 10:10:06 INFO JobScheduler:54 - Added jobs for time 1594894206000 ms
2020-07-16 10:10:08 INFO JobScheduler:54 - Added jobs for time 1594894208000 ms
2020-07-16 10:10:10 INFO JobScheduler:54 - Added jobs for time 1594894210000 ms
2020-07-16 10:10:12 INFO JobScheduler:54 - Added jobs for time 1594894212000 ms
2020-07-16 10:10:14 INFO JobScheduler:54 - Added jobs for time 1594894214000 ms
2020-07-16 10:10:16 INFO JobScheduler:54 - Added jobs for time 1594894216000 ms
2020-07-16 10:10:18 INFO JobScheduler:54 - Added jobs for time 1594894218000 ms
2020-07-16 10:10:20 INFO JobScheduler:54 - Added jobs for time 1594894220000 ms
2020-07-16 10:10:22 INFO JobScheduler:54 - Added jobs for time 1594894222000 ms
2020-07-16 10:10:24 INFO JobScheduler:54 - Added jobs for time 1594894224000 ms
2020-07-16 10:10:26 INFO JobScheduler:54 - Added jobs for time 1594894226000 ms
2020-07-16 10:10:28 INFO JobScheduler:54 - Added jobs for time 1594894228000 ms
2020-07-16 10:10:30 INFO JobScheduler:54 - Added jobs for time 1594894230000 ms
2020-07-16 10:10:32 INFO JobScheduler:54 - Added jobs for time 1594894232000 ms
2020-07-16 10:10:34 INFO JobScheduler:54 - Added jobs for time 1594894234000 ms
2020-07-16 10:10:36 INFO JobScheduler:54 - Added jobs for time 1594894236000 ms
2020-07-16 10:10:38 INFO JobScheduler:54 - Added jobs for time 1594894238000 ms
2020-07-16 10:10:40 INFO JobScheduler:54 - Added jobs for time 1594894240000 ms
2020-07-16 10:10:42 INFO JobScheduler:54 - Added jobs for time 1594894242000 ms
2020-07-16 10:10:44 INFO JobScheduler:54 - Added jobs for time 1594894244000 ms
2020-07-16 10:10:46 INFO JobScheduler:54 - Added jobs for time 1594894246000 ms
2020-07-16 10:10:48 INFO JobScheduler:54 - Added jobs for time 1594894248000 ms
2020-07-16 10:10:50 INFO JobScheduler:54 - Added jobs for time 1594894250000 ms
2020-07-16 10:10:52 INFO JobScheduler:54 - Added jobs for time 1594894252000 ms
2020-07-16 10:10:54 INFO JobScheduler:54 - Added jobs for time 1594894254000 ms
2020-07-16 10:10:56 INFO JobScheduler:54 - Added jobs for time 1594894256000 ms
2.原因:
原命令为:
/spark/spark/bin/spark-submit --driver-class-path /spark/spark/jars/*:/spark/spark/jars/flume/* /spark/spark/spark.py spark 44444
这方法都意味着只能使用一个线程在本地运行任务。如果你用基于receiver 的输入dStream(例如套接字、Kafka、Flume等),然后使用单个线程来运行receiver ,不留下任何线程来处理接收到的数据。
3.解决
3.1设置的spark的线程不足
将命令改为 增加 --master local[4] 表示本地有4个spark的线程 因此可以有多余的线程来分析程序
/spark/spark/bin/spark-submit --master local[4] --driver-class-path /spark/spark/jars/*:/spark/spark/jars/flume/* /spark/spark/spark.py spark 44444
3.2系统配置不够
如果你的系统cpu只有一个核那么是运行不了的,如果是虚拟机,可以增加虚拟机的核