1. Please ensure your setuptool is the latest version
or the coca is unable to work correctly.
2. You need to change the mongo host in master`s "weibo.yaml" file to your master`s real IP, or the worker`s job could not store the crawler to your
master mongo database.
Start master:
coca master -s [ip:port]
The ip and port is not necessary because the master may be allocated an ip and a port automatically.
Start one or more workers:
coca worker -s -m [ip:port]
Please ensure the parameters sequence is correct.
This ip and port is essential and it should be the master`s. You can just fill it with the ip and port showed in the master cmd.
-m means master.
Then run the application(weibo as an example):
coca job -u /path/to/cola/app/weibo -r
'/path/to/cola/app/weibo' this path is necessary in windows system. And pay attention to the blanks exist in file name.
In this circumstance, you can use double quote for the file name. eg. coca job -u "D:My doc/cola/app/weibo" -r
>python -c "import tempfile; print tempfile.gettempdir()"
Then the path showed stores your master or worker data folder.
If the worker refuse to run, you can try to clear these data folder, since it may be polluted after running wrongly in the previous operation.
Kill master to stop the whole cluster:
coca master -k
List all jobs:
coca job -m [ip:port] -l
3096

被折叠的 条评论
为什么被折叠?



