关闭防火墙 systemctl stop firewalld.service(systemctl disable firewalld.service)
/usr/local/python3.6.5/bin/virtualenv --python=/usr/local/python3.6.5/bin/python3.6 --no-site-packages stock_data_get
pip install apache-airflow
export AIRFLOW_HOME=~/airflow
# airflow initdb
输入 airflow 可以在用户目录生成airflow目录
配置文件
airflow.cfg
sql_alchemy_conn = mysql://username:password@ip:port/dbname (mysql://mysql:123456@192.168.3.38:3306/airflow)
# Logging level
logging_level = ERROR
[email]
email_backend = airflow.utils.email.send_email_smtp
[smtp]
# If you want airflow to send emails on retries, failure, and you want to use
# the airflow.utils.email.send_email_smtp function, you have to configure an
# smtp server here
smtp_host = smtp.163.com
smtp_starttls = True
smtp_ssl = False
# Uncomment and set the user/pass settings if you want to use SMTP AUTH
smtp_user = user@163.com
smtp_password = password # 163邮箱配置的授权码
smtp_port = 25
smtp_mail_from = user@163.com
# after how much time a new DAGs should be picked up from the filesystem
min_file_process_interval = 60 (cpu占用过高, 将此项配置一下就可以)
初始化
airflow initdb
启动web服务器
airflow webserver -p 8080 [方便可视化管理dag] (nohup airflow webserver -p 8080 >/dev/null 2>&1 &)
http://192.168.3.38:8080/admin/ 访问
(杀死进程时可能需要杀死 gunicorn, netstat -ntpl 查看端口)
注册
~/airflow/dags
python ~/airflow/dags/tutorial.py
启动任务 airflow scheduler [scheduler启动后,DAG目录下的dags就会根据设定的时间定时启动] (nohup airflow scheduler >/dev/null 2>&1 &)
airflow list_dags
airflow list_tasks tutorial
airflow test dag_name email_task 2017-05-10
每个DAG在执行时都会传入一个具体的时间(datetime对象), 对于周期任务,airflow传入的时间是上一个周期的时间(划重点),比如你的任务是每天执行, 那么今天传入的是昨天的日期,如果是周任务,那传入的是上一周今天的值
其他资料
https://www.cnblogs.com/testzcy/p/8427141.html
https://stackoverflow.com/questions/48194000/apache-airflow-1-9-default-timezone-set-to-non-utc