使用 Python 和 Flask 开发你的第一个 Web 程序

原创已于 2025-07-10 21:23:26 修改 · 190 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#python #flask

于 2025-07-03 15:41:44 首次发布

部署运行你感兴趣的模型镜像

1、使用python和flask开发第一个web程序
教程：https://tutorial.helloflask.com/template/
python app.py 也许更好

2、使用 Gunicorn 部署 Flask 应用
1）安装gunicorn
pip install gunicorn
2）运行 Flask 应用
假设你的 Flask 应用在 app.py 文件中，且应用实例名为 app，可以使用以下命令启动 Gunicorn 服务器：

gunicorn -w 4 -b 0.0.0.0:8000 app:app
-w 4：启动 4 个工作进程。
-b 0.0.0.0:8000：绑定到所有网络接口上的 8000 端口。
app:app：指定 Flask 应用实例的位置，格式为模块名:实例名。

需要在后台运行加上-D注意是大写，gunicorn -D -w 4 -b 0.0.0.0:9999 app:app

3、flask服务，每日13点读取多个远程文件，然后解析远程文件，获取其中的某些信息 + 怎么在web页面输出’http://example.com/file1.json’的部分内容
1）安装所需库
pip install Flask APScheduler requests
2）编写flask应用与调度任务

from flask import Flask
from apscheduler.schedulers.background import BackgroundScheduler
import requests
import json
import csv
from io import StringIO

app = Flask(__name__)

def fetch_and_parse_remote_files():
    # 定义远程文件的URL
    urls = [
        'http://example.com/file1.json',
        'http://example.com/file2.csv',
        # 添加更多的 URL
    ]
    
    for url in urls:
        try:
            response = requests.get(url)
            response.raise_for_status()  # 检查请求是否成功
            content_type = response.headers.get('Content-Type')
            
            if 'application/json' in content_type:
                data = response.json()
                parse_json(data)
            elif 'text/csv' in content_type:
                data = StringIO(response.text)
                parse_csv(data)
            else:
                print(f"Unsupported content type: {content_type}")

        except requests.exceptions.RequestException as e:
            print(f"Error fetching file from {url}: {str(e)}")

def parse_json(data):
    # 示例：从 JSON 中提取数据
    for item in data:
        print(item.get('key_of_interest'))

def parse_csv(data):
    # 示例：从 CSV 中提取数据
    reader = csv.DictReader(data)
    for row in reader:
        print(row['column_of_interest'])

# 初始化并配置调度器
scheduler = BackgroundScheduler()
scheduler.add_job(func=fetch_and_parse_remote_files, trigger='cron', hour=13, minute=0)
scheduler.start()

# 防止调度器在使用多个线程时出现的问题
# ！！在新版本中，可能需要使用更合适的装饰器或事件机制来替代 before_first_request。考虑使用 before_request 
@app.before_first_request
def init_scheduler():
    if not scheduler.running:
        scheduler.start()

@app.route('/')
def index():
    return "Flask Service Running"

if __name__ == '__main__':
    app.run()

需要修改，参考

步骤 1：修改 Flask 应用逻辑
创建全局变量或缓存存放解析结果:

为此，你可以使用一个字典或列表来存储解析后的数据。
创建一个新的路由用于展示解析数据:

创建一个新路由 /show_data 来显示 JSON 数据内容。
代码示例
from flask import Flask, jsonify
from apscheduler.schedulers.background import BackgroundScheduler
import requests
import json
import csv
from io import StringIO

app = Flask(__name__)

# 用于存放解析后的数据
parsed_data = {}

def fetch_and_parse_remote_files():
    urls = [
        'http://example.com/file1.json',
        'http://example.com/file2.csv',
        # 添加更多的 URL
    ]
    
    for url in urls:
        try:
            response = requests.get(url)
            response.raise_for_status()
            content_type = response.headers.get('Content-Type')
            
            if 'application/json' in content_type:
                data = response.json()
                # 针对特定 URL 存储数据
                if url == 'http://example.com/file1.json':
                    parsed_data['file1'] = parse_json(data)
            elif 'text/csv' in content_type:
                data = StringIO(response.text)
                parse_csv(data)
            else:
                print(f"Unsupported content type: {content_type}")

        except requests.exceptions.RequestException as e:
            print(f"Error fetching file from {url}: {str(e)}")

def parse_json(data):
    extracted_data = []
    for item in data:
        extracted_data.append(item.get('key_of_interest', 'Key not found'))
    return extracted_data

def parse_csv(data):
    reader = csv.DictReader(data)
    for row in reader:
        print(row['column_of_interest'])

scheduler = BackgroundScheduler()
scheduler.add_job(func=fetch_and_parse_remote_files, trigger='cron', hour=13, minute=0)
scheduler.start()

@app.before_first_request
def init_scheduler():
    if not scheduler.running:
        scheduler.start()

@app.route('/')
def index():
    return "Flask Service Running"

@app.route('/show_data')
def show_data():
    # 返回请求的 JSON 数据
    return jsonify(parsed_data.get('file1', 'No data found'))

if __name__ == '__main__':
    app.run()
说明
全局数据存储:

使用 parsed_data 字典存储在调度任务中解析的数据，使其可以被网页请求读取并返回。
新路由 (/show_data):

/show_data 路由会返回从 'http://example.com/file1.json' 中提取的数据。
jsonify 函数用于将数据转换为适合网页显示的 JSON 格式。
调度器功能:

在每日13点自动更新 parsed_data 字典中的内容，以确保展示的是最新数据。
通过这种方式，你的网站可以动态地获取每天从远程文件解析的信息，并允许用户在浏览器中访问这些解析内容。注意根据需求适当调整数据结构和键名。

3)比较重要！！flask应用可以正确读取远程文件，并根据对应的文件格式正确解析数据

# -*- coding: utf-8 -*-
from flask import Flask, render_template
from markupsafe import escape
from flask import url_for,jsonify
import pickle
from io import BytesIO
import base64

# 模块名：指的是包含 Flask 应用实例的 Python 模块（文件），不包括 .py 后缀。例如，app.py 模块在这里的模块名就是 app。
#实例名：指的是在模块中创建的 Flask 应用的变量名称。例如，如果你定义了一个 Flask 应用实例为 app = Flask(__name__)，那么这个实例名就是 app。
app = Flask(__name__)
# 用于存放解析后的数据
parsed_data = {}

import requests
import json
import csv
from io import StringIO

def fetch_and_parse_remote_files(urls):
    for url in urls:
        try:
            response = requests.get(url)
            response.raise_for_status()  # 检查请求是否成功
            content_type = response.headers.get('Content-Type')
            print('url=',url," content_type=",content_type)
            
            if 'application/json' in content_type:
                print("?????????")
                print("Processing JSON Line by Line")
                lines = response.text.strip().split("\n")
                #data = response.json()
                #print("***i!!!!!******data[:5]=", data[:5])
                parsed_data[url] = parse_line_json(lines)
            elif 'text/plain' in content_type:
                #response_text = response.text
                # 使用 base64 解码以便提取出 pickle 字节流:                
                #pickle_bytes = base64.b64decode(response_text)
                # 使用 BytesIO 处理字节数据（模拟文件操作）
                # -->!!!直接使用 response.content 获取原始字节数据
                data_stream = BytesIO(response.content)
                #data_stream = BytesIO(pickle_bytes)
                # 反序列化：使用 pickle 加载数据
                loaded_data = pickle.load(data_stream)
                #parsed_data[url] = parse_csv(data)
                parsed_data[url] = loaded_data
                print("len(loaded_data)=", len(loaded_data))
            # 处理其他格式
        except Exception as e:
            print(f"Error fetching file from {url}: {str(e)}")

def parse_line_json(data):
        print("*********data[:5]=", data[:5])
        outdata = {}
        for line in data:
                outdata.update(json.loads(line))
    # 示例：从 JSON 中提取数据
    # for item in data:
        # print(item.get('key_of_interest'))
        print("解析的   len(outdata) = ", len(outdata))
        return outdata

def parse_csv(data):
    # 示例：从 CSV 中提取数据
    reader = csv.DictReader(data)
    # for row in reader:
       # print(row['column_of_interest'])
    return data

from apscheduler.schedulers.background import BackgroundScheduler

from datetime import datetime
today = datetime.now()
#
date_str = today.strftime("%Y%m%d")
base_url = "xxxxxx"
outdata_url = f"{base_url}{date_str}.json"
# 在这里定义 URL 列表
urls_to_fetch = [
    #'http://example.com/file1.json',
    #'http://example.com/file2.csv',
   
    outdata_url,
    'XXvXX'
    # 添加更多的远程URL
]

if parsed_data == {}:
        fetch_and_parse_remote_files(urls_to_fetch)

# 初始化并配置调度器
scheduler = BackgroundScheduler()
scheduler.add_job(func=lambda: fetch_and_parse_remote_files(urls_to_fetch), trigger="cron", hour=13,minute=0)
scheduler.start()

# 防止调度器在使用多个线程时出现的问题
# @app.before_first_request
@app.before_request
def init_scheduler():
    if not scheduler.running:
        scheduler.start()

name = 'Grey Li'
movies = [
    {'title': 'My Neighbor Totoro', 'year': '1988'},
    {'title': 'Dead Poets Society', 'year': '1989'},
    {'title': 'A Perfect World', 'year': '1993'},
    {'title': 'Leon', 'year': '1994'},
    {'title': 'Mahjong', 'year': '1996'},
    {'title': 'Swallowtail Butterfly', 'year': '1996'},
    {'title': 'King of Comedy', 'year': '1999'},
    {'title': 'Devils on the Doorstep', 'year': '1999'},
    {'title': 'WALL-E', 'year': '2008'},
    {'title': 'The Pork of Music', 'year': '2012'},
]

@app.route('/')
def index():
        return render_template('index.html',name=name, movies=movies)

@app.route('/test/day=<day>')
def hello(day):
        return f'Welcome to My Watchlist!欢迎~~in: {escape(day)}'

@app.route('/test_url')
def test_url_for():
        print(url_for('hello', day='2025-07-07'))
        print(url_for('test_url_for'))
        return 'Test url'+'      \n'+url_for('test_url_for')+'    \n' + url_for('hello', day='2025-07-07')

@app.route('/get_remote_file/query=<query>')
def get_remote_file(query):
        t = urls_to_fetch[0]
        t_data= parsed_data.get(t,'No data found')
        print('len(t_data)=', len(t_data))
        if len(t_data) < 30:
                print(" t_data=   ",t_data)
        print( 'len(parsed_data)=', len(parsed_data), 't=', t)
        #print(parsed_data)
        if isinstance(t_data, str):
                res = t_data[:22]
        else:
                res = t_data.get(query,"No intent res")
        out={"query":query, 'intent_res':res, "testdateQ":'test'}
        # 解析中间结果
        mid_res={}
        for url in urls_to_fetch[1:]:
                mid_file_res = parsed_data.get(url,'No data found')
                mid_k = url.split('/v5/')[-1]
                mid_res[mid_k] = mid_file_res.get(query,"No mid res")
        out.update(mid_res)
        return jsonify(out)

if __name__ == '__main__':
        # 如果你想让 Flask 服务器监听特定的 IP 地址（例如监听外部请求），你需要在 app.run() 方法中设置 host 参数。
        # host='0.0.0.0'：使应用监听在所有可用的网络接口上。这意味着外部计算机可以通过你的机器的 IP 地址访问到你的 Flask 应用。
        # port=9999：指定应用运行在 9999 端口上。
        #app.run(host='0.0.0.0', port=9999, debug=True)
        app.run(host='0.0.0.0', port=9999)