phython-通过FastAPI部署python代码（Linux篇）

Gratitute_林腾

已于 2025-05-03 09:00:12 修改

阅读量407

点赞数 6

分类专栏： Linux 文章标签： fastapi

于 2025-03-18 09:36:40 首次发布

本文链接：https://blog.youkuaiyun.com/m0_74462339/article/details/146296787

版权

Linux 专栏收录该内容

52 篇文章

订阅专栏

前言

步骤

检查python版本

安装 python3-venv 对应版本的包

前言

本文是：phython-通过FastAPI部署python代码-优快云博客的扩展

步骤

检查python版本

检查云服务器的python版本

python --version

建议python版本为3.8+，因为 transformers 和 FastAPI 可能需要更高版本。

安装 `python3-venv` 对应版本的包

sudo apt update
sudo apt install python3.8-venv

创建 Python虚拟环境

mkdir ~/fastapi_server && cd ~/fastapi_server
python3.8 -m venv venv
source venv/bin/activate  # 进入虚拟环境

此时会发现终端前面多了 (venv)，表示已激活。

安装 FastAPI 和依赖

pip install --upgrade pip  # 更新 pip
pip install fastapi uvicorn torch transformers elasticsearch

如果空间不足，无法安装，可以将/root作为临时目录重新安装

TMPDIR=/root pip install fastapi uvicorn torch transformers elasticsearch --no-cache-dir

TMPDIR=/root 只是让 pip 在安装过程中使用 /root 作为临时目录，并不会影响 torch 的最终安装位置。

--no-cache-dir：禁用 pip 缓存，防止 pip 将下载的 .whl 文件缓存到默认目录（通常是 ~/.cache/pip 或 /tmp），避免空间不足问题。

创建 FastAPI 服务器

在fastapi_server 目录下，新建 app.py：

touch app.py

然后粘贴以下代码：

from fastapi import FastAPI
from elasticsearch import Elasticsearch
from transformers import BertTokenizer, BertModel
import torch

app = FastAPI()

# 选择 CPU 或 GPU
device = "cuda" if torch.cuda.is_available() else "cpu"

# 加载模型
model_path = "/home/your_user/model/chinese-roberta-wwm-ext-large"
tokenizer = BertTokenizer.from_pretrained(model_path)
model = BertModel.from_pretrained(model_path).to(device)

# 连接 Elasticsearch
es = Elasticsearch(["http://your_ip:9200"])

# 计算文本嵌入
def get_embedding(text):
    encoded_input = tokenizer.encode_plus(
        text, add_special_tokens=True, max_length=300, padding='max_length', truncation=True, return_tensors='pt'
    )
    encoded_input = {key: value.to(device) for key, value in encoded_input.items()}

    with torch.no_grad():
        outputs = model(encoded_input['input_ids'], attention_mask=encoded_input['attention_mask'])

    return outputs.last_hidden_state[:, 0, :].tolist()[0]

# 定义 API 端点
@app.get("/search")
def search_similar(query_text: str, index_name: str = "math_index", top_k: int = 3):
    embedding = get_embedding(query_text)

    query = {
        "query": {
            "script_score": {
                "query": {"match_all": {}},
                "script": {
                    "source": "cosineSimilarity(params.queryVector, 'ask_vector') + 1.0",
                    "params": {"queryVector": embedding}
                }
            }
        },
        "size": top_k
    }

    res = es.search(index=index_name, body=query)
    return res["hits"]["hits"]

运行 FastAPI 服务器

测试启动

uvicorn app:app --host 0.0.0.0 --port 8000 --reload

如果未找到unicorn，说明没有安装这个依赖，pip安装即可：
python3 -m pip install uvicorn

检查 API 是否运行

在本地浏览器访问：

http://your_server_ip:8000/docs

让 API 持续运行

默认 uvicorn 运行在前台，如果终端关闭，API 也会停止。可以用systemd 让它后台运行。

创建 fastapi.service

sudo nano /etc/systemd/system/fastapi.service

写入：

[Unit]
Description=FastAPI Service
After=network.target

[Service]
User=your_user
WorkingDirectory=/home/your_user/fastapi_server
ExecStart=/home/your_user/fastapi_server/venv/bin/uvicorn app:app --host 0.0.0.0 --port 8000
Restart=always

[Install]
WantedBy=multi-user.target

启用服务：

sudo systemctl daemon-reload
sudo systemctl enable fastapi
sudo systemctl start fastapi

查看状态：

sudo systemctl status fastapi

如果显示 Active: running，说明 FastAPI 正在后台运行，且 服务器重启后会自动启动！