介绍
Structured Query Language (SQL) 是用于管理和查询数据库的标准语言。借助LangChain,可以使用自然语言生成SQL查询,实现如下功能:
-
生成基于自然语言问题的SQL查询
-
创建可以基于数据库数据回答问题的聊天机器人
-
构建基于用户分析需求的自定义仪表盘
LangChain提供了一系列工具来与SQL数据库交互:
-
基于自然语言生成SQL查询
-
创建查询并执行SQL数据库查询
-
使用代理进行灵活的SQL查询交互
安装依赖
# !pip install langchain langchain-experimental openai
初始化LangChain并加载API密钥
import os
from dotenv import load_dotenv
from langchain.llms import OpenAI
load_dotenv()
api_key = os.getenv("OPEN_AI_KEY")
api_organization = os.getenv("OPEN_AI_ORG")
llm = OpenAI(
api_key= api_key,
openai_organization= api_organization,
temperature=0,
verbose=True
)
案例1:自然语言转换为SQL查询(Text-to-SQL)
from langchain.prompts.prompt import PromptTemplate
PROMPT_SUFFIX = """Only use the following tables:
{table_info}
Question: {input}"""
_DEFAULT_TEMPLATE = """Given an input question, first create a syntactically correct {dialect} query to run, then look at the results of the query and return the answer.
Use the following format:
Question: Question here
SQLQuery: SQL Query to run
SQLResult: Result of the SQLQuery
Answer: Final answer here"""
PROMPT = PromptTemplate(
input_variables=["input", "table_info", "dialect", "top_k"],
template=_DEFAULT_TEMPLATE + PROMPT_SUFFIX,
)
连接数据库并执行查询
from langchain.chat_models import ChatOpenAI
from langchain.chains import create_sql_query_chain
from langchain.utilities import SQLDatabase
db = SQLDatabase.from_uri("sqlite:///./docs/sql/Chinook.db")
chain = create_sql_query_chain(llm=llm, db=db)
response = chain.invoke({"question": "How many employees are there"})
print(response)
案例2:SQL查询执行优化
SQLDatabaseChain
通过以下方式提高查询性能:
-
添加示例数据
-
自定义表信息
-
使用Query Checker自动修正错误查询
-
自定义LLM提示
-
获取SQL查询的中间步骤
-
限制查询返回的行数
from langchain_experimental.sql import SQLDatabaseChain
db_chain = SQLDatabaseChain.from_llm(
llm=llm,
db=db,
verbose=True,
)
db_chain.run("How many employees are there?")
案例3:使用SQL代理(SQL Agent)
from langchain.agents import create_sql_agent
from langchain.agents.agent_toolkits import SQLDatabaseToolkit
from langchain.agents.agent_types import AgentType
toolkit = SQLDatabaseToolkit(db=db, llm=llm)
agent_executor = create_sql_agent(
llm=llm,
toolkit=toolkit,
verbose=True,
agent_type=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
)
代理任务示例
查询总销售额并找出最高消费国家
agent_executor.run("List the total sales per country. Which country's customers spent the most?")
描述表结构
agent_executor.run("Describe the playlisttrack table")
扩展SQL工具
通过示例增强查询能力
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.schema import Document
from langchain.vectorstores import FAISS
embeddings = OpenAIEmbeddings()
few_shots = {
"List all artists.": "SELECT * FROM artists;",
"Find all albums for the artist 'AC/DC'.": "SELECT * FROM albums WHERE ArtistId = (SELECT ArtistId FROM artists WHERE Name = 'AC/DC');",
"List all tracks in the 'Rock' genre.": "SELECT * FROM tracks WHERE GenreId = (SELECT GenreId FROM genres WHERE Name = 'Rock');",
}
few_shot_docs = [Document(page_content=q, metadata={"sql_query": few_shots[q]}) for q in few_shots]
vector_db = FAISS.from_documents(few_shot_docs, embeddings)
retriever = vector_db.as_retriever()
创建检索工具
from langchain.agents.agent_toolkits import create_retriever_tool
retriever_tool = create_retriever_tool(
retriever, name="sql_get_similar_examples", description="Retrieve similar SQL examples."
)
结合代理执行查询
agent.run("How many employees do we have?")
多工具协作(Multiple Tools)
结合数据库查询和计算工具,提高查询能力。
from langchain.agents import initialize_agent, Tool
from langchain.chains import LLMMathChain
from pydantic import BaseModel, Field
class CalculatorInput(BaseModel):
question: str = Field()
db_chain = SQLDatabaseChain.from_llm(llm=llm, db=db, verbose=True)
calculator = LLMMathChain.from_llm(llm=llm, verbose=True)
tools = [
Tool(name="Database", func=db_chain.run, description="Query the database"),
Tool(name="Calculator", func=calculator.run, description="Perform arithmetic calculations", args_schema=CalculatorInput),
]
agent = initialize_agent(
tools,
llm,
agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
verbose=True,
)
agent.run("How many employees do we have?")
参考资料
如果你觉得这篇博文对你有帮助,请点赞、收藏、关注我,并且可以打赏支持我!
欢迎关注我的后续博文,我将分享更多关于人工智能、自然语言处理和计算机视觉的精彩内容。
谢谢大家的支持!