使用LLM构建自主智能体控制你的iot设备

原创已于 2023-09-05 12:07:13 修改 · 1.4k 阅读

7 ·

CC 4.0 BY-SA版权

文章标签：

#人工智能

于 2023-09-05 11:54:58 首次发布

本文讲述了AI领域的最新进展，特别是大模型在思维链和自主智能体方面的潜力。重点介绍了langchain和promptulate框架，以及如何使用它们控制IoT设备，通过构建工具模块和自主决策的智能体展示了大模型在物联网中的应用和未来可能性。

今年的AI领域可谓是大放光彩，各路大厂纷纷发布自己的大模型，大模型给人带来了极为惊艳的体验，写作，编程都不在话下，更是拥有海量的知识，能够输出极为专业的内容。但我认为大模型更值得关注的也许不是生成创造，而是被称为思维链的能力，这是一种大模型才具有的涌现能力。使用cot能力，通过一定的prompt技巧我们将有可能构建由llm自主决策的自主智能体。

langchain是llm开发最值得关注的框架之一，langchain提供了一种思路，将外部工具与llm结合。并使用嵌入和向量数据库克服上下文有限的问题。然而langchain本身也存在一定的问题，代码较为冗余，这大大影响了编程的体验。直到我阅读了zeedland作者的promptulate框架的时候才感觉豁然开朗，尽管promptulate框架本身还不成熟，但作为一个开源开发者，这并不是什么大问题，于是我持续给promptulate AI 社区提出各种issue和pr，promptulate社区非常友好，它持续接受并合并了我数个pr，包括文心一言支持提案以及各种工具的提案。但对于iot工程师来讲，更迫切的愿景是把llm的能力引入自己的领域，于是我提出了构建自主智能体控制iot设备的提案，令我惊喜的是该提案被社区接收了，下面就来讲讲我是如何实现该提案的。（提案使用了已经正式上线的文心一言大模型）

在promptulate框架的思想下要想使用llm控制你的iot设备首先当然得创建相应的工具模块，mqtt协议作为iot设备最常用的协议，因此我选择使用paho库快速创建了一个api_wrapper用于发布控制设备。发送给iot设备的主要是开关量和数字量。数字量较为难以实现，因此本次提案我暂时没有创建数字量的控制，而是创建了开关量的tool。

开关量的tool主要分为三部分，第一部分是通讯模块，即api_wrapper，它的构造很简单，就是利用了paho库进行mqtt发布信息。

import paho.mqtt.client as mqtt


class IotSwitchAPIWrapper:
    def run(self, client: mqtt, topic: str, command: str) -> str:
        client.publish(topic, command)
        return "ok"

第二部分就是映射表和描述，将该工具的作用描述清楚给llm，并将自然语言映射到操作指令。但显然自然语言表达多种多样，单靠传统的条件判断和传统算法难以进行有效判断用于意图，但是借助大语言模型，这样的判断就变得非常简单，这就需要第三部分构建合适的prompt进行判断。

第三部分自然就是prompt构建，使用了角色扮演和格式化输出的技巧，使得大语言模型判断用户的意图并返回对应的序号，这部分借助promptulate的格式化字符串生成组件，开发起来非常简便。

以下是第二和第三部分的核心代码。

import re
from typing import Optional, Dict, List

from pydantic import Field
import paho.mqtt.client as mqtt

from promptulate.llms import ErnieBot, BaseLLM
from promptulate.tools import BaseTool
from promptulate.tools.iot_swith_mqtt.api_wrapper import IotSwitchAPIWrapper
from promptulate.tools.iot_swith_mqtt.prompt import prompt_template
from promptulate.utils import StringTemplate, get_logger

logger = get_logger()


class IotSwitchTool(BaseTool):
    """A tool for running python code in a REPL."""

    name: str = "Iot_Switch_Mqtt"
    description: str = (
        "An IoT tool used for switching operations on various devices"
        "This tool uses the mqtt protocol for transmission"
        "The input content is the intention or command to open the specified electrical appliance"
        "If the operation of the device is successful, an OK will be returned, otherwise a failure will be returned"
    )
    llm_prompt_template: StringTemplate = Field(default=prompt_template)
    llm: BaseLLM = Field(
        default=ErnieBot(temperature=0.3)
    )
    client: mqtt.Client
    rule_table: List[Dict]
    api_wrapper: IotSwitchAPIWrapper = Field(default_factory=IotSwitchAPIWrapper)

    def _run(self, question: str, *args, **kwargs) -> str:
        if len(self.rule_table) == 0:
            raise Exception("rule_table is empty")
        else:
            index = 1
            key = ""
            for s in self.rule_table:
                key = key+str(index) + "." + s["content"] + "\n"
                index = index + 1
            prompt = self.llm_prompt_template.format(question=question, rule_key=key)
            logger.debug(prompt)
            llm_output = self.llm(prompt)
            return self._process_llm_result(llm_output)

    def _process_llm_result(self, llm_output: str) -> str:
        answer = re.findall(r"\d+", llm_output)
        if len(answer) == 0:
            return "failure information :" + llm_output
        else:
            self.api_wrapper.run(
                self.client,
                self.rule_table[int(answer[0]) - 1]["topic"],
                self.rule_table[int(answer[0]) - 1]["ask"],
            )
            return "ok"

from promptulate.utils.string_template import StringTemplate

PROMPT_TEMPLATE = '''
现在你是一个智能音箱，用户将向你输入”{question}“，
请判断用户是否是以下意图 
{rule_key}
如果符合你只需要回答数字标号，如1，请不要输出你的判断和额外的解释。
如果都不符合，你需要输出无法找到对应电器和对应的原因，请不要输出任何数字。
'''
prompt_template = StringTemplate(PROMPT_TEMPLATE)

到现在我们已经拥有了完整的iot开关量控制工具，可以根据用户的意图进行相应的电器控制，但是这还远远不够，我们需要的是llm能有根据环境自主决策，比如说根据现在的环境来给我自主选择合适的智能设备，想想就有点科幻那味了。

那如何构建agent呢？这就需要一种叫做react的提示技巧了

paper: https://arxiv.org/pdf/2210.03629.pdf

ReAct是Reasoning和Acting的缩写。这个框架的基本思路是给一个Prompt，这个Prompt将Question拆解成几个步骤。分别是：

Tought：思考自己应该采取的下一步动作。
Action：执行某个动作，并决定自己使用什么工具。

Finish[answer] 它将使用答案完成当前任务。
Observation：观察到的外部工具给到的结果，将作为新的提示输入给 ChatGPT。

嗯，这看起来还挺复杂的，但幸运的是promptulate框架已经将一切封装好了，我们仅仅需要几十行代码就可以完成上述过程。

import paho.mqtt.client as mqtt
from promptulate.agents import ToolAgent
from promptulate.llms import ErnieBot
from promptulate.tools import (
    DuckDuckGoTool,
    Calculator, SleepTool,
)
from promptulate.tools.iot_swith_mqtt import IotSwitchTool
from promptulate.utils.logger import enable_log

enable_log()


def main():
    # MQTT broker address and port
    broker_address = "XXX"
    broker_port = 1883
    # username and password
    username = "XXX"
    password = "XXXXX"
    client = mqtt.Client()
    client.username_pw_set(username, password)
    client.connect(broker_address, broker_port)
    tools = [
        DuckDuckGoTool(),
        Calculator(),
        SleepTool(),
        IotSwitchTool(
            client=client,
            rule_table=[
                {"content": "开空调", "topic": "/123", "ask": "open conditioner"},
                {"content": "开加热器", "topic": "/123", "ask": "open heater"},
                {"content": "开灯", "topic": "/123", "ask": "open light"},
            ],
        ),
    ]
    llm = ErnieBot()
    agent = ToolAgent(tools, llm=llm)
    prompt = """查询现在新会的气温，并根据现在的气温开相应的电气"""
    agent.run(prompt)


if __name__ == "__main__":
    main()

agent执行结果

Answer the following questions as best you can. You have access to the following tools:

ddg-search: A wrapper around DuckDuckgo Web Search. Useful for when you need to answer questions about current events. Input should be a search query.

math-calculator: Useful for when you need to answer questions about math.You input is a naturelanguage of math expression.

sleep: Make agent sleep for a specified number of seconds.Input is a number. eg: Sleep for 5s and enter 5

Iot_Switch_Mqtt: An IoT tool used for switching operations on various devicesThis tool uses the mqtt protocol for transmissionThe input content is the intention or command to open the specified electrical applianceIf the operation of the device is successful, an OK will be returned, otherwise a failure will be returned


Use the following format:
Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, must be one of [ddg-search, math-calculator, sleep, Iot_Switch_Mqtt]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Begin!

Question: 查询现在新会的气温，并根据现在的气温开相应的电气
Thought:
I need to query the current temperature in Xinhui and switch the appropriate electrical appliance based on the temperature.

Action: ddg-search
Action Input: "现在新会的气温"

Observation: 中国气象局提供权威的天气预报 ... 新会; 国内. 国内; 国外 ... 7天天气预报（2023/08/30 20:00发布） 星期三 08/30 多云 无持续风向 微风 33℃ 26℃ 多云 无持续风向 ... 31 ~ 26℃ 湿度：89% 气压：99.94千帕 紫外线：无 日出日落: 06:08 ~ 18:52 新会天气预报：新会今天白天天气中雨,晚间天气小雨,最高温度31 ℃ ,最低温度26 ℃ 。 空气：36 优 新会天气更新于 2023-08-25 19:09 新会天气预报一周 新会天气预报15天 >> 08月25日 星期五 中雨 08月26日 星期六 小雨 08月27日 星期日 大雨 08月28日 星期一 中雨 08月29日 星期二 阴 08月30日 星期三 多云 08月31日 星期四 多云 优 优 良 良 良 良 优 新会24小时天气预报 19点 阴 27℃ 优 20点 阴 27℃ 优 21点 小雨 27℃ 优 22点 小雨 27℃ 优 23点 小雨 27℃ 优 提供【新会天气预报】,一周,15天查询,40天查询,气象云图,空气质量,天气实况,温度,降水,风力,气压,紫外线等及时准确发布,是中央气象台唯一权威天气发布平台,出行的好伴侣 新会明天天气 新会后天天气 新会天气预报7天(一周) 新会天气预报10天 新会天气预报15天 新会天气预报30天 新会天气预报40天 新会天气预报60天; 新会市气象局04日(星期二) 7时预计，新会今日天气雷阵雨转多云，最低气温26℃，最高气温33℃，无持续风向3级，天气 ... 新会区 天气 30天预报 降水 空气质量 灾害预警 生活指数 历史数据 2023-08-19 21:26 28° 阴 AQI 优 今晚中雨。 明天大雨，温度和今天差不多（32°），空气不错。 2级 西北风 83% 相对湿度 弱 紫外线 30° 体感温度 26km 能见度 0.0mm 降水量 1002hPa 大气压 24小时预报 10pm 0am 2am 4am 6am 8am 10am 12am 2pm 4pm
Thought: Based on the temperature, I need to switch the appropriate electrical appliance.

Action: Iot_Switch_Mqtt
Action Input: "把客厅的空调打开"


Observation: ok
Thought: Final Answer: The final answer to the original input question is not provided.

[DEBUG] 2023-09-05 11:40:18 [pne hook] hooks <('Agent', 'on_agent_result')> mounted_obj <ToolAgent> call hook: <[BaseHookSchema(hook_name='on_agent_result', callback=<function StdOutHook.handle_agent_result at 0x00000180F0B14CA0>, component_type='Agent')]>
[Agent Result]  The final answer to the original input question is not provided.
Agent End.

Process finished with exit code 0

可以看到，空调被顺利打开，这反映了大模型超强的逻辑推理能力，也为aiot提供了一种新的可能。尽管目前的llm技术仍然存在着很多的不足，推理存在一定的不稳定等问题，这为其全面落地产生了一定的障碍，但是我认为全自主化的agent依然是今后值得关注的方向。

本文使用的完整代码的github仓库地址https://github.com/Undertone0809/promptulate

欢迎加入创万联社区和promptulate AI 社区来共同探讨AIOT的前景