AI-ChatBot-0.知识框架

本文探讨了对话系统的构建原理,包括开放域与封闭域知识的应用、语言个性的统一、模型验证及多样性提升等。同时介绍了相关学术论文,并举例说明了在实际应用中遇到的问题,如语音助手的不同交互方式和特定Python环境下PyAudio的安装调试。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

一般涉及两部分,Open Domain 广域的知识,Closed Domain专业的知识,如普通的随机聊天,和专业的IT知识问答

语境

语言语境:这句话在说什么内容?(涉及到对语言的embed,比如word vector)

物理语境:这句话在哪里说的?(涉及到物理环境,ྲ如在哪里,现在几点)

 

相关paper

Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models(Lulian et al., 2015)https://arxiv.org/abs/1507.04808

Attention with Intention for a Neural Network Conversation Model(Yao, 2015)

https://arxiv.org/abs/1510.08565

 

统一的语言个性

相关paperA Persona-Based Neural Conversation Model (Li et al., 2016)

https://arxiv.org/abs/1603.06155

 

模型验证

几种难缠的情况:

1.我们自己对模型的正误判断需要人类智慧的解读:

比如,你跟amazon的Alexa说,我想睡了,这时候,alexa帮你调整灯光,但你不能说这种操作一定是正确的。

相关paper

How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation(Liu, 2016)

https://arxiv.org/abs/1603.08023

 

多样性

吃了吗?

-嗯-

今天天气好吗?

-嗯-

明天去哪د玩?

-嗯-

你没病吧?

-嗯-

因为机器学习可能发现,回答嗯可以达到99%的正确率,所以忽略了多样性

相关paper

A Diversity-Promoting Objective Function for Neural Conversation Models(Li et al. 2015)

https://arxiv.org/abs/1510.03055

 

语音助手

siri: 被动式交互

google now:主动式交互

 

PyAudio安装失败

37版本的python会安装失败,提示portaudio.h找不到

因为在https://pypi.org/project/PyAudio/#files中最新版本是PyAudio-0.2.11-cp36-cp36m-win_amd64.whl (52.6 kB) 

所以把37版本降为3.66版本,搞定

安装了 PyAudio 后可从控制台进行安装测试

python -m speech_recognition

请确保默认麦克风打开并取消静音,若安装正常则应该看到如下所示的内容:

C:\Users\Administrator>python -m speech_recognition
A moment of silence, please...
Set minimum energy threshold to 89.9493545097573
Say something!
Got it! Now to recognize it...
You said good morning
Say something!
Got it! Now to recognize it...
You said license for me that yo
Say something!
Got it! Now to recognize it...
You said are you okay
Say something!
Got it! Now to recognize it...
You said shanks
Say something!

请对着麦克风讲话并观察 SpeechRecognition 如何转录你的讲话。

示例测试:

import speech_recognition as sr
from time import ctime
import time
import os
from gtts import gTTS
import sys

# 讲出来AI的话
def speak(audioString):
    print(audioString)
    tts = gTTS(text=audioString, lang='en')
    path =  os.path.abspath('.')+"\\hello.mp3"
    tts.save(path)
    cmd = path
    print(cmd)
    os.system(cmd)


# 录下来你讲的话
def recordAudio():
    # 用麦克风记录下你的话
    r = sr.Recognizer()
    with sr.Microphone() as source:
        audio = r.listen(source)

    # 用Google API转化音频
    data = ""
    try:
        data = r.recognize_google(audio)
        print("You said:"+data)
    except sr.UnknownValueError:
        print("Google Speech Recognition could not understand audio")
    except sr.RequestError as e:
        print("Could not request results from Google Speech Recognition service; {0}".format(e))

    return data



# 自带的对话技能(rules)
def jarvis():
    while True:

        data = recordAudio()

        if "how are you" in data:
            speak("Iam fine")

        if "time" in data:
            speak(ctime())

        if "where is" in data:
            data = data.split(" ")
            location = data[2]
            speak("Hold on Tony, I will show you where" + location + "is.")
            os.system("open -a Safari https://www.google.com/maps/place/" + location + "/&")

        if "bye" in data:
            speak("bye bye")
            break

# 初始化
time.sleep(2)
speak("Hi Tony, what can I do for you?")

#跑起
jarvis()

测试时发现语音识别不太好,可能我英语口语太差吧

 

 

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值