一般涉及两部分,Open Domain 广域的知识,Closed Domain专业的知识,如普通的随机聊天,和专业的IT知识问答
语境
语言语境:这句话在说什么内容?(涉及到对语言的embed,比如word vector)
物理语境:这句话在哪里说的?(涉及到物理环境,ྲ如在哪里,现在几点)
相关paper
Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models(Lulian et al., 2015)https://arxiv.org/abs/1507.04808
Attention with Intention for a Neural Network Conversation Model(Yao, 2015)
https://arxiv.org/abs/1510.08565
统一的语言个性
相关paperA Persona-Based Neural Conversation Model (Li et al., 2016)
https://arxiv.org/abs/1603.06155
模型验证
几种难缠的情况:
1.我们自己对模型的正误判断需要人类智慧的解读:
比如,你跟amazon的Alexa说,我想睡了,这时候,alexa帮你调整灯光,但你不能说这种操作一定是正确的。
相关paper
How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation(Liu, 2016)
https://arxiv.org/abs/1603.08023
多样性
吃了吗?
-嗯-
今天天气好吗?
-嗯-
明天去哪د玩?
-嗯-
你没病吧?
-嗯-
因为机器学习可能发现,回答嗯可以达到99%的正确率,所以忽略了多样性
相关paper
A Diversity-Promoting Objective Function for Neural Conversation Models(Li et al. 2015)
https://arxiv.org/abs/1510.03055
语音助手
siri: 被动式交互
google now:主动式交互
PyAudio安装失败
37版本的python会安装失败,提示portaudio.h找不到
因为在https://pypi.org/project/PyAudio/#files中最新版本是PyAudio-0.2.11-cp36-cp36m-win_amd64.whl (52.6 kB)
所以把37版本降为3.66版本,搞定
安装了 PyAudio 后可从控制台进行安装测试
python -m speech_recognition
请确保默认麦克风打开并取消静音,若安装正常则应该看到如下所示的内容:
C:\Users\Administrator>python -m speech_recognition
A moment of silence, please...
Set minimum energy threshold to 89.9493545097573
Say something!
Got it! Now to recognize it...
You said good morning
Say something!
Got it! Now to recognize it...
You said license for me that yo
Say something!
Got it! Now to recognize it...
You said are you okay
Say something!
Got it! Now to recognize it...
You said shanks
Say something!
请对着麦克风讲话并观察 SpeechRecognition 如何转录你的讲话。
示例测试:
import speech_recognition as sr
from time import ctime
import time
import os
from gtts import gTTS
import sys
# 讲出来AI的话
def speak(audioString):
print(audioString)
tts = gTTS(text=audioString, lang='en')
path = os.path.abspath('.')+"\\hello.mp3"
tts.save(path)
cmd = path
print(cmd)
os.system(cmd)
# 录下来你讲的话
def recordAudio():
# 用麦克风记录下你的话
r = sr.Recognizer()
with sr.Microphone() as source:
audio = r.listen(source)
# 用Google API转化音频
data = ""
try:
data = r.recognize_google(audio)
print("You said:"+data)
except sr.UnknownValueError:
print("Google Speech Recognition could not understand audio")
except sr.RequestError as e:
print("Could not request results from Google Speech Recognition service; {0}".format(e))
return data
# 自带的对话技能(rules)
def jarvis():
while True:
data = recordAudio()
if "how are you" in data:
speak("Iam fine")
if "time" in data:
speak(ctime())
if "where is" in data:
data = data.split(" ")
location = data[2]
speak("Hold on Tony, I will show you where" + location + "is.")
os.system("open -a Safari https://www.google.com/maps/place/" + location + "/&")
if "bye" in data:
speak("bye bye")
break
# 初始化
time.sleep(2)
speak("Hi Tony, what can I do for you?")
#跑起
jarvis()
测试时发现语音识别不太好,可能我英语口语太差吧