[语音合成技术第一讲]web 页面中使用语音合成技术

最新推荐文章于 2025-03-21 09:54:17 发布

最新推荐文章于 2025-03-21 09:54:17 发布 · 298 阅读

文章标签：

#Web #JavaScript #XP #Microsoft #Windows #ViewUI

本文介绍如何在网页中实现语音合成，包括创建语音对象、控制音量、语速及选择发音人等基本操作，并提供了一个完整的JavaScript示例。

部署运行你感兴趣的模型镜像

web 页面中使用语音合成技术

前言：

语音合成技术其实并没有什么神秘的，也不像想象中的那么繁杂。今天我就与大家一起来看一下，怎么让我们网页为我们朗读文本。怎样做到在 web 中进行语音合成。我也将使用最短的代码，最通俗话语来完成这篇文章。

环境要求：

首先我们需要一个微软的 Speech SDK 5.1 的安装包（当然你的机器的操作系统版本要在 windows2000 以上的这个范畴） , 来使得我们的机器具有语音识别的功能。安装包，您可以在这里找到：

http://www.microsoft.com/downloads/details.aspx?FamilyId=5E86EC97-40A7-453F-B0EE-6583171B4530&displaylang=en

安装说明：

· If you want to download sample code, documentation, SAPI, and the U.S. English Speech engines for development purposes, download the Speech SDK 5.1 file (SpeechSDK51.exe).

· If you want to use the Japanese and Simplified Chinese engines for development purposes, download the Speech SDK 5.1 Language Pack file (SpeechSDK51LangPack.exe) in addition to the Speech SDK 5.1 file.

· If you want to redistribute the Speech API and/or the Speech engines to integrate and ship as a part of your product, download the Speech 5.1 SDK Redistributables file (SpeechSDK51MSM.exe).

· If you want to get only the Mike and Mary voices redistributable for Windows XP, download Mike and Mary redistributables (Sp5TTIntXP.exe).

· If you only want the documentation, download the Documentation file (sapi.chm).

其实上面这些可以不看，请您下载并安装 SpeechSDK51.exe 和 SpeechSDK51LangPack.exe 就可以了。

让我们开始：

环境已经准备好了，那就让我们正式开始吧。

首先我们需要一个能够"发声"的对象，暂时我们就称他为" 朗读人"。在不同的语音合成的程序中，他所出现的形式也是不同的，当然这是后话，以后我再告诉你（嘿嘿，不是卖关子，这是第一讲，咱们先让它能说话了先）。

在web 应用程序的 html 代码中创建" 朗读人"对象：

//CreatetheSapiSpVoiceobject

varVoiceObj=newActiveXObject("Sapi.SpVoice");

上面的代码是创建一个" 朗读人"对象，我们要将这个写在js中（有点废话，呵呵）。

下面的代码将告诉我们" 朗读人"是如何工作的：

VoiceObj.Speak(“helloworld”);

下面的代码告诉了我们如何销毁我们的" 朗读人"

//Cleanupvoiceobject

deleteVoiceObj;

当您如果读到了这里，我首先要感谢您的耐心。与此同时我也要恭喜你了，如果您是一个敏感的程序员。这个时候您可能已经开始编写您自己的语音合成代码了。因为我们知道了，如何创建对象，如何使用对象的方法，和如何delete它。

当然这些还远远不够，让我们再做的更好些：

控制声音的属性

控制音量(1~100)：

VoiceObj.Volume=80;

控制语速(-10~10)

VoiceObj.Rate=0;

控制朗读人的声音

VoiceObj.Voice="MicrosoftAnna";

控制朗读人的硬件设备输出

VoiceObj.AudioOutput="SoundMaxIntegrated";

好了该知道的我们都已经知道了，再让我们看一个完整的例子来结束我们这一次的语音合成的学习。

完整的例子：

< HTML >

< HEAD >

< META HTTP-EQUIV ="Content-Type" content ="text/html;charset=UTF-8" >

< TITLE > TTSDemo </ TITLE >

< SCRIPT LANGUAGE ="JavaScript" > ...

//CreatetheSapiSpVoiceobject

varVoiceObj=newActiveXObject("Sapi.SpVoice");

//ChangeVoice()function:

//ThisfunctionsetsthenewlyselectedvoicechoicefromtheVoice

//SelectboxontheVoiceobject.

functionChangeVoice()...{

vari=parseInt(idsVoices.value);

VoiceObj.Voice=VoiceObj.GetVoices().Item(i);

}

//ChangeAudioOutput()function:

//Thisfunctionsetsthenewlyselectedaudiooutputchoicefromthe

//AudioOutputSelectboxontheVoiceobject.

functionChangeAudioOutput()...{

vari=parseInt(idsAudioOutputs.value);

VoiceObj.AudioOutput=VoiceObj.GetAudioOutputs().Item(i);

}

//IncRate()function:

//Thisfunctionincreasesthespeakingrateby1uptoamaximum

//of10.

functionIncRate()...{

if(VoiceObj.Rate<10)

...{

VoiceObj.Rate=VoiceObj.Rate+1;

}

//DecRate()function:

//Thisfunctiondecreasesthespeakingrateby-1downtoaminimum

//of-10.

functionDecRate()...{

if(VoiceObj.Rate>-10)

...{

VoiceObj.Rate=VoiceObj.Rate-1;

}

//IncVol()function:

//Thisfunctionincreasesthespeakingvolumeby10uptoamaximum

//of100.

functionIncVol()...{

if(VoiceObj.Volume<100)

...{

VoiceObj.Volume=VoiceObj.Volume+10;

}

//DecVol()function:

//Thisfunctiondecreasesthespeakingvolumeby-10downtoaminimum

//of0.

functionDecVol()...{

if(VoiceObj.Volume>9)

...{

VoiceObj.Volume=VoiceObj.Volume-10;

}

//SpeakText()function:

//Thisfunctiongetsthetextfromthetextboxandsendsittothe

//Voiceobject'sSpeak()function.Thevalue"1"forthesecond

//parametercorrespondstotheSVSFlagsAsyncvalueintheSpeechVoiceSpeakFlags

//enumeratedtype.

functionSpeakText()...{

if(idbSpeakText.value=="SpeakText")

...{

//Speakthestringintheeditbox

try

...{

VoiceObj.Speak(idTextBox.value);

}

catch(exception)

...{

alert("Speakerror");

}

elseif(idbSpeakText.value=="Stop")

...{

//SpeakemptystringtoStopcurrentspeaking.Thevalue"2"for

//thesecondparametercorrespondstotheSVSFPurgeBeforeSpeak

//valueintheSpeechVoiceSpeakFlagsenumeratedtype.

VoiceObj.Speak("");

}

</ SCRIPT >

< SCRIPT FOR ="window" EVENT ="OnQuit()" LANGUAGE ="JavaScript" > ...

//Cleanupvoiceobject

deleteVoiceObj;

</ SCRIPT >

</ HEAD >

< BODY >

< H1 align =center > SimpleTTS(DHTML) </ H1 >

< H1 align =center >< FONT size =3 >         </ FONT >

< IMG alt ="" border =2 hspace =0 id =idImage src ="mouthclo.bmp" >   </ H1 >

< H1 align =center >

< TEXTAREA ID =idTextBox COLS =50 ROWS =10 WRAP =VIRTUAL > Entertextyouwishspokenhere </ TEXTAREA >

</ H1 >

< P align =center >< STRONG >< STRONG >

Rate   < STRONG >

< INPUT id =idbIncRate name =button1 type =button onclick =IncRate() value ="+" ></ STRONG >  

< INPUT id =idbDecRate name =button2 type =button onclick =DecRate() value ="-" style ="LEFT:237px;TOP:292px" >       </ STRONG >  

Volume   < STRONG >< STRONG >

< INPUT id =idbIncVol name =button3 onclick =IncVol() style ="LEFT:67px;TOP:318px" type =button value ="+" >  

< INPUT id =idbDecVol name =button4 onclick =DecVol() type =button value ="-" style ="LEFT:134px;TOP:377px" >

</ STRONG ></ STRONG ></ STRONG ></ P >

< P align =center >< STRONG >< BUTTON id =idbSpeakText onclick =SpeakText();

style ="HEIGHT:24px;LEFT:363px;TOP:332px;WIDTH:178px" > SpeakText </ BUTTON ></ STRONG ></ P >

< P align =center >< STRONG > Voice                                              

< STRONG > AudioOutput   </ STRONG ></ STRONG ></ P >

< P align =center >

< SELECT id =idsVoices name =Voices onchange =ChangeVoice() style ="FONT-FAMILY:serif;HEIGHT:21px;WIDTH:179px" > </ SELECT >

< SELECT id =idsAudioOutputs name =AudioOutputs onchange =ChangeAudioOutput() style ="HEIGHT:22px;WIDTH:179px" > </ SELECT >

< SCRIPT LANGUAGE ="JavaScript" > ...

//CodeintheBODYofthewebpageisusedtoinitializecontrolsand

//tohandleSAPIevents

/**//*****Initializercode*****/

InitializeControls();

functionInitializeControls()

...{

//InitializetheVoicesandAudioOutputSelectboxes

您可能感兴趣的与本文相关的镜像

HunyuanVideo-Foley

语音合成

HunyuanVideo-Foley是由腾讯混元2025年8月28日宣布开源端到端视频音效生成模型，用户只需输入视频和文字，就能为视频匹配电影级音效