Utilizando o Festival Text-to-Speach no Linux

O recurso de Text-to-Speach pode ser utilizado com facilidade no Linux através do software Festival. A aplicação possibilita que você leia a saída de um arquivo com voz utilizando o recurso de TTS.

Post em inglês


Using Festival Text-to-Speech in Ubuntu doesn’t work after the install. Here are some steps I took to fix it. Also, some changes to make it useful in everyday work.

Getting Festival to Work

Festival is the free text-to-speech engine that is extremely popular.

Here’s how to get it:

sudo apt-get install festival

Here’s how to test it:

echo "hello world"|festival --tts

You may see this error:

Linux: can't open /dev/dsp

If so, add the following lines to your .festivalrc file:

(Parameter.set 'Audio_Command "aplay -q -c 1 -t raw -f s16 -r $SR $FILE")
(Parameter.set 'Audio_Method 'Audio_Command)

Getting it to read the clipboard

Now, if you want it to read info from your clipboard, install this:

sudo apt-get install xclip

And type this:

xclip -o|festival --tts

Now, you can go a step further and create a shortcut key for reading text. Here’s a good one:


#This script reads the information from the clipboard outloud.

#Look for festival being run.
running=$(pgrep festival)

if [ -z $running ]
    #read it
    xclip -o|festival --tts
    #kill it
    killall festival;killall aplay;sleep .1;killall aplay

I call it talk.sh. Be sure to do a chmod +x talk.sh to it.

Assigning a Shortcut

Now, to assign to a shortcut key. I’m using Ubuntu which uses GNOME. if you use something else..you’re on your own. Otherwise, click System->Keyboard Shortcuts. Then add the path to the script and assign a shortcut.

I assigned it to the Windows-A keystroke. You can click it once to start and again to stop. Unfortunately, the script assumes you only have one instance of festival.

Adjusting the Playback Speed

If you want it to read faster, change the .festivalrc file:

(Parameter.set 'Audio_Command "aplay -q -c 1 -t raw -f s16 -r $(($SR*140/100)) $FILE")

The 140/100 means 140% of original speed which seems about right to me for most texts.

Improving Voices

The default voices in Festival do not sound great. Here’s a bash script to add new voices. These are the best I could find anywhere:

mkdir $dir
cd $dir

#Download the voices
for voice in awb bdl clb rms slt jmk
  wget "http://hts.sp.nitech.ac.jp/archives/2.0.1/festvox_nitech_us_"$voice"_arctic_hts-2.0.1.tar.bz2" done   #Unpack tar xvf *.bz2   #Install sudo mkdir -p /usr/share/festival/voices/us sudo mv lib/voices/us/* /usr/share/festival/voices/us/ sudo mv lib/hts.scm /usr/share/festival/hts.scm

Setting a Default Voice

The default voice in Festival is configurable, but it doesn’t seem to work. It was necessary to change/usr/share/festival/voices.scm directly. Simply update the default-voice-priority-list. It should like something like this:

(defvar default-voice-priority-list
kal_diphone cmu_us_bdl_arctic_hts cmu_us_jmk_arctic_hts cmu_us_slt_arctic_hts cmu_us_awb_arctic_hts ; cstr_rpx_nina_multisyn ; restricted license (lexicon) ; cstr_rpx_jon_multisyn ; restricted license (lexicon) ; cstr_edi_awb_arctic_multisyn ; restricted license (lexicon) ; cstr_us_awb_arctic_multisyn ked_diphone don_diphone rab_diphone en1_mbrola us1_mbrola us2_mbrola us3_mbrola gsw_diphone ;; not publically distributed el_diphone ) "default-voice-priority-list List of voice names. The first of them available becomes the default voice.")

Notice how I put nitech_us_slt_arctic_hts at the top. This is my favorite voice.