Difference between revisions of "Activity Team/gst-plugins-espeak"

Revision as of 16:47, 9 March 2009

gst-plugins-espeak

eSpeak library as a sound source for GStreamer.
Plugin uses given text to produce audio output.

Interface

gst-plugins-espeak is a valid gstreamer plugin thus it is a GObject

Properties

GObject properties:

text text to pronounce
pitch pitch adjustment, -100 to 100, default is 0
rate speed in words per minute, -100 to 100, default is 0
voice use voice file of this name from espeak-data/voices
gap Word gap. Pause between words, units of 10mS at the default speed, default is 0
trac track events
- 0 do not track any events (default)
- 1 track word events (see #Track words example)
- 2 track <mark name="<mark-name>"/> marks in text (see #Track marks example)
voices read-only list of supported voices/languages
caps read-only caps describing the format of the data

Events

Gstreamer uses separate threads and user should use gst.Bus messages(are processed in main gtk thread) instead of native GObject events. To use messages you need to setup tract property. These are supported gst.Bus messages (see):

espeak-word before speeching a word, message properties:
- offset offset in chars from beginning of text
- len size of word in chars
espeak-mark mark in text, message properties:
- offset offset in chars from beginning of text
- mark name of mark

Usage

gst-plugins-espeak generates raw audio/x-raw-int data.

Pipeline format

Plugin adds new URI scheme

gst-launch espeak://Hi ! autoaudiosink

Full pipline example:

gst-launch espeak text="Hello world" pitch=-50 rate=-50 voice=default ! autoaudiosink

Python examples

To use gst-plugins-espeak in Python:

setup regular gstreamer envireonment
plugin's name is espeak
all writable properties(including text) make sense only at start playing; to apply new values you need to stop pipe.set_state(gst.STATE_NULL) pipe and start it again with new properties pipe.set_state(gst.STATE_PLAYING).

Simple example

 import gtk
 import gst

 def gstmessage_cb(bus, message, pipe):
     if message.type in (gst.MESSAGE_EOS, gst.MESSAGE_ERROR):
         pipe.set_state(gst.STATE_NULL)

 pipeline = 'espeak text="Hello, World!" ! autoaudiosink'
 pipe = gst.parse_launch(pipeline)

 bus = pipe.get_bus()
 bus.add_signal_watch()
 bus.connect('message', gstmessage_cb, pipe)

 pipe.set_state(gst.STATE_PLAYING)

 gtk.main()

Choir example

 import gtk
 import gst
 import random
 from gettext import gettext as _

 def gstmessage_cb(bus, message, pipe):
     if message.type in (gst.MESSAGE_EOS, gst.MESSAGE_ERROR):
         pipe.set_state(gst.STATE_NULL)

 def make_pipe():
     pipeline = 'espeak name=src ! autoaudiosink'
     pipe = gst.parse_launch(pipeline)

     src = pipe.get_by_name('src')
     src.props.text = _('Hello, World!')
     src.props.pitch = random.randint(-100, 100)
     src.props.rate = random.randint(-100, 100)

     voices = src.props.voices
     voice = voices[random.randint(0, len(voices)-1)]
     src.props.voice = voice[0]

     bus = pipe.get_bus()
     bus.add_signal_watch()
     bus.connect('message', gstmessage_cb, pipe)

     pipe.set_state(gst.STATE_PLAYING)

 for i in range(10):
     make_pipe()

 gtk.main()

Track words example

 import gtk
 import gst

 text = file(__file__, 'r').read()

 def gstmessage_cb(bus, message, pipe):
     if message.type in (gst.MESSAGE_EOS, gst.MESSAGE_ERROR):
         pipe.set_state(gst.STATE_NULL)
     elif message.type == gst.MESSAGE_ELEMENT and \
             message.structure.get_name() == 'espeak-word':
         offset = message.structure['offset']
         len = message.structure['len']
         print text[offset:offset+len]

 pipe = gst.Pipeline('pipeline')

 src = gst.element_factory_make('espeak', 'src')
 src.props.text = text
 src.props.track = 1
 pipe.add(src)

 sink = gst.element_factory_make('autoaudiosink', 'sink')
 pipe.add(sink)
 src.link(sink)

 bus = pipe.get_bus()
 jbus.add_signal_watch()
 bus.connect('message', gstmessage_cb, pipe)

 pipe.set_state(gst.STATE_PLAYING)

 gtk.main()

Track marks example

 import gtk
 import gst

 text = 'Hello, World!'

 def gstmessage_cb(bus, message, pipe):
     if message.type in (gst.MESSAGE_EOS, gst.MESSAGE_ERROR):
         pipe.set_state(gst.STATE_NULL)
     elif message.type == gst.MESSAGE_ELEMENT and \
             message.structure.get_name() == 'espeak-mark':
         offset = message.structure['offset']
         mark = message.structure['mark']
         print '%d:%s' % (offset, mark)

 pipe = gst.Pipeline('pipeline')

 src = gst.element_factory_make('espeak', 'src')
 src.props.text = text
 src.props.track = 2
 src.props.gap = 100
 pipe.add(src)

 sink = gst.element_factory_make('autoaudiosink', 'sink')
 pipe.add(sink)
 src.link(sink)

 bus = pipe.get_bus()
 bus.add_signal_watch()
 bus.connect('message', gstmessage_cb, pipe)

 pipe.set_state(gst.STATE_PLAYING)

 gtk.main()

Known issues

espeak-word doesn't track words with numbers(at least full-numered words) in proper way; see upstream bug [1]; you should use espeak-mark instead

@@ Line 32: / Line 32: @@
 === Usage ===
+gst-plugins-espeak generates raw ''audio/x-raw-int'' data.
 ==== Pipeline format ====
-Plugin generates raw ''audio/x-raw-int'' data.
 Plugin adds new URI scheme
   gst-launch espeak://Hi ! autoaudiosink