Difference between revisions of "Activity Team/gst-plugins-espeak"
(→Usage) |
|||
(26 intermediate revisions by 5 users not shown) | |||
Line 1: | Line 1: | ||
− | === gst-plugins-espeak | + | <noinclude>{{ GoogleTrans-en | es =show | bg =show | zh-CN =show | zh-TW =show | hr =show | cs =show | da =show | nl =show | fi =show | fr =show | de =show | el =show | hi =show | it =show | ja =show | ko =show | no =show | pl =show | pt =show | ro =show | ru =show | sv =show }}{{TOCright}} |
+ | |||
+ | == gst-plugins-espeak == | ||
eSpeak library as a sound source for GStreamer.<br> | eSpeak library as a sound source for GStreamer.<br> | ||
Plugin uses given text to produce audio output. | Plugin uses given text to produce audio output. | ||
− | + | == Interface == | |
gst-plugins-espeak is a valid gstreamer plugin thus it is a GObject | gst-plugins-espeak is a valid gstreamer plugin thus it is a GObject | ||
Line 14: | Line 16: | ||
* '''voice''' use voice file of this name from espeak-data/voices | * '''voice''' use voice file of this name from espeak-data/voices | ||
* '''gap''' Word gap. Pause between words, units of 10mS at the default speed, default is 0 | * '''gap''' Word gap. Pause between words, units of 10mS at the default speed, default is 0 | ||
− | * ''' | + | * '''track''' track events |
** '''0''' do not track any events (default) | ** '''0''' do not track any events (default) | ||
** '''1''' track word events (see [[#Track words example]]) | ** '''1''' track word events (see [[#Track words example]]) | ||
Line 22: | Line 24: | ||
==== Events ==== | ==== Events ==== | ||
− | Gstreamer uses separate threads and user should use gst.Bus messages(are processed in main gtk thread) instead of native GObject events. To use messages you need to setup ''' | + | Gstreamer uses separate threads and user should use gst.Bus messages(are processed in main gtk thread) instead of native GObject events. To use messages you need to setup '''track''' property. These are supported gst.Bus messages (see): |
* '''espeak-word''' before speeching a word, message properties: | * '''espeak-word''' before speeching a word, message properties: | ||
Line 31: | Line 33: | ||
** '''mark''' name of mark | ** '''mark''' name of mark | ||
− | === | + | == Usage == |
+ | gst-plugins-espeak generates raw ''audio/x-raw-int'' data. | ||
+ | |||
+ | ==== Pipeline format ==== | ||
Plugin adds new URI scheme | Plugin adds new URI scheme | ||
gst-launch espeak://Hi ! autoaudiosink | gst-launch espeak://Hi ! autoaudiosink | ||
Line 38: | Line 43: | ||
gst-launch espeak text="Hello world" pitch=-50 rate=-50 voice=default ! autoaudiosink | gst-launch espeak text="Hello world" pitch=-50 rate=-50 voice=default ! autoaudiosink | ||
− | ==== Simple | + | ==== Python examples ==== |
+ | To use gst-plugins-espeak in Python: | ||
+ | * setup regular gstreamer environment | ||
+ | * plugin's name is ''espeak'' | ||
+ | * all writable properties(including '''text''') make sense only at start playing; to apply new values you need to stop ''pipe.set_state(gst.STATE_NULL)'' pipe and start it again with new properties ''pipe.set_state(gst.STATE_PLAYING)''. | ||
+ | |||
+ | Note: the examples below are for GTK+ 2 and GStreamer 0.10, and are yet to be ported to GTK+ 3 and GStreamer 1.0, there are more recent examples in the GTK+ 3 toolkit for Sugar, and in activities. | ||
+ | |||
+ | ===== Simple example ===== | ||
import gtk | import gtk | ||
Line 45: | Line 58: | ||
if message.type in (gst.MESSAGE_EOS, gst.MESSAGE_ERROR): | if message.type in (gst.MESSAGE_EOS, gst.MESSAGE_ERROR): | ||
pipe.set_state(gst.STATE_NULL)<br> | pipe.set_state(gst.STATE_NULL)<br> | ||
− | pipeline = 'espeak text="Hello | + | pipeline = 'espeak text="Hello, World!" ! autoaudiosink' |
pipe = gst.parse_launch(pipeline)<br> | pipe = gst.parse_launch(pipeline)<br> | ||
bus = pipe.get_bus() | bus = pipe.get_bus() | ||
Line 53: | Line 66: | ||
gtk.main() | gtk.main() | ||
− | ==== Choir example ==== | + | ===== Choir example ===== |
import gtk | import gtk | ||
import gst | import gst | ||
− | import random<br> | + | import random |
+ | from gettext import gettext as _<br> | ||
def gstmessage_cb(bus, message, pipe): | def gstmessage_cb(bus, message, pipe): | ||
if message.type in (gst.MESSAGE_EOS, gst.MESSAGE_ERROR): | if message.type in (gst.MESSAGE_EOS, gst.MESSAGE_ERROR): | ||
Line 65: | Line 79: | ||
pipe = gst.parse_launch(pipeline)<br> | pipe = gst.parse_launch(pipeline)<br> | ||
src = pipe.get_by_name('src') | src = pipe.get_by_name('src') | ||
− | src.props.text = 'Hello | + | src.props.text = _('Hello, World!') |
src.props.pitch = random.randint(-100, 100) | src.props.pitch = random.randint(-100, 100) | ||
src.props.rate = random.randint(-100, 100)<br> | src.props.rate = random.randint(-100, 100)<br> | ||
+ | voices = src.props.voices | ||
+ | voice = random.choice(voices) | ||
+ | src.props.voice = voice[0]<br> | ||
bus = pipe.get_bus() | bus = pipe.get_bus() | ||
bus.add_signal_watch() | bus.add_signal_watch() | ||
Line 76: | Line 93: | ||
gtk.main() | gtk.main() | ||
− | ==== Track words example ==== | + | ===== Track words example ===== |
− | ==== Track marks example ==== | + | import gtk |
+ | import gst<br> | ||
+ | text = file(__file__, 'r').read()<br> | ||
+ | def gstmessage_cb(bus, message, pipe): | ||
+ | if message.type in (gst.MESSAGE_EOS, gst.MESSAGE_ERROR): | ||
+ | pipe.set_state(gst.STATE_NULL) | ||
+ | elif message.type == gst.MESSAGE_ELEMENT and \ | ||
+ | message.structure.get_name() == 'espeak-word': | ||
+ | offset = message.structure['offset'] | ||
+ | len = message.structure['len'] | ||
+ | print text[offset:offset+len]<br> | ||
+ | pipe = gst.Pipeline('pipeline')<br> | ||
+ | src = gst.element_factory_make('espeak', 'src') | ||
+ | src.props.text = text | ||
+ | src.props.track = 1 | ||
+ | pipe.add(src)<br> | ||
+ | sink = gst.element_factory_make('autoaudiosink', 'sink') | ||
+ | pipe.add(sink) | ||
+ | src.link(sink)<br> | ||
+ | bus = pipe.get_bus() | ||
+ | jbus.add_signal_watch() | ||
+ | bus.connect('message', gstmessage_cb, pipe)<br> | ||
+ | pipe.set_state(gst.STATE_PLAYING)<br> | ||
+ | gtk.main() | ||
+ | |||
+ | ===== Track marks example ===== | ||
+ | |||
+ | import gtk | ||
+ | import gst<br> | ||
+ | text = '<mark name="mark to Hello"/>Hello, <mark name="mark for World"/>World!'<br> | ||
+ | def gstmessage_cb(bus, message, pipe): | ||
+ | if message.type in (gst.MESSAGE_EOS, gst.MESSAGE_ERROR): | ||
+ | pipe.set_state(gst.STATE_NULL) | ||
+ | elif message.type == gst.MESSAGE_ELEMENT and \ | ||
+ | message.structure.get_name() == 'espeak-mark': | ||
+ | offset = message.structure['offset'] | ||
+ | mark = message.structure['mark'] | ||
+ | print '%d:%s' % (offset, mark)<br> | ||
+ | pipe = gst.Pipeline('pipeline')<br> | ||
+ | src = gst.element_factory_make('espeak', 'src') | ||
+ | src.props.text = text | ||
+ | src.props.track = 2 | ||
+ | src.props.gap = 100 | ||
+ | pipe.add(src)<br> | ||
+ | sink = gst.element_factory_make('autoaudiosink', 'sink') | ||
+ | pipe.add(sink) | ||
+ | src.link(sink)<br> | ||
+ | bus = pipe.get_bus() | ||
+ | bus.add_signal_watch() | ||
+ | bus.connect('message', gstmessage_cb, pipe)<br> | ||
+ | pipe.set_state(gst.STATE_PLAYING)<br> | ||
+ | gtk.main() | ||
+ | |||
+ | ===== Simple TTS example ===== | ||
+ | |||
+ | import gtk | ||
+ | import gst | ||
+ | import pango | ||
+ | |||
+ | window = gtk.Window() | ||
+ | window.connect('destroy', | ||
+ | lambda sender: gtk.main_quit()) | ||
+ | |||
+ | workspace = gtk.VBox() | ||
+ | window.add(workspace) | ||
+ | |||
+ | # text widget | ||
+ | |||
+ | scrolled = gtk.ScrolledWindow() | ||
+ | workspace.pack_start(scrolled) | ||
+ | |||
+ | text = gtk.TextView() | ||
+ | scrolled.add(text) | ||
+ | |||
+ | buffer = text.props.buffer | ||
+ | buffer.props.text = file(__file__).read() | ||
+ | |||
+ | tag = buffer.create_tag() | ||
+ | tag.props.weight = pango.WEIGHT_BOLD | ||
+ | |||
+ | # play controls | ||
+ | |||
+ | toolbar = gtk.HBox() | ||
+ | workspace.pack_end(toolbar, False) | ||
+ | |||
+ | play = gtk.Button('Play/Resume') | ||
+ | play.connect('clicked', | ||
+ | lambda sender: pipe.set_state(gst.STATE_PLAYING)) | ||
+ | toolbar.add(play) | ||
+ | |||
+ | pause = gtk.Button('Pause') | ||
+ | pause.connect('clicked', | ||
+ | lambda sender: pipe.set_state(gst.STATE_PAUSED)) | ||
+ | toolbar.add(pause) | ||
+ | |||
+ | stop = gtk.Button('Stop') | ||
+ | stop.connect('clicked', | ||
+ | lambda sender: pipe.set_state(gst.STATE_NULL)) | ||
+ | toolbar.add(stop) | ||
+ | |||
+ | # gst code | ||
+ | |||
+ | pipe = gst.parse_launch('espeak name=src ! autoaudiosink') | ||
+ | |||
+ | src = pipe.get_by_name('src') | ||
+ | src.props.text = buffer.props.text | ||
+ | src.props.track = 1 # track for words | ||
+ | |||
+ | def tts_cb(bus, message): | ||
+ | if message.structure.get_name() != 'espeak-word': | ||
+ | return | ||
+ | |||
+ | offset = message.structure['offset'] | ||
+ | len = message.structure['len'] | ||
+ | |||
+ | buffer.remove_tag(tag, buffer.get_start_iter(), buffer.get_end_iter()) | ||
+ | start = buffer.get_iter_at_offset(offset) | ||
+ | end = buffer.get_iter_at_offset(offset + len) | ||
+ | buffer.apply_tag(tag, start, end) | ||
+ | |||
+ | bus = pipe.get_bus() | ||
+ | bus.add_signal_watch() | ||
+ | bus.connect('message::element', tts_cb) | ||
+ | |||
+ | # gtk start | ||
+ | |||
+ | window.show_all() | ||
+ | gtk.main() | ||
+ | |||
+ | == Known issues == | ||
+ | |||
+ | * espeak-ng requires v0.5.0, or on Debian and Ubuntu use package version 0.4.0-3 or later, which has Debian Bug #877750 fix. | ||
+ | * '''espeak-word''' with espeak < 1.40.09 doesn't track words with numbers(at least full-numered words) in proper way | ||
+ | * if you are tracking '''espeak-word''' with espeak < 1.40.10 you should use gst-plugins-espeak-0.3x(or 0.3 branch in git repository) | ||
+ | |||
+ | == Install == | ||
+ | |||
+ | gstreamer-plugins-espeak is a part of SugarPlatform-0.84, so just install meta package(depends on your distro) with SP. | ||
+ | |||
+ | ==== XO ==== | ||
+ | |||
+ | * attach [http://people.sugarlabs.org/~alsroot/xo/] repo(or just download proper .rpm) | ||
+ | * install gstreamer-plugins-espeak package | ||
+ | |||
+ | == Contacts == | ||
+ | * [[User:Alsroot|Aleksey Lim]] | ||
+ | * [[User:Quozl|James Cameron]] | ||
+ | * be involved and add yourself here | ||
=== Resources === | === Resources === | ||
− | * [http://git.sugarlabs.org/projects/gst-plugins-espeak Sources] | + | * [https://github.com/sugarlabs/gst-plugins-espeak Sources] |
+ | * [http://git.sugarlabs.org/projects/gst-plugins-espeak Previous Sources] | ||
* [http://download.sugarlabs.org/sources/honey/gst-plugins-espeak/ Tarballs] | * [http://download.sugarlabs.org/sources/honey/gst-plugins-espeak/ Tarballs] | ||
+ | |||
+ | [[Category:Activity Team]] |
Latest revision as of 20:41, 5 October 2017
gst-plugins-espeak
eSpeak library as a sound source for GStreamer.
Plugin uses given text to produce audio output.
Interface
gst-plugins-espeak is a valid gstreamer plugin thus it is a GObject
Properties
GObject properties:
- text text to pronounce
- pitch pitch adjustment, -100 to 100, default is 0
- rate speed in words per minute, -100 to 100, default is 0
- voice use voice file of this name from espeak-data/voices
- gap Word gap. Pause between words, units of 10mS at the default speed, default is 0
- track track events
- 0 do not track any events (default)
- 1 track word events (see #Track words example)
- 2 track <mark name="<mark-name>"/> marks in text (see #Track marks example)
- voices read-only list of supported voices/languages
- caps read-only caps describing the format of the data
Events
Gstreamer uses separate threads and user should use gst.Bus messages(are processed in main gtk thread) instead of native GObject events. To use messages you need to setup track property. These are supported gst.Bus messages (see):
- espeak-word before speeching a word, message properties:
- offset offset in chars from beginning of text
- len size of word in chars
- espeak-mark mark in text, message properties:
- offset offset in chars from beginning of text
- mark name of mark
Usage
gst-plugins-espeak generates raw audio/x-raw-int data.
Pipeline format
Plugin adds new URI scheme
gst-launch espeak://Hi ! autoaudiosink
Full pipline example:
gst-launch espeak text="Hello world" pitch=-50 rate=-50 voice=default ! autoaudiosink
Python examples
To use gst-plugins-espeak in Python:
- setup regular gstreamer environment
- plugin's name is espeak
- all writable properties(including text) make sense only at start playing; to apply new values you need to stop pipe.set_state(gst.STATE_NULL) pipe and start it again with new properties pipe.set_state(gst.STATE_PLAYING).
Note: the examples below are for GTK+ 2 and GStreamer 0.10, and are yet to be ported to GTK+ 3 and GStreamer 1.0, there are more recent examples in the GTK+ 3 toolkit for Sugar, and in activities.
Simple example
import gtk import gst
def gstmessage_cb(bus, message, pipe): if message.type in (gst.MESSAGE_EOS, gst.MESSAGE_ERROR): pipe.set_state(gst.STATE_NULL)
pipeline = 'espeak text="Hello, World!" ! autoaudiosink' pipe = gst.parse_launch(pipeline)
bus = pipe.get_bus() bus.add_signal_watch() bus.connect('message', gstmessage_cb, pipe)
pipe.set_state(gst.STATE_PLAYING)
gtk.main()
Choir example
import gtk import gst import random from gettext import gettext as _
def gstmessage_cb(bus, message, pipe): if message.type in (gst.MESSAGE_EOS, gst.MESSAGE_ERROR): pipe.set_state(gst.STATE_NULL)
def make_pipe(): pipeline = 'espeak name=src ! autoaudiosink' pipe = gst.parse_launch(pipeline)
src = pipe.get_by_name('src') src.props.text = _('Hello, World!') src.props.pitch = random.randint(-100, 100) src.props.rate = random.randint(-100, 100)
voices = src.props.voices voice = random.choice(voices) src.props.voice = voice[0]
bus = pipe.get_bus() bus.add_signal_watch() bus.connect('message', gstmessage_cb, pipe)
pipe.set_state(gst.STATE_PLAYING)
for i in range(10): make_pipe()
gtk.main()
Track words example
import gtk import gst
text = file(__file__, 'r').read()
def gstmessage_cb(bus, message, pipe): if message.type in (gst.MESSAGE_EOS, gst.MESSAGE_ERROR): pipe.set_state(gst.STATE_NULL) elif message.type == gst.MESSAGE_ELEMENT and \ message.structure.get_name() == 'espeak-word': offset = message.structure['offset'] len = message.structure['len'] print text[offset:offset+len]
pipe = gst.Pipeline('pipeline')
src = gst.element_factory_make('espeak', 'src') src.props.text = text src.props.track = 1 pipe.add(src)
sink = gst.element_factory_make('autoaudiosink', 'sink') pipe.add(sink) src.link(sink)
bus = pipe.get_bus() jbus.add_signal_watch() bus.connect('message', gstmessage_cb, pipe)
pipe.set_state(gst.STATE_PLAYING)
gtk.main()
Track marks example
import gtk import gst
text = 'Hello, World!'
def gstmessage_cb(bus, message, pipe): if message.type in (gst.MESSAGE_EOS, gst.MESSAGE_ERROR): pipe.set_state(gst.STATE_NULL) elif message.type == gst.MESSAGE_ELEMENT and \ message.structure.get_name() == 'espeak-mark': offset = message.structure['offset'] mark = message.structure['mark'] print '%d:%s' % (offset, mark)
pipe = gst.Pipeline('pipeline')
src = gst.element_factory_make('espeak', 'src') src.props.text = text src.props.track = 2 src.props.gap = 100 pipe.add(src)
sink = gst.element_factory_make('autoaudiosink', 'sink') pipe.add(sink) src.link(sink)
bus = pipe.get_bus() bus.add_signal_watch() bus.connect('message', gstmessage_cb, pipe)
pipe.set_state(gst.STATE_PLAYING)
gtk.main()
Simple TTS example
import gtk import gst import pango window = gtk.Window() window.connect('destroy', lambda sender: gtk.main_quit()) workspace = gtk.VBox() window.add(workspace) # text widget scrolled = gtk.ScrolledWindow() workspace.pack_start(scrolled) text = gtk.TextView() scrolled.add(text) buffer = text.props.buffer buffer.props.text = file(__file__).read() tag = buffer.create_tag() tag.props.weight = pango.WEIGHT_BOLD # play controls toolbar = gtk.HBox() workspace.pack_end(toolbar, False) play = gtk.Button('Play/Resume') play.connect('clicked', lambda sender: pipe.set_state(gst.STATE_PLAYING)) toolbar.add(play) pause = gtk.Button('Pause') pause.connect('clicked', lambda sender: pipe.set_state(gst.STATE_PAUSED)) toolbar.add(pause) stop = gtk.Button('Stop') stop.connect('clicked', lambda sender: pipe.set_state(gst.STATE_NULL)) toolbar.add(stop) # gst code pipe = gst.parse_launch('espeak name=src ! autoaudiosink') src = pipe.get_by_name('src') src.props.text = buffer.props.text src.props.track = 1 # track for words def tts_cb(bus, message): if message.structure.get_name() != 'espeak-word': return offset = message.structure['offset'] len = message.structure['len'] buffer.remove_tag(tag, buffer.get_start_iter(), buffer.get_end_iter()) start = buffer.get_iter_at_offset(offset) end = buffer.get_iter_at_offset(offset + len) buffer.apply_tag(tag, start, end) bus = pipe.get_bus() bus.add_signal_watch() bus.connect('message::element', tts_cb) # gtk start window.show_all() gtk.main()
Known issues
- espeak-ng requires v0.5.0, or on Debian and Ubuntu use package version 0.4.0-3 or later, which has Debian Bug #877750 fix.
- espeak-word with espeak < 1.40.09 doesn't track words with numbers(at least full-numered words) in proper way
- if you are tracking espeak-word with espeak < 1.40.10 you should use gst-plugins-espeak-0.3x(or 0.3 branch in git repository)
Install
gstreamer-plugins-espeak is a part of SugarPlatform-0.84, so just install meta package(depends on your distro) with SP.
XO
- attach [1] repo(or just download proper .rpm)
- install gstreamer-plugins-espeak package
Contacts
- Aleksey Lim
- James Cameron
- be involved and add yourself here