Changes

USpeak (view source)

Revision as of 15:15, 28 March 2009

399 bytes added , 15:15, 28 March 2009

no edit summary

Line 65: Line 65:

I have been working towards achieving this goal for the past 6 months. The task can be accomplished by breaking the problem into the following smaller subsets and tackling them one by one:

−

# '''''Port an existing speech engine to the less powerful computers like XO.''''' ( This has been a part of the work that I have been doing so far. I chose Julius as the Speech engine as it is lighter and written in C. I have been able to compile Julius on the XO and am continuing to optimize it to make it work faster.)

+

# '''''Port an existing speech engine to the less powerful computers like XO.''''' This has been a part of the work that I have been doing so far. I chose Julius as the Speech engine as it is lighter and written in C. I have been able to compile Julius on the XO and am continuing to optimize it to make it work faster. Also XO-1 is the bare minimum case on which I'll be testing it. If it works on this it will most certainly work anywhere else.

# '''''Writing a system service that will take speech as an input and generate corresponding keystrokes and then proceed as if the input was given through the keyboard.''''' This method was suggested by Benjamin M. Schwartz as a simpler approach as compared to writing a speech library in Python (which would use DBUS to connect the engine to the activities) in which case changes have to be made to the existing activities to use the library.

# '''''Starting with recognition of alphabets of a language rather than full-blown speech recognition.''''' This will give an achievable target for the initial stages. As the alphabet set is limited to a small number for most languages, this target will be feasible considering both computational power requirements and attainable efficiency.

Line 127: Line 127:

The above code will send one character to the window. This can be looped to generate a continuous stream (An even nicer way to do this would be set a timer delay to make it look like a typed stream).

+

Sayamindu has pointed me to XTEST extension as well which seems to be the easier way. I'll do some research on that write back my finiding in this section. It has useful routines like XTestFakeKeyEvent, XTestFakeButtonEvent, etc which will make life more easier in this task.

Command Mode:

Similarly, a whole host of events can be catered to using the X11 Input Server. Words like "Close" etc (which will be defined in a list of commands that the engine will recognize) need not be parsed and broken into letters and can just be sent as events like XCloseDisplay().

−

All of this basically needs to be wrapped in a single service that can run in the background. That service can be implemented as a Sugar Feature that enables starting and stopping of this service.

Mavu

52

edits

Changes

USpeak (view source)

Revision as of 15:15, 28 March 2009

Navigation menu

Search