Changes

USpeak (view source)

Revision as of 12:42, 30 March 2009

322 bytes added , 12:42, 30 March 2009

no edit summary

Line 81: Line 81:

# Writing a system service that has support for recognition of characters and a demonstration that it works by running it with Listen and Spell.

−

# Introduce modes in the system service. Dictation mode will process input as a stream of characters ~~as described in deliverable 1~~ and ~~a new mode called~~ command mode will process the audio input to recognize a known set of commands.

+

# Introduce modes in the system service. Dictation mode will process input as a stream of characters and send corresponding keystrokes and command mode will process the audio input to recognize a known set of commands.

# Make a recording tool/activity so Users can use it to make their own models and improve it for their own needs.

Line 113: Line 113:

Dictation Mode:

−

This can be done via simple calls to the X11 Server. Here is a snippet of how that can be done.

+

In this mode, the users speech will be recognized and the corresponding keystrokes will be sent as is. This can be done via simple calls to the X11 Server. Here is a snippet of how that can be done.

// Get the currently focused window.

Line 165: Line 165:

Major Components:

−

# A language model browser which shows all the current samples and dictionary. Can create new ones or delete ~~exisiting~~ ones.

+

# A language model browser which shows all the current samples and dictionary. Can create new ones or delete existing ones.

# Ability to edit/record new samples and input new dictionary entries and save changes.

Line 177: Line 177:

The coding will be done in C, shell scripts and Python and recording will be done on an external computer and the compiled model will be stored on the XO. I own an XO because of my previous efforts and hence I plan to work natively on it and test the performance real time.

+

The recording utility will be implemented using PyGTK for UI and <code>aplay</code> and <code>arecord</code> for play and record commands.

----

Line 206: Line 207:

'''Fourth Week:'''

+

* Add a few basic commands.

* Implement the mode menu.

−

* ~~Add~~ command mode.

+

* Put the existing functionality in command mode and make provisions of the dictation mode.

−

'''~~Milstone~~ 1 Completed'''

+

'''Milestone 1 Completed'''

'''Fifth Week:'''

* Complete the interface

−

* Start writing code for the ~~model~~ browser and recorder.

+

* Start writing code for the language browser and recorder.

'''Sixth Week:'''

−

* Complete the language ~~model~~ browser.

+

* Complete the language browser.

* Write down the recording and dictionary creation code for the tool.

* Package everything in an activity.

Line 230: Line 232:

'''Infinity and Beyond:'''

−

* Continue with pursuit of perfecting this system on Sugar by increasing accuracy, performing algorithmic optimizations and making new Speech Oriented Activities. :)

+

* Continue with pursuit of perfecting this system on Sugar by increasing accuracy, performing algorithmic optimizations and making new Speech Oriented Activities.

Mavu

52

edits