Changes

8 bytes added ,  08:08, 28 March 2009
no edit summary
Line 87: Line 87:  
'''The rest this wiki page will refer to the steps proposed for GSoC 09 as project.'''
 
'''The rest this wiki page will refer to the steps proposed for GSoC 09 as project.'''
   −
I. The Speech Service:
+
'''I. The Speech Service:'''
    
The speech service will be a daemon running in the background that can be activated to provide input to the Sugar Interface using speech. This daemon can be activated by the user and can be 'initiated' via a hotkey. This daemon will transfer the audio to Julius Speech Engine and will process its output to generate a stream of keystrokes and are passed as input method to other activities. Also the generated text data can be any Unicode character or text and will not be restricted to XKeyEvent data of X11 (helps in foreign languages).
 
The speech service will be a daemon running in the background that can be activated to provide input to the Sugar Interface using speech. This daemon can be activated by the user and can be 'initiated' via a hotkey. This daemon will transfer the audio to Julius Speech Engine and will process its output to generate a stream of keystrokes and are passed as input method to other activities. Also the generated text data can be any Unicode character or text and will not be restricted to XKeyEvent data of X11 (helps in foreign languages).
Line 135: Line 135:  
So the architecture will be like:
 
So the architecture will be like:
   −
                       Speech Engine ---> Service ---> Activity
+
                       Speech Engine <---> Service <---> Activity
    
Flow:
 
Flow:
   −
1. User activates the Speech Service
+
# User activates the Speech Service
2. The service listens for audio when a particular key combo is pressed (a small popup/notification can be shown to show that the daemon is listening).  
+
# The service listens for audio when a particular key combo is pressed (a small popup/notification can be shown to show that the daemon is listening).  
3. Service passes the spoke audio to the Speech Engine.
+
# Service passes the spoke audio to the Speech Engine.
4. The output of the speech engine is grabbed and analysis is done on it to see what the input is.  
+
# The output of the speech engine is grabbed and analysis is done on it to see what the input is.  
5. If the input is a recognized 'Word Command' then it performs that command otherwise it generates key events as mentioned above and is sent to the currently focused window.
+
# If the input is a recognized 'Word Command' then it performs that command otherwise it generates key events as mentioned above and is sent to the currently focused window.
6. This continues until the user de-avtivates the Speech Service.
+
# This continues until the user de-avtivates the Speech Service.
    
This approach will simplify quite a few aspects and will be efficient.  
 
This approach will simplify quite a few aspects and will be efficient.  
Line 155: Line 155:       −
II. Demonstrate its utility using Listen and Spell:
+
'''II. Demonstrate its utility using Listen and Spell:'''
    
Beyond this I would like to implement an activity that can quite well demonstrate the use of this service. I plan to Implement Speak Spell which will be a spelling activity where children can spell out the words show to them. Single character recognition can have very high recognition rates.  
 
Beyond this I would like to implement an activity that can quite well demonstrate the use of this service. I plan to Implement Speak Spell which will be a spelling activity where children can spell out the words show to them. Single character recognition can have very high recognition rates.  
52

edits