Changes

no edit summary
Line 23: Line 23:  
'''Where are you located, and what hours (UTC) do you tend to work? (We also try to match mentors by general time zone if possible.)'''
 
'''Where are you located, and what hours (UTC) do you tend to work? (We also try to match mentors by general time zone if possible.)'''
   −
I live in Asunción, Paraguay. Standard time zone is UTC/GMT -4 hours.
+
I live in Asunción, Paraguay. Standard time zone is UTC/GMT -4 hours. I plan to work on this project in the afternoon, probably from 10 AM to 6 PM UTC.
I plan to work on this project in the afternoon, probably from 10 AM to 6 PM UTC.
   
    
 
    
 
'''Have you participated in an open-source project before? If so, please send us URLs to your profile pages for those projects, or some other demonstration of the work that you have done in open-source. If not, why do you want to work on an open-source project this summer?'''
 
'''Have you participated in an open-source project before? If so, please send us URLs to your profile pages for those projects, or some other demonstration of the work that you have done in open-source. If not, why do you want to work on an open-source project this summer?'''
Line 40: Line 39:  
* A tiny feature for the Birdie Twitter client: https://github.com/birdieapp/birdie/pull/68
 
* A tiny feature for the Birdie Twitter client: https://github.com/birdieapp/birdie/pull/68
   −
Even though these contributions are small, I think they show that I am familiar with the open-source development workflow and that I am motivated to collaborate.
+
Even though these contributions are small, I think they show that I am familiar with the open-source development workflow and that I am motivated to collaborate. I have been an open-source software user for more than 5 years now, and for me this project is a great chance to give something back to the community.
 
  −
I have been an open-source software user for more than 5 years now, and for me this project is a great chance to give something back to the community.
        Line 53: Line 50:  
'''Describe your project in 10-20 sentences. What are you making? Who are you making it for, and why do they need it? What technologies (programming languages, etc.) will you be using?'''
 
'''Describe your project in 10-20 sentences. What are you making? Who are you making it for, and why do they need it? What technologies (programming languages, etc.) will you be using?'''
   −
The main goal of Sugar Listens is to provide an easy-to-use speech recognition API to educational content developers, within the Sugar Learning Platform.
+
The main goal of Sugar Listens is to provide an easy-to-use speech recognition API to educational content developers, within the Sugar Learning Platform. This will allow developers to integrate speech-enabled interfaces to their Sugar Activities, letting users interact with Sugar through voice commands.
 
  −
This will allow developers to integrate speech-enabled interfaces to their Sugar Activities, letting users interact with Sugar through voice commands.
  −
 
  −
Introducing voice user interfaces to Sugar Activities will enable richer, and arguably more natural, human-computer interactions.
     −
Perhaps more importantly, such interfaces are a promising opportunity to make Sugar available to people with certain disabilities.
+
Introducing voice user interfaces to Sugar Activities will enable richer, and arguably more natural, human-computer interactions. Perhaps more importantly, such interfaces are a promising opportunity to make Sugar available to people with certain disabilities.
   −
I will use Pocketsphinx, an open-source speech recognition engine developed as a research project at Carnegie Mellon University, to implement the core speech recognition capabilities.
+
I will use Pocketsphinx, an open-source speech recognition engine developed as a research project at Carnegie Mellon University, to implement the core speech recognition capabilities. The Voxforge Project provides acoustic models for several languages, one of which should be used according to the language of choice.
   −
The Voxforge Project provides acoustic models for several languages, one of which should be used according to the language of choice.
+
Appropriate models should probably be downloaded according to the locale of the system, to avoid wasting resources such as disk space and bandwidth. In order to provide a high-level API to access speech-recognition functionality, Pocketsphinx will be exposed as a D-Bus service available to Sugar Activities.
   −
Appropriate models should probably be downloaded according to the locale of the system, to avoid wasting resources such as disk space and bandwidth.
+
My programming language of choice will be Python. It is the main Sugar Platform language and Python bindings are available for Pocketsphinx. Expected results of this project include not only the code, but also proper documentation of the API and a proof-of-concept voice-user interface for a Sugar Activity. An idea I have is to add new speech recognition blocks to Turtle Blocks.
 
  −
In order to provide a high-level API to access speech-recognition functionality, Pocketsphinx will be exposed as a D-Bus service available to Sugar Activities.
  −
 
  −
My programming language of choice will be Python. It is the main Sugar Platform language and Python bindings are available for Pocketsphinx.
  −
 
  −
Expected results of this project include not only the code, but also proper documentation of the API and a proof-of-concept voice-user interface for a Sugar Activity. An idea I have is to add new speech recognition blocks to Turtle Blocks.
      
Additionally, packaging the implemented solution as a .rpm package ready to be included in the repositories is desirable.
 
Additionally, packaging the implemented solution as a .rpm package ready to be included in the repositories is desirable.
Line 88: Line 75:  
| 02/06 - 08/06 ||  Allow Activities to publish their custom language models and acoustic dictionaries.<br>Define a custom grammar-based language model for Turtle Blocks.<br>Publish the custom language model from Turtle Blocks to the speech recognition daemon to use it instead of the default one.
 
| 02/06 - 08/06 ||  Allow Activities to publish their custom language models and acoustic dictionaries.<br>Define a custom grammar-based language model for Turtle Blocks.<br>Publish the custom language model from Turtle Blocks to the speech recognition daemon to use it instead of the default one.
 
|-
 
|-
| 09-06 - 15/06 ||  Test and bugfix custom models support:<br>* Test support for custom acoustic models.<br>*Test support for custom grammar-based language models.<br>*Test support for custom statistical language models.
+
| 09-06 - 15/06 ||  Test and bugfix custom models support, which should include: custom acoustic models and custom (statistical and grammar-based) language models.
 
|-
 
|-
 
| 16/06 - 22/06 ||  Download and use Voxforge models according to the locale of the system.<br>Smart acoustic/language models setting on Activity startup/close.<br>The speech recognition daemon should restart only if there any model changes associated with Activity switches.
 
| 16/06 - 22/06 ||  Download and use Voxforge models according to the locale of the system.<br>Smart acoustic/language models setting on Activity startup/close.<br>The speech recognition daemon should restart only if there any model changes associated with Activity switches.
Line 106: Line 93:  
'''Convince us, in 5-15 sentences, that you will be able to successfully complete your project in the timeline you have described. This is usually where people describe their past experiences, credentials, prior projects, schoolwork, and that sort of thing, but be creative. Link to prior work or other resources as relevant.'''
 
'''Convince us, in 5-15 sentences, that you will be able to successfully complete your project in the timeline you have described. This is usually where people describe their past experiences, credentials, prior projects, schoolwork, and that sort of thing, but be creative. Link to prior work or other resources as relevant.'''
   −
I am a 24-year-old last-year Computer Science Engineering student at Universidad Nacional de Asunción, Paraguay.
+
I am a 24-year-old last-year Computer Science Engineering student at Universidad Nacional de Asunción, Paraguay. I am also a member of Juky Paraguay, a group for paraguayan Sugar developers to write code, share ideas and mostly have fun.
   −
I am also a member of Juky Paraguay, a group for paraguayan Sugar developers to write code, share ideas and mostly have fun.
+
I have been working on my engineering thesis project, which has a strong focus on speech recognition and voice-enabled user interfaces, for almost a year now. Its title loosely translates to: “Design of Speech Recognition Based User Interfaces”. Some of my early work can be found at: https://github.com/jorgeramirez/step
   −
I have been working on my engineering thesis project, which has a strong focus on speech recognition and voice-enabled user interfaces, for almost a year now. Its title loosely translates to: “Design of Speech Recognition Based User Interfaces”.
+
As a part of my thesis, I developed an voice-user interface to control TamTam Listens, an existing Sugar Activity for music composition. For TamTam Listens, I programmed a daemon process to run the Pocketsphinx speech recognition engine, which produced text output based on user-pronounced voice commands. Text output was later parsed to get the commands in the appropriate format.
   −
Some of my early work can be found at: https://github.com/jorgeramirez/step
+
Recognized commands were published through a D-Bus service which allowed TamTam Listens to integrate speech recognition with
 +
minimum coupling. The final step was to modify TamTam Listens in order to make the graphical interface respond to the commands. Later on, a usability study was conducted with 12 users in order to draw conclusions about speech-based user interfaces.
   −
As a part of my thesis, I developed an voice-user interface to control TamTam Listens, an existing Sugar Activity for music composition.
+
The architecture of the developed solution resembles the one included in the project description to a great degree. I used, and in consequence I am familiar with, Pockesphinx, Voxforge and D-Bus. Although some improvements are still needed, such as multi-language support, I believe my experience with the field and the tools would be of great help to the success of the project.
 
  −
Later on, a usability study was conducted with 12 users in order to draw conclusions about speech-based user interfaces.
  −
 
  −
The architecture of the developed solution resembles the one included in the project description to a great degree. I used, and in consequence I am familiar with, Pockesphinx, Voxforge and D-Bus.
  −
 
  −
Although some improvements are still needed, such as multi-language support, I believe my experience with the field and the tools would be of great help to the success of the project.
      
<big>'''You and the community'''</big>
 
<big>'''You and the community'''</big>
Line 126: Line 108:  
'''If your project is successfully completed, what will its impact be on the Sugar Labs community? Give 3 answers, each 1-3 paragraphs in length. The first one should be yours. The other two should be answers from members of the Sugar Labs community, at least one of whom should be a Sugar Labs GSoC mentor. Provide email contact information for non-GSoC mentors.'''
 
'''If your project is successfully completed, what will its impact be on the Sugar Labs community? Give 3 answers, each 1-3 paragraphs in length. The first one should be yours. The other two should be answers from members of the Sugar Labs community, at least one of whom should be a Sugar Labs GSoC mentor. Provide email contact information for non-GSoC mentors.'''
   −
'''Me:''' As mentioned before, speech-enabled user interfaces for Sugar Activities will allow richer, and perhaps more natural, interactions between users and the computer.
+
'''Me:''' As mentioned before, speech-enabled user interfaces for Sugar Activities will allow richer, and perhaps more natural, interactions between users and the computer. Personally, the most meaningful reward would be to make Sugar Activities (and education opportunities in general) accesible for more people.
 
  −
Personally, the most meaningful reward would be to make Sugar Activities (and education opportunities in general) accesible for more people.
      
'''Martín Abente Lahaye:''' Speech-recognition technologies are interaction mechanisms that, nowadays, have evolved from "alternative" to "extended". Proof of this is the proliferation of such technologies in a wide range of domains. From smartphones assistants, medical-record transcriptions, smart cars, and TV command controls to many others. In this regard, not much have been seen in the education domain.  
 
'''Martín Abente Lahaye:''' Speech-recognition technologies are interaction mechanisms that, nowadays, have evolved from "alternative" to "extended". Proof of this is the proliferation of such technologies in a wide range of domains. From smartphones assistants, medical-record transcriptions, smart cars, and TV command controls to many others. In this regard, not much have been seen in the education domain.  
Line 139: Line 119:     
If I get stuck at some point, I would probably look for a hint in the documentation and/or
 
If I get stuck at some point, I would probably look for a hint in the documentation and/or
the community.
+
the community. If none of the above work, I would probably work in another feature while my mentor
 
  −
If none of the above work, I would probably work in another feature while my mentor
   
is not available.
 
is not available.
   Line 159: Line 137:  
'''Describe a great learning experience you had as a child.'''
 
'''Describe a great learning experience you had as a child.'''
   −
When I was 8 years old, I remember having trouble understanding some basic math
+
When I was 8 years old, I remember having trouble understanding some basic math concept that I was supposed to learn at the time. I can’t remember what it was exactly, though. I recall that one day in class, a classmate asked precissely about that concept. Instead of just telling him I didn’t get it yet, I’m not sure why, I tried my best to explain it the best I could to him.
concept that I was supposed to learn at the time. I can’t remember what it was exactly, though.
  −
 
  −
I recall that one day in class, a classmate asked precissely about that concept. Instead of just telling him I didn’t get it yet, I’m not sure why, I tried my best to explain it the best I could to him.
  −
 
  −
During my explanation, I remember finally understanding the concept. Something just made ‘click’ inside my head. Trying to help a friend helped me to get rid of that annoying learning block. I felt awesome.
     −
I learned and important lesson that day: while learning is itself a rewarding process, learning by helping others is a much more fulfilling experience.
+
During my explanation, I remember finally understanding the concept. Something just made ‘click’ inside my head. Trying to help a friend helped me to get rid of that annoying learning block. I felt awesome. I learned and important lesson that day: while learning is itself a rewarding process, learning by helping others is a much more fulfilling experience.
    
'''Is there anything else we should have asked you or anything else that we should know that might make us like you or your project more?'''
 
'''Is there anything else we should have asked you or anything else that we should know that might make us like you or your project more?'''
8

edits