Speech-synthesis

Revision as of 15:51, 28 March 2009 by Chiragjain1989 (talk | contribs)

About you

1. What is your name?

    Chirag Jain

2. What is your email address?

    chiragjain1989[AT]gmail[DOT]com

3. What is your Sugar Labs wiki username?

    chiragjain1989

4. What is your IRC nickname?

    chirag

5. What is your primary language? (We have mentors who speak multiple languages and can match you with one of them if you'd prefer.)

      Hindi and English

6. Where are you located, and what hours do you tend to work? (We also try to match mentors by general time zone if possible.)

    [TODO]

7. Have you participated in an open-source project before? If so, please send us URLs to your profile pages for those projects, or some other demonstration of the work that you have done in open-source. If not, why do you want to work on an open-source project this summer?

    I was not aware of a thing like open source before I stepped into my college. But then I heard a lot about this stuff from seminars and my other seniors. Then I started participating in coding events and my first open source event was AI Challenge organized during our technical fest. I did write a simulator code for the event.

Link: http://www.code.google.com/p/artificial-intelligence Then I also made a Sudoku solver in open source using a back tracking method in C++. The algorithm has complexity which is exponential in nature. Link: http://www.code.google.come/p/sudoku-crazy Now after knowing a lot about open source I want to gain some real time experience in open source development. The GSoC is an opportunity where I can apply my technical skills, can learn new things and at the same time can contribute something to the society.


About your project

8. What is the name of your project?

    Speech Synthesis

9. Describe your project in 10-20 sentences. What are you making? Who are you making it for, and why do they need it? What technologies (programming languages, etc.) will you be using?

    I am trying to integrate speech in the core sugar. Means I am trying to provide speech synthesis as a basic functionality   in sugar. According to me and a survey [EXPLAIN], language learning can be a great experience if done with speech. The literacy rate can be increased by 6-10% if speech is also included with text because this is the ability of our brain to easily remember sounds rather than text. So I am making this activity for children learning language of age group 5-15 or even older people.

I discussed a lot with alsroot, assimd and besmac on IRC about this project. The main points of discussion are:

1. The main aim of sugar in speech synthesis is to integrate the speech in core sugar.

2. Integrating speech in core sugar means providing a speech generator as a basic functionality in sugar. Thus if there is any window containing a text is open in sugar then the selected text can be read out by the application running behind.

3. The other aim is to develop a GUI for speech configuration which will also act as a configuration mangement tool.

4. Now in this tool, basic facilities like changing the volume, pitch, voice, accent, language etc can be included.

5. Accent acording to locale is yet another important feature that we aim at in the speech synthesis. Espeak already provides different accents for different languages.

6. Another nice idea that assimd suggested is a keyboard speaker. Means whenever a user presses any of the key, the activity speaks it out.

    Some rough ideas of implementation:

7. There are two options for using a layer over TTS engine espeak, one is a speech dispatcher which was created as last year GSoC project and other is the gstreamer plugin.

8. Both of these use espeak. Listen and Spell uses the speechd. But when I discussed it with alsroot on IRC, he told me that using a speechd is a bad idea becaue it has become a system daemon and requires root privileges to work. Therefoe using gstreamer plugin is the only and best idea.

9. For the GUI pyGtk can be used.

10. Now to implement speech in sugar core my idea is to use clipboard module which takes care of copy paste in sugar. So using this module the entire selected text can be sent to the speech activity that it can speak out.

12. For the keyboard speaker, we can simply store the keystrokes in a file and then send the file to the speech generator.

13. The basic idea is to provide a read button in core sugar (like a home button) which is always there. So that if a user selects any of the text in the current window and presses the button it gets speak out.


10. What is the timeline for development of your project? The Summer of Code work period is 7 weeks long, May 23 - August 10; tell us what you will be working on each week. (As the summer goes on, you and your mentor will adjust your schedule, but it's good to have a plan at the beginning so you have an idea of where you're headed.) Note that you should probably plan to have something "working and 90% done" by the midterm evaluation (July 6-13); the last steps always take longer than you think, and we will consider cancelling projects which are not mostly working by then.

     [TODO]

11. Convince us, in 5-15 sentences, that you will be able to successfully complete your project in the timeline you have described. This is usually where people describe their past experiences, credentials, prior projects, schoolwork, and that sort of thing, but be creative. Link to prior work or other resources as relevant.

     [TODO]

You and the community

12. If your project is successfully completed, what will its impact be on the Sugar Labs community? Give 3 answers, each 1-3 paragraphs in length. The first one should be yours. The other two should be answers from members of the Sugar Labs community, at least one of whom should be a Sugar Labs GSoC mentor. Provide email contact information for non-GSoC mentors.

     According to me, the main aim of sugar labs is to spread the fruit of literacy in developing nations. It is a common experience that we learn very fast on listening things then reading them. Providing speech in core sugar will be like making the sugar 10-15% more efficient. When children of age group 5-15 and who are learning languages will hear the speech again and again they will be able to learn it very fast. Not only this, now they will be able to hear a story or any other text than just readinng it. One more potential advantage is for blind students which can't read the texts but can learn the language by listening it.
    [TO MORE PARA]

13. Sugar Labs will be working to set up a small (5-30 unit) Sugar pilot near each student project that is accepted to GSoC so that you can immediately see how your work affects children in a deployment. We will make arrangements to either supply or find all the equipment needed. Do you have any ideas on where you would like your deployment to be, who you would like to be involved, and how we can help you and the community in your area begin it?

     I would greatly appreciate the efforts of sugar if they are planing for this and I think that my home town  which is still backward and has many primary schools, will be the best place where this pilot can be set up. I have many friends in the home town who are involved in such activities and they would love to contribute in here also. I also have a primary school near my home where we can easily test the activity.

14. What will you do if you get stuck on your project and your mentor isn't around?

     Well I have some of my great helping seniors who are already associated with OLPC for some projects and who are ready to help me out in every possible way they can.
     If still the problem can't be resolved then I can always ask it on IRC.

Google is also a very great option

     I can also post the problem on sugar mailing list.

15. How do you propose you will be keeping the community informed of your progress and any problems or questions you might have over the course of the project?

     I will regularly post my progress reports on my wiki page.
     I can mail my progress reports to sugar mailing list.

Miscellaneous

 
An example of the kind of screenshot of your first modification to your development environment which you should include in your application. Note that the drop-down menu text has Mel's email address in place of the word "Restart" - your screenshot should contain your email instead.

16. We want to make sure that you can set up a development environment before the summer starts. Please send us a link to a screenshot of your Sugar development environment with the following modification: when you hover over the XO-person icon in the middle of Home view, the drop-down text should have your email in place of "Restart." See the image on the right for an example. It's normal to need assistance with this, so please visit our IRC channel, #sugar on irc.freenode.net, and ask for help.

     [TODO]

17. What is your t-shirt size? (Yes, we know Google asks for this already; humor us.)

     Extra Large

18. Describe a great learning experience you had as a child.

     [TODO]

19. Is there anything else we should have asked you or anything else that we should know that might make us like you or your project more?

     [TODO]