Difference between revisions of "Activities/Read Etexts"

From Sugar Labs
Jump to navigation Jump to search
Line 18: Line 18:
 
* Book sharing is supported.
 
* Book sharing is supported.
 
* The power management code from the core Read activity has been added, with a few minor changes, and seems to work OK.
 
* The power management code from the core Read activity has been added, with a few minor changes, and seems to work OK.
* A new feature is text to speech with Karaoke highlighting. The purpose of this is to produce a tool to help someone learn to read. Support for text to speech on the XO laptop is done using speech-dispatcher and espeak. Speech-dispatcher currently is not part of the software included on the XO, but can be easily added using Yum.  You do '''not''' need speech-dispatcher installed to use Read Etexts, but you will of course not have text to speech working unless you do.
+
* A new feature is text to speech with Karaoke highlighting. The purpose of this is to produce a tool to help someone learn to read. Support for text to speech on the XO laptop is done using a gstreamer plugin for espeak. This plugin currently is not part of the software included on the XO, but is installed on Sugar on a Stick.  You do '''not''' need this plugin installed to use Read Etexts, but you will of course not have text to speech working unless you do.
* The latest release supports both speech-dispatcher and the not yet released gstreamer espeak plugin, which should be included in future versions of Sugar. This provides much better performance for text highlighting and requires no configuration.  If you want to get a really inadequate preview of what Speech with highlighting will be install speech-dispatcher, otherwise wait for the gstreamer espeak plugin.
 
 
* The Books toolbar lets you search for books in Project Gutenberg's offline catalog.  Enter in words that you would expect to find in the title or author of a book, then press Enter.  A table will appear in the lower half of the screen listing book titles and authors that contain all of the words.  Select a book from that table and click the download button and the book will arrive in a minute or so.  The download tries to get the best available version of the book.  For instance, it will try to download an 8 bit version of the book, and if there is none it will try to get a 7 bit version.  (8 bit files contain accents and diacritical marks; the 7 bit versions do not.  Not every text has both versions).
 
* The Books toolbar lets you search for books in Project Gutenberg's offline catalog.  Enter in words that you would expect to find in the title or author of a book, then press Enter.  A table will appear in the lower half of the screen listing book titles and authors that contain all of the words.  Select a book from that table and click the download button and the book will arrive in a minute or so.  The download tries to get the best available version of the book.  For instance, it will try to download an 8 bit version of the book, and if there is none it will try to get a 7 bit version.  (8 bit files contain accents and diacritical marks; the 7 bit versions do not.  Not every text has both versions).
 
* You can download several books to the Journal in one session.  Each book will be given a Journal entry when it downloads.
 
* You can download several books to the Journal in one session.  Each book will be given a Journal entry when it downloads.

Revision as of 09:58, 18 June 2009

Description & Goals

"Outside of a dog, a book is man's best friend. Inside of a dog it's too dark to read." -- Groucho Marx

The Read Etexts activity is meant to allow the XO laptop to read Project Gutenberg ETexts, which are plain text files. The original goal of this Activity was to create a stopgap for reading plain text files until the core Read activity was able to do that. Read Etexts has become much more than that, adding features that core Read does not have, like text to speech with word highlighting, and most recently the ability to search the Project Gutenberg offline catalog and download books.

Since the ManyBooks.net website offers Project Gutenberg titles as PDFs you might wonder why you would need an Activity to read plain text files. It is a matter of personal preference. If you have a choice between a text file and a PDF, you may find that the text file is easier on the eyes than a PDF, takes up less space in the Journal (especially in zip format), and uses less memory to read. You will also find that the offline catalog search (Books tab) is a really convenient way to download books.

The interface to Read Etexts is very similar to the core Read activity, which should not be surprising as the toolbar code was adapted from Read's toolbar. You can use the up and down arrows or the game controller to scroll pages, and the '+' and '-' keys to adjust the font size. Use Page Up and Page Down to move to the previous and next pages respectively.

Project Gutenberg is a website where you can download thousands of public domain books for free. There are books for every interest: classics, history, childen's novels, science fiction, and much, much more. Browse By Library of Congress Class: Language and Literatures: Juvenile belles lettres will give you a list of books suitable for young readers.

Read ETexts can read books in plain text format or in Zip format. These are by far the most popular formats on the Gutenberg website. If for some reason you cannot use the Catalog search to get a book you can also download books from the website using the Browse activity. You should download one of the Zip file formats. These can be encoded as us-ascii text or as iso-8859-1; Read Etexts can handle either one. The iso-8859-1 encoding is used for books that need accent marks, etc. Save the Zip file to the Journal, change the Journal entry name to match the title of the book, and then resume it using the Read Etexts option on the Resume menu. See the first screenshot.

Current Features

  • Currently Read Etexts can be used to read Gutenberg Etexts, either as text files or as zip files containing one text file. The toolbars include Activity, Read (skip to page), Edit (copy to clipboard, search for text) and View (zoom text bigger or smaller). The Books toolbar comes up if you launch Read Etexts from the Activity ring. This toolbar supports searching the Project Gutenberg offline catalog and downloading books.
  • Book sharing is supported.
  • The power management code from the core Read activity has been added, with a few minor changes, and seems to work OK.
  • A new feature is text to speech with Karaoke highlighting. The purpose of this is to produce a tool to help someone learn to read. Support for text to speech on the XO laptop is done using a gstreamer plugin for espeak. This plugin currently is not part of the software included on the XO, but is installed on Sugar on a Stick. You do not need this plugin installed to use Read Etexts, but you will of course not have text to speech working unless you do.
  • The Books toolbar lets you search for books in Project Gutenberg's offline catalog. Enter in words that you would expect to find in the title or author of a book, then press Enter. A table will appear in the lower half of the screen listing book titles and authors that contain all of the words. Select a book from that table and click the download button and the book will arrive in a minute or so. The download tries to get the best available version of the book. For instance, it will try to download an 8 bit version of the book, and if there is none it will try to get a 7 bit version. (8 bit files contain accents and diacritical marks; the 7 bit versions do not. Not every text has both versions).
  • You can download several books to the Journal in one session. Each book will be given a Journal entry when it downloads.

Using Text to Speech

Read Etexts uses software called Speech Dispatcher to read text aloud and to perform callbacks which enable the word being spoken to be highlighted. Speech Dispatcher is not yet included with the normal XO software distribution, but can be installed using the instructions found here.

To start text to speech you simply press the check mark button on the XO's display (Numeric Keypad "End" on a standard keyboard). This button will also pause and resume speech. Only the current page will be spoken, and always starting from the first word on the page unless you are resuming after pausing. You need to have the text control containing the text to be spoken in focus. I use the check button because you can use it when the XO is folded into its ebook reader configuration. There is also a Play/Pause button on the Speech tab of the toolbar that you can use instead.

If you do not have the Python bindings for speech-dispatcher installed you will not see the Speech tool bar. This is intentional. The toolbar is very much like the one in the Speak activity and was adapted from its code. It allows you to change the language, pitch, and rate of speech. You can only do this while the Activity is not speaking. You can pause the speech, change its rate, pitch,or language, and then resume.

The latest version of Read Etexts supports either speech-dispatcher or the gstreamer espeak plugin developed by Aleksey Lim for the Sugar project. This plugin should be part of Sugar on a Stick and future releases of Sugar for the XO. This plugin works much better than speech-dispatcher does and does not require running a demon program or doing any configuration.

Sharing Documents

This activity uses code adapted from the core Read activity for document sharing over the network. To share a document with someone that person must also have the Read Etexts activity installed, and it should be the same version for best results. You can invite an individual to join the activity or share it with the whole neighborhood, but either way only those who have the activity installed will see the invitation.

When someone accepts the invitation to join the activity a copy of the document is sent to his computer for him to read. When he exits the activity the document will be saved in the journal. The Journal entry will be titled "Read Etexts Activity", not the title of the book. Of course the recipient can modify this title to match the actual title of the book, but the activity currently will not do this for her.

This is a bit different from the core Read activity because I actually save a copy of the received document in the Journal, whereas Read does not. If you try to resume a shared Read activity when the document is not currently being shared you will get an empty document.

Planned Features

  • I plan to add an annotation feature that enables the user to highlight passages in the text and attach notes to pages in the text. These annotations and highlights will be stored in an XML file that will be included in the Zip file containing the document. When you share a document your annotations and highlights will go along with it. For the recipient it will be sort of like buying a used textbook that has all the important stuff already marked up. Text to Speech may or may not read these notes along with the text.
  • I plan to allow multiple bookmarks in a document, and have those bookmarks stored in the XML file with the annotations and highlights. This will be in addition to the current feature that remembers where you left off when you last read a document.

Bugs

  • Text to speech does not work perfectly. The gstreamer plugin combined with newer versions of espeak works much better than the original speech-dispatcher code did, but in Sugar on a Stick on some machines the text highlighting lags badly behind the spoken words.
  • Sharing documents works, but the progress report text does not. You only see the final totals, not the counts of bytes dowloaded so far. View Slides had a similar problem which I fixed. When I tried the same code to fix Read Etexts it caused the Activity to hang, possibly because of multi-threading issues. Text documents are generally small enough that it is tolerable not to have this working.
  • There is no word-wrap feature for books with long lines. At the moment I consider this more of a feature than a bug, because all of the many thousands of Project Gutenberg Etexts available use a standard line width so word-wrapping isn't really necessary.

Source

http://git.sugarlabs.org/projects/readetexts