Difference between revisions of "Activities/Get Internet Archive Books"

From Sugar Labs
Jump to navigation Jump to search
Line 18: Line 18:
 
* Black and White PDF
 
* Black and White PDF
 
* DJVU
 
* DJVU
* Text Unlike Project Gutenberg texts, this are created with OCR software with no attempt to format or proofread it, so I probably won't offer this one.
+
* Text
 +
 
 +
Unlike Project Gutenberg texts, text files for the Internet Archive are created with OCR software with no attempt to format or proofread it, so I probably won't offer this one.
  
 
== Bugs ==
 
== Bugs ==

Revision as of 12:12, 29 June 2009

Description & Goals

The Internet Archive is a website containing around a million public domain ebooks created by scanning page images from books in various libraries. Because of this the ebooks have pages that look like the books they came from, including illustrations and other page decorations. It may be the best source of free books for younger readers, as well as for books in languages other than English.

This Activity will use the Advanced Search capabilities of the Internet Archive website to enable browsing the website's catalog, getting information on the books therein, and downloading these books to the Journal. Its user interface is similar to the offline catalog search of Read Etexts, but where that Activity is used for both getting books and reading them this one will concern itself only with getting the books, so they may be read with the Read Activity.

Current Features

The Activity will allow searching on Title and Author. The books found will be listed in a table containing Author, Title, Volume (if any) and Language. Selecting the entry in the table will display other metadata about the book above the table: the book's description and subject, publisher, etc. The user may then download the selected book to the Journal where it will be given a title meta tag containing title and author and an appropriate MIME type.

The metadata for the book is stored in the Journal entry's Description field.

Planned Features

I plan to support other formats for downloading and will add new formats when it is possible to use these formats in Sugar Activities. The most common formats are:

  • PDF
  • Black and White PDF
  • DJVU
  • Text

Unlike Project Gutenberg texts, text files for the Internet Archive are created with OCR software with no attempt to format or proofread it, so I probably won't offer this one.

Bugs

  1. Currently the progress reporting of downloads works OK in my test environment but not when running on an XO.
  2. I'd like the Activity to start up with the search field having the focus. I put in code that should do this but it isn't working.
  3. The results table displays nicely without horizontal scrolling in my test environment but needs horizontal scrolling on the XO.

Source

http://git.sugarlabs.org/projects/get-internet-archive-books