Activities/Wikipedia/HowTo: Difference between revisions
No edit summary |
No edit summary |
||
| Line 5: | Line 5: | ||
This page describes how to create the data files needed to create a wikipedia activity like | This page describes how to create the data files needed to create a wikipedia activity like | ||
[http://activities.sugarlabs.org/es-ES/sugar/addon/4401 Wikipedia es] or [http://activities.sugarlabs.org/es-ES/sugar/addon/4411 Wikipedia en] | [http://activities.sugarlabs.org/es-ES/sugar/addon/4401 Wikipedia es] or [http://activities.sugarlabs.org/es-ES/sugar/addon/4411 Wikipedia en] | ||
The general idea is download a xml file with a dump (backup) with the state of the wikipedia pages, and process it to select a number of pages, and compress them, to include in a activity. Optionally, is possible download the images used in that pages. | |||
You will need a computer with a lot of space on disk, and a working Sugar environment. May be using packages provided by your linux distribution or in a virtual machine. The wikipedia xml file is big (almost 6 GB to the spanish wikipedia, bigger in english), and you need more space to generate temporary files. The process takes a lot of time too, but is automatic, you only need check states at finish of every stage. | |||
This page is a work in progress. If you have doubts or the information provided is not good enough, please contact me at gonzalo at laptop dot org and I will try to improve it. | |||
== Download the wikipedia base activity == | |||
You will need download the wikipedia base from http://dev.laptop.org/~gonzalo/wikibase.zip. This file include the activity and the tools to create the data files. | |||
You need create a directory in your Activities directory for example WikipediaEs.activity and unzip wikibase.zip inside. | |||
== Download a dump == | == Download a dump == | ||
Wikipedia provide a almost daily xml files dump for every language. | Wikipedia provide a almost daily xml files dump for every language. | ||
This test was done with the spanish dump. | This test was done with the spanish dump. The file used was eswiki-20111112-pages-articles.xml.bz2 from http://dumps.wikimedia.org/eswiki/20110810/ | ||
The file used was eswiki-20111112-pages-articles.xml.bz2 from http://dumps.wikimedia.org/eswiki/20110810/ | You need create a directory inside the create activity and download the wikipedia dump file | ||
The first two letters from your directory must be the language code example: es_es or en_us | The first two letters from your directory must be the language code example: es_es or en_us | ||
| Line 108: | Line 119: | ||
in another directory to acelerate the process. | in another directory to acelerate the process. | ||
== | == Modify your activity to use the data files == | ||
You need can modify the file activity_es.py and modify the lines: | |||
self.WIKIDB = 'es_new/eswiki-20111112-pages-articles.xml' | |||
self.HOME_PAGE = '/static/index_es.html' | |||
to point to your new data files or create a new different file, for example activity_pt.py. | |||
If you create a new file, you will need modify the file activity/activity.info to point to this new file. | |||
You can create a new icon too, or modify the existing activity/activity-wikipedia-es.svg file. | |||