Line 3: |
Line 3: |
| === Object of this HowTo === | | === Object of this HowTo === |
| | | |
− | This HowTo explain how to update the data files in the wikipedia activities or create new activities with other languages or selection of different. | + | This HowTo explains how to update the data files in the wikipedia activities or how to create new activities with other languages or different selections of articles. |
| | | |
− | Is not very difficult if you already have a Sugar environment. If you have doubts or the information provided is not adequate, please contact me at gonzalo at laptop dot org or in the sugar-devel mailing list and I will try to help and improve this page.
| + | The procedure is not very difficult if you already have a Sugar environment setup. If you have doubts or the information provided is not adequate, please contact me at ''gonzalo at laptop dot org'' or in the sugar-devel mailing list and I will try to help and improve this page. |
| | | |
− | If you want create a wikipedia activity in your language, and do not have the technical resources, but can help translating a few files and doing quality control, contact me and I will help you to create the activity. | + | If you want to create a wikipedia activity in your language, and do not have the technical resources, but can help translating a few files and doing quality control, contact me and I will help you to create the activity. |
| | | |
| === How to Create a new wikipedia activity or update an existing activity === | | === How to Create a new wikipedia activity or update an existing activity === |
| | | |
| This page describes how to generate the data files needed to create a wikipedia activity like | | This page describes how to generate the data files needed to create a wikipedia activity like |
− | [http://activities.sugarlabs.org/es-ES/sugar/addon/4401 Wikipedia es] or [http://activities.sugarlabs.org/es-ES/sugar/addon/4411 Wikipedia en] | + | [http://activities.sugarlabs.org/es-ES/sugar/addon/4401 Wikipedia es] or [http://activities.sugarlabs.org/es-ES/sugar/addon/4411 Wikipedia en]. |
| | | |
− | The general idea is to download an XML dump-file (backup) containing the current Wikipedia pages for a given language, this will be processed to select certain pages and compress them into a self-contained Sugar activity. Whether or not to include the images from the wiki articles will have a large impact on the size of the activity. | + | The general idea is to download an XML dump-file (backup) containing the current Wikipedia pages for a given language, then process the dump and select certain pages and compress them into a self-contained Sugar activity. Whether or not to include the images from the wiki articles will have a large impact on the size of the activity. |
| | | |
− | Generating a Wikipedia activity requires a computer with a lot of available disk space, ideally lots of RAM and a working Sugar environment. It is probably best to use packages provided by your favorite Linux distribution or in a virtual machine. The wikipedia xml file is very large (almost 6 GB for the Spanish wikipedia, and it is even bigger in English), and you will need lots of space to generate temporary files. The process has a long run-time, but it is mostly automated, although you will need to confirm success at each stage of the process before moving on to the next. | + | Generating a Wikipedia activity requires a computer with a lot of available disk space, ideally lots of RAM and a working Sugar environment. It is probably best to use packages provided by your favorite Linux distribution or in a virtual machine. The wikipedia xml file is very large (almost 6 GB for the Spanish wikipedia, and it is even bigger in English), and you will need lots of space to generate temporary files. The process does take a lot of time, but it is mostly automated, although you will need to confirm success at each stage of the process before moving on to the next one. |
| | | |
| == Download the wikipedia base activity == | | == Download the wikipedia base activity == |
| | | |
− | You will need to download the wikipedia base from http://dev.laptop.org/~gonzalo/wikiserver/WikipediaBase-33.xo. This package includes the activity and the tools to create the data files. | + | You will need to download the wikipedia base from http://dev.laptop.org/~gonzalo/wikiserver/WikipediaBase-35.xo. This package includes the activity and the tools to create the data files. |
| | | |
− | You need unzip it in your Activities directory, or install it, if you do not have other wikipedia activity already installed. | + | You need to unzip it in your Activities directory, or install it, if you do not have another wikipedia activity already installed. |
| | | |
− | The git repository is here http://dev.laptop.org/git/projects/wikiserver | + | The git repository is here https://github.com/godiard/wikipedia-activity . |
| | | |
| == Download a Wikipedia dump file== | | == Download a Wikipedia dump file== |
Line 128: |
Line 128: |
| | | |
| mv eswiki-20111112-pages-articles.xml.processed_expanded eswiki-20111112-pages-articles.xml.processed | | mv eswiki-20111112-pages-articles.xml.processed_expanded eswiki-20111112-pages-articles.xml.processed |
− | ../tools2/create_index.py --delete_all | + | ../tools2/create_index.py --delete_old |
| | | |
− | The option --delete_all is used to remove the old index | + | The option --delete_old is used to remove the old index |
| | | |
| If you want to include images in your wikipedia activity, you can go again to your data directory and do: | | If you want to include images in your wikipedia activity, you can go again to your data directory and do: |
Line 157: |
Line 157: |
| in the header and replace the entities for stroke_color and fill_color, after that. | | in the header and replace the entities for stroke_color and fill_color, after that. |
| | | |
− | * activity_''lang''.py: is the startup class, sets the configuration values and starts the server. | + | |
| + | * '''DEPRECATED, SEE BELOW:''' activity_''lang''.py: is the startup class, sets the configuration values and starts the server. |
| You can copy the class from another language and set the parameters. You need set the name of the class, | | You can copy the class from another language and set the parameters. You need set the name of the class, |
| equal than the value in the exec value in the activity/activity.info.lang file. | | equal than the value in the exec value in the activity/activity.info.lang file. |
Line 166: |
Line 167: |
| If you create your favorite list based in a translation of the home page from other language, would be a good idea translate the home page too. | | If you create your favorite list based in a translation of the home page from other language, would be a good idea translate the home page too. |
| | | |
− | Now, you can test your changes, starting the wikipedia server: | + | |
| + | '''DEPRECATED, SEE BELOW:''' Now, you can test your changes, starting the wikipedia server: |
| | | |
| ./activity_''lang''.py es_lat/eswiki-20111112-pages-articles.xml 8000 | | ./activity_''lang''.py es_lat/eswiki-20111112-pages-articles.xml 8000 |
Line 183: |
Line 185: |
| | | |
| Now, in the directory dist, a new .xo file will be created and you can distribute it. | | Now, in the directory dist, a new .xo file will be created and you can distribute it. |
| + | |
| + | === Notes on updates in the process === |
| + | |
| + | After version 38, with the intention of make more standard the process to allow package the activity |
| + | in distributions, we added a standard setup.py. To use it, is needed add the wikipedia initialization |
| + | parameters to the activity.info file, as is displayed in the file activity.info.en_simple |
| + | |
| + | https://github.com/godiard/wikipedia-activity/blob/master/activity/activity.info.en_simple |
| + | |
| + | [Wikipedia] |
| + | path = en_simple/simplewiki-20130724-pages-articles.xml |
| + | port = 8011 |
| + | home_page = /static/index_en_simple.html |
| + | templateprefix = Template: |
| + | wpheader = From Wikipedia, The Free Encyclopedia |
| + | wpfooter = Content available under the |
| + | <a href="/static/es-gfdl.html">GNU Free Documentation License</a>. |
| + | <br/> Wikipedia is a registered trademark of the non-profit |
| + | Wikimedia Foundation, Inc.<br/><a href="/static/about_en.html"> |
| + | About Wikipedia</a> |
| + | resultstitle = Search results for '%s'. |
| + | |
| + | Another change important is that now is not needed create a activity_<lang>.py file, |
| + | because the activity starts and read the config from the activity.info file, the "exec" line need be: |
| + | |
| + | exec = sugar-activity activity.WikipediaActivity |
| + | |
| + | Then to create the .xo you can do: |
| + | |
| + | ./setup.py dist_xo es_lat/eswiki-20111112-pages-articles.xml |
| + | |
| + | or to create the sources tar.bz2 file: |
| + | |
| + | ./setup.py dist_source es_lat/eswiki-20111112-pages-articles.xml |
| + | |
| + | With this new version, testing the wiki can be done on the command line doing: |
| + | |
| + | ./test_server.py es_lat/eswiki-20111112-pages-articles.xml 8000 |
| + | |
| + | The two parameters are optional, if are not provided, the parameters in activity.info file will be used. |
| | | |
| == Other changes needed == | | == Other changes needed == |
Line 189: |
Line 231: |
| | | |
| If after finish the process of the files, the images are not displayed in the pages, check if the image identifier is included in the set imageKeywords in the file mwlib/parser.py. For example, in the Quechua wikipedia, the image identifier is "rikcha" and we needed add it because was not included. | | If after finish the process of the files, the images are not displayed in the pages, check if the image identifier is included in the set imageKeywords in the file mwlib/parser.py. For example, in the Quechua wikipedia, the image identifier is "rikcha" and we needed add it because was not included. |
| + | |
| + | == More tools == |
| + | |
| + | === Big image files === |
| + | |
| + | There are cases where a small group of images are very big, if you want remove them to have a smaller activity, can do: |
| + | |
| + | mkdir big-images |
| + | find images -size +100k -exec mv {} big-images \; |
| + | |
| + | (in this example, moving images with more than 100k to another directory) |
| + | |
| + | == Old information == |
| + | |
| + | http://wiki.laptop.org/go/User:Godiard/WkipediaDataRebuild |