Changes

Jump to navigation Jump to search
Line 3: Line 3:  
=== Object of this HowTo ===
 
=== Object of this HowTo ===
   −
This HowTo explain how to update the data files in the wikipedia activities or create new activities with other languages or selection of different.
+
This HowTo explains how to update the data files in the wikipedia activities or how to create new activities with other languages or different selections of articles.
   −
Is not very difficult if you already have a Sugar environment. If you have doubts or the information provided is not adequate, please contact me at gonzalo at laptop dot org or in the sugar-devel mailing list and I will try to help and improve this page.
+
The procedure is not very difficult if you already have a Sugar environment setup. If you have doubts or the information provided is not adequate, please contact me at ''gonzalo at laptop dot org'' or in the sugar-devel mailing list and I will try to help and improve this page.
   −
If you want create a wikipedia activity in your language, and do not have the technical resources, but can help translating a few files and doing quality control, contact me and I will help you to create the activity.
+
If you want to create a wikipedia activity in your language, and do not have the technical resources, but can help translating a few files and doing quality control, contact me and I will help you to create the activity.
    
=== How to Create a new wikipedia activity or update an existing activity ===
 
=== How to Create a new wikipedia activity or update an existing activity ===
    
This page describes how to generate the data files needed to create a wikipedia activity like  
 
This page describes how to generate the data files needed to create a wikipedia activity like  
[http://activities.sugarlabs.org/es-ES/sugar/addon/4401 Wikipedia es] or [http://activities.sugarlabs.org/es-ES/sugar/addon/4411 Wikipedia en]
+
[http://activities.sugarlabs.org/es-ES/sugar/addon/4401 Wikipedia es] or [http://activities.sugarlabs.org/es-ES/sugar/addon/4411 Wikipedia en].
   −
The general idea is to download an XML dump-file (backup) containing the current Wikipedia pages for a given language, this will be processed to select certain pages and compress them into a self-contained Sugar activity. Whether or not to include the images from the wiki articles will have a large impact on the size of the activity.
+
The general idea is to download an XML dump-file (backup) containing the current Wikipedia pages for a given language, then process the dump and select certain pages and compress them into a self-contained Sugar activity. Whether or not to include the images from the wiki articles will have a large impact on the size of the activity.
   −
Generating a Wikipedia activity requires a computer with a lot of available disk space, ideally lots of RAM and a working Sugar environment. It is probably best to use packages provided by your favorite Linux distribution or in a virtual machine. The wikipedia xml file is very large (almost 6 GB for the Spanish wikipedia, and it is even bigger in English), and you will need lots of space to generate temporary files. The process has a long run-time, but it is mostly automated, although you will need to confirm success at each stage of the process before moving on to the next.   
+
Generating a Wikipedia activity requires a computer with a lot of available disk space, ideally lots of RAM and a working Sugar environment. It is probably best to use packages provided by your favorite Linux distribution or in a virtual machine. The wikipedia xml file is very large (almost 6 GB for the Spanish wikipedia, and it is even bigger in English), and you will need lots of space to generate temporary files. The process does take a lot of time, but it is mostly automated, although you will need to confirm success at each stage of the process before moving on to the next one.   
    
== Download the wikipedia base activity ==
 
== Download the wikipedia base activity ==
   −
You will need to download the wikipedia base from http://dev.laptop.org/~gonzalo/wikiserver/WikipediaBase-33.xo. This package includes the activity and the tools to create the data files.
+
You will need to download the wikipedia base from http://dev.laptop.org/~gonzalo/wikiserver/WikipediaBase-35.xo. This package includes the activity and the tools to create the data files.
   −
You need unzip it in your Activities directory, or install it, if you do not have other wikipedia activity already installed.
+
You need to unzip it in your Activities directory, or install it, if you do not have another wikipedia activity already installed.
   −
The git repository is here http://dev.laptop.org/git/projects/wikiserver
+
The git repository is here https://github.com/godiard/wikipedia-activity .
    
== Download a Wikipedia dump file==
 
== Download a Wikipedia dump file==
Line 128: Line 128:     
  mv eswiki-20111112-pages-articles.xml.processed_expanded eswiki-20111112-pages-articles.xml.processed
 
  mv eswiki-20111112-pages-articles.xml.processed_expanded eswiki-20111112-pages-articles.xml.processed
  ../tools2/create_index.py --delete_all
+
  ../tools2/create_index.py --delete_old
   −
The option --delete_all is used to remove the old index
+
The option --delete_old is used to remove the old index
    
If you want to include images in your wikipedia activity, you can go again to your data directory and do:
 
If you want to include images in your wikipedia activity, you can go again to your data directory and do:
Line 157: Line 157:  
in the header and replace the entities for stroke_color and fill_color, after that.
 
in the header and replace the entities for stroke_color and fill_color, after that.
   −
* activity_''lang''.py: is the startup class, sets the configuration values and starts the server.
+
 
 +
* '''DEPRECATED, SEE BELOW:''' activity_''lang''.py: is the startup class, sets the configuration values and starts the server.
 
You can copy the class from another language and set the parameters. You need set the name of the class,
 
You can copy the class from another language and set the parameters. You need set the name of the class,
 
equal than the value in the exec value in the activity/activity.info.lang file.
 
equal than the value in the exec value in the activity/activity.info.lang file.
Line 166: Line 167:  
If you create your favorite list based in a translation of the home page from other language, would be a good idea translate the home page too.  
 
If you create your favorite list based in a translation of the home page from other language, would be a good idea translate the home page too.  
   −
Now, you can test your changes, starting the wikipedia server:
+
 
 +
'''DEPRECATED, SEE BELOW:''' Now, you can test your changes, starting the wikipedia server:
    
  ./activity_''lang''.py es_lat/eswiki-20111112-pages-articles.xml 8000
 
  ./activity_''lang''.py es_lat/eswiki-20111112-pages-articles.xml 8000
Line 183: Line 185:     
Now, in the directory dist, a new .xo file will be created and you can distribute it.
 
Now, in the directory dist, a new .xo file will be created and you can distribute it.
 +
 +
=== Notes on updates in the process ===
 +
 +
After version 38, with the intention of make more standard the process to allow package the activity
 +
in distributions, we added a standard setup.py. To use it, is needed add the wikipedia initialization
 +
parameters to the activity.info file, as is displayed in the file activity.info.en_simple
 +
 +
https://github.com/godiard/wikipedia-activity/blob/master/activity/activity.info.en_simple
 +
 +
[Wikipedia]
 +
path = en_simple/simplewiki-20130724-pages-articles.xml
 +
port = 8011
 +
home_page = /static/index_en_simple.html
 +
templateprefix = Template:
 +
wpheader = From Wikipedia, The Free Encyclopedia
 +
wpfooter = Content available under the
 +
  <a href="/static/es-gfdl.html">GNU Free Documentation License</a>.
 +
  <br/> Wikipedia is a registered trademark of the non-profit
 +
  Wikimedia Foundation, Inc.<br/><a href="/static/about_en.html">
 +
  About Wikipedia</a>
 +
resultstitle = Search results for '%s'.
 +
 +
Another change important is that now is not needed create a activity_<lang>.py file,
 +
because the activity starts and read the config from the activity.info file, the "exec" line need be:
 +
 +
exec = sugar-activity activity.WikipediaActivity
 +
 +
Then to create the .xo you can do:
 +
 +
./setup.py dist_xo es_lat/eswiki-20111112-pages-articles.xml
 +
 +
or to create the sources tar.bz2 file:
 +
 +
./setup.py dist_source es_lat/eswiki-20111112-pages-articles.xml
 +
 +
With this new version, testing the wiki can be done on the command line doing:
 +
 +
./test_server.py es_lat/eswiki-20111112-pages-articles.xml 8000
 +
 +
The two parameters are optional, if are not provided, the parameters in activity.info file will be used.
    
== Other changes needed ==
 
== Other changes needed ==
Line 189: Line 231:     
If after finish the process of the files, the images are not displayed in the pages, check if the image identifier is included in the set imageKeywords in the file mwlib/parser.py. For example, in the Quechua wikipedia, the image identifier is "rikcha" and we needed add it because was not included.
 
If after finish the process of the files, the images are not displayed in the pages, check if the image identifier is included in the set imageKeywords in the file mwlib/parser.py. For example, in the Quechua wikipedia, the image identifier is "rikcha" and we needed add it because was not included.
 +
 +
== More tools ==
 +
 +
=== Big image files ===
 +
 +
There are cases where a small group of images are very big, if you want remove them to have a smaller activity, can do:
 +
 +
mkdir big-images
 +
find images -size +100k -exec mv {} big-images \;
 +
 +
(in this example, moving images with more than 100k to another directory)
 +
 +
== Old information ==
 +
 +
http://wiki.laptop.org/go/User:Godiard/WkipediaDataRebuild
628

edits

Navigation menu