Difference between revisions of "Service/mirrors"

From Sugar Labs
Jump to navigation Jump to search
m (remove cacheboy from manually mantained list)
 
(14 intermediate revisions by 6 users not shown)
Line 2: Line 2:
 
[[Category:Resource]]
 
[[Category:Resource]]
  
==Introduction==
+
== Introduction ==
A content delivery network or content distribution network (CDN) is a system of computers containing copies of data, placed at various points in a network so as to maximize bandwidth for access to the data from clients throughout the network. A client accesses a copy of the data near to the client, as opposed to all clients accessing the same central server, thereby causing a bottleneck near that server.
+
A content delivery network or Content Delivery Network (CDN) is a system of computers containing copies of data, placed at various points in a network so as to maximize bandwidth for access to the data from clients throughout the network. A client accesses a copy of the data near to the client, as opposed to all clients accessing the same central server, thereby causing a bottleneck near that server.
  
Mirrorbrain, Bouncer, Fedora Mirror Manager, and Cacheboy are four possible choices for scaling up the Sugar Labs content delivery networkLong term, Cacheboy looks like it might be the best fit for Sugar Labs, but the project must become more stable before becoming Sugar Labs primary CDN. Bouncer is currently being used as successfully by MozillaBut development ended at the end of 2008. Mozilla is currently investigating Mirrorbrain. Mirrorbrain looks promising.
+
== Goals ==
 +
* Reduce bandwidth at primary download server.
 +
* Improve quality of service for users.
 +
* Move content closer to users, thus reducing latency.
 +
 
 +
== Architecture ==
 +
The Sugar Labs Content Delivery Network uses [http://www.mirrorbrain.org/ MirrorBrain] as a redirectorThe redirector, which lives in a Sugar Labs data center, keeps track of which files are available on which mirror. When a user requests a file, the redirector points the user to the correct mirror and automatically starts the file download.
 +
 
 +
== Mirrors ==
 +
The current list of available mirrors is available at http://mirrors.sugarlabs.org/
 +
 
 +
== Considerations ==
 +
 
 +
=== Bandwidth ===
 +
 
 +
To run a mirror you need a lot of bandwidth! You should look at [http://stats.sugarlabs.org/download.sugarlabs.org/ the total bandwidth used by all the mirrors].
  
==Goals==
+
If you have trouble with bandwidth, you should look at [http://www.cloudflare.com CloudFlare].
* Reduce bandwith at primary download server.
+
 
* Improve quality of service for users.
+
=== HDD Space ===
* Move content closer to users.
+
 
 +
Hosting a mirror takes a lot of space.  If you don't have a lot of space you can only choose to mirror some parts.  For example exclude all directories but the activities (~13gb):
 +
 
 +
  rsync -avzh rsync://download.sugarlabs.org/pub --exclude 'dextrose' --exclude 'hexoquinasa' --exclude 'images' --exclude 'sources' --exclude 'docs' --exclude 'packages' --exclude 'soas' /rsync/download.sugarlabs.org
 +
 
 +
== Setting up a new mirror ==
 +
 
 +
=== For mirror administrators ===
 +
All you need is a web server with enough bandwidth to serve the files. To set up a new mirror, the site administrator needs to:
 +
 
 +
* First lets make a directory to store the data:
 +
 
 +
  mkdir /rsync
 +
  mkdir /rsync/download.sugarlabs.org
 +
 
 +
* Then lets use rsync to download the data (warning: takes a long time)
 +
 
 +
  rsync -avzh rsync://download.sugarlabs.org/pub /rsync/download.sugarlabs.org
 +
 
 +
* Save the rsync command as a shell script and make it executable:
 +
 
 +
  echo "rsync -avzh rsync://download.sugarlabs.org/pub /rsync/download.sugarlabs.org" > /rsync/download.sugarlabs.org/sync.sh
 +
  chmod 774 /rsync/download.sugarlabs.org/sync.sh
 +
 
 +
* Then lets make this to sync automatically.  We can use a cron job to do that.  You could make sync every 2 hours:
 +
 
 +
  echo "0 */2 * * * /rsync/download.sugarlabs.org/sync.sh" > asloSyncCronJob.txt
 +
  crontab asloSyncCronJob.txt
  
==Mirrors==
+
If you don't want it to sync every 2 hours, have a look at [https://www.digitalocean.com/community/tutorials/how-to-use-cron-to-automate-tasks-on-a-vps a cron tutorial] to change that value.
</noinclude>
 
<includeonly>==[[Infrastructure Team/Content Delivery Network|Content Delivery Network]]==</includeonly>
 
:wq
 
=== Local Mirrors ===
 
  
Here is a list of localized mirrors, please select a mirror close by and try to avoid to the main site:
+
* Publish the files via HTTP.  Look at your http server documentation on how to do that.  You could set up a virtual host to serve these files: [https://www.digitalocean.com/community/tutorials/how-to-set-up-nginx-server-blocks-virtual-hosts-on-ubuntu-14-04-lts in nginx] [https://www.digitalocean.com/community/tutorials/how-to-set-up-apache-virtual-hosts-on-ubuntu-12-04-lts in apache]
  
{|class="schedule sortable"
+
* Setup a rsync mirror so we can view the status of your mirrorTo do so, create a rsyncd.conf file and open it:
! http !! ftp !! rsync !! country !! sync !!mirrorhost !! stats !! contact
 
|-
 
| [http://download.sugarlabs.org/ download.sugarlabs.org]
 
| [ftp://download.sugarlabs.org/ ftp]
 
| rsync://download.sugarlabs.org/
 
| Boston MA
 
| 0
 
| Sunjammer
 
| -
 
| [[User:Bernie|Bernie]]
 
|-
 
| [http://ftp.nluug.nl/pub/os/Linux/distr/Sugar ftp.nluug.nl]
 
| [ftp://ftp.nluug.nl/pub/os/Linux/distr/Sugar ftp]
 
| rsync://ftp.nluug.nl/sugar
 
| The Netherlands
 
| 1x24h
 
| [http://nluug.nl NLUUG]
 
| [http://ftp.nluug.nl/.statistics stats]
 
|  [[User:Marten|Marten]]
 
|-
 
|}
 
  
===Becoming a Mirror===
+
  sudo nano rsyncd.conf
rsync -avz --exclude=soas/snapshots  --exclude=soas/xoimages  rsync://sunjammer.sugarlabs.org/pub ~/d3.sl.o
 
  
==== Monitoring ====
+
Then insert the following config:
  
There is some basic information on the status of our mirrors.
+
  log file = /rsync/log
  
Your can find it here, http://martenvijn.nl/sugar_mirror.html
+
  [sugarlabs]
 +
      path = /rsync/download.sugarlabs.org
 +
      comment = PUT SOME INFORMATION HERE - LIKE A MOTD
 +
      read only = true
 +
      list = yes
  
===[http://mirrorbrain.org/ Mirrorbrain]===
+
Save and quit nanoThen start rsyncd so it can serve your files:
Mirrrorbrain is used by several major project including opensuse and openofficeIt is quite stable, under activate development, and well documented,
 
  
====Installation from scratch====
+
  rsync --daemon --config=/etc/rsyncd.conf
Please see http://mirrorbrain.org/docs/installation/debian/ .  Our Sugar Labs installation instructions have become the official mirrorbrain on ubuntu instructions.
 
  
====Installation from packages====
+
* Alert the [[Infrastructure_Team/Contacts|Sugar Labs System Administrators]] that they would like their mirror into rotation, including the following information in the request:
Add the following line to sources.list to get the latest packages for the opensuse build system.
+
** Name and URL of the mirror operator (e.g. organization)
 +
** Name and email address of the administrative contact
 +
** [http://en.wikipedia.org/wiki/ISO_3166-1_alpha-2 ISO 3166-1 alpha-2] country code of the server location
 +
** HTTP base URL of the files on the mirror  (typically http://mirrors.example.org/sugarlabs/)
 +
** rsync base URL of the files on the mirror (typically rsync://mirrors.example.org/sugarlabs/)
  
sudo vim /etc/apt/sources.list
+
Please contact sysadmin AT sugarlabs DOT org if you are interested in hosting a mirror.
  
deb http://download.opensuse.org/repositories/Apache:/MirrorBrain/xUbuntu_9.04/ /
+
=== For Sugar Labs sysadmins ===
  
====Install the mirrorbrain packages====
+
To add a new mirror to the MirrorBrain redirector:
  
  sudo apt-get update
+
* Choose a name for the mirror, usually the host name.
  sudo apt-get install mirrorbrain mirrorbrain-tools mirrorbrain-scanner libapache2-mod-mirrorbrain libapache2-mod-autoindex-mb
+
* Register the mirror with MirrorBrain:
 +
  sudo -u mirrorbrain mb new <mirror name> --operator-name <operator name> \
 +
  --operator-url <operator URL> -a <admin name> -e <admin email> \
 +
  -c <country code> -H <base HTTP URL> -R <base rsync URL> -F <base FTP URL>
 +
* Scan and enable the mirror:
 +
  sudo -u mirrorbrain mb scan -e <mirror name>
 +
* Export the list of mirrors for mirmon (a hourly cronjob does this, but if you don't want to wait...):
 +
mb export --format=mirmon-apache | sudo -u mirrorbrain tee /srv/mirrorbrain/mirmon/mirrorlist-export
 +
* Finally, re-run mirmon to ensure it can check the health of the mirror (this is also done by a cronjob, but our patience is very short):
 +
sudo -u mirrorbrain mirmon -v -get all -c /etc/mirmon.conf

Latest revision as of 20:10, 9 August 2014

Team Home   ·   Join   ·   Contacts   ·   Resources   ·   FAQ   ·   Roadmap   ·   To Do   ·   Meetings

Introduction

A content delivery network or Content Delivery Network (CDN) is a system of computers containing copies of data, placed at various points in a network so as to maximize bandwidth for access to the data from clients throughout the network. A client accesses a copy of the data near to the client, as opposed to all clients accessing the same central server, thereby causing a bottleneck near that server.

Goals

  • Reduce bandwidth at primary download server.
  • Improve quality of service for users.
  • Move content closer to users, thus reducing latency.

Architecture

The Sugar Labs Content Delivery Network uses MirrorBrain as a redirector. The redirector, which lives in a Sugar Labs data center, keeps track of which files are available on which mirror. When a user requests a file, the redirector points the user to the correct mirror and automatically starts the file download.

Mirrors

The current list of available mirrors is available at http://mirrors.sugarlabs.org/

Considerations

Bandwidth

To run a mirror you need a lot of bandwidth! You should look at the total bandwidth used by all the mirrors.

If you have trouble with bandwidth, you should look at CloudFlare.

HDD Space

Hosting a mirror takes a lot of space. If you don't have a lot of space you can only choose to mirror some parts. For example exclude all directories but the activities (~13gb):

 rsync -avzh rsync://download.sugarlabs.org/pub --exclude 'dextrose' --exclude 'hexoquinasa' --exclude 'images' --exclude 'sources' --exclude 'docs' --exclude 'packages' --exclude 'soas' /rsync/download.sugarlabs.org

Setting up a new mirror

For mirror administrators

All you need is a web server with enough bandwidth to serve the files. To set up a new mirror, the site administrator needs to:

  • First lets make a directory to store the data:
 mkdir /rsync
 mkdir /rsync/download.sugarlabs.org
  • Then lets use rsync to download the data (warning: takes a long time)
 rsync -avzh rsync://download.sugarlabs.org/pub /rsync/download.sugarlabs.org
  • Save the rsync command as a shell script and make it executable:
 echo "rsync -avzh rsync://download.sugarlabs.org/pub /rsync/download.sugarlabs.org" > /rsync/download.sugarlabs.org/sync.sh
 chmod 774 /rsync/download.sugarlabs.org/sync.sh
  • Then lets make this to sync automatically. We can use a cron job to do that. You could make sync every 2 hours:
 echo "0 */2 * * * /rsync/download.sugarlabs.org/sync.sh" > asloSyncCronJob.txt
 crontab asloSyncCronJob.txt

If you don't want it to sync every 2 hours, have a look at a cron tutorial to change that value.

  • Publish the files via HTTP. Look at your http server documentation on how to do that. You could set up a virtual host to serve these files: in nginx in apache
  • Setup a rsync mirror so we can view the status of your mirror. To do so, create a rsyncd.conf file and open it:
 sudo nano rsyncd.conf

Then insert the following config:

 log file = /rsync/log
 [sugarlabs]
     path = /rsync/download.sugarlabs.org
     comment = PUT SOME INFORMATION HERE - LIKE A MOTD
     read only = true
     list = yes

Save and quit nano. Then start rsyncd so it can serve your files:

 rsync --daemon --config=/etc/rsyncd.conf
  • Alert the Sugar Labs System Administrators that they would like their mirror into rotation, including the following information in the request:
    • Name and URL of the mirror operator (e.g. organization)
    • Name and email address of the administrative contact
    • ISO 3166-1 alpha-2 country code of the server location
    • HTTP base URL of the files on the mirror (typically http://mirrors.example.org/sugarlabs/)
    • rsync base URL of the files on the mirror (typically rsync://mirrors.example.org/sugarlabs/)

Please contact sysadmin AT sugarlabs DOT org if you are interested in hosting a mirror.

For Sugar Labs sysadmins

To add a new mirror to the MirrorBrain redirector:

  • Choose a name for the mirror, usually the host name.
  • Register the mirror with MirrorBrain:
sudo -u mirrorbrain mb new <mirror name> --operator-name <operator name> \
 --operator-url <operator URL> -a <admin name> -e <admin email> \
 -c <country code> -H <base HTTP URL> -R <base rsync URL> -F <base FTP URL>
  • Scan and enable the mirror:
sudo -u mirrorbrain mb scan -e <mirror name>
  • Export the list of mirrors for mirmon (a hourly cronjob does this, but if you don't want to wait...):
mb export --format=mirmon-apache | sudo -u mirrorbrain tee /srv/mirrorbrain/mirmon/mirrorlist-export
  • Finally, re-run mirmon to ensure it can check the health of the mirror (this is also done by a cronjob, but our patience is very short):
sudo -u mirrorbrain mirmon -v -get all -c /etc/mirmon.conf