Line 1: |
Line 1: |
| <noinclude>{{TOCright}}</noinclude> | | <noinclude>{{TOCright}}</noinclude> |
| + | Discovery One is the name of the cluster of machines hosting activities.sugarlabs.org. Activities.sugarlabs.org is a system for encouraging developers of different skill levels to cooperatively develop, edit, and distribute learning activities for the Sugar Platform. |
| | | |
− | Discovery One is the name of the cluster of machines hosting activities.sugarlabs.org. In the fall of 2008 and we discovered our own TMA-1. Activities.sl.o is a very effective system for encouraging developers of different skill levels to cooperatively develop, edit, and distribute learning activities for the Sugar Platform.
| + | This section of the wiki is about setting up and maintaining the infrastructure. For information about using and and improving activities.sl.o please see [[Activity_Library| Activity Library]]. |
| | | |
− | This section of the wiki is about setting up and maintaining the infrastructure, Discovery One, necessary to keep activities.sugarlabs.org running. For information about using and and improving activities.sl.o please see [[Activity_Library| Activity Library]].
| + | ===Design=== |
| + | The prime design characteristics of a.sl.o are scalability and availability. As the a.sl.o userbase grows, each component can be scaled horizontally, across multiple physical machines. |
| + | |
| + | As of November 2009 activities.sl.o is server 500,000 activities per month using two machines located at Gnaps. The proxy(green) is on treehouse and the rest(red) is on sunjammer. |
| | | |
− | ===Design===
| |
| [[Image:Aslo1.png]] | | [[Image:Aslo1.png]] |
| | | |
− | ===Machines=== | + | ===Components=== |
− | * [[Machine/Discovery_One/Proxy | Proxy ]] | + | * [[Machine/Discovery_One/Proxy | Proxy ]] The Proxy is the public web face portion of a.sl.o. The Proxy both serves static content and acts as a firewall in fron to the rest of the system. |
− | * [[Machine/Discovery_One/Web | Web ]] | + | * [[Machine/Discovery_One/Web | Web ]] The Web nodes serve dynamically generated content and pass requests for activity downloads to the Content Delivery Network. |
− | * [[Machine/Discovery_One/Database | Database ]] | + | * [[Machine/Discovery_One/Database | Database ]] The Database maintains the data for the web nodes. |
| + | * [[Machine/Sunjammer | Shared File System ]] The Shared File System maintains a consistant file structure for the web nodes and the Content Delivery Network. |
| + | * [[Infrastructure_Team/Content_Delivery_Network | Content Delivery Network ]] The Content Delivery Network distributes and serves files from mirrors outside of the primary datacenter. |
| + | |
| + | ===Scaling Stage 1=== |
| + | Our first bottleneck in scaling a.sl.o is the cpu load of the web-nodes. Our first step will be to split the web nodes across multiple physical machines. |
| + | |
| + | ====Considerations==== |
| + | * Cloning web nodes. Each web node is an exact clone of eachother. The only difference is in assigned IP Address. Tested |
| + | * Load balancing. Add Perlbal loadbalancing and Heartbeat HA monitoring to proxy. Tested |
| + | * Common data base. Point web nodes to common database. tested. |
| + | * Common file system. Point web nodes and CDN to common file system. In Progress. |
| + | |
| + | ====Observations==== |
| + | As of Nov 2009 |
| + | * Proxy nodes |
| + | ** At peak loads catches ~ 20-25% of hits before they reach webnodes |
| + | ** Limiting factors inodes and memory |
| + | ** VM has 2G memory.... Starting to swap. |
| + | |
| + | * Web nodes |
| + | ** A Dual core 2.4 Opteron(Sunjammer) can handle our peak load at ~ 60% cpu |
| + | ** A Quad core 2.2 AMD(treehouse) can handle ~ 22 transactions per second. |
| + | ** Estimate less than 4GB of memory required per web node. |
| + | |
| + | * Memcached nodes (part of web nodes) |
| + | ** ~85 hit rate |
| + | ** 1.25G of assigned memory. |
| + | |
| + | *Database Nodes |
| + | ** Cpu load about 25% of web node -- one Database node should serve 4-5 web nodes. |
| + | |
| + | ====Compromises==== |
| + | This design sacrifices availability for simplicity. We have several possible single points of failure; Proxy, common file system, and database. |
| + | |
| + | [[Image:Aslo2.png]] |
| + | |
| + | ===Scaling Stage 2+=== |
| + | Sorry Bernie this bit is likely to give you a heart attack. |
| + | |
| + | As we split the web nodes across multiple physical machines, we we be able to add redundant components for High availability. |
| + | |
| + | ====Considerations==== |
| + | * Proxy - Loadbalancers. 2+ proxies on separate physical machines which share an IP. If a machine fails the other(s) pick up the load. |
| + | * Web nodes - Individual nodes will be monitored by Heartbeat HA monitor living on the proxies. If a web node fails, it is dropped from the Load balancing rotation. |
| + | * Memcached - Memcached is designed to be distributed. If a node fails it is dropped. |
| + | * Database - Two machines in a Master-Master configuration. Under normal operation they operate as master-slave. If the master fails, the other takes over as master. |
| + | * File system - TBD |
| + | |
| + | [[Image:Aslo3.png]] |
| | | |
| == Location == | | == Location == |
− | Hosted by [[Machine/treehouse|treehouse]] | + | * Hosted by [[Machine/sunjammer|sunjammer]] |
| + | * Hosted by [[Machine/treehouse|treehouse]] |
| + | |
| | | |
| == Admins == | | == Admins == |
Line 23: |
Line 77: |
| This machine is a clone from the VM-Template base904.img on treeehouse and runs | | This machine is a clone from the VM-Template base904.img on treeehouse and runs |
| Ubuntu server 9.04. | | Ubuntu server 9.04. |
− |
| |
− | ===External Services===
| |
− | * Shared File System
| |
− | Aslo depends on each web node having access to a common file system. This is currently set up as a NFS share on sunjammer
| |
− |
| |
− | * Content Delivery Network
| |
− | Aslo depends on the Sugar Labs content delivery next for distribution of public files.
| |
| | | |
| {{Special:PrefixIndex/{{PAGENAME}}/}} | | {{Special:PrefixIndex/{{PAGENAME}}/}} |
| | | |
| [[Category:Machine]] | | [[Category:Machine]] |