Harvest: Difference between revisions

Tch (talk | contribs)
Tch (talk | contribs)
No edit summary
Line 110: Line 110:


'''''Observation:''' All the metadata names, matches the original names of the journal metadata.''
'''''Observation:''' All the metadata names, matches the original names of the journal metadata.''
== How does it work? ==
The project comprises two pieces of software: a harvest server that can be localed anywhere in the cloud, and a harvest client that runs in the learners machine. The harvest server exposes a service, accessible from the Internet, for metadata storage. The harvest clients collect metadata from the Journal and sends it to server.
== When does it collect? ==
* Data is collected when Sugar starts and when Sugar successfully connects to a network.
* Once it has successfully collected data, it won't sent another report until the next collecting period, weekly or monthly.
* In order to avoid service peaks, Harvest applies a random chance for executing the collection process.
* Also, if the server is unresponsive, it won't retry for couple hours.
== What are the advantages? ==
* Learners data are never copied nor transferred out of their machines.
* The collection is being done continuously over time. This means that its sampling is very fine grained.
* It is very lightweight. It can be deployed in a central server.
* Does not require OS customization. The client is based on Sugar's web service framework, and it can be installed on any existing Sugar 0.100+ distribution.
== What is implemented so far? ==
Pretty much everything as it concerns for metadata collection.
=== Harvest server ===
* Back-end service for storage.
* SSL data encryption.
* API Key authorization.
* Control scripts based on systemd.
* DB migrations and continuous integration support.
* RPM packaging.
=== Harvest client ===
* Journal metadata collection.
* Web service extension.
* Extension controls from the web service control panel.
* Random selection.
* Exclusive log for debugging.
* Hashed serial numbers.
* Restricted retry policy.
* RPM packaging.
== Code ==
* https://github.com/tchx84/harvest-client
* https://github.com/tchx84/harvest-server
== External dependencies ==


=== Custom Groups ===
=== Custom Groups ===
Line 157: Line 200:
  ]}
  ]}


== How does it work? ==
=== Network traffic measurements ===
The project comprises two pieces of software: a harvest server that can be localed anywhere in the cloud, and a harvest client that runs in the learners machine. The harvest server exposes a service, accessible from the Internet, for metadata storage. The harvest clients collect metadata from the Journal and sends it to server.


== When does it collect? ==
Harvest-monitor is a lightweight daemon which uses custom iptables counters to do measurements on network traffic. This is an optional feature. If available, harvest-client will collect these measurements and report it to the server.  
* Data is collected when Sugar starts and when Sugar successfully connects to a network.
* Once it has successfully collected data, it won't sent another report until the next collecting period, weekly or monthly.
* In order to avoid service peaks, Harvest applies a random chance for executing the collection process.
* Also, if the server is unresponsive, it won't retry for couple hours.


== What are the advantages? ==
The source code can be found at: https://github.com/tchx84/harvest-monitor
* Learners data are never copied nor transferred out of their machines.
The RPM package can be downloaded from: http://www.sugarlabs.org/~tch/repos/f18/harvest-monitor-0.2.0-2.noarch.rpm
* The collection is being done continuously over time. This means that its sampling is very fine grained.
* It is very lightweight. It can be deployed in a central server.
* Does not require OS customization. The client is based on Sugar's web service framework, and it can be installed on any existing Sugar 0.100+ distribution.


== What is implemented so far? ==
=== Spent time tracking ===


Pretty much everything as it concerns for metadata collection.
This is based on downstream sugar-toolkit and sugar-toolkit-gtk3 patches by Manuel Quiñones and Martin Abente.


=== Harvest server ===
* sugar-toolkit downstream pathches: https://github.com/manuq/sugar-toolkit/tree/spent-time
* Back-end service for storage.
* sugar-toolkit-gtk3 downstream patches: https://github.com/manuq/sugar-toolkit-gtk3/tree/spent-time-3
* SSL data encryption.
* API Key authorization.
* Control scripts based on systemd.
* DB migrations and continuous integration support.
* RPM packaging.


=== Harvest client ===
The precision of the time tracking can be improved by taking into account power management events and other sugar UI events. In order to do so, harvest-tracker must be used.
* Journal metadata collection.
* Web service extension.
* Extension controls from the web service control panel.
* Random selection.
* Exclusive log for debugging.
* Hashed serial numbers.
* Restricted retry policy.
* RPM packaging.


== Code ==
* harvest-tracker source code: https://github.com/tchx84/harvest-tracker
* https://github.com/tchx84/harvest-client
* harvest-tracker package: http://www.sugarlabs.org/~tch/repos/f18/harvest-tracker-0.3.0-1.noarch.rpm
* https://github.com/tchx84/harvest-server


== RPMs ==
== RPMs ==