Summer of Code/2018/attentive migration of wiki activity pages to git
Google Summer of Code 2018 Project Proposal
Note: This is a work in progress. The proposal will be further modified and improved, upon discussions and feedback from the Sugar Labs community
This wiki-page does not yet reflect a few updates made to the proposal. Please read the .pdf
The Final pdf is available here
Project Name: Attentive Migration of wiki activity pages to git
Author : Rudra Sadhu
About Me
- Name: Rudra Sadhu
- Email: rdrsadhu@gmail.com
- Sugar Labs wiki username: rdrsadhu (user-page at User: rdrsadhu)
- IRC nickname on irc.freenode.net: rdrsadhu
- Languages: English, Hindi, Bengali
- Where am I located? and what hours(UTC) do I tend to work?
- I'm located at Kolkata, India (UTC +05:30)
- Working Hours: Flexible (Whatever works for both me and my mentor)
- Past experiences with Open-Source projects:
- actively contributing to Open-Source since August, 2017
- organisation: Oppia
- firstly, Why Oppia(and now Sugar Labs)?
- I've always believed, 'there is no process as powerful as education to change the world'
- Mission of Oppia Foundation and Sugar Labs are similar(if not exactly same). Read here
- My first pull request : Fix #3258: Add share-to-classroom Button ~Happiness is when pull-requests gets merged :)
- What else did I chip in?
- while playing around, Found an issue in the application,
- Recorded it : https://github.com/oppia/oppia/issues/3750
- Submitted fix: https://github.com/oppia/oppia/pull/3754 ~Felt Awesome!
- Created outline for Mathematics Lessons(Geometry)
- 6 other Merged pull-requests. Complete list here
- these are either Code Refactoring (which proves my skill to understand complex codebases, and knowledge of intermediate git workflow- dealing with branches, merge-conflicts etc)
- or Trivial Documentation fixes (which confirms my attention to detail)
- What did I gain out of contributing to Open Source projects and why I'm again up for it?
- I learned a lot! and more importantly, had fun learning.
- A sense of pride and satisfaction, to be able to contribute for a good cause, which impacts people around the globe.
- Mentions on the Credits Page. Even if I die tomorrow, this will stay on.
- I've got nothing to lose :D
- I've tasted failures as well, and I'm not uncomfortable to talk about them. Well, because they often teach us a lot more than successes.
- this pull-request of mine did not get merged.
About Project
- Project Name: Attentive Migration of wiki activity pages to git
- Description:
- At present, the Activities#Sugar_Activities section in Sugar Labs wiki lists 345 pages.
- These pages individually serves as kind of homepages(containing all relevant information), for different Sugar Activities.
- example: Activities/MakeyMakey is the homepage for MakeyMakey activity. Whereas, its source-code is hosted at https://github.com/sugarlabs/makeymakey
- as Sugar Labs is moving towards GitHub style of development (see discussion)
- for any change to a activity, it gets cumbersome for the developer to update both the GitHub repository and its corresponding wiki-page documentation
- thus, it would be beneficial(and more maintainable) in the long run if these 345 wiki pages were embedded only in their corresponding GitHub repositories.
- Objective:
- Write a program(or script) to migrate all of the above-mentioned wiki-content to their appropriate GitHub repositories in Markdown(or reStructuredText) format, with detailed attention to quality.
- Promised Outcome:
- 100% of the mentioned wiki-pages(from wikitext markup) moved to their GitHub repositories(in Markdown/reStructuredText format).
- Something like https://github.com/rdrsadhu/portfolio-activity, from Activities/Portfolio
- this is a fork of the original repository https://github.com/sugarlabs/portfolio-activity
- I'm aware that the converted Markdown syntax of README.md file is not optimal.
- I intend to discuss a few things with the community, before starting to optimize the conversion process.
- Something like https://github.com/rdrsadhu/portfolio-activity, from Activities/Portfolio
- Also, all the current wiki-pages will be deprecated(deleted/moved as desired by the community)
- 100% of the mentioned wiki-pages(from wikitext markup) moved to their GitHub repositories(in Markdown/reStructuredText format).
- Technologies used:
- Programming Languages: Python 3.6
- Tools: git, pandoc
- APIs: MediaWiki API
- High-Level overview of the Migration Process:
- For each page to migrate
- Step 0: Determine to which GitHub repository, it'll be added
- Step 1: from wiki.sugarlabs.org/activities/<activity_page>, Get source file(wikitext) which is to be converted(to Markdown/reStructuredText)
- Step 2: Get associated extra files(images etc) used in the wiki-page
- Step 3: Convert wikitext content to Markdown/reStructuredText format
- Step 4: Add converted files to GitHub repository mentioned in Step 0
- Step 5: Update all other pages in the wiki, which links to this to-be-deprecated page with the new GitHub link
- Step 6: Deprecate wiki-page and associated extra files from wiki.sugarlabs.org
- For each page to migrate
- Detailed Technical description of Migration Process
- Step 1: using MediaWiki API, query wiki.sugarlabs.org to get list of all pages to convert
- Step 2: For each page, research and find out(or discuss as needed with mentor/community) where to place them in GitHub
- there isn't a one:one correspondence for all the pages and the git repositories
- for example:
- in the section Activities#Sugar_Activities, there are at least 20 pages as Activities/Turtle Art/Tutorials/<wiki-page-name> , which I believe should be in the same GitHub repository
- Some git repositories already has a README.md file, which contains information not available in the corresponding wiki-page (See activity-git-repo : activity-wiki-page)
- Some wikipages contain user documentation which should ideally be moved to the help-activity'
- Some wikipages have already been moved to GitHub and thus repeating the work would result in duplication
- The variety is a lot more, which I plan to extensively figure out and discuss as needed, before the coding-period starts.
- Step 3: For each GitHub repository to push converted files:
- 3.1: Fork the repository (create a copy of the original repository into my-GitHub-account)
- 3.2: Pull the forked repository (get a local copy of it, in my development machine)
- 3.3: Checkout a new branch for changes
- 3.4: For each file to be pushed into the repository:
- Using MediaWiki API, query a specific wiki-page, to get its raw source text(in wikitext format), and all associated files(images etc) used
- create a safe backup of the original raw wikitext
- generate one temporary duplicate file(containing raw wikitext), for conversion
- using pandoc, convert
- while the conversion is not perfect (it is difficult or sometimes not-possible, to generate the exact functionalities of a wiki-page into a markdown file. For example, image-galleries)
- modify the temporary wikitext file, with attention to every small detail
- and convert again
- 3.5: Test all changes
- 3.6: Commit all newly generated files (also the safe wikitext backup; in-case the community decides to keep it for future reference)
- 3.7: git push to Forked Repository
- 3.8: Send a Pull Request to original Repository
- 3.9: After Pull Request gets merged;
- Update all other pages in the wiki, which links to these to-be-deprecated pages with the new GitHub link
- using MediaWiki API, delete/move contents of original wiki-page
- Important:
- Adding documentation to the activity repositories, may sometimes unnecessarily enlarge the activity bundle size;
- in such a scenario, a custom setup.py will be written which excludes the documentation folder.
- Important:
- Project Deliverables
- 3 clear quantitatively measurable goals to determine project success/failure status which will help in evaluation
- Month 01: Compete high-quality end-to-end migration of 100 pages
- Month 02: Compete high-quality end-to-end migration of 300 pages
- Month 03: Compete high-quality end-to-end migration of 345 pages
- Project Timeline:
- I will work 45-50 hours a week, and have decided the goals accordingly.
Week Date(2018) Period Task Today - April 22 - Research about the variety of content in different wiki-pages to be converted
- Discuss with the community about anything important which comes up
- Learn more about syntax of wikitext, different functionalities of MediaWiki API
April 23 - May 13 Community Bonding - Set up Development environment(Install necessary software, packages, libraries)
- Set up a dedicated blog for GSoC 2018
- Dive deep into the codebase of help-activity, and other important areas of the project
- Learn how to write a custom setup.py to keep the size of activity bundle in check
- Blog post about my plans for Coding Month 01
May 14 - June 10 Coding Month 01 01 May 14 - May 20 - Write Script
- Test Script extensively
02 May 21 - May 27 - Complete end-to-end migration of 30 Pages
03 May 28 - June 03 - Complete end-to-end migration of 65 Pages
04 June 04 - June 10 - Complete end-to-end migration of 100 Pages
- Blog post about my plans for Coding Month 02
June 11 - July 08 Coding Month 02 05 June 11 - June 17 First Midterm Evaluations - Quality Check of all pages migrated during Coding Month 01
- Submit Evaluations
- Complete end-to-end migration of 150 Pages
06 June 18 - June 24 - Complete end-to-end migration of 200 Pages
07 June 25 - July 01 - Complete end-to-end migration of 250 Pages
08 July 02 - July 08 - Complete end-to-end migration of 300 Pages
- Blog post about my plans for Coding Month 03
July 09 - August 05 Coding Month 03 09 July 09 - July 15 Second Midterm Evaluations - Quality Check of all pages migrated during Coding Month 02
- Submit Evaluations
- Complete end-to-end migration of 345 Pages
10 July 16 - July 22 - Final Quality Check of all migrated pages
- Detailed documentation about everything related to this project
11 July 23 - July 29 - Clean up Development environment
- Buffer time to accommodate any due/extra/unknown work
12 July 30 - August 05 - Buffer time to accommodate any due/extra/unknown work
August 06 - August 14 Final Evaluations - Submit Evaluations
- Be Humble, and thank everyone for the opportunity :)
- Convince us, in 5-15 sentences, that you will be able to successfully complete your project in the timeline you have described. Link to prior work or other resources as relevant
- Find the beta version of proposed script here
- I implemented some basic features of the script, which
- given any Sugar Activity pagename and forked_repo_url, migrates the page(without any manual supervision) and commits all generated files into the local copy of forked-repository
- The script is commented to explain what each part is doing
- The script follows, roughly the same workflow stated in Detailed Technical description of Migration Process above
- I implemented some basic features of the script, which
- I tried the script on 3 different pages of Activities.
- Here are the results:
- Note: I'm aware of the sub-optimal output from pandoc. and that is what I plan to refine during the summer. It will require a lot of careful observations(and maybe some manual cleanup as well according to the diverse page content) and thus was not possible to implement in this limited timeframe.
- I do not have any other commitments during the summer, and will be able to devote complete time & energy to implement this project at the best of its quality.
- Moreover, I'm excited to learn and willing to push myself.
Project and the Community
If the project is successfully completed, what will be its impact on the Sugar Labs community?
- Answer 1: Rudra Sadhu:
- All information about 1 particular Sugar Activity, will live inside its GitHub Repository, which will help
- Activity Maintainers
- New members of the community
- Users
- and ultimately everyone :)
- GitHub is the world's leading software development platform, (check Stack Overflow 2018 developer survey)
- thus hosting a complete repository with all the documentation, will provide greater exposure and surely help to attract more developers and traffic to the community
- All information about 1 particular Sugar Activity, will live inside its GitHub Repository, which will help
- Answer 1: Rudra Sadhu:
- Answer 2: James Cameron (quozl@laptop.org):
- As stated in this GitHub conversation
- "Migration is needed because Wiki is not being used now that GitHub is being used. Maintaining the Wiki in addition to GitHub is not sustainable. So we want documentation to move from Wiki to GitHub.
- So that when an activity is updated in GitHub, the documentation can be updated in the same place at the same time."
- As stated in this thread
- "Originally documentation was separate because we had non-coding developers and tool chains that varied by type of developer. Now we use GitHub the tool chains are combined.
- With the project as described, documentation will be concentrated in the source code repository for an activity, reducing ongoing maintenance.
- We have less active Wiki contributors than we ever did, and in the current threat environment a Wiki requires significant monitoring and administration; we recently lost some system administrators and gained new ones; using GitHub allows us to outsource system administration."
- Answer 2: James Cameron (quozl@laptop.org):
- Answer 3: SL Community Member-2:
- Please share your views
- Answer 3: SL Community Member-2:
What will I do if I get stuck on the project and my mentor isn't around?
- I know where to get information, or ask for help (google, stackoverflow.com, etc)
- Read documentation of related tools
- Ask other Sugar Labs members via IRC, developer mailing lists
- Ask help from Sugar Labs GSoC coordinator
How do I propose to keep the community informed of my progress and any problems or questions I might have over the course of the project?
- daily updates on IRC
- weekly blog post
- about what I did the previous week
- what I plan to do the next week
- communication via mailing-lists (as and when needed)
- all the code I write and every Pull-Requests/Issues I open
- will be thoroughly commented in an easy to understand format
- Live Record of Activity migration status for all the 345 pages.
- I intend to create a wiki-page/git-repo which at any time will reflect the current migration status of all the 345 pages. Something like this
Miscellaneous
Send us a link to a pull request or merge request you have made on a Sugar or Sugar activity bug
- I went ahead and migrated 4 pages out of the 345 listed. Find the pull request at https://github.com/godiard/help-activity/pull/38
Describe a great learning experience you had as a child
- Learning how to ride a bicycle surely has been one great experience.
- Though my father initially helped to ease out the fear; just some weeks later I was alone all by myself, deciding when to speed and how to stop, all without any manual. :D
- I did fall a few(read more than a few) times too.
- What came out of the process to me, is one very important life lesson: to keep balance, we must look forward and continue moving.
Is there anything else we should have asked you or anything else that we should know that might make us like you or your project more?
- Yes,
- I did learn a lot in the last 6 weeks. From wikitext markup to Mediawiki API, pandoc, sphinx, reStructuredText and a bunch of other stuff. I had not even heard about most of these technologies before coming across this project. So, thank you for the opportunity :)
- I've always tried to prove my worth by action. The script is completely working and maybe just needs some refinement.
- Proof: 4 of the 345 pages have already been migrated and thus I know the intricate difficulties a lot better than my competitors(why? because well, I actually did the work)
- Personally, the field of education is very close to my heart. I help out local middle school students to learn basic Computer Science concepts using simple flowcharts and other intuitive stuff, in my hometown(a small town with inadequate quality learning options)
- and thus for the common goal, I believe I'll continue to work with the Sugar Labs organization and you all in the coming future.
Feel free to discuss your opinion about everything mentioned above via developer list thread
Thanks, Rudra Sadhu
Drafted with ❤ by another Open Source enthusiast, just like you :)