TI1 - 2022/23 - Critical - Moodle SRE
Size | Medium |
---|---|
Budget Epic Name | CTP Maintenance Budget |
Jira Epic | Error rendering macro 'jira' : Unable to locate Jira server for this macro. It may be due to Application Link configuration. |
Feature Lead | Alistair Spark |
Team | David Kwaw Nikola Bozhkov |
This feature encapsulates the need for Moodle to be pro-actively monitored and performance issues dealt with before they cause any CIs.
This ties in with the idea of an LA Data Availability team but more generally an Application SRE function (Site Reliability Engineering: https://cloud.google.com/blog/products/devops-sre/how-sre-teams-are-organized-and-how-to-get-started).
Key areas of focus for TI1:
- Active monitoring during start of academic year and end of term assessment periodÂ
- Engage auto-scaling and monitor it's speed scaling for peaks is adequate
- Refine the coursemodinfo caching infrastructure to reduce cost while fulfilling the bandwidth requirement
- Drive load testing of Moodle 4.1 and pipeline based load testing
- Continue pushing performance related core trackersÂ
Some of the key activities that still need to be progressed:
- Post CI strands of work (Catalyst development but exchange and test)Â
- Regrading issue - https://wrms.catalyst.net.nz/wr.php?request_id=378838
- Cloudfront / S3 signed URLs ( ) - if not completed in TI2
- Active monitoring of the Redis / frontends / etc during peaks of loadÂ
- Drill through any blips in response times and document causes
- Push for resolution of any identified flaws
- Explore options for automating load testing (will need to time bound the effort on this)
- Improve CI comms channel - ISD News editing by SO & reach out to Mike Haward about Status page & get this reset to be generic - https://www.ucl.ac.uk/isd/moodle-under-maintenance
- Create a Moodle maintenance/outage page that can be used for traffic redirection in the event of a Moodle outage. This page needs to be editable by the Moodle team. Consider setting up a Moodle_Status Twitter feed as a short term measure if we are unable to obtain an editable page.
Moodle uptime is critical and this feature will always come before anything else. We currently rely on Catalyst to develop fixes for us, this will change over time but we are well resourced so this should not be seen as a barrier.
This information is provided by Digital Education
( https://www.ucl.ac.uk/isd/digital-education-team-information ) and licensed under a Creative Commons Attribution-ShareAlike 4.0 International License