TI2 - 2022/23 - Critical - Moodle SRE
Size | Medium |
---|---|
Budget Epic Name | CTP Maintenance Budget |
Jira Epic | Error rendering macro 'jira' : Unable to locate Jira server for this macro. It may be due to Application Link configuration. |
Feature Lead | Alistair Spark |
Team | Nikola Bozhkov Catalyst EU/AU |
This feature encapsulates the need for Moodle to be pro-actively monitored and performance issues dealt with before they cause any CIs.
This ties in with the idea of an LA Data Availability team but more generally an Application SRE function (Site Reliability Engineering: https://cloud.google.com/blog/products/devops-sre/how-sre-teams-are-organized-and-how-to-get-started).
Key areas of focus for TI2:
- Continue monitoring auto-scaling, refine it's scaling parameters for peaks
- Refine the coursemodinfo caching infrastructure to reduce cost while fulfilling the bandwidth requirement
- Drive load testing of Moodle 4.1 and pipeline based load testing
Some of the key recurring activities encapsulated here:
- Active monitoring of the Redis / frontends / etc during peaks of loadÂ
- Drill through any blips in response times and document causes
- Push for resolution of any identified flaws
Moodle uptime is critical and this feature will always come before anything else. We currently rely on Catalyst to develop fixes for us, this will change over time but we are well resourced so this should not be seen as a barrier.
This information is provided by Digital Education
( https://www.ucl.ac.uk/isd/digital-education-team-information ) and licensed under a Creative Commons Attribution-ShareAlike 4.0 International License