Last reviewed and updated on: 31st October 2022
The core of OpenClassrooms’ operations and services are conducted and delivered online. Our students and, where relevant, their employers can expect access to our learning resources and support platforms 24/7. It is essential, therefore, that our systems, processes and procedures are designed to maximise accessibility and minimise potential disruption for employees and customers in the event of unplanned and unforeseen disruptive circumstances and that we have plans in place to deal with the unexpected.
This Continuity Plan outlines the mechanisms that OpenClassrooms has in place to continue to operate in the event of a disaster.
The plan has two objectives:
1. To avert or minimise the effects of a disaster
2. In the event of a disaster, to resume full operation with minimum disruption
The plan is owned by the Chief Technology Officer and reviewed and tested regularly, at least once per year.
1. Disaster scenarios and critical losses
Disaster scenarios could result from a wide range of causes, e.g system or equipment failure, human error, serious adverse weather conditions, natural disaster, theft, sabotage, cyber-attack, vandalism or serious accident. They have the potential to lead to losses that are likely to have a major impact on OpenClassrooms’ ability to operate.
A risk-based business impact analysis has identified the following potential losses critical to OpenClassrooms’ operation:
- Loss of access to critical IT infrastructure
- Database corruption / data loss
- Loss of access to physical infrastructure
- Loss of human resource and expertise resulting in the disruption to service
1.1 IT infrastructure and databases
OpenClassrooms operates a paperless and Software-as-a-Service (SaaS) business model. Therefore, our IT systems, the software tools relied upon by employees to perform their daily duties and the learning platforms and databases used by employees and students are cloud based and fully accessible remotely by all authorised users. This ensures that in the event of disruption to office space, employees in all functional capacities have full remote access to business critical systems, applications and data.
Servers are professionally hosted off site via Amazon Web Services, a multi compliant cloud hosting service (https://aws.amazon.com/fr/compliance/programs/) and mirrored to provide full resilience. Server management is handled by Claranet (https://www.claranet.co.uk/cybersecurity/audit-and-compliance).
Scheduled maintenance and disruption to the service of the platform is communicated to users via a message on the site, which wherever possible will include the time by when service is expected to resume.
Database backup procedures are effectuated as follows:
- Every database is fully backed up every day between 2am and 5am.
- A snapshot is taken every day and stored for two weeks (14 days)
- During the 14-day backup storage window every database can be restored at any point of time using the point in time recovery system from AWS RDS
- Database backups are tested daily with a full restore to staging databases
Application:
Every build artefact is stored in AWS S3 and can be re-deployed at any time. Our application is launched from at least two different servers from two different physical and logical isolated locations. Servers can be added on demand at short notice, in the event of a server issue or increase in server load. Usage of two different locations ensures continued platform availability in the event of an issue occurring in one of the locations.
Architecture:
Our infrastructure is described in terraform, an infrastructure as code framework and was fully tested to recreate a test environment for benchmarking purposes. This permits easy audit of our current infrastructure and allows us to recreate our infrastructure in case of a major event that completely destroyed or made unavailable our current infrastructure provider.
|
Disaster scenario
|
Disaster recovery action
|
Communication plan
|
Responsible owner
|
|
Infrastructure failure resulting in database corruption and/or data loss
|
Restore database from most appropriate point in time.
Estimated recovery time: <1hr from reporting the failure.
This procedure was used on production after a failed migration in production which resulted in a data loss. This procedure allowed us to keep the data lost below 10 minutes.
|
The responsible owner will:
Ensure database users are informed of any data loss that may have occurred
Advise users of any remedial action users may be required to undertake to retrieve the data
Where necessary, escalate the matter to the OpenClassrooms crisis communication plan.
|
Chief Technology Officer
|
|
Database server outage
|
The production database runs in High Availability mode.
In case of a master database server outage, the system will automatically promote a standby replica.
The failover time takes from 60 to 120 seconds to complete and does not require human intervention.
|
N/A
|
Chief Technology Officer
|
|
Web server outage
|
A new web server is automatically started to replace the faulty server.
A web server outage doesn't impact production.
|
The responsible owner will:
- Ensure the externally hosted ‘status page’ is updated notifying users of current status and anticipated recovery time
- Identify any students whose synchronous session schedules may have been affected and prompt the rescheduling of service delivery to students (e.g. rescheduling of delivery sessions, extensions to assessment deadlines, etc.) ensuring that students are not unreasonably disadvantaged
|
Chief Technology Officer
|
|
A full AWS Availability Zone crash
|
None or short outage expected, every app groups and databases are available on at least two availability zones.
|
N/A
|
Chief Technology Officer
|
|
A full AWS Region crash
|
Start of a new infrastructure from scratch with terraform, restore database backup and change DNS.
Estimated time of recovery: 1 day
|
This scenario would result in full website unavailability for 24 hours.
The responsible owner will:
- Ensure the externally hosted ‘status page’ is updated notifying users of current status and anticipated recovery time
- Communicate with students and other site users advising them of the unavailability and any consequent loss of data
- Identifying any students whose synchronous session schedules may have been affected and prompting the rescheduling of service delivery to students (e.g. rescheduling of delivery sessions, extensions to assessment deadlines, etc.) ensuring that students are not unreasonably disadvantaged.
|
Chief Technology Officer
|
|
A full AWS crash
|
Start a new infrastructure on Google Public Cloud from scratch with terraform, restore database backup and change DNS.
Estimated time of recovery: 5 days
|
The responsible owner is responsible for:
- Invoking OpenClassrooms’ crisis communication plan.
- Updating the externally hosted ‘status page’ notifying users of current status and anticipated recovery time
- Communicating with students and other site users advising them of the unavailability and any consequent loss of data
- Identifying any students whose synchronous session schedules may have been affected and prompting the rescheduling of service delivery to students (e.g. rescheduling of delivery sessions, extensions to assessment deadlines, etc.) ensuring that students are not unreasonably disadvantaged.
|
Chief Technology Officer
|
|
Malicious cyber attack
|
In case of a malicious cyber attack targeting the platform, immediate response will be to activate the 'I'm under attack mode' from our CDN to mitigate the attack.
After mitigation, an investigation must be undertaken to understand what was impacted by the attack and take responsive action accordingly.
|
Where a data breach requires public communication with customers or the press, invoke OpenClassrooms’ crisis communication plan.
|
Chief Technology Officer
|
|
Data breach or loss of personal data
|
This could range from a small data breach (e.g. a small file on a usb key) to a major data breach (e.g. a full database backup going online publicly).
The responsible owner will ensure that appropriate immediate mitigating action is taken. In the event of a small data breach, the first step is to take any measure necessary to stop the breach. In the event of a major data breach, the Responsible Owner is responsible for implementing a contingency action plan. This could include taking part of the website offline if needed to mitigate the data breach.
Once immediate mitigating action has been conducted, analysis will be undertaken to diagnose how the breach may have been exploited and who are the impacted parties/customers.
Any data protection and GDPR repercussions will be considered and action taken accordingly.
|
Where a data breach requires public communication with customers or the press, the responsible owner will be responsible for invoking OpenClassrooms’ crisis communication plan.
|
Chief Technology Officer and Data Protection Officer
|
1.2 Physical infrastructure
|
Disaster scenario
|
Disaster recovery action
|
Communication plan
|
Responsible owner
|
|
Temporary loss of access to office premises
|
In accordance with OpenClassrooms’ SaaS business model, in the event of temporary inaccessibility of OpenClassrooms’ administrative and HQ premises (<5 days), all employees will implement remote working arrangements to ensure continuity of all business-critical functions.
|
Invoke OpenClassrooms’ crisis communication plan.
|
Chief Human Resources Officer
|
|
Long-term loss of access to office premises
|
|
Invoke OpenClassrooms’ crisis communication plan.
|
Facilities Manager
|
1.3 Disruption of service
|
Disaster scenario
|
Disaster recovery action
|
Communication plan
|
Responsible owner
|
|
Extensive absence among employees (e.g. due to severe sickness outbreak) resulting in disruption to services to students
|
Responsible owner will instigate plans for the rescheduling of service delivery to students (e.g. rescheduling of delivery sessions, extensions to assessment deadlines, etc.) ensuring that students are not unreasonably disadvantaged.
Where appropriate, temporary workers will be recruited.
Where appropriate, support will be extended to absent employees or their families.
|
Invoke OpenClassrooms’ crisis communication plan.
|
Chief Human Resources Officer
|
|
Extensive absence among mentors (e.g. due to severe sickness outbreak) resulting in disruption to services to students
|
Responsible owner will investigate plans for the rescheduling of service delivery to students (e.g. rescheduling of delivery sessions, extensions to assessment deadlines, etc.) ensuring that students are not unreasonably disadvantaged.
Where appropriate, temporary workers will be recruited.
|
Invoke OpenClassrooms’ crisis communication plan.
|
VP Customer Success
|
|
Transportation (Public or Private) failure prevents face-to-face events
|
Where face-to-face meetings have been scheduled (e.g. Progress Review) and transportation makes travel impossible for the attende from OpenClassrooms, the Employer and Learner must be notified as soon as known. A video call or new date arranged as an alternative.
|
Invoke OpenClassrooms’ crisis communication plan.
|
VP Customer Success
|
2. Emergency Contacts
Up to date Emergency contacts for students, including external links to ESFA contacts, are provided in the student handbook and via the OpenClassrooms’ website.
Internal Emergency Contacts are published on Notion. Key Contacts for Learners listed below
|
Organisation
|
Area
|
Website
|
Phone
|
Email
|
|
OpenClassrooms
|
Learning and Mentor related |
https://openclassrooms.zendesk.com/hc/en-us/requests/new |
0161 768 1880 |
Via the form on our website: oc.cm/contact |
OpenClassrooms
|
Labs House Office
|
labs.com/location/labs-house/
|
020 3761 2800
|
|
|
Platform/IT Issue
|
https://openclassrooms.zendesk.com/hc/en-us/requests/new
|
0161 768 1880
|
Via the form on our website: oc.cm/contact
|
|
|
|
|
|
3.Crisis Communication Plans
3.1 PRE-CRISIS
|
Action
|
DRI
|
Agenda
|
|
Prepare Crisis Communications Manual:
. Main messages . Holding statements
. Crisis contacts and organization
. Crisis centre organization
. Q&As
. Circulation of information
. Stakeholders mapping
|
VP Brand Communication and Impact
|
Ongoing
|
|
Prepare Crisis response infrastructure:
. Crisis room setup
. Specific phone lines
. Activation of status page with crisis contacts
. Crisis communications team setup: who’s in who’s out
|
VVP Brand Communication and Impact
|
Ongoing
|
|
Prepare Crisis protocols:
. Level 1: no effect on stakeholders - e.g. short website outage, spam attack etc
. Level 2: effects on at least two stakeholders (employees, students, mentors) – e.g. long website outage, reduced data breach etc
. Level 3: broad effect on all stakeholders – e.g. massive data breach
|
VP Brand Communication and Impact
|
Ongoing
|
|
Prepare Crisis spokespeople and response protocols (who speaks when and to whom)
|
VP Brand Communication and Impact
|
Ongoing
|
3.2 CRISIS
|
Action
|
DRI
|
Agenda
|
|
Gather information and assess level of crisis (1, 2 or 3), decide on answer
|
CEO, with help and advice of CPO, VP Communications and affected executives
|
As soon as alert emerges
|
|
Open Crisis room following crisis protocol, create Crisis team and send all unnecessary personnel home
|
VP Brand Communication and Impact + Crisis communications team
|
Alert + 30 minutes
|
|
Gather as much information as possible, establish secure line, take control over status page and other communications channels
|
VP Brand Communication and Impact + Crisis communications team
|
Alert + 60 minutes
|
|
Issue first holding statement to the media
Advise relevant officials and stakeholders
Publish first holding statement on social media feeds, status page and website
|
VP Brand Communication and Impact + Crisis communications team
|
As soon as ready, depending on availability of information
|
|
Issue complete statement to the media
Advise relevant officials and stakeholders
Publish complete statement on social media feeds, status page and website
|
VP Brand Communication and Impact + Crisis communications team
|
As soon as ready, depending on availability of information
|
|
Provide media with complete overview of situation
Advise relevant officials and stakeholders
Disseminate through social media
|
CEO with VP Brand Communication and Impact + Crisis communications team
|
As soon as ready, depending on availability of information
|
3.3 POST-CRISIS
|
Action
|
DRI
|
Agenda
|
|
Establish complete post-mortem, update crisis communications manual
|
VP Brand Communication and Impact
|
Ongoing
|
4. General
OpenClassrooms will review and amend this plan on an annual basis and where otherwise necessary may revise it as required in response to findings of any review or testing. Next date of planned reviews is 31st October 2024.