Posts tagged 2025
Outsourcing IT Operations

We recently attended a very interesting seminar organised by the Global Association of Risk Professionals (GARP). The topic was to help finance companies address APRA’s CPS 230 standard that has come into force on 1 July 2025. The standard addresses the need for finance companies to develop business continuity plans and to prudently manage their operational risks. In light of the common approach to outsource IT operations to third parties, the standard focuses heavily on this trend. 

Speakers were from UniSuper, NAB and Deloitte and they spoke generally about their experience in preparing for the introduction of the new standard. 

The speaker from UniSuper focused on the company’s experience of having Google inadvertently delete the entire UniSuper Google Cloud subscription, impacting over 600,000 customers. This occurred even though UniSuper had duplicate infrastructure and data in two geographies.

The conversation settled primarily on the challenge of maintaining the resilience of IT applications and data, in light of the small number of vendors in Australia.

See below for a diagram that depicts the major vendors of IT infrastructure and applications in Australia.

By way of example, before cloud and SaaS, the four banks in Australia would own and operate their own data centres, IT systems, networks and purchase the application software licences to run their businesses. If one of the banks suffered a power outage, a fire in their data centre or an IT malfunction, only its customers were impacted. 

Today, it is likely that all four banks subscribe to the services of AWS, Microsoft and Google. So, if one of these providers suffers and outage, many more bank customers could be impacted. 

The dominance of Microsoft is particularly concerning because it operates cloud services and three dominant SaaS services – O365, Teams and SharePoint. For a large proportion of Australian organisations, employees working from home are especially reliant on Teams.

The other aggregation of risks results from the concentration of data centres in Melbourne. There are currently four large data centres located in close proximity to each other in Port Melbourne, with NextDC planning another very large data centre nearby. Port Melbourne is about 2-3 metres above the Yarra River, which is open to the sea.


Uncertainty in the US

Finally, Jeff Bezos and Mark Zuckerberg have recently substantially changed their policies governing The Washington Post and Facebook. It’s plausible that given the major changes occuring in the US, that other companies mentioned above in the diagram, could also initiate substantial changes to the way they operate, possibly impacting Australian companies.

What we used to take for granted is no longer!

We left the seminar believing that more Australian organisations should seriously consider APRA’s approach to managing their reliance on IT systems and data. 

The introduction of the Standard on 1 July 2025 adds urgency for Australia’s regulated entities!

PS: The incredibly impactful electrical sub-station fire at Heathrow Airport recently apparently also supplied a number of the UK’s data centres!

Floods and Resilience

Reducing the impact of a flood

We often advise clients that the risks presented by climate change are increasing rapidly. For any organisation that has assets exposed to flooding, sea level rise or storms, mitigating these risks can be challenging. Here are a couple of success stories.

In May 2010, a major flood hit Coca-Cola’s 30,000 m2 bottling plant in Nashville, Tennessee. The facility is located in a high hazard, 100 year flood zone.

The flooded bottling facility during the 2010 flood which prompted the flood mitigation project.

Coca-Cola partnered with FM Global to develop a plan to protect their facility from future floods.

They decided they could not relocate the very large warehouse away from a known flood risk, so they developed a method to reduce the impact of the next inevitable flood. They protected critical production equipment within the facility using flood walls. This enabled them to let the flood waters flow into, and out of, the building.

Flood wall around the critical infrastructure and the flood door to allow the water to flow back out of the building.

Amazingly, they were able to verify the effectiveness of the solution during another serious flood in March 2021. See here for more details: https://www.fm.com/insights/coca-cola


Interestingly, the Reject Shop 26,000 m2 Distribution Centre in Ipswich, Queensland had a very similar experience. After the extensive flooding at the start of 2011, they installed a floodbarrier system around the DC. Again, they had the opportunity to test the barrier when another flood hit the area in 2013. There was no impact to the DC’s operations!

Resilience in the Cloud

The Uptime Institute recently published an excellent paper on the topic of the cost and benefits of purchasing increased resilience from cloud providers, using AWS as a case study. The baseline comparison was a system with no resilience installed. The study shows the cost of the increased resilience and the associated reduced downtime.

The author provides some wise advice:

"Unlike privately owned and co-located data centres, customers using the public cloud have no visibility or control over the datacentre used by their cloud provider. When architecting a cloud application, it is up to software developers to incorporate resiliency into their application architecture. Conversely, in amore traditional non-cloud application, data centre teams, infrastructure engineers and software developers should work together to meet resiliency requirements.”

"If customers use more resources to architect resiliency, they need to pay for those additional resources. The implication is that resiliency is neither included as standard nor guaranteed. Customers should design their applications to meet availability requirements and balance this objective against the cost."

As CIO’s are relinquishing control over the operation of their IT systems, it is imperative that the resilience requirements of each application and its data are fully specified in the service agreement with the cloud provider. APRA’s standard soon to come into effect, CPS 230, addresses many of the issues associated with outsourcing critical services to Third Parties. Source: https://intelligence.uptimeinstitute.com/resource/cloud-availability-comes-price

Operational Risk Management - CPS 230

The Australian Prudential Regulation Authority (APRA) has released a guide that covers the new standard on Operational Risk Management - CPS 230. The standard came into force this month.

Although APRA’s standards are intended for companies operating in the Australian financial market, we think the standard and guide provide very good advice for most organisations that are concerned about their operational resilience.

The standard addresses the following:

  • The assessment and management of a wide range of operational risks, including legal, regulatory, compliance, conduct, technology, data and change management risks.

  • Business continuity and how organisations should identify time critical business activities and estimate their tolerance for having them unavailable. Importantly, the business continuity plan should document the recovery procedures and workarounds if any supporting resources (people, facilities and IT systems) become unavailable because of a disruption.

  • Development of a policy for dealing with material service providers. This policy should cover how to identify, manage and monitor the service providers that have a significant impact on the organisation’s operations. They should also evaluate the risks posed by these service providers, sign formal contracts with them, track their performance and carefully manage any major changes in their arrangements.

Managing outsourced IT services

We find that many organisations have outsourced large parts of their IT infrastructure to service providers and as a result they have often yielded management and control to others.

This makes it challenging for the CIO to ensure that the recoverability of IT systems meet the needs of the business. Some Software as a Service vendors will not warrant a Recovery Time Objective. Often, the outsourced system (and its data) only exists at one location, making it a single point of failure.

It is critical that business management identifies the time critical activities and their tolerance to disruption. These requirements should be communicated to IT management, so that the critical IT systems have the necessary resilience to support the business during disruptions.

CPS 230 outlines an excellent approach to achieve that!