AI & Machine Learning

ChatGPT is down - not for the last time. Do you need a ‘backup generator?’

What the ChatGPT outage means for businesses' risk and bottom line in an increasingly AI-reliant world.

If you’ve tried to use ChatGPT today or indeed you do not live under a rock, you might have noticed: it’s not working.

And if you’re one of the many people who’ve integrated it into your daily workflow, you’ll have felt that outage much like a power cut. It’s a reminder of just how embedded generative AI has become – and how vulnerable we might be when it suddenly disappears.

But this isn’t the first time ChatGPT has had a so-called ‘outage’. In fact, it’s unfair to call it that. What’s really happening is that demand has outstripped capacity, leaving free users experiencing major lag and errors, pro users getting a little more priority, and enterprise package users likely having the most protection.

Basically, it’s like when the first cold snap of the year hits. Everyone turns their heating on, and the grid struggles, which can lead to power dips.

When AI goes down, so does productivity

The interesting thing about outages like this is how unevenly they hit. Some people barely notice. Others – especially those in teams that have gone all-in on AI – are left scrambling.

It’s like when the self-checkouts go down in a supermarket: queues build up, tempers fray, customers give up and walk out. Revenue drops.

If you’ve outsourced the IT support for those checkouts? You’re probably staring down a costly emergency callout... If the guy who fixes the checkouts is in-house, he’s everybody’s favourite that day.

The same logic applies to AI. If your business has outsourced its AI capability entirely to a single provider, you’re exposed. And as adoption deepens, that exposure becomes more acute.

The case for a ‘backup generator’

The question is – when do we start classing AI downtime as critical infrastructure failure? After all, as AI augmentation in the workplace becomes more and more ubiquitous, outages could have a very real impact on the economy.

Well, critical infrastructure failures are sometimes inevitable. Power and the internet can go down, and they do. That’s why hospitals, for example, have emergency generators to mitigate disastrous consequences.

I’ve started talking about AI resilience in the same way we talk about energy resilience. If your business can’t function without AI, then you need a backup plan. 

That might mean running models locally, so you’re not entirely dependent on cloud-based services, or training your teams to manage and deploy AI internally. 

You might not have to use it everyday – but when you need it, you’ll be very glad it’s there. Not every business is equipped, however, to build their own local, on-site AI model ‘just in case.’

Another option, and the practice I strongly advise, is to ‘spread risk’ between more than one provider. This means your workforce has another friend to turn to when one lets them down, rather than just accepting the fate of plummeting productivity.

However, to make this seamless is easier said than done. If you’ve ever hit your free ChatGPT limit halfway through a task and had to shift your task into another tool, you’ll know this transfer can be frustrating and time consuming.

Smart architecture: One interface, many models

One of the smartest strategies could be to build a model-agnostic interface – a single front-end that can dynamically switch between different AI backends. That way, if one provider goes down (or hikes their prices), you can pivot instantly.

Your employees can be focused on their roles and AI augmented output via one consistent chat function, while behind the scenes, your interface tool shares all the context and information needed for the employee to continue using any of a set of approved base models effectively and with continuity.

But to do that well, you need to have built your context in a portable way. That means structuring your prompts, data, and workflows so they can be reused across platforms. It’s not just good practice – it’s future-proofing.

From pilot to production: The skills gap is real

You probably smell where I’m going with this – whether it’s creating a local AI ‘backup generator’ or manufacturing the ability to switch effectively between providers, the need for advanced AI engineering skills is the same. And we already know those skills (at least ‘ready-made’) are both costly and in short supply.

What’s more, over the last 12 months I’ve seen a clear shift: organisations are moving from AI pilots to full-scale rollouts. That’s exciting, but it also means the stakes are higher.

When AI was a side project, an outage was annoying. Now, it’s a business risk.

These are both reasons that I expect to see a sharp uptick in demand for AI engineering training. It’s no longer enough to have a few prompt wizards on the team. Businesses need people who can:

  • Build and maintain AI infrastructure.
  • Integrate multiple models and providers.
  • Ensure continuity when things go wrong.

In short: we need more AI Makers, not just AI Takers.

Resilience mindset

Today’s ChatGPT outage won’t be the last. Other providers are not immune, either. 

There’s a wider conversation about provider transparency, and regulations for uptime and incident reporting... but the likelihood is that there will always be flaws – always be risk.

Sometimes, everyone will turn their kettle on at half time – and the lights go out.

It’s a useful reminder that resilience isn’t just about infrastructure – it’s about mindset.  If your business is serious about AI, then you need to be serious about continuity, capability, and control.

Because when it goes dark, the businesses that keep running are the ones that planned ahead.

 

Keep your teams up-to-date with the latest AI skills so they can respond to provider failures with our expert-led AI training solutions.

Related Articles