Emerging threats

We support organisations striving to build a trustworthy, safe online environment where users can engage authentically in their communities.
Commercial organisationsWe support commercial organisations operating in a digital world, seeking to protect their reputation and prevent business disruption caused by cyber attacks and compliance breaches.
International programmes and developmentWe support international government organisations and NGOs working to provide infrastructure or improve the capabilities, security and resilience of their nation.
UK government and public sectorWe support UK government organisations responsible for safeguarding critical infrastructure, preserving public trust, and maintaining national security.



HMRC has just signed a £175 million, ten-year contract with Quantexa, a UK-based data, analytics and AI software company. The system will pull together fragmented data across government to flag tax fraud and direct enforcement effort, at scale. HMRC oversees roughly £800 billion in tax revenue a year, so by any measure, this is well and truly consequential AI.
Most of the coverage has treated it as a procurement story; a large contract, a well-funded vendor, a government getting serious about compliance.
But that framing misses what I think is the more important question: What happens when someone decides to attack it? We’ve seen this before; a new class of critical system appears, organisations focus first on capability and rollout and only later realise that adversaries will test it too.
This isn't just speculation; the pattern is already established. Wherever a system makes decisions with money or consequence attached, attackers will probe it. Fraud teams have spent years watching adversaries map the edges of rules-based detection systems, learning which transaction patterns trigger alerts and which don't.
AI systems may be more sophisticated targets, but the adversarial logic is identical.
The examples aren't theoretical:
The techniques exist and are in use. The question isn’t whether AI systems used in tax enforcement, benefits assessment or financial regulation will face adversarial pressure, it’s:
Which ones and how soon?
We have a useful precedent. It just played out so slowly that it's easy to forget how it happened.
In 1999, The Matrix introduced millions of people to the idea that a networked system could be infiltrated, manipulated and turned against its operators. That same year, penetration testing was a niche discipline practised by a handful of researchers. Most organisations had no idea it existed. In 2004, PCI DSS made it mandatory for card processors. At the time, it was considered a blunt regulatory intervention driven by the cost of card fraud reaching levels that demanded a response. CREST was founded in 2006 to professionalise and credential the practice. NIST published its technical pen testing standard in 2008. By the mid-2010s, no regulated organisation was procuring serious technology without a pen testing requirement somewhere in the process.
That's roughly fifteen years from niche to a standard procurement requirement. There were four key forces that drove it:
None of those four emerged all at once; they accumulated, reinforced each other and eventually made the question “has this been tested?” as unremarkable as asking whether a building has a fire alarm.
All four forces are now in motion for AI.
The regulatory direction is already clear:
None of this mandates AI red teaming as a specific, credentialled practice…yet. But the direction is unmistakable and the gap between ‘direction’ and ‘mandatory’ is closing faster than it did for pen testing. Let’s be honest, the regulatory cycle moves quicker now, public and political attention on AI is higher than it ever was on cyber security in 2004. And insurers are already beginning to ask about AI exposure within cyber liability policies—the pricing signal that, in retrospect, did as much as any regulation to normalise pen testing as routine hygiene.
My estimate: AI red teaming will become a procurement expectation in regulated sectors within five to seven years. Financial services, healthcare and public sector will get there sooner.
What's missing right now is the infrastructure the cyber industry built: There’s no CREST equivalent, no agreed engagement standard, no common methodology for what an AI red team should produce and how findings should be reported. That gap will close. The question for organisations procuring AI today is whether they wait for the standard to exist before taking it seriously (which is exactly what most organisations did with cyber, and exactly why the 2000s were so expensive).
AI red teaming sits at the intersection of three distinct capabilities:
Most AI consultancies have none of these. They understand model architecture, not adversarial tradecraft. Most pen testing firms have one (usually the hands-on technical piece) but not the threat intelligence background or the investigative depth to understand how a system's decisions can be gamed by someone who has spent time understanding its purpose.
The combination is important because the most dangerous attacks on consequential AI systems won't come from people probing APIs at random. They'll come from people who understand what the system is trying to decide, what evidence it relies on and how to shape that evidence to produce a different outcome. That requires threat intelligence thinking, not just technical testing.
This is exactly the kind of work PGI has been building toward through adjacent disciplines. Our experience across fraud, financial crime, cyber investigation and threat-led testing means we are used to examining how systems behave when someone is actively trying to manipulate an outcome, not just how they perform under normal conditions.
If you are procuring AI for anything consequential—enforcement, fraud detection, credit decisions, benefits eligibility—the primary question is probably not “is it accurate?” because accuracy is the vendor's pitch. It tells you how the system performs against known inputs.
The question that tells you more is: “Has it been attacked and what can it survive?”
That question doesn't have a standard answer yet, but it will. Organisations that start asking it now, before the framework exists, before the regulator requires it, will be better placed than those who wait to be told.
We're not watching the beginning of a new discipline. We're watching a repeat of one.
Most organisations will wait for regulation. The smarter ones won’t. If you're procuring AI for anything consequential, now is the time to test whether it can be manipulated before someone else does. We've spent years thinking like attackers and we can apply that mindset to the AI systems you rely on. Let's talk.

The UK Ministry of Defence (MoD) recently introduced the Defence Cyber Certification (DCC) Scheme: a framework for suppliers in the defence supply chain.

Rising geopolitical tensions, uncertainty in the economy, and the growing sophistication of cyber-attacks are forcing organisations to face the question: would our defences actually hold up against a real attack?Recently at PGI, we’ve seen a notable increase in demand for red teaming and Physical security assessments.

The adoption of AI is driving organisations to reassess their operations and, in some cases, if they can replace staff headcount with technology.