Emerging threats

We support organisations striving to build a trustworthy, safe online environment where users can engage authentically in their communities.
Cross-sector corporatesWe support international government organisations and NGOs working to provide infrastructure or improve the capabilities, security and resilience of their nation.
International programmes and developmentWe support commercial organisations operating in a digital world, seeking to protect their reputation and prevent business disruption caused by cyber attacks and compliance breaches.
UK government and public sectorWe support UK government organisations responsible for safeguarding critical infrastructure, preserving public trust, and maintaining national security.
The world of cyber security has not escaped the shift brought about by rapid developments in AI. While these innovations are exciting and offer real benefits in terms of efficiency and scale, it's leading some to believe that it can replace human-led security testing all together.
We’ve been keeping an eye on how AI has been impacting penetration testing specifically. One example that caught our eye was XBOW, a self-proclaimed ‘autonomous’ AI vulnerability scanning tool. It has topped the US leaderboard on the ‘HackerOne bug bounty’ programme, sparking widespread debate about the implications of using AI in security testing.
Over a 90-day period, XBOW submitted around 1,060 vulnerability reports to the programme, with 132 of those vulnerabilities confirmed as valid, or ‘awarded’, and a further 303 recognised as valid but not yet triaged. While these numbers are impressive at first glance, statistics can be misleading (that’s marketing for you). Looking at the same stats, but cut a different way: Just over one in ten vulnerabilities submitted by the scanner were actually successfully awarded. Resulting in humans needing to spend more time investigating false positives, which pulls them away from doing work that is actually beneficial.
This raises some key flaws in the claims that AI alone can be used to replace traditional penetration tests without any level of human intervention.
There are still key limitations of AI when it comes to replacing human-led penetration testing:
While it’s true that AI excels at large-scale repetitive tasks—like identifying vulnerability patterns—it’s not able to consider business logic or the context of unpredictable or complex scenarios (such as custom-built web apps). Every organisation is unique, with its own systems, workflows, and risk profile, and therefore requires tailored security mechanisms to protect its specific business interests.
It’s widely acknowledged that AI lacks the creativity and adaptability in building an understanding of the system being tested, in comparison to a human tester, that would have the capabilities for more advanced threat modelling and complex exploitation. This could lead to AI misinterpreting the level of threat, or overlooking novel or complex vulnerabilities, leaving critical gaps in your business' risk profile.
Even though XBOW is described as a “fully autonomous AI-driven penetration tester”, human input is still required to validate the quality and accuracy of reports to comply with HackerOne's policies.
This highlights two important points:
Human testers are essential for validating findings, interpreting risk and ensuring contextual, actionable results are produced.
By design, AI systems can only work with the data they’ve been ‘trained’ on. For penetration testing, this means that an AI tool is limited to detecting ‘known’ vulnerabilities that it has already been programmed to detect. So, this means AI penetration testing tools are likely to miss new or complex attack vectors or misinterpret unique data - that’s a huge gap when it comes to having true oversight of your risk landscape.
Putting this into a real-world example: A totally custom-built web application with no third-party software components. An AI system might know how to look for common configuration issues and have a general methodology for testing the application, but it would struggle to test the application’s custom logic. These are areas where more nuanced vulnerabilities could be hiding that a human penetration tester would be trained to uncover.
One major limitation of fully automating penetration testing with AI is the loss of human control. Every organisation has unique systems and security requirements, which means there are often specific considerations or constraints that need to be followed during testing. Because these nuances can be complex and context-specific, an AI system might struggle to interpret them correctly, which could lead to security issues, data breaches, inaccurate reporting and other risks.
Let’s take the example of an organisation that uses a production system for land ownership applications, where testers are instructed not to interact with certain options that would make an application public. A fully automated AI might ignore or misunderstand those constraints, unintentionally submitting large numbers of fake applications to public records.
In another scenario, imagine the same application has a searchable public database of land ownership records, but it’s vulnerable to SQL injection. An AI tool might try to ‘test’ this vulnerability by executing destructive payloads, like deleting records or even an entire database, without understanding the potential consequences. Unlike human testers, AI tools lack the contextual judgement to know the difference between safely proving a vulnerability and causing real damage. (In fact, similar behaviour has been observed in AI-assisted code generation tools like Copilot.)
This reinforces why human validation and quality assurance remain essential to penetration testing for maintaining control and delivering high-quality results aligned with each client’s specific needs.
The recent development of tools, like XBOW, have shown that AI-driven tools can significantly improve efficiency, scale and coverage of testing. However, the real strength lies in combining AI tools with human expertise, rather than replacing it.
To get the most value out of penetration testing services, organisations need to adopt a combined approach; using AI tools alongside skilled human penetration testers. This allows teams to leverage both the efficiency and scale of AI while ensuring the quality of testing and a comprehensive view of your risk landscape, including complex and emerging threats.
AI tools can support and enhance penetration testing services through:
It’s important to remember that many vulnerabilities are unique and contextual to an organisation, and sometimes a deep understanding of individual workflows is needed. Some vulnerabilities can also cause side effects to software during testing, so relying on AI in production environments with limited control could be potentially dangerous.
AI is transforming the security testing landscape – and although there are clear advantages like improved efficiency, coverage and identification of patterns, there are lots of limitations to consider. A complete lack of human contextual understanding and interpretation could lead to false positives, unsafe testing and other security risks.
The safest and most effective approach is a combination of skilled human testers who leverage AI tools and technology to enhance speed and scale.
At PGI, our experts combine deep human insight with advanced technology and tools to deliver assessments that go beyond surface-level checks. This approach ensures that our penetration testing is context-driven, evidence-based, and tailored to the unique risks of your organisation.
Get in touch with our team today to find out how we can support you with security testing and beyond.
If you’re considering an automated threat intelligence service, it’s important to first weigh up the benefits and limitations against the level of security your business needs.
The simple truth is that to get a complete understanding of your risk posture, your security testing needs to include what information a threat actor can learn about your organisation.
As a business leader, security leader, or IT decision-maker, you’re already spinning multiple plates: managing risk, meeting regulatory requirements, and making sure your systems are secure without slowing the pace of business.