In cybersecurity, ignoring the tools that are redefining the field is not an option. That’s why, at Pyxis, we ran proof-of-concept (POC) tests with AWS AI Security Agent as part of our offensive security activities — and we want to share what we found: how it works, what surprised us, and where its real limits are.
AWS AI Security Agent runs penetration tests through four sequential phases. This structure ensures that every finding is validated with real evidence and is reproducible:
One important differentiator: the agent can optionally analyze source code, design documentation, and architecture diagrams before starting real-time tests. This allows it to understand the application before attacking it, significantly expanding coverage depth. Each finding includes a severity rating based on CVSS v3.1, a confidence level based on successful exploitation, and detailed reproduction steps.
The agent’s working model is built around the concept of a workspace. From a workspace, you create and manage individual pentests and assign access permissions to specific team members, making it easy to collaborate and control who can view or run each test.
Before starting a pentest, the agent offers several configuration options that provide precise scope control:
— Exclusion of specific paths within the target domain, avoiding out-of-scope areas. — Exclusion of external domains the target application interacts with, to avoid unintended impact. — Custom HTTP header configuration to identify requests generated by the agent. In our tests, we used the header pyxis: security-agent to differentiate them from legitimate traffic. — Support for authenticated testing, allowing evaluation of functionality that requires an active session. — Integration with source code repositories (GitHub), design documents, or architecture diagrams for white-box testing.
— Exclusion of specific paths within the target domain, avoiding out-of-scope areas.
— Exclusion of external domains the target application interacts with, to avoid unintended impact. — Custom HTTP header configuration to identify requests generated by the agent. In our tests, we used the header pyxis: security-agent to differentiate them from legitimate traffic.
pyxis: security-agent
— Support for authenticated testing, allowing evaluation of functionality that requires an active session.
— Integration with source code repositories (GitHub), design documents, or architecture diagrams for white-box testing.
During execution, the agent displays in real time the set of tests it’s running, allowing the team to follow progress and understand which attack vectors are being explored at any given moment.
We ran proof-of-concept tests on two web applications with completely different architectures, which allowed us to evaluate the agent’s ability to adapt across different contexts. In both cases, the agent started from a single entry point and autonomously expanded the attack surface as it discovered new endpoints.
The agent executed a broad and structured set of tests. The table below summarizes the categories covered:
It’s worth noting that in neither POC did we observe the agent performing network-level tasks, such as open port reconnaissance or infrastructure scanning. All tests were focused exclusively on the application layer.
Application 1 — Web Application (Node.js)
The agent discovered and evaluated 38 endpoints, including the main domain and a variety of internal APIs related to blog services, member management, forms, CRM, and analytics.
The most relevant findings from this POC:
alg:none
isAdmin: true
role: SITE_OWNER
Application 2 — Next.js Application with RESTful API
In this application, the agent started from a single endpoint and autonomously expanded scope to cover 106 endpoints, including authentication, management, and document routes.
The most significant findings from this POC:
The aspect that surprised us most was the final report. Unlike traditional automated tool reports — which typically include generic descriptions copied from vulnerability databases — AWS AI Security Agent generates descriptions tailored to the specific context of each evaluated application.
Each finding includes:
During our tests with AWS AI Security Agent, we experienced a moment that precisely illustrates the boundary between computational power and human judgment.
The agent performed a flawless sweep of the web application. Within minutes, it identified the application’s structure, analyzed access policies, scanned for known vulnerability patterns, and exploited them to confirm its findings. Yet it missed a critical finding: a directory in the application that exposed a file containing sensitive information.
Two perspectives on the same finding — For the agent: it was a valid, accessible route based on configured permissions. — For our team: it was a risk affecting confidentiality that could be leveraged to gain access and damage the client’s reputation.
Two perspectives on the same finding
— For the agent: it was a valid, accessible route based on configured permissions.
— For our team: it was a risk affecting confidentiality that could be leveraged to gain access and damage the client’s reputation.
Without enough context about the document’s contents, the agent interpreted it as valid and legitimate information. Our team, on the other hand, understands the context, the business, and the client.
AI provides the power. We provide the judgment and the strategy.
Integrating AWS AI Security Agent changes how we scale our capabilities as a team. In the context of applications with development lifecycles (SDLC) that need to be continuously reviewed before going to production, AI-powered security testing enables us to:
Security testing agents are already a reality. But that doesn’t mean our team stops exploiting vulnerabilities. It means we can focus our full analytical capacity on designing more sophisticated attack strategies, exploiting more complex vulnerabilities, and building attack chains that require human context. This synergy allows for broader coverage and a significantly deeper level of analysis on every engagement.
AWS AI Security Agent operates at the application layer. The attack surfaces it covers include:
Like any tool, AWS AI Security Agent has concrete limitations that are important to understand in order to use it as a complement to your team’s work:
AWS AI Security Agent has reached general availability (GA), graduating from its preview phase. A service moving to GA in AWS signals that it has passed internal stability, security, and scalability validations, and that AWS considers it production-ready at scale. In the context of an AI-powered automated pentesting tool, it also signals that the market is adopting it at a pace that justifies the move to general availability.
On pricing: AWS has published a pay-per-use cost structure based on pentest execution time and resources consumed, similar to other managed AWS services. This makes it an accessible option for teams that need to scale their security testing without the fixed costs of traditional pentesting tools.
For teams evaluating integrating it into their SDLC, the time to act is now: GA stability removes the uncertainty inherent to a preview and opens the door to more robust integrations within CI/CD pipelines.
Adopting new AI tools isn’t about using all of them — it’s about knowing which ones genuinely elevate the team’s work. Our experience with AWS AI Security Agent — and with other tools we’ve evaluated — confirms that pentesting has entered a new stage of maturity.
At Pyxis, AI-powered pentesting isn’t a promise or a lab experiment: it’s something we’re already applying to strengthen our clients’ applications. We understand that security in the cloud demands speed — but it also demands judgment, instinct, and context.
If your team is accelerating cloud deployments and you need security to keep pace rather than slow you down, we can help.
With a 360° potential, our solutions matrix accompanies the lifecycle of any project, with skills and experience in Development, Design, Q&A, Devops, Operation & Deploy, and Architecture
We are here to help you!
You can leave us your query or recommendation through this form.
I accept the terms & conditions and I understand that my data will be hold securely in accordance with the privacy policy.