ZadeNor AI
Back to Blog
AI

vxcontrol/pentagi: Trending on GitHub

February 21, 2026
5 min
1,531 views
By ZadeNor AI Team
vxcontrol/pentagi: Trending on GitHub

vxcontrol/pentagi: Trending on GitHub

PentAGI

Penetration testing Artificial General Intelligence

๐Ÿš€ Join the Community! Connect with security researchers, AI enthusiasts, and fellow ethical hackers. Get support, share insights, and stay updated with the latest PentAGI developments.

โ €

๐Ÿ“– Table of Contents

Overview

Features

Quick Start

Advanced Setup

Development

Testing LLM Agents

Embedding Configuration and Testing

Function Testing with ftester

Building

Credits

License

๐ŸŽฏ Overview

PentAGI is an innovative tool for automated security testing that leverages cutting-edge artificial intelligence technologies. The project is designed for information security professionals, researchers, and enthusiasts who need a powerful and flexible solution for conducting penetration tests.

You can watch the video PentAGI overview:

โœจ Features

๐Ÿ›ก๏ธ Secure & Isolated. All operations are performed in a sandboxed Docker environment with complete isolation.

๐Ÿค– Fully Autonomous. AI-powered agent that automatically determines and executes penetration testing steps.

๐Ÿ”ฌ Professional Pentesting Tools. Built-in suite of 20+ professional security tools including nmap, metasploit, sqlmap, and more.

๐Ÿง  Smart Memory System. Long-term storage of research results and successful approaches for future use.

๐Ÿ“š Knowledge Graph Integration. Graphiti-powered knowledge graph using Neo4j for semantic relationship tracking and advanced context understanding.

๐Ÿ” Web Intelligence. Built-in browser via scraper for gathering latest information from web sources.

๐Ÿ”Ž External Search Systems. Integration with advanced search APIs including Tavily, Traversaal, Perplexity, DuckDuckGo, Google Custom Search, and Searxng for comprehensive information gathering.

๐Ÿ‘ฅ Team of Specialists. Delegation system with specialized AI agents for research, development, and infrastructure tasks.

๐Ÿ“Š Comprehensive Monitoring. Detailed logging and integration with Grafana/Prometheus for real-time system observation.

๐Ÿ“ Detailed Reporting. Generation of thorough vulnerability reports with exploitation guides.

๐Ÿ“ฆ Smart Container Management. Automatic Docker image selection based on specific task requirements.

๐Ÿ“ฑ Modern Interface. Clean and intuitive web UI for system management and monitoring.

๐Ÿ”Œ API Integration. Support for REST and GraphQL APIs for seamless external system integration.

๐Ÿ’พ Persistent Storage. All commands and outputs are stored in PostgreSQL with pgvector extension.

๐ŸŽฏ Scalable Architecture. Microservices-based design supporting horizontal scaling.

๐Ÿ  Self-Hosted Solution. Complete control over your deployment and data.

๐Ÿ”‘ Flexible Authentication. Support for various LLM providers (OpenAI, Anthropic, Ollama, AWS Bedrock, Google AI/Gemini, Deep Infra, OpenRouter, DeepSeek), Moonshot and custom configurations.

โšก Quick Deployment. Easy setup through Docker Compose with comprehensive environment configuration.

๐Ÿ—๏ธ Architecture

System Context

  flowchart TB
classDef person fill:#08427B,stroke:#073B6F,color:#fff
classDef system fill:#1168BD,stroke:#0B4884,color:#fff
classDef external fill:#666666,stroke:#0B4884,color:#fff

pentester["๐Ÿ‘ค Security Engineer
(User of the system)"]

pentagi["โœจ PentAGI
(Autonomous penetration testing system)"]

target["๐ŸŽฏ target-system
(System under test)"]
llm["๐Ÿง  llm-provider
(OpenAI/Anthropic/Ollama/Bedrock/Gemini/Custom)"]
search["๐Ÿ” search-systems
(Google/DuckDuckGo/Tavily/Traversaal/Perplexity/Searxng)"]
langfuse["๐Ÿ“Š langfuse-ui
(LLM Observability Dashboard)"]
grafana["๐Ÿ“ˆ grafana
(System Monitoring Dashboard)"]

pentester --> |Uses HTTPS| pentagi
pentester --> |Monitors AI HTTPS| langfuse
pentester --> |Monitors System HTTPS| grafana
pentagi --> |Tests Various protocols| target
pentagi --> |Queries HTTPS| llm
pentagi --> |Searches HTTPS| search
pentagi --> |Reports HTTPS| langfuse
pentagi --> |Reports HTTPS| grafana

class pentester person
class pentagi system
class target,llm,search,langfuse,grafana external

linkStyle default stroke:#ffffff,color:#ffffff

Loading

๐Ÿ”„ Container Architecture (click to expand)

  graph TB
subgraph Core Services
    UI[Frontend UI<br/>React + TypeScript]
    API[Backend API<br/>Go + GraphQL]
    DB[(Vector Store<br/>PostgreSQL + pgvector)]
    MQ[Task Queue<br/>Async Processing]
    Agent[AI Agents<br/>Multi-Agent System]
end

subgraph Knowledge Graph
    Graphiti[Graphiti<br/>Knowledge Graph API]
    Neo4j[(Neo4j<br/>Graph Database)]
end

subgraph Monitoring
    Grafana[Grafana<br/>Dashboards]
    VictoriaMetrics[VictoriaMetrics<br/>Time-series DB]
    Jaeger[Jaeger<br/>Distributed Tracing]
    Loki[Loki<br/>Log Aggregation]
    OTEL[OpenTelemetry<br/>Data Collection]
end

subgraph Analytics
    Langfuse[Langfuse<br/>LLM Analytics]
    ClickHouse[ClickHouse<br/>Analytics DB]
    Redis[Redis<br/>Cache + Rate Limiter]
    MinIO[MinIO<br/>S3 Storage]
end

subgraph Security Tools
    Scraper[Web Scraper<br/>Isolated Browser]
    PenTest[Security Tools<br/>20+ Pro Tools<br/>Sandboxed Execution]
end

UI --> |HTTP/WS| API
API --> |SQL| DB
API --> |Events| MQ
MQ --> |Tasks| Agent
Agent --> |Commands| PenTest
Agent --> |Queries| DB
Agent --> |Knowledge| Graphiti
Graphiti --> |Graph| Neo4j

API --> |Telemetry| OTEL
OTEL --> |Metrics| VictoriaMetrics
OTEL --> |Traces| Jaeger
OTEL --> |Logs| Loki

Grafana --> |Query| VictoriaMetrics
Grafana --> |Query| Jaeger
Grafana --> |Query| Loki

API --> |Analytics| Langfuse
Langfuse --> |Store| ClickHouse
Langfuse --> |Cache| Redis
Langfuse --> |Files| MinIO

classDef core fill:#f9f,stroke:#333,stroke-width:2px,color:#000
classDef knowledge fill:#ffa,stroke:#333,stroke-width:2px,color:#000
classDef monitoring fill:#bbf,stroke:#333,stroke-width:2px,color:#000
classDef analytics fill:#bfb,stroke:#333,stroke-width:2px,color:#000
classDef tools fill:#fbb,stroke:#333,stroke-width:2px,color:#000

class UI,API,DB,MQ,Agent core
class Graphiti,Neo4j knowledge
class Grafana,VictoriaMetrics,Jaeger,Loki,OTEL monitoring
class Langfuse,ClickHouse,Redis,MinIO analytics
class Scraper,PenTest tools

Loading

๐Ÿ“Š Entity Relationship (click to expand)

  erDiagram
Flow ||--o{ Task : contains
Task ||--o{ SubTask : contains
SubTask ||--o{ Action : contains
Action ||--o{ Artifact : produces
Action ||--o{ Memory : stores

Flow {
    string id PK
    string name "Flow name"
    string description "Flow description"
    string status "active/completed/failed"
    json parameters "Flow parameters"
    timestamp created_at
    timestamp updated_at
}

Task {
    string id PK
    string flow_id FK
    string name "Task name"
    string description "Task description"
    string status "pending/running/done/failed"
    json result "Task results"
    timestamp created_at
    timestamp updated_at
}

SubTask {
    string id PK
    string task_id FK
    string name "Subtask name"
    string description "Subtask description"
    string status "queued/running/completed/failed"
    string agent_type "researcher/developer/executor"
    json context "Agent context"
    timestamp created_at
    timestamp updated_at
}

Action {
    string id PK
    string subtask_id FK
    string type "command/search/analyze/etc"
    string status "success/failure"
    json parameters "Action parameters"
    json result "Action results"
    timestamp created_at
}

Artifact {
    string id PK
    string action_id FK
    string type "file/report/log"
    string path "Storage path"
    json metadata "Additional info"
    timestamp created_at
}

Memory {
    string id PK
    string action_id FK
    string type "observation/conclusion"
    vector embedding "Vector representation"
    text content "Memory content"
    timestamp created_at
}

Loading

๐Ÿค– Agent Interaction (click to expand)

  sequenceDiagram
participant O as Orchestrator
participant R as Researcher
participant D as Developer
participant E as Executor
participant VS as Vector Store
participant KB as Knowledge Base

Note over O,KB: Flow Initialization
O->>VS: Query similar tasks
VS-->>O: Return experiences
O->>KB: Load relevant knowledge
KB-->>O: Return context

Note over O,R: Research Phase
O->>R: Analyze target
R->>VS: Search similar cases
VS-->>R: Return patterns
R->>KB: Query vulnerabilities
KB-->>R: Return known issues
R->>VS: Store findings
R-->>O: Research results

Note over O,D: Planning Phase
O->>D: Plan attack
D->>VS: Query exploits
VS-->>D: Return techniques
D->>KB: Load tools info
KB-->>D: Return capabilities
D-->>O: Attack plan

Note over O,E: Execution Phase
O->>E: Execute plan
E->>KB: Load tool guides
KB-->>E: Return procedures
E->>VS: Store results
E-->>O: Execution status

Loading

๐Ÿง  Memory System (click to expand)

  graph TB
subgraph "Long-term Memory"
    VS[(Vector Store<br/>Embeddings DB)]
    KB[Knowledge Base<br/>Domain Expertise]
    Tools[Tools Knowledge<br/>Usage Patterns]
end

subgraph "Working Memory"
    Context[Current Context<br/>Task State]
    Goals[Active Goals<br/>Objectives]
    State[System State<br/>Resources]
end

subgraph "Episodic Memory"
    Actions[Past Actions<br/>Commands History]
    Results[Action Results<br/>Outcomes]
    Patterns[Success Patterns<br/>Best Practices]
end

Context --> |Query| VS
VS --> |Retrieve| Context

Goals --> |Consult| KB
KB --> |Guide| Goals

State --> |Record| Actions
Actions --> |Learn| Patterns
Patterns --> |Store| VS

Tools --> |Inform| State
Results --> |Update| Tools

VS --> |Enhance| KB
KB --> |Index| VS

classDef ltm fill:#f9f,stroke:#333,stroke-width:2px,color:#000
classDef wm fill:#bbf,stroke:#333,stroke-width:2px,color:#000
classDef em fill:#bfb,stroke:#333,stroke-width:2px,color:#000

class VS,KB,Tools ltm
class Context,Goals,State wm
class Actions,Results,Patterns em

Loading

๐Ÿ”„ Chain Summarization (click to expand) The chain summarization system manages conversation context growth by selectively summarizing older messages. This is critical for preventing token limits from being exceeded while maintaining conversation coherence.

  flowchart TD
A[Input Chain] --> B{Needs Summarization?}
B -->|No| C[Return Original Chain]
B -->|Yes| D[Convert to ChainAST]
D --> E[Apply Section Summarization]
E --> F[Process Oversized Pairs]
F --> G[Manage Last Section Size]
G --> H[Apply QA Summarization]
H --> I[Rebuild Chain with Summaries]
I --> J{Is New Chain Smaller?}
J -->|Yes| K[Return Optimized Chain]
J -->|No| C

classDef process fill:#bbf,stroke:#333,stroke-width:2px,color:#000
classDef decision fill:#bfb,stroke:#333,stroke-width:2px,color:#000
classDef output fill:#fbb,stroke:#333,stroke-width:2px,color:#000

class A,D,E,F,G,H,I process
class B,J decision
class C,K output

Loading

The algorithm operates on a structured representation of conversation chains (ChainAST) that preserves message types including tool calls and their responses. All summarization operations maintain critical conversation flow while reducing context size.

Global Summarizer Configuration Options

Parameter Environment Variable Default Description

Preserve Last SUMMARIZER_PRESERVE_LAST true Whether to keep all messages in the last section intact

Use QA Pairs SUMMARIZER_USE_QA true Whether to use QA pair summarization strategy

Summarize Human in QA SUMMARIZER_SUM_MSG_HUMAN_IN_QA false Whether to summarize human messages in QA pairs

Last Section Size SUMMARIZER_LAST_SEC_BYTES 51200 Maximum byte size for last section (50KB)

Max Body Pair Size SUMMARIZER_MAX_BP_BYTES 16384 Maximum byte size for a single body pair (16KB)

Max QA Sections SUMMARIZER_MAX_QA_SECTIONS 10 Maximum QA pair sections to preserve

Max QA Size SUMMARIZER_MAX_QA_BYTES 65536 Maximum byte size for QA pair sections (64KB)

Keep QA Sections SUMMARIZER_KEEP_QA_SECTIONS 1 Number of recent QA sections to keep without summarization

Assistant Summarizer Configuration Options

Assistant instances can use customized summarization settings to fine-tune context management behavior:

Parameter Environment Variable Default Description

Preserve Last ASSISTANT_SUMMARIZER_PRESERVE_LAST true Whether to preserve all messages in the assistant's last section

Last Section Size ASSISTANT_SUMMARIZER_LAST_SEC_BYTES 76800 Maximum byte size for assistant's last section (75KB)

Max Body Pair Size ASSISTANT_SUMMARIZER_MAX_BP_BYTES 16384 Maximum byte size for a single body pair in assistant context (16KB)

Max QA Sections ASSISTANT_SUMMARIZER_MAX_QA_SECTIONS 7 Maximum QA sections to preserve in assistant context

Max QA Size ASSISTANT_SUMMARIZER_MAX_QA_BYTES 76800 Maximum byte size for assistant's QA sections (75KB)

Keep QA Sections ASSISTANT_SUMMARIZER_KEEP_QA_SECTIONS 3 Number of recent QA sections to preserve without summarization

The assistant summarizer configuration provides more memory for context retention compared to the global settings, preserving more recent conversation history while still ensuring efficient token usage.

Summarizer Environment Configuration

Default values for global summarizer logic

SUMMARIZER_PRESERVE_LAST=true SUMMARIZER_USE_QA=true SUMMARIZER_SUM_MSG_HUMAN_IN_QA=false SUMMARIZER_LAST_SEC_BYTES=51200 SUMMARIZER_MAX_BP_BYTES=16384 SUMMARIZER_MAX_QA_SECTIONS=10 SUMMARIZER_MAX_QA_BYTES=65536 SUMMARIZER_KEEP_QA_SECTIONS=1

Default values for assistant summarizer logic

ASSISTANT_SUMMARIZER_PRESERVE_LAST=true ASSISTANT_SUMMARIZER_LAST_SEC_BYTES=76800 ASSISTANT_SUMMARIZER_MAX_BP_BYTES=16384 ASSISTANT_SUMMARIZER_MAX_QA_SECTIONS=7 ASSISTANT_SUMMARIZER_MAX_QA_BYTES=76800 ASSISTANT_SUMMARIZER_KEEP_QA_SECTIONS=3

The architecture of PentAGI is designed to be modular, scalable, and secure. Here are the key components:

Core Services

Frontend UI: React-based web interface with TypeScript for type safety

Backend API: Go-based REST and GraphQL APIs for flexible integration

Vector Store: PostgreSQL with pgvector for semantic search and memory storage

Task Queue: Async task processing system for reliable operation

AI Agent: Multi-agent system with specialized roles for efficient testing

Knowledge Graph

Graphiti: Knowledge graph API for semantic relationship tracking and contextual understanding

Neo4j: Graph database for storing and querying relationships between entities, actions, and outcomes

Automatic capturing of agent responses and tool executions for building comprehensive knowledge base

Monitoring Stack

OpenTelemetry: Unified observability data collection and correlation

Grafana: Real-time visualization and alerting dashboards

VictoriaMetrics: High-performance time-series metrics storage

Jaeger: End-to-end distributed tracing for debugging

Loki: Scalable log aggregation and analysis

Analytics Platform

Langfuse: Advanced LLM observability and performance analytics

ClickHouse: Column-oriented analytics data warehouse

Redis: High-speed caching and rate limiting

MinIO: S3-compatible object storage for artifacts

Security Tools

Web Scraper: Isolated browser environment for safe web interaction

Pentesting Tools: Comprehensive suite of 20+ professional security tools

Sandboxed Execution: All operations run in isolated containers

Memory Systems

Long-term Memory: Persistent storage of knowledge and experiences

Working Memory: Active context and goals for current operations

Episodic Memory: Historical actions and success patterns

Knowledge Base: Structured domain expertise and tool capabilities

Context Management: Intelligently manages growing LLM context windows using chain summarization

The system uses Docker containers for isolation and easy deployment, with separate networks for core services, monitoring, and analytics to ensure proper security boundaries. Each component is designed to scale horizontally and can be configured for high availability in production environments.

๐Ÿš€ Quick Start

System Requirements

Docker and Docker Compose

Minimum 2 vCPU

Minimum 4GB RAM

20GB free disk space

Internet access for downloading images and updates

Using Installer (Recommended)

PentAGI provides an interactive installer with a terminal-based UI for streamlined configuration and deployment. The installer guides you through system checks, LLM provider setup, search engine configuration, and security hardening.

Supported Platforms:

Linux: amd64 download | arm64 download

Windows: amd64 download

macOS: amd64 (Intel) download | arm64 (M-series) download

Quick Installation (Linux amd64):

Create installation directory

mkdir -p pentagi && cd pentagi

Download installer

wget -O installer.zip https://pentagi.com/downloads/linux/amd64/installer-latest.zip

Extract

unzip installer.zip

Run interactive installer

./installer

Prerequisites & Permissions:

The installer requires appropriate privileges to interact with the Docker API for proper operation. By default, it uses the Docker socket (/var/run/docker.sock) which requires either:

Option 1 (Recommended for production): Run the installer as root:

sudo ./installer

Option 2 (Development environments): Grant your user access to the Docker socket by adding them to the docker group:

Add your user to the docker group

sudo usermod -aG docker $USER

Log out and log back in, or activate the group immediately

newgrp docker

Verify Docker access (should run without sudo)

docker ps

โš ๏ธ Security Note: Adding a user to the docker group grants root-equivalent privileges. Only do this for trusted users in controlled environments. For production deployments, consider using rootless Docker mode or running the installer with sudo.

The installer will:

System Checks: Verify Docker, network connectivity, and system requirements

Environment Setup: Create and configure .env file with optimal defaults

Provider Configuration: Set up LLM providers (OpenAI, Anthropic, Gemini, Bedrock, Ollama, Custom)

Search Engines: Configure DuckDuckGo, Google, Tavily, Traversaal, Perplexity, Searxng

Security Hardening: Generate secure credentials and configure SSL certificates

Deployment: Start PentAGI with docker-compose

For Production & Enhanced Security:

For production deployments or security-sensitive environments, we strongly recommend using a distributed two-node architecture where worker operations are isolated on a separate server. This prevents untrusted code execution and network access issues on your main system.

๐Ÿ‘‰ See detailed guide: Worker Node Setup

The two-node setup provides:

Isolated Execution: Worker containers run on dedicated hardware

Network Isolation: Separate network boundaries for penetration testing

Security Boundaries: Docker-in-Docker with TLS authentication

OOB Attack Support: Dedicated port ranges for out-of-band techniques

Manual Installation

Create a working directory or clone the repository:

mkdir pentagi && cd pentagi

Copy .env.example to .env or download it:

curl -o .env https://raw.githubusercontent.com/vxcontrol/pentagi/master/.env.example

Touch examples files (example.custom.provider.yml, example.ollama.provider.yml) or download it:

curl -o example.custom.provider.yml https://raw.githubusercontent.com/vxcontrol/pentagi/master/examples/configs/custom-openai.provider.yml curl -o example.ollama.provider.yml https://raw.githubusercontent.com/vxcontrol/pentagi/master/examples/configs/ollama-llama318b.provider.yml

Fill in the required API keys in .env file.

Required: At least one of these LLM providers

OPEN_AI_KEY=your_openai_key ANTHROPIC_API_KEY=your_anthropic_key GEMINI_API_KEY=your_gemini_key

Optional: AWS Bedrock provider (enterprise-grade models)

BEDROCK_REGION=us-east-1 BEDROCK_ACCESS_KEY_ID=your_aws_access_key BEDROCK_SECRET_ACCESS_KEY=your_aws_secret_key

Optional: Local LLM provider (zero-cost inference)

OLLAMA_SERVER_URL=http://localhost:11434 OLLAMA_SERVER_MODEL=your_model_name

Optional: Additional search capabilities

DUCKDUCKGO_ENABLED=true GOOGLE_API_KEY=your_google_key GOOGLE_CX_KEY=your_google_cx TAVILY_API_KEY=your_tavily_key TRAVERSAAL_API_KEY=your_traversaal_key PERPLEXITY_API_KEY=your_perplexity_key PERPLEXITY_MODEL=sonar-pro PERPLEXITY_CONTEXT_SIZE=medium

Searxng meta search engine (aggregates results from multiple sources)

SEARXNG_URL=http://your-searxng-instance:8080 SEARXNG_CATEGORIES=general SEARXNG_LANGUAGE= SEARXNG_SAFESEARCH=0 SEARXNG_TIME_RANGE=

Graphiti knowledge graph settings

GRAPHITI_ENABLED=true GRAPHITI_TIMEOUT=30 GRAPHITI_URL=http://graphiti:8000 GRAPHITI_MODEL_NAME=gpt-5-mini

Neo4j settings (used by Graphiti stack)

NEO4J_USER=neo4j NEO4J_DATABASE=neo4j NEO4J_PASSWORD=devpassword NEO4J_URI=bolt://neo4j:7687

Assistant configuration

ASSISTANT_USE_AGENTS=false # Default value for agent usage when creating new assistants

Change all security related environment variables in .env file to improve security.

Security related environment variables

Main Security Settings

COOKIE_SIGNING_SALT - Salt for cookie signing, change to random value

PUBLIC_URL - Public URL of your server (eg. https://pentagi.example.com)

SERVER_SSL_CRT and SERVER_SSL_KEY - Custom paths to your existing SSL certificate and key for HTTPS (these paths should be used in the docker-compose.yml file to mount as volumes)

Scraper Access

SCRAPER_PUBLIC_URL - Public URL for scraper if you want to use different scraper server for public URLs

SCRAPER_PRIVATE_URL - Private URL for scraper (local scraper server in docker-compose.yml file to access it to local URLs)

Access Credentials

PENTAGI_POSTGRES_USER and PENTAGI_POSTGRES_PASSWORD - PostgreSQL credentials

NEO4J_USER and NEO4J_PASSWORD - Neo4j credentials (for Graphiti knowledge graph)

Remove all inline comments from .env file if you want to use it in VSCode or other IDEs as a envFile option:

perl -i -pe 's/\s+#.*$//' .env

Run the PentAGI stack:

curl -O https://raw.githubusercontent.com/vxcontrol/pentagi/master/docker-compose.yml docker compose up -d

Visit localhost:8443 to access PentAGI Web UI (default is [emailย protected] / admin)

Note

If you caught an error about pentagi-network or observability-network or langfuse-network you need to run docker-compose.yml firstly to create these networks and after that run docker-compose-langfuse.yml, docker-compose-graphiti.yml, and docker-compose-observability.yml to use Langfuse, Graphiti, and Observability services.

You have to set at least one Language Model provider (OpenAI, Anthropic, Gemini, AWS Bedrock, or Ollama) to use PentAGI. AWS Bedrock provides enterprise-grade access to multiple foundation models from leading AI companies, while Ollama provides zero-cost local inference if you have sufficient computational resources. Additional API keys for search engines are optional but recommended for better results.

LLM_SERVER_* environment variables are experimental feature and will be changed in the future. Right now you can use them to specify custom LLM server URL and one model for all agent types.

PROXY_URL is a global proxy URL for all LLM providers and external search systems. You can use it for isolation from external networks.

The docker-compose.yml file runs the PentAGI service as root user because it needs access to docker.sock for container management. If you're using TCP/IP network connection to Docker instead of socket file, you can remove root privileges and use the default pentagi user for better security.

Assistant Configuration

PentAGI allows you to configure default behavior for assistants:

Variable Default Description

ASSISTANT_USE_AGENTS false Controls the default value for agent usage when creating new assistants

The ASSISTANT_USE_AGENTS setting affects the initial state of the "Use Agents" toggle when creating a new assistant in the UI:

false (default): New assistants are created with agent delegation disabled by default

true: New assistants are created with agent delegation enabled by default

Note that users can always override this setting by toggling the "Use Agents" button in the UI when creating or editing an assistant. This environment variable only controls the initial default state.

Custom LLM Provider Configuration

When using custom LLM providers with the LLM_SERVER_* variables, you can fine-tune the reasoning format used in requests:

Variable Default Description

LLM_SERVER_URL

Base URL for the custom LLM API endpoint

LLM_SERVER_KEY

API key for the custom LLM provider

LLM_SERVER_MODEL

Default model to use (can be overridden in provider config)

LLM_SERVER_CONFIG_PATH

Path to the YAML configuration file for agent-specific models

LLM_SERVER_PROVIDER

Provider name prefix for model names (e.g., openrouter, deepseek for LiteLLM proxy)

LLM_SERVER_LEGACY_REASONING false Controls reasoning format in API requests

LLM_SERVER_PRESERVE_REASONING false Preserve reasoning content in multi-turn conversations (required by some providers)

The LLM_SERVER_PROVIDER setting is particularly useful when using LiteLLM proxy, which adds a provider prefix to model names. For example, when connecting to Moonshot API through LiteLLM, models like kimi-2.5 become moonshot/kimi-2.5. By setting LLM_SERVER_PROVIDER=moonshot, you can use the same provider configuration file for both direct API access and LiteLLM proxy access without modifications.

The LLM_SERVER_LEGACY_REASONING setting affects how reasoning parameters are sent to the LLM:

false (default): Uses modern format where reasoning is sent as a structured object with max_tokens parameter

true: Uses legacy format with string-based reasoning_effort parameter

This setting is important when working with different LLM providers as they may expect different reasoning formats in their API requests. If you encounter reasoning-related errors with custom providers, try changing this setting.

The LLM_SERVER_PRESERVE_REASONING setting controls whether reasoning content is preserved in multi-turn conversations:

false (default): Reasoning content is not preserved in conversation history

true: Reasoning content is preserved and sent in subsequent API calls

This setting is required by some LLM providers (e.g., Moonshot) that return errors like "thinking is enabled but reasoning_content is missing in assistant tool call message" when reasoning content is not included in multi-turn conversations. Enable this setting if your provider requires reasoning content to be preserved.

Local LLM Provider Configuration

PentAGI supports Ollama for local LLM inference, providing zero-cost operation and enhanced privacy:

Variable Default Description

OLLAMA_SERVER_URL

URL of your Ollama server

OLLAMA_SERVER_MODEL llama3.1:8b-instruct-q8_0 Default model for inference

OLLAMA_SERVER_CONFIG_PATH

Path to custom agent configuration file

OLLAMA_SERVER_PULL_MODELS_TIMEOUT 600 Timeout for model downloads (seconds)

OLLAMA_SERVER_PULL_MODELS_ENABLED false Auto-download models on startup

OLLAMA_SERVER_LOAD_MODELS_ENABLED false Query server for available models

Configuration examples:

Basic Ollama setup with default model

OLLAMA_SERVER_URL=http://localhost:11434 OLLAMA_SERVER_MODEL=llama3.1:8b-instruct-q8_0

Production setup with auto-pull and model discovery

OLLAMA_SERVER_URL=http://ollama-server:11434 OLLAMA_SERVER_PULL_MODELS_ENABLED=true OLLAMA_SERVER_PULL_MODELS_TIMEOUT=900 OLLAMA_SERVER_LOAD_MODELS_ENABLED=true

Custom configuration with agent-specific models

OLLAMA_SERVER_CONFIG_PATH=/path/to/ollama-config.yml

Default configuration file inside docker container

OLLAMA_SERVER_CONFIG_PATH=/opt/pentagi/conf/ollama-llama318b.provider.yml

Performance Considerations:

Model Discovery (OLLAMA_SERVER_LOAD_MODELS_ENABLED=true): Adds 1-2s startup latency querying Ollama API

Auto-pull (OLLAMA_SERVER_PULL_MODELS_ENABLED=true): First startup may take several minutes downloading models

Pull timeout (OLLAMA_SERVER_PULL_MODELS_TIMEOUT=900): 15 minutes in seconds

Static Config: Disable both flags and specify models in config file for fastest startup

Creating Custom Ollama Models with Extended Context

PentAGI requires models with larger context windows than the default Ollama configurations. You need to create custom models with increased num_ctx parameter through Modelfiles. While typical agent workflows consume around 64K tokens, PentAGI uses 110K context size for safety margin and handling complex penetration testing scenarios.

Important: The num_ctx parameter can only be set during model creation via Modelfile - it cannot be changed after model creation or overridden at runtime.

Example: Qwen3 32B FP16 with Extended Context

Create a Modelfile named Modelfile_qwen3_32b_fp16_tc:

FROM qwen3:32b-fp16 PARAMETER num_ctx 110000 PARAMETER temperature 0.3 PARAMETER top_p 0.8 PARAMETER min_p 0.0 PARAMETER top_k 20 PARAMETER repeat_penalty 1.1

Build the custom model:

ollama create qwen3:32b-fp16-tc -f Modelfile_qwen3_32b_fp16_tc

Example: QwQ 32B FP16 with Extended Context

Create a Modelfile named Modelfile_qwq_32b_fp16_tc:

FROM qwq:32b-fp16 PARAMETER num_ctx 110000 PARAMETER temperature 0.2 PARAMETER top_p 0.7 PARAMETER min_p 0.0 PARAMETER top_k 40 PARAMETER repeat_penalty 1.2

Build the custom model:

ollama create qwq:32b-fp16-tc -f Modelfile_qwq_32b_fp16_tc

Note: The QwQ 32B FP16 model requires approximately 71.3 GB VRAM for inference. Ensure your system has sufficient GPU memory before attempting to use this model.

These custom models are referenced in the pre-built provider configuration files (ollama-qwen332b-fp16-tc.provider.yml and ollama-qwq32b-fp16-tc.provider.yml) that are included in the Docker image at /opt/pentagi/conf/.

OpenAI Provider Configuration

PentAGI supports OpenAI's advanced language models, including the latest reasoning-capable o-series models designed for complex analytical tasks:

Variable Default Description

OPEN_AI_KEY

API key for OpenAI services

OPEN_AI_SERVER_URL https://api.openai.com/v1 OpenAI API endpoint

Configuration examples:

Basic OpenAI setup

OPEN_AI_KEY=your_openai_api_key OPEN_AI_SERVER_URL=https://api.openai.com/v1

Using with proxy for enhanced security

OPEN_AI_KEY=your_openai_api_key PROXY_URL=http://your-proxy:8080

The OpenAI provider offers cutting-edge capabilities including:

Reasoning Models: Advanced o-series models (o1, o3, o4-mini) with step-by-step analytical thinking

Latest GPT-4.1 Series: Flagship models optimized for complex security research and exploit development

Cost-Effective Options: From nano models for high-volume scanning to powerful reasoning models for deep analysis

Versatile Performance: Fast, intelligent models perfect for multi-step security analysis and penetration testing

Proven Reliability: Industry-leading models with consistent performance across diverse security scenarios

The system automatically selects appropriate OpenAI models based on task complexity, optimizing for both performance and cost-effectiveness.

Anthropic Provider Configuration

PentAGI integrates with Anthropic's Claude models, known for their exceptional safety, reasoning capabilities, and sophisticated understanding of complex security contexts:

Variable Default Description

ANTHROPIC_API_KEY

API key for Anthropic services

ANTHROPIC_SERVER_URL https://api.anthropic.com/v1 Anthropic API endpoint

Configuration examples:

Basic Anthropic setup

ANTHROPIC_API_KEY=your_anthropic_api_key ANTHROPIC_SERVER_URL=https://api.anthropic.com/v1

Using with proxy for secure environments

ANTHROPIC_API_KEY=your_anthropic_api_key PROXY_URL=http://your-proxy:8080

The Anthropic provider delivers superior capabilities including:

Advanced Reasoning: Claude 4 series with exceptional reasoning for sophisticated penetration testing

Extended Thinking: Claude 3.7 with step-by-step thinking capabilities for methodical security research

High-Speed Performance: Claude 3.5 Haiku for blazing-fast vulnerability scans and real-time monitoring

Comprehensive Analysis: Claude Sonnet models for complex security analysis and threat hunting

Safety-First Design: Built-in safety mechanisms ensuring responsible security testing practices

The system leverages Claude's advanced understanding of security contexts to provide thorough and responsible penetration testing guidance.

Google AI (Gemini) Provider Configuration

PentAGI supports Google's Gemini models through the Google AI API, offering state-of-the-art reasoning capabilities and multimodal features:

Variable Default Description

GEMINI_API_KEY

API key for Google AI services

GEMINI_SERVER_URL https://generativelanguage.googleapis.com Google AI API endpoint

Configuration examples:

Basic Gemini setup

GEMINI_API_KEY=your_gemini_api_key GEMINI_SERVER_URL=https://generativelanguage.googleapis.com

Using with proxy

GEMINI_API_KEY=your_gemini_api_key PROXY_URL=http://your-proxy:8080

The Gemini provider offers advanced features including:

Thinking Capabilities: Advanced reasoning models (Gemini 2.5 series) with step-by-step analysis

Multimodal Support: Text and image processing for comprehensive security assessments

Large Context Windows: Up to 2M tokens for analyzing extensive codebases and documentation

Cost-Effective Options: From high-performance pro models to economical flash variants

Security-Focused Models: Specialized configurations optimized for penetration testing workflows

The system automatically selects appropriate Gemini models based on agent requirements, balancing performance, capabilities, and cost-effectiveness.

AWS Bedrock Provider Configuration

PentAGI integrates with Amazon Bedrock, offering access to a wide range of foundation models from leading AI companies including Anthropic, AI21, Cohere, Meta, and Amazon's own models:

Variable Default Description

BEDROCK_REGION us-east-1 AWS region for Bedrock service

BEDROCK_ACCESS_KEY_ID

AWS access key ID for authentication

BEDROCK_SECRET_ACCESS_KEY

AWS secret access key for authentication

BEDROCK_SESSION_TOKEN

AWS session token as alternative way for authentication

BEDROCK_SERVER_URL

Optional custom Bedrock endpoint URL

Configuration examples:

Basic AWS Bedrock setup with credentials

BEDROCK_REGION=us-east-1 BEDROCK_ACCESS_KEY_ID=your_aws_access_key BEDROCK_SECRET_ACCESS_KEY=your_aws_secret_key

Using with proxy for enhanced security

BEDROCK_REGION=us-east-1 BEDROCK_ACCESS_KEY_ID=your_aws_access_key BEDROCK_SECRET_ACCESS_KEY=your_aws_secret_key PROXY_URL=http://your-proxy:8080

Using custom endpoint (for VPC endpoints or testing)

BEDROCK_REGION=us-east-1 BEDROCK_ACCESS_KEY_ID=your_aws_access_key BEDROCK_SECRET_ACCESS_KEY=your_aws_secret_key BEDROCK_SERVER_URL=https://bedrock-runtime.us-east-1.amazonaws.com

Important

AWS Bedrock Rate Limits Warning

The default PentAGI configuration for AWS Bedrock uses two primary models:

us.anthropic.claude-sonnet-4-20250514-v1:0 (for most agents) - 2 requests per minute for new AWS accounts

us.anthropic.claude-3-5-haiku-20241022-v1:0 (for simple tasks) - 20 requests per minute for new AWS accounts

These default rate limits are extremely restrictive for comfortable penetration testing scenarios and will significantly impact your workflow. We strongly recommend:

Request quota increases for your AWS Bedrock models through the AWS Service Quotas console

Use provisioned throughput models with hourly billing for higher throughput requirements

Switch to alternative models with higher default quotas (e.g., Amazon Nova series, Meta Llama models)

Consider using a different LLM provider (OpenAI, Anthropic, Gemini) if you need immediate high-throughput access

Without adequate rate limits, you may experience frequent delays, timeouts, and degraded testing performance.

The AWS Bedrock provider delivers comprehensive capabilities including:

Multi-Provider Access: Access to models from Anthropic (Claude), AI21 (Jamba), Cohere (Command), Meta (Llama), Amazon (Nova, Titan), and DeepSeek (R1) through a single interface

Advanced Reasoning: Support for Claude 4 and other reasoning-capable models with step-by-step thinking

Multimodal Models: Amazon Nova series supporting text, image, and video processing for comprehensive security analysis

Enterprise Security: AWS-native security controls, VPC integration, and compliance certifications

Cost Optimization: Wide range of model sizes and capabilities for cost-effective penetration testing

Regional Availability: Deploy models in your preferred AWS region for data residency and performance

High Performance: Low-latency inference through AWS's global infrastructure

The system automatically selects appropriate Bedrock models based on task complexity and requirements, leveraging the full spectrum of available foundation models for optimal security testing results.

Warning

Converse API Requirements

PentAGI uses the Amazon Bedrock Converse API for model interactions, which requires models to support the following features:

โœ… Converse - Basic conversation API support

โœ… ConverseStream - Streaming response support

โœ… Tool use - Function calling capabilities for penetration testing tools

โœ… Streaming tool use - Real-time tool execution feedback

Before selecting models, verify their feature support at: Supported models and model features

โš ๏ธ Important: Some models like AI21 Jurassic-2 and Cohere Command (Text) have limited chat support and may not work properly with PentAGI's multi-turn conversation workflows.

Note: AWS credentials can also be provided through IAM roles, environment variables, or AWS credential files following standard AWS SDK authentication patterns. Ensure your AWS account has appropriate permissions for Amazon Bedrock service access.

For advanced configuration options and detailed setup instructions, please visit our documentation.

๐Ÿ”ง Advanced Setup

Langfuse Integration

Langfuse provides advanced capabilities for monitoring and analyzing AI agent operations.

Configure Langfuse environment variables in existing .env file.

Langfuse valuable environment variables

Database Credentials

LANGFUSE_POSTGRES_USER and LANGFUSE_POSTGRES_PASSWORD - Langfuse PostgreSQL credentials

LANGFUSE_CLICKHOUSE_USER and LANGFUSE_CLICKHOUSE_PASSWORD - ClickHouse credentials

LANGFUSE_REDIS_AUTH - Redis password

Encryption and Security Keys

LANGFUSE_SALT - Salt for hashing in Langfuse Web UI

LANGFUSE_ENCRYPTION_KEY - Encryption key (32 bytes in hex)

LANGFUSE_NEXTAUTH_SECRET - Secret key for NextAuth

Admin Credentials

LANGFUSE_INIT_USER_EMAIL - Admin email

LANGFUSE_INIT_USER_PASSWORD - Admin password

LANGFUSE_INIT_USER_NAME - Admin username

API Keys and Tokens

LANGFUSE_INIT_PROJECT_PUBLIC_KEY - Project public key (used from PentAGI side too)

LANGFUSE_INIT_PROJECT_SECRET_KEY - Project secret key (used from PentAGI side too)

S3 Storage

LANGFUSE_S3_ACCESS_KEY_ID - S3 access key ID

LANGFUSE_S3_SECRET_ACCESS_KEY - S3 secret access key

Enable integration with Langfuse for PentAGI service in .env file.

LANGFUSE_BASE_URL=http://langfuse-web:3000 LANGFUSE_PROJECT_ID= # default: value from ${LANGFUSE_INIT_PROJECT_ID} LANGFUSE_PUBLIC_KEY= # default: value from ${LANGFUSE_INIT_PROJECT_PUBLIC_KEY} LANGFUSE_SECRET_KEY= # default: value from ${LANGFUSE_INIT_PROJECT_SECRET_KEY}

Run the Langfuse stack:

curl -O https://raw.githubusercontent.com/vxcontrol/pentagi/master/docker-compose-langfuse.yml docker compose -f docker-compose.yml -f docker-compose-langfuse.yml up -d

Visit localhost:4000 to access Langfuse Web UI with credentials from .env file:

LANGFUSE_INIT_USER_EMAIL - Admin email

LANGFUSE_INIT_USER_PASSWORD - Admin password

Monitoring and Observability

For detailed system operation tracking, integration with monitoring tools is available.

Enable integration with OpenTelemetry and all observability services for PentAGI in .env file.

OTEL_HOST=otelcol:8148

Run the observability stack:

curl -O https://raw.githubusercontent.com/vxcontrol/pentagi/master/docker-compose-observability.yml docker compose -f docker-compose.yml -f docker-compose-observability.yml up -d

Visit localhost:3000 to access Grafana Web UI.

Note

If you want to use Observability stack with Langfuse, you need to enable integration in .env file to set LANGFUSE_OTEL_EXPORTER_OTLP_ENDPOINT to http://otelcol:4318.

To run all available stacks together (Langfuse, Graphiti, and Observability):

docker compose -f docker-compose.yml -f docker-compose-langfuse.yml -f docker-compose-graphiti.yml -f docker-compose-observability.yml up -d

You can also register aliases for these commands in your shell to run it faster:

alias pentagi="docker compose -f docker-compose.yml -f docker-compose-langfuse.yml -f docker-compose-graphiti.yml -f docker-compose-observability.yml" alias pentagi-up="docker compose -f docker-compose.yml -f docker-compose-langfuse.yml -f docker-compose-graphiti.yml -f docker-compose-observability.yml up -d" alias pentagi-down="docker compose -f docker-compose.yml -f docker-compose-langfuse.yml -f docker-compose-graphiti.yml -f docker-compose-observability.yml down"

Knowledge Graph Integration (Graphiti)

PentAGI integrates with Graphiti, a temporal knowledge graph system powered by Neo4j, to provide advanced semantic understanding and relationship tracking for AI agent operations. The vxcontrol fork provides custom entity and edge types that are specific to pentesting purposes.

What is Graphiti?

Graphiti automatically extracts and stores structured knowledge from agent interactions, building a graph of entities, relationships, and temporal context. This enables:

Semantic Memory: Store and recall relationships between tools, targets, vulnerabilities, and techniques

Contextual Understanding: Track how different pentesting actions relate to each other over time

Knowledge Reuse: Learn from past penetration tests and apply insights to new assessments

Advanced Querying: Search for complex patterns like "What tools were effective against similar targets?"

Enabling Graphiti

The Graphiti knowledge graph is optional and disabled by default. To enable it:

Configure Graphiti environment variables in .env file:

Graphiti knowledge graph settings

GRAPHITI_ENABLED=true GRAPHITI_TIMEOUT=30 GRAPHITI_URL=http://graphiti:8000 GRAPHITI_MODEL_NAME=gpt-5-mini

Neo4j settings (used by Graphiti stack)

NEO4J_USER=neo4j NEO4J_DATABASE=neo4j NEO4J_PASSWORD=devpassword NEO4J_URI=bolt://neo4j:7687

OpenAI API key (required by Graphiti for entity extraction)

OPEN_AI_KEY=your_openai_api_key

Run the Graphiti stack along with the main PentAGI services:

Download the Graphiti compose file if needed

curl -O https://raw.githubusercontent.com/vxcontrol/pentagi/master/docker-compose-graphiti.yml

Start PentAGI with Graphiti

docker compose -f docker-compose.yml -f docker-compose-graphiti.yml up -d

Verify Graphiti is running:

Check service health

docker compose -f docker-compose.yml -f docker-compose-graphiti.yml ps graphiti neo4j

View Graphiti logs

docker compose -f docker-compose.yml -f docker-compose-graphiti.yml logs -f graphiti

Access Neo4j Browser (optional)

Visit http://localhost:7474 and login with NEO4J_USER/NEO4J_PASSWORD

Access Graphiti API (optional, for debugging)

Visit http://localhost:8000/docs for Swagger API documentation

Note

The Graphiti service is defined in docker-compose-graphiti.yml as a separate stack. You must run both compose files together to enable the knowledge graph functionality. The pre-built Docker image vxcontrol/graphiti:latest is used by default.

What Gets Stored

When enabled, PentAGI automatically captures:

Agent Responses: All agent reasoning, analysis, and decisions

Tool Executions: Commands executed, tools used, and their results

Context Information: Flow, task, and subtask hierarchy

GitHub and Google OAuth Integration

OAuth integration with GitHub and Google allows users to authenticate using their existing accounts on these platforms. This provides several benefits:

Simplified login process without need to create separate credentials

Enhanced security through trusted identity providers

Access to user profile information from GitHub/Google accounts

Seamless integration with existing development workflows

For using GitHub OAuth you need to create a new OAuth application in your GitHub account and set the GITHUB_CLIENT_ID and GITHUB_CLIENT_SECRET in .env file.

For using Google OAuth you need to create a new OAuth application in your Google account and set the GOOGLE_CLIENT_ID and GOOGLE_CLIENT_SECRET in .env file.

Docker Image Configuration

PentAGI allows you to configure Docker image selection for executing various tasks. The system automatically chooses the most appropriate image based on the task type, but you can constrain this selection by specifying your preferred images:

Variable Default Description

DOCKER_DEFAULT_IMAGE debian:latest Default Docker image for general tasks and ambiguous cases

DOCKER_DEFAULT_IMAGE_FOR_PENTEST vxcontrol/kali-linux Default Docker image for security/penetration testing tasks

When these environment variables are set, AI agents will be limited to the image choices you specify. This is particularly useful for:

Security Enforcement: Restricting usage to only verified and trusted images

Environment Standardization: Using corporate or customized images across all operations

Performance Optimization: Utilizing pre-built images with necessary tools already installed

Configuration examples:

Using a custom image for general tasks

DOCKER_DEFAULT_IMAGE=mycompany/custom-debian:latest

Using a specialized image for penetration testing

DOCKER_DEFAULT_IMAGE_FOR_PENTEST=mycompany/pentest-tools:v2.0

Note

If a user explicitly specifies a particular Docker image in their task, the system will try to use that exact image, ignoring these settings. These variables only affect the system's automatic image selection process.

๐Ÿ’ป Development

Development Requirements

golang

nodejs

docker

postgres

commitlint

Environment Setup

Backend Setup

Run once cd backend && go mod download to install needed packages.

For generating swagger files have to run

swag init -g ../../pkg/server/router.go -o pkg/server/docs/ --parseDependency --parseInternal --parseDepth 2 -d cmd/pentagi

before installing swag package via

go install github.com/swaggo/swag/cmd/[emailย protected]

For generating graphql resolver files have to run

go run github.com/99designs/gqlgen --config ./gqlgen/gqlgen.yml

after that you can see the generated files in pkg/graph folder.

For generating ORM methods (database package) from sqlc configuration

docker run --rm -v $(pwd):/src -w /src --network pentagi-network -e DATABASE_URL="{URL}" sqlc/sqlc generate -f sqlc/sqlc.yml

For generating Langfuse SDK from OpenAPI specification

fern generate --local

and to install fern-cli

npm install -g fern-api

Testing

For running tests cd backend && go test -v ./...

Frontend Setup

Run once cd frontend && npm install to install needed packages.

For generating graphql files have to run npm run graphql:generate which using graphql-codegen.ts file.

Be sure that you have graphql-codegen installed globally:

npm install -g graphql-codegen

After that you can run:

npm run prettier to check if your code is formatted correctly

npm run prettier:fix to fix it

npm run lint to check if your code is linted correctly

npm run lint:fix to fix it

For generating SSL certificates you need to run npm run ssl:generate which using generate-ssl.ts file or it will be generated automatically when you run npm run dev.

Backend Configuration

Edit the configuration for backend in .vscode/launch.json file:

DATABASE_URL - PostgreSQL database URL (eg. postgres://postgres:postgres@localhost:5432/pentagidb?sslmode=disable)

DOCKER_HOST - Docker SDK API (eg. for macOS DOCKER_HOST=unix:///Users//Library/Containers/com.docker.docker/Data/docker.raw.sock) more info

Optional:

SERVER_PORT - Port to run the server (default: 8443)

SERVER_USE_SSL - Enable SSL for the server (default: false)

Frontend Configuration

Edit the configuration for frontend in .vscode/launch.json file:

VITE_API_URL - Backend API URL. Omit the URL scheme (e.g., localhost:8080 NOT http://localhost:8080)

VITE_USE_HTTPS - Enable SSL for the server (default: false)

VITE_PORT - Port to run the server (default: 8000)

VITE_HOST - Host to run the server (default: 0.0.0.0)

Running the Application

Backend

Run the command(s) in backend folder:

Use .env file to set environment variables like a source .env

Run go run cmd/pentagi/main.go to start the server

Note

The first run can take a while as dependencies and docker images need to be downloaded to setup the backend environment.

Frontend

Run the command(s) in frontend folder:

Run npm install to install the dependencies

Run npm run dev to run the web app

Run npm run build to build the web app

Open your browser and visit the web app URL.

๐Ÿงช Testing LLM Agents

PentAGI includes a powerful utility called ctester for testing and validating LLM agent capabilities. This tool helps ensure your LLM provider configurations work correctly with different agent types, allowing you to optimize model selection for each specific agent role.

The utility features parallel testing of multiple agents, detailed reporting, and flexible configuration options.

Key Features

Parallel Testing: Tests multiple agents simultaneously for faster results

Comprehensive Test Suite: Evaluates basic completion, JSON responses, function calling, and penetration testing knowledge

Detailed Reporting: Generates markdown reports with success rates and performance metrics

Flexible Configuration: Test specific agents or test groups as needed

Specialized Test Groups: Includes domain-specific tests for cybersecurity and penetration testing scenarios

Usage Scenarios

For Developers (with local Go environment)

If you've cloned the repository and have Go installed:

Default configuration with .env file

cd backend go run cmd/ctester/*.go -verbose

Custom provider configuration

go run cmd/ctester/*.go -config ../examples/configs/openrouter.provider.yml -verbose

Generate a report file

go run cmd/ctester/*.go -config ../examples/configs/deepinfra.provider.yml -report ../test-report.md

Test specific agent types only

go run cmd/ctester/*.go -agents simple,simple_json,primary_agent -verbose

Test specific test groups only

go run cmd/ctester/*.go -groups basic,advanced -verbose

For Users (using Docker image)

If you prefer to use the pre-built Docker image without setting up a development environment:

Using Docker to test with default environment

docker run --rm -v $(pwd)/.env:/opt/pentagi/.env vxcontrol/pentagi /opt/pentagi/bin/ctester -verbose

Test with your custom provider configuration

docker run --rm
-v $(pwd)/.env:/opt/pentagi/.env
-v $(pwd)/my-config.yml:/opt/pentagi/config.yml
vxcontrol/pentagi /opt/pentagi/bin/ctester -config /opt/pentagi/config.yml -agents simple,primary_agent,coder -verbose

Generate a detailed report

docker run --rm
-v $(pwd)/.env:/opt/pentagi/.env
-v $(pwd):/opt/pentagi/output
vxcontrol/pentagi /opt/pentagi/bin/ctester -report /opt/pentagi/output/report.md

Using Pre-configured Providers

The Docker image comes with built-in support for major providers (OpenAI, Anthropic, Gemini, Ollama) and pre-configured provider files for additional services (OpenRouter, DeepInfra, DeepSeek, Moonshot):

Test with OpenRouter configuration

docker run --rm
-v $(pwd)/.env:/opt/pentagi/.env
vxcontrol/pentagi /opt/pentagi/bin/ctester -config /opt/pentagi/conf/openrouter.provider.yml

Test with DeepInfra configuration

docker run --rm
-v $(pwd)/.env:/opt/pentagi/.env
vxcontrol/pentagi /opt/pentagi/bin/ctester -config /opt/pentagi/conf/deepinfra.provider.yml

Test with DeepSeek configuration

docker run --rm
-v $(pwd)/.env:/opt/pentagi/.env
vxcontrol/pentagi /opt/pentagi/bin/ctester -config /opt/pentagi/conf/deepseek.provider.yml

Test with Moonshot configuration

docker run --rm
-v $(pwd)/.env:/opt/pentagi/.env
vxcontrol/pentagi /opt/pentagi/bin/ctester -config /opt/pentagi/conf/moonshot.provider.yml

Test with OpenAI configuration

docker run --rm
-v $(pwd)/.env:/opt/pentagi/.env
vxcontrol/pentagi /opt/pentagi/bin/ctester -type openai

Test with Anthropic configuration

docker run --rm
-v $(pwd)/.env:/opt/pentagi/.env
vxcontrol/pentagi /opt/pentagi/bin/ctester -type anthropic

Test with Gemini configuration

docker run --rm
-v $(pwd)/.env:/opt/pentagi/.env
vxcontrol/pentagi /opt/pentagi/bin/ctester -type gemini

Test with AWS Bedrock configuration

docker run --rm
-v $(pwd)/.env:/opt/pentagi/.env
vxcontrol/pentagi /opt/pentagi/bin/ctester -type bedrock

Test with Custom OpenAI configuration

docker run --rm
-v $(pwd)/.env:/opt/pentagi/.env
vxcontrol/pentagi /opt/pentagi/bin/ctester -config /opt/pentagi/conf/custom-openai.provider.yml

Test with Ollama configuration (local inference)

docker run --rm
-v $(pwd)/.env:/opt/pentagi/.env
vxcontrol/pentagi /opt/pentagi/bin/ctester -config /opt/pentagi/conf/ollama-llama318b.provider.yml

Test with Ollama Qwen3 32B configuration (requires custom model creation)

docker run --rm
-v $(pwd)/.env:/opt/pentagi/.env
vxcontrol/pentagi /opt/pentagi/bin/ctester -config /opt/pentagi/conf/ollama-qwen332b-fp16-tc.provider.yml

Test with Ollama QwQ 32B configuration (requires custom model creation and 71.3GB VRAM)

docker run --rm
-v $(pwd)/.env:/opt/pentagi/.env
vxcontrol/pentagi /opt/pentagi/bin/ctester -config /opt/pentagi/conf/ollama-qwq32b-fp16-tc.provider.yml

To use these configurations, your .env file only needs to contain:

LLM_SERVER_URL=https://openrouter.ai/api/v1 # or https://api.deepinfra.com/v1/openai or https://api.deepseek.com or https://api.openai.com/v1 or https://api.moonshot.ai/v1 LLM_SERVER_KEY=your_api_key LLM_SERVER_MODEL= # Leave empty, as models are specified in the config LLM_SERVER_CONFIG_PATH=/opt/pentagi/conf/openrouter.provider.yml # or deepinfra.provider.yml or deepseek.provider.yml or custom-openai.provider.yml or moonshot.provider.yml LLM_SERVER_PROVIDER= # Provider name for LiteLLM proxy (e.g., openrouter, deepseek, moonshot) LLM_SERVER_LEGACY_REASONING=false # Controls reasoning format, for OpenAI must be true (default: false) LLM_SERVER_PRESERVE_REASONING=false # Preserve reasoning content in multi-turn conversations (required by Moonshot, default: false)

For OpenAI (official API)

OPEN_AI_KEY=your_openai_api_key # Your OpenAI API key OPEN_AI_SERVER_URL=https://api.openai.com/v1 # OpenAI API endpoint

For Anthropic (Claude models)

ANTHROPIC_API_KEY=your_anthropic_api_key # Your Anthropic API key ANTHROPIC_SERVER_URL=https://api.anthropic.com/v1 # Anthropic API endpoint

For Gemini (Google AI)

GEMINI_API_KEY=your_gemini_api_key # Your Google AI API key GEMINI_SERVER_URL=https://generativelanguage.googleapis.com # Google AI API endpoint

For AWS Bedrock (enterprise foundation models)

BEDROCK_REGION=us-east-1 # AWS region for Bedrock service BEDROCK_ACCESS_KEY_ID=your_aws_access_key # AWS access key ID BEDROCK_SECRET_ACCESS_KEY=your_aws_secret_key # AWS secret access key BEDROCK_SESSION_TOKEN=your_aws_session_token # AWS session token (alternative auth method) BEDROCK_SERVER_URL= # Optional custom Bedrock endpoint

For Ollama (local inference)

OLLAMA_SERVER_URL=http://localhost:11434 OLLAMA_SERVER_MODEL=llama3.1:8b-instruct-q8_0 OLLAMA_SERVER_CONFIG_PATH=/opt/pentagi/conf/ollama-llama318b.provider.yml OLLAMA_SERVER_PULL_MODELS_ENABLED=false OLLAMA_SERVER_LOAD_MODELS_ENABLED=false

Using OpenAI with Unverified Organizations

For OpenAI accounts with unverified organizations that don't have access to the latest reasoning models (o1, o3, o4-mini), you need to use a custom configuration.

To use OpenAI with unverified organization accounts, configure your .env file as follows:

LLM_SERVER_URL=https://api.openai.com/v1 LLM_SERVER_KEY=your_openai_api_key LLM_SERVER_MODEL= # Leave empty, models are specified in config LLM_SERVER_CONFIG_PATH=/opt/pentagi/conf/custom-openai.provider.yml LLM_SERVER_LEGACY_REASONING=true # Required for OpenAI reasoning format

This configuration uses the pre-built custom-openai.provider.yml file that maps all agent types to models available for unverified organizations, using o3-mini instead of models like o1, o3, and o4-mini.

You can test this configuration using:

Test with custom OpenAI configuration for unverified accounts

docker run --rm
-v $(pwd)/.env:/opt/pentagi/.env
vxcontrol/pentagi /opt/pentagi/bin/ctester -config /opt/pentagi/conf/custom-openai.provider.yml

Note

The LLM_SERVER_LEGACY_REASONING=true setting is crucial for OpenAI compatibility as it ensures reasoning parameters are sent in the format expected by OpenAI's API.

Using LiteLLM Proxy

When using LiteLLM proxy to access various LLM providers, model names are prefixed with the provider name (e.g., moonshot/kimi-2.5 instead of kimi-2.5). To use the same provider configuration files with both direct API access and LiteLLM proxy, set the LLM_SERVER_PROVIDER variable:

Direct access to Moonshot API

LLM_SERVER_URL=https://api.moonshot.ai/v1 LLM_SERVER_KEY=your_moonshot_api_key LLM_SERVER_CONFIG_PATH=/opt/pentagi/conf/moonshot.provider.yml LLM_SERVER_PROVIDER= # Empty for direct access

Access via LiteLLM proxy

LLM_SERVER_URL=http://litellm-proxy:4000 LLM_SERVER_KEY=your_litellm_api_key LLM_SERVER_CONFIG_PATH=/opt/pentagi/conf/moonshot.provider.yml LLM_SERVER_PROVIDER=moonshot # Provider prefix for LiteLLM

With LLM_SERVER_PROVIDER=moonshot, the system automatically prefixes all model names from the configuration file with moonshot/, making them compatible with LiteLLM's model naming convention.

Supported provider names for LiteLLM:

openai - for OpenAI models via LiteLLM

anthropic - for Anthropic/Claude models via LiteLLM

gemini - for Google Gemini models via LiteLLM

openrouter - for OpenRouter aggregator

deepseek - for DeepSeek models

deepinfra - for DeepInfra hosting

moonshot - for Moonshot AI (Kimi)

Any other provider name configured in your LiteLLM instance

This approach allows you to:

Use the same configuration files for both direct and proxied access

Switch between providers without modifying configuration files

Easily test different routing strategies with LiteLLM

Running Tests in a Production Environment

If you already have a running PentAGI container and want to test the current configuration:

Run ctester in an existing container using current environment variables

docker exec -it pentagi /opt/pentagi/bin/ctester -verbose

Test specific agent types with deterministic ordering

docker exec -it pentagi /opt/pentagi/bin/ctester -agents simple,primary_agent,pentester -groups basic,knowledge -verbose

Generate a report file inside the container

docker exec -it pentagi /opt/pentagi/bin/ctester -report /opt/pentagi/data/agent-test-report.md

Access the report from the host

docker cp pentagi:/opt/pentagi/data/agent-test-report.md ./

Command-line Options

The utility accepts several options:

-env - Path to environment file (default: .env)

-type - Provider type: custom, openai, anthropic, ollama, bedrock, gemini (default: custom)

-config - Path to custom provider config (default: from LLM_SERVER_CONFIG_PATH env variable)

-tests - Path to custom tests YAML file (optional)

-report - Path to write the report file (optional)

-agents - Comma-separated list of agent types to test (default: all)

-groups - Comma-separated list of test groups to run (default: all)

-verbose - Enable verbose output with detailed test results for each agent

Available Agent Types

Agents are tested in the following deterministic order:

simple - Basic completion tasks

simple_json - JSON-structured responses

primary_agent - Main reasoning agent

assistant - Interactive assistant mode

generator - Content generation

refiner - Content refinement and improvement

adviser - Expert advice and consultation

reflector - Self-reflection and analysis

searcher - Information gathering and search

enricher - Data enrichment and expansion

coder - Code generation and analysis

installer - Installation and setup tasks

pentester - Penetration testing and security assessment

Available Test Groups

basic - Fundamental completion and prompt response tests

advanced - Complex reasoning and function calling tests

json - JSON format validation and structure tests (specifically designed for simple_json agent)

knowledge - Domain-specific cybersecurity and penetration testing knowledge tests

Note: The json test group is specifically designed for the simple_json agent type, while all other agents are tested with basic, advanced, and knowledge groups. This specialization ensures optimal testing coverage for each agent's intended purpose.

Example Provider Configuration

Provider configuration defines which models to use for different agent types:

simple: model: "provider/model-name" temperature: 0.7 top_p: 0.95 n: 1 max_tokens: 4000

simple_json: model: "provider/model-name" temperature: 0.7 top_p: 1.0 n: 1 max_tokens: 4000 json: true

... other agent types ...

Optimization Workflow

Create a baseline: Run tests with default configuration to establish benchmark performance

Analyze agent-specific performance: Review the deterministic agent ordering to identify underperforming agents

Test specialized configurations: Experiment with different models for each agent type using provider-specific configs

Focus on domain knowledge: Pay special attention to knowledge group tests for cybersecurity expertise

Validate function calling: Ensure tool-based tests pass consistently for critical agent types

Compare results: Look for the best success rate and performance across all test groups

Deploy optimal configuration: Use in production with your optimized setup

This tool helps ensure your AI agents are using the most effective models for their specific tasks, improving reliability while optimizing costs.

๐Ÿงฎ Embedding Configuration and Testing

PentAGI uses vector embeddings for semantic search, knowledge storage, and memory management. The system supports multiple embedding providers that can be configured according to your needs and preferences.

Supported Embedding Providers

PentAGI supports the following embedding providers:

OpenAI (default): Uses OpenAI's text embedding models

Ollama: Local embedding model through Ollama

Mistral: Mistral AI's embedding models

Jina: Jina AI's embedding service

HuggingFace: Models from HuggingFace

GoogleAI: Google's embedding models

VoyageAI: VoyageAI's embedding models

Embedding Provider Configuration (click to expand) Environment Variables

To configure the embedding provider, set the following environment variables in your .env file:

Primary embedding configuration

EMBEDDING_PROVIDER=openai # Provider type (openai, ollama, mistral, jina, huggingface, googleai, voyageai) EMBEDDING_MODEL=text-embedding-3-small # Model name to use EMBEDDING_URL= # Optional custom API endpoint EMBEDDING_KEY= # API key for the provider (if required) EMBEDDING_BATCH_SIZE=100 # Number of documents to process in a batch EMBEDDING_STRIP_NEW_LINES=true # Whether to remove new lines from text before embedding

Advanced settings

PROXY_URL= # Optional proxy for all API calls

SSL/TLS Certificate Configuration (for external communication with LLM backends and tool servers)

EXTERNAL_SSL_CA_PATH= # Path to custom CA certificate file (PEM format) inside the container # Must point to /opt/pentagi/ssl/ directory (e.g., /opt/pentagi/ssl/ca-bundle.pem) EXTERNAL_SSL_INSECURE=false # Skip certificate verification (use only for testing)

How to Add Custom CA Certificates (click to expand) If you see this error: tls: failed to verify certificate: x509: certificate signed by unknown authority

Step 1: Get your CA certificate bundle in PEM format (can contain multiple certificates)

Step 2: Place the file in the SSL directory on your host machine:

Default location (if PENTAGI_SSL_DIR is not set)

cp ca-bundle.pem ./pentagi-ssl/

Or custom location (if using PENTAGI_SSL_DIR in docker-compose.yml)

cp ca-bundle.pem /path/to/your/ssl/dir/

Step 3: Set the path in .env file (path must be inside the container):

The volume pentagi-ssl is mounted to /opt/pentagi/ssl inside the container

EXTERNAL_SSL_CA_PATH=/opt/pentagi/ssl/ca-bundle.pem EXTERNAL_SSL_INSECURE=false

Step 4: Restart PentAGI:

docker compose restart pentagi

Notes:

The pentagi-ssl volume is mounted to /opt/pentagi/ssl inside the container

You can change host directory using PENTAGI_SSL_DIR variable in docker-compose.yml

File supports multiple certificates and intermediate CAs in one PEM file

Use EXTERNAL_SSL_INSECURE=true only for testing (not recommended for production)

Provider-Specific Limitations

Each provider has specific limitations and supported features:

OpenAI: Supports all configuration options

Ollama: Does not support EMBEDDING_KEY as it uses local models

Mistral: Does not support EMBEDDING_MODEL or custom HTTP client

Jina: Does not support custom HTTP client

HuggingFace: Requires EMBEDDING_KEY and supports all other options

GoogleAI: Does not support EMBEDDING_URL, requires EMBEDDING_KEY

VoyageAI: Supports all configuration options

If EMBEDDING_URL and EMBEDDING_KEY are not specified, the system will attempt to use the corresponding LLM provider settings (e.g., OPEN_AI_KEY when EMBEDDING_PROVIDER=openai).

Why Consistent Embedding Providers Matter

It's crucial to use the same embedding provider consistently because:

Vector Compatibility: Different providers produce vectors with different dimensions and mathematical properties

Semantic Consistency: Changing providers can break semantic similarity between previously embedded documents

Memory Corruption: Mixed embeddings can lead to poor search results and broken knowledge base functionality

If you change your embedding provider, you should flush and reindex your entire knowledge base (see etester utility below).

Embedding Tester Utility (etester)

PentAGI includes a specialized etester utility for testing, managing, and debugging embedding functionality. This tool is essential for diagnosing and resolving issues related to vector embeddings and knowledge storage.

Etester Commands (click to expand)

Test embedding provider and database connection

cd backend go run cmd/etester/main.go test -verbose

Show statistics about the embedding database

go run cmd/etester/main.go info

Delete all documents from the embedding database (use with caution!)

go run cmd/etester/main.go flush

Recalculate embeddings for all documents (after changing provider)

go run cmd/etester/main.go reindex

Search for documents in the embedding database

go run cmd/etester/main.go search -query "How to install PostgreSQL" -limit 5

Using Docker

If you're running PentAGI in Docker, you can use etester from within the container:

Test embedding provider

docker exec -it pentagi /opt/pentagi/bin/etester test

Show detailed database information

docker exec -it pentagi /opt/pentagi/bin/etester info -verbose

Advanced Search Options

The search command supports various filters to narrow down results:

Filter by document type

docker exec -it pentagi /opt/pentagi/bin/etester search -query "Security vulnerability" -doc_type guide -threshold 0.8

Filter by flow ID

docker exec -it pentagi /opt/pentagi/bin/etester search -query "Code examples" -doc_type code -flow_id 42

All available search options

docker exec -it pentagi /opt/pentagi/bin/etester search -help

Available search parameters:

-query STRING: Search query text (required)

-doc_type STRING: Filter by document type (answer, memory, guide, code)

-flow_id NUMBER: Filter by flow ID (positive number)

-answer_type STRING: Filter by answer type (guide, vulnerability, code, tool, other)

-guide_type STRING: Filter by guide type (install, configure, use, pentest, development, other)

-limit NUMBER: Maximum number of results (default: 3)

-threshold NUMBER: Similarity threshold (0.0-1.0, default: 0.7)

Common Troubleshooting Scenarios

After changing embedding provider: Always run flush or reindex to ensure consistency

Poor search results: Try adjusting the similarity threshold or check if embeddings are correctly generated

Database connection issues: Verify PostgreSQL is running with pgvector extension installed

Missing API keys: Check environment variables for your chosen embedding provider

๐Ÿ” Function Testing with ftester

PentAGI includes a versatile utility called ftester for debugging, testing, and developing specific functions and AI agent behaviors. While ctester focuses on testing LLM model capabilities, ftester allows you to directly invoke individual system functions and AI agent components with precise control over execution context.

Key Features

Direct Function Access: Test individual functions without running the entire system

Mock Mode: Test functions without a live PentAGI deployment using built-in mocks

Interactive Input: Fill function arguments interactively for exploratory testing

Detailed Output: Color-coded terminal output with formatted responses and errors

Context-Aware Testing: Debug AI agents within the context of specific flows, tasks, and subtasks

Observability Integration: All function calls are logged to Langfuse and Observability stack

Usage Modes

Command Line Arguments

Run ftester with specific function and arguments directly from the command line:

Basic usage with mock mode

cd backend go run cmd/ftester/main.go [function_name] -[arg1] [value1] -[arg2] [value2]

Example: Test terminal command in mock mode

go run cmd/ftester/main.go terminal -command "ls -la" -message "List files"

Using a real flow context

go run cmd/ftester/main.go -flow 123 terminal -command "whoami" -message "Check user"

Testing AI agent in specific task/subtask context

go run cmd/ftester/main.go -flow 123 -task 456 -subtask 789 pentester -message "Find vulnerabilities"

Interactive Mode

Run ftester without arguments for a guided interactive experience:

Start interactive mode

go run cmd/ftester/main.go [function_name]

For example, to interactively fill browser tool arguments

go run cmd/ftester/main.go browser

Available Functions (click to expand) Environment Functions

terminal: Execute commands in a container and return the output

file: Perform file operations (read, write, list) in a container

Search Functions

browser: Access websites and capture screenshots

google: Search the web using Google Custom Search

duckduckgo: Search the web using DuckDuckGo

tavily: Search using Tavily AI search engine

traversaal: Search using Traversaal AI search engine

perplexity: Search using Perplexity AI

searxng: Search using Searxng meta search engine (aggregates results from multiple engines)

Vector Database Functions

search_in_memory: Search for information in vector database

search_guide: Find guidance documents in vector database

search_answer: Find answers to questions in vector database

search_code: Find code examples in vector database

AI Agent Functions

advice: Get expert advice from an AI agent

coder: Request code generation or modification

maintenance: Run system maintenance tasks

memorist: Store and organize information in vector database

pentester: Perform security tests and vulnerability analysis

search: Complex search across multiple sources

Utility Functions

describe: Show information about flows, tasks, and subtasks

Debugging Flow Context (click to expand) The describe function provides detailed information about tasks and subtasks within a flow. This is particularly useful for diagnosing issues when PentAGI encounters problems or gets stuck.

List all flows in the system

go run cmd/ftester/main.go describe

Show all tasks and subtasks for a specific flow

go run cmd/ftester/main.go -flow 123 describe

Show detailed information for a specific task

go run cmd/ftester/main.go -flow 123 -task 456 describe

Show detailed information for a specific subtask

go run cmd/ftester/main.go -flow 123 -task 456 -subtask 789 describe

Show verbose output with full descriptions and results

go run cmd/ftester/main.go -flow 123 describe -verbose

This function allows you to identify the exact point where a flow might be stuck and resume processing by directly invoking the appropriate agent function.

Function Help and Discovery (click to expand) Each function has a help mode that shows available parameters:

Get help for a specific function

go run cmd/ftester/main.go [function_name] -help

Examples:

go run cmd/ftester/main.go terminal -help go run cmd/ftester/main.go browser -help go run cmd/ftester/main.go describe -help

You can also run ftester without arguments to see a list of all available functions:

go run cmd/ftester/main.go

Output Format (click to expand) The ftester utility uses color-coded output to make interpretation easier:

Blue headers: Section titles and key names

Cyan [INFO]: General information messages

Green [SUCCESS]: Successful operations

Red [ERROR]: Error messages

Yellow [WARNING]: Warning messages

Yellow [MOCK]: Indicates mock mode operation

Magenta values: Function arguments and results

JSON and Markdown responses are automatically formatted for readability.

Advanced Usage Scenarios (click to expand) Debugging Stuck AI Flows

When PentAGI gets stuck in a flow:

Pause the flow through the UI

Use describe to identify the current task and subtask

Directly invoke the agent function with the same task/subtask IDs

Examine the detailed output to identify the issue

Resume the flow or manually intervene as needed

Testing Environment Variables

Verify that API keys and external services are configured correctly:

Test Google search API configuration

go run cmd/ftester/main.go google -query "pentesting tools"

Test browser access to external websites

go run cmd/ftester/main.go browser -url "https://example.com"

Developing New AI Agent Behaviors

When developing new prompt templates or agent behaviors:

Create a test flow in the UI

Use ftester to directly invoke the agent with different prompts

Observe responses and adjust prompts accordingly

Check Langfuse for detailed traces of all function calls

Verifying Docker Container Setup

Ensure containers are properly configured:

go run cmd/ftester/main.go -flow 123 terminal -command "env | grep -i proxy" -message "Check proxy settings"

Docker Container Usage (click to expand) If you have PentAGI running in Docker, you can use ftester from within the container:

Run ftester inside the running PentAGI container

docker exec -it pentagi /opt/pentagi/bin/ftester [arguments]

Examples:

docker exec -it pentagi /opt/pentagi/bin/ftester -flow 123 describe docker exec -it pentagi /opt/pentagi/bin/ftester -flow 123 terminal -command "ps aux" -message "List processes"

This is particularly useful for production deployments where you don't have a local development environment.

Integration with Observability Tools (click to expand) All function calls made through ftester are logged to:

Langfuse: Captures the entire AI agent interaction chain, including prompts, responses, and function calls

OpenTelemetry: Records metrics, traces, and logs for system performance analysis

Terminal Output: Provides immediate feedback on function execution

To access detailed logs:

Check Langfuse UI for AI agent traces (typically at http://localhost:4000)

Use Grafana dashboards for system metrics (typically at http://localhost:3000)

Examine terminal output for immediate function results and errors

Command-line Options

The main utility accepts several options:

-env - Path to environment file (optional, default: .env)

-provider - Provider type to use (default: custom, options: openai, anthropic, ollama, bedrock, gemini, custom)

-flow - Flow ID for testing (0 means using mocks, default: 0)

-task - Task ID for agent context (optional)

-subtask - Subtask ID for agent context (optional)

Function-specific arguments are passed after the function name using -name value format.

๐Ÿ—๏ธ Building

Building Docker Image

docker build -t local/pentagi:latest .

Note

You can use docker buildx to build the image for different platforms like a docker buildx build --platform linux/amd64 -t local/pentagi:latest .

You need to change image name in docker-compose.yml file to local/pentagi:latest and run docker compose up -d to start the server or use build key option in docker-compose.yml file.

๐Ÿ‘ Credits

This project is made possible thanks to the following research and developments:

Emerging Architectures for LLM Applications

A Survey of Autonomous LLM Agents

๐Ÿ“„ License

PentAGI Core License

PentAGI Core: Licensed under MIT License Copyright (c) 2025 PentAGI Development Team

VXControl Cloud SDK Integration

VXControl Cloud SDK Integration: This repository integrates VXControl Cloud SDK under a special licensing exception that applies ONLY to the official PentAGI project.

โœ… Official PentAGI Project

This official repository: https://github.com/vxcontrol/pentagi

Official releases distributed by VXControl LLC

Code used under direct authorization from VXControl LLC

โš ๏ธ Important for Forks and Third-Party Use

If you fork this project or create derivative works, the VXControl SDK components are subject to AGPL-3.0 license terms. You must either:

Remove VXControl SDK integration

Open source your entire application (comply with AGPL-3.0 copyleft terms)

Obtain a commercial license from VXControl LLC

Commercial Licensing

For commercial use of VXControl Cloud SDK in proprietary applications, contact:

Email: [emailย protected]

Subject: "VXControl Cloud SDK Commercial License"


Source: https://github.com/vxcontrol/pentagi

About the Author

ZadeNor AI Team is a leading expert in AI, contributing to cutting-edge research and development in the field.