🚀 Now Available in Beta

Unified AI/LLM Security Scanner

Static code analysis + security posture audit for OWASP LLM Top 10 vulnerabilities. Find vulnerabilities in your AI code before deployment.

$ pip install aisentry

The AI Security Gap

LLMs are being deployed to production faster than ever, but security is an afterthought.

⚠️

LLMs Deployed Without Testing

Teams rush AI features to production without security testing. No one checks if the model can be jailbroken, tricked into leaking data, or manipulated via prompt injection.

🎯

OWASP LLM Top 10 is Real

Prompt injection, insecure output handling, model theft — these aren't theoretical. They're actively exploited. Yet most teams don't know they exist, let alone how to detect them.

🧩

Fragmented Tooling

Existing SAST tools don't understand LLM-specific vulnerabilities. You need multiple tools, different configurations, separate reports. No single tool covers the OWASP LLM Top 10.

🔒

No Unified Standard

Security teams lack a single CLI tool that covers the full OWASP LLM Top 10. No standardized output format. No way to track vulnerabilities across code and runtime.

One Tool. Complete Coverage.

aisentry combines static code analysis and security posture audit in a single, unified tool.

aisentry aisentry scan Static + Audit Combined aisentry audit Security Posture garak (recommended) Live Testing 10 OWASP Detectors • Prompt Injection • Insecure Output • Training Poisoning • Model DoS • Supply Chain • Secrets Exposure • Insecure Plugins • Excessive Agency • Overreliance • Model Theft 61 Security Controls • Prompt Security (8) • Model Security (8) • Data Privacy (8) • OWASP LLM (10) • Blue Team (7) • Governance (5) • Supply Chain (3) • Hallucination (5) • Ethical AI (4) • Incident Response (3) Maturity: Initial → Developing → Defined → Managed → Optimizing 11 Attack Vectors • Prompt Injection • Jailbreak • Data Leakage • Hallucination • Model Extraction • Bias Detection • DoS • Supply Chain • Adversarial • Output Manip. • Behavioral Code Vulnerabilities Found in your codebase Security Posture Score Maturity level assessment Model Vulnerabilities Found via live probes Unified Tabbed Report JSON | HTML (Dark Mode) | SARIF
10
Static Detectors
61
Security Controls
88%
FP Reduction Accuracy
7
LLM Providers
1.7k+
PyPI Downloads

Tested on Real-World LLM Frameworks

Evaluated against 10 major LLM frameworks including LangChain, LlamaIndex, vLLM, OpenAI Python, and more.

75.4%
Precision
63.0%
Recall
68.7%
F1 Score
🎯

LLM-Specific Detection Coverage

Patterns that generic SAST tools (Semgrep, Bandit) cannot detect:

LLM01 Prompt Injection (73% F1)
LLM02 Insecure Output (74% F1)
LLM04 Model DoS (80% F1)
LLM07 Insecure Plugin (93% F1)

Note: For general patterns (eval/exec/SQL), use aisentry + Bandit together. See methodology →

Real-World Repository Analysis (10 repos, 14,991 files)

Repository Files Findings Findings/File
LangChain 2,501 170 0.07
LlamaIndex 4,088 999 0.24
Haystack 523 45 0.09
LiteLLM 2,792 1,623 0.58
DSPy 231 98 0.42
OpenAI Python 1,134 40 0.04
Guidance 149 31 0.21
vLLM 2,239 1,245 0.56
Semantic Kernel 1,241 30 0.02
Text Gen WebUI 93 131 1.41
Total 14,991 4,412 0.29

Detection Rate by OWASP LLM Category

Category Recall Precision F1
LLM07: Insecure Plugin 100% 87.5% 93.3%
LLM04: Model DoS 66.7% 100% 80.0%
LLM09: Overreliance 66.7% 100% 80.0%
LLM02: Insecure Output 70.0% 77.8% 73.7%
LLM01: Prompt Injection 66.7% 80.0% 72.7%
LLM06: Sensitive Info 71.4% 55.6% 62.5%
LLM08: Excessive Agency 50.0% 75.0% 60.0%
LLM03: Training Poisoning 40.0% 100% 57.1%
LLM05: Supply Chain 60.0% 54.5% 57.1%
LLM10: Model Theft 28.6% 100% 44.4%
📊
Transparency: Current metrics: 75.4% precision, 63.0% recall, 68.7% F1. Outperforms Semgrep (6.8% recall) and Bandit (46.3% F1) on LLM-specific vulnerabilities. Best at detecting Insecure Plugin (93.3%), Model DoS (80.0%), and Insecure Output (73.7%).

Benchmark Methodology

All metrics are computed against a ground truth testbed with labeled vulnerabilities across 10 OWASP categories. Results are fully reproducible.

📁 View Testbed & Ground Truth 🔬 Reproduce Results 🏷️ Labels & Annotations

ML-Powered Noise Reduction

Automatically filter common false positives with 88% accuracy using ML-trained heuristics.

Raw Findings From static scan FP Reducer (Ensemble) Heuristics Pattern matching ML Classifier RandomForest LLM Verify Optional Weighted Score TP Probability Clean Findings FPs filtered out 100+ findings High-confidence TPs Common False Positives Filtered: model.eval() | session.exec() | base64 images | placeholder keys
🔥

PyTorch model.eval()

Not Python's dangerous eval(). Sets model to evaluation mode.

🗄️

SQLAlchemy session.exec()

Not Python's dangerous exec(). Executes SQL queries safely.

🖼️

Base64 Images

Data URIs like data:image/png;base64,... are not leaked secrets.

🔑

Placeholder Keys

Example values like your-api-key or sk-test-xxx.

🧠

Enhanced ML Reduction

For ML-based classification trained on 1,000 labeled findings, install with the [ml] extra:

pip install aisentry[ml]

Built for Your Workflow

Whether you're a security engineer, developer, or platform team — we've got you covered.

🔐

For Security Engineers

  • Complete OWASP LLM Top 10 coverage
  • Evidence-based confidence scoring
  • Actionable remediation guidance
  • SARIF output for GitHub Security
💻

For Developers

  • Simple pip install, intuitive CLI
  • Works with your existing codebase
  • No config files needed
  • Clear, actionable output
⚙️

For DevOps / Platform

  • CI/CD ready out of the box
  • Multi-provider: cloud + local (Ollama)
  • JSON/SARIF for automation
  • Enterprise-ready, MIT licensed

Up and Running in Minutes

Install, scan, test. It's that simple.

Try It Now - Scan a Real Project

Clone any LLM project and generate reports in seconds:

# Install
pip install aisentry

# Clone a sample LLM project
git clone https://github.com/langchain-ai/langchain.git
cd langchain

# Generate HTML report (interactive, with audit)
aisentry scan ./libs/langchain -o html -f report.html

# Generate JSON report (for automation)
aisentry scan ./libs/langchain -o json -f report.json

# Generate SARIF report (for GitHub Code Scanning)
aisentry scan ./libs/langchain -o sarif -f report.sarif

# View the HTML report
open report.html

See example reports from real-world scans.

1

Install

Install from PyPI with pip

pip install aisentry
2

Scan Your Code

Static analysis + security posture audit

# Scan with audit (tabbed HTML report)
aisentry scan ./my-project -o html

# Standalone security posture audit
aisentry audit ./src -o html
3

Test Live Models

For runtime testing, use Garak

# Install Garak (NVIDIA's LLM vulnerability scanner)
pip install garak

# Run probes against a model
garak --model_type openai --model_name gpt-4
aisentry v1.0.0

Scanning ./my-project...

 Analyzed 47 files
 Ran 10 OWASP detectors
 Evaluated 61 security controls

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
FINDINGS SUMMARY
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

■ CRITICAL  1   Prompt Injection (LLM01)
■ HIGH      2   Insecure Output (LLM02)
■ MEDIUM    3   Secrets Exposure (LLM06)

─────────────────────────────────────
Vulnerability Score: 67/100
Security Posture:   72/100
Maturity Level:     Developing

 Report saved to report.html

Test Any LLM Provider

Cloud providers, local models, or custom endpoints — we support them all.

🟢 OpenAI
🟠 Anthropic
🔶 AWS Bedrock
🔵 Google Vertex
🔷 Azure OpenAI
🦙 Ollama
Custom API

Frequently Asked Questions

What's the difference between scan and audit?

scan - Static code analysis for OWASP LLM Top 10 vulnerabilities in your source code.
audit - Security posture assessment evaluating 61 controls across 10 categories.
For live runtime testing, we recommend Garak.

How accurate is the detection?

Current metrics: 75.4% precision, 63.0% recall, 68.7% F1 score. We outperform Semgrep and Bandit on LLM-specific vulnerabilities. Use --mode strict for fewer false positives.

Does it support languages other than Python?

Currently Python-only. JavaScript/TypeScript support is planned. The architecture is extensible — see CONTRIBUTING.md if you want to help add new parsers.

Can I use this in CI/CD?

Yes! Use -o sarif for GitHub Code Scanning, or -o json for custom integrations. See CI/CD integration docs.

Is my code sent anywhere?

No. All analysis runs 100% locally. Your source code never leaves your machine.

How can I contribute?

We welcome contributions! Check out our CONTRIBUTING.md for guidelines. Good first issues are labeled good-first-issue on GitHub.

Building the Future of AI Security

Features you won't find anywhere else. Open-source and community-driven.

Shipped
In Progress
Planned
SHIPPED
Q1

Foundation

January 2026
  • 10 OWASP LLM Detectors

    Full coverage of OWASP LLM Top 10

  • ML-based FP Reduction

    88% accuracy filtering false positives

  • Security Posture Audit

    61 controls across 10 categories

  • HTML/JSON/SARIF Reports

    CI/CD ready with GitHub integration

IN PROGRESS
Q2

Agent Security

April 2026
  • MCP Server Scanning FIRST

    Scan Model Context Protocol servers for over-permissioned tools

  • Agentic Flow Analysis

    Trace agent chains in LangGraph, CrewAI, AutoGen

  • RAG Pipeline Security

    Vector DB injection, unsafe loaders, PII in embeddings

  • VS Code Extension

    Real-time scanning in your editor

PLANNED
Q3

Deep Analysis

July 2026
  • Prompt Template Analyzer

    Static analysis for injection vectors in prompts

  • Guardrails Scanner

    Detect missing NeMo, LlamaGuard, Guardrails AI

  • Fine-tuning Data Security

    PII detection, poisoning indicators in training data

  • LLM Gateway Scanner

    Audit LiteLLM, AI Gateway proxy configs

PLANNED
Q4

Advanced Features

October 2026
  • Multi-modal Security NEW

    Image prompt injection, audio input sanitization

  • Runtime-Static Bridge

    Generate Garak tests from static findings

  • JavaScript/TypeScript

    Full support for JS/TS LLM applications

  • Team Dashboard

    Cloud dashboard with trend tracking

Have a feature request? We're building in the open.

Request Feature Star on GitHub