Static code analysis + security posture audit for OWASP LLM Top 10 vulnerabilities. Find vulnerabilities in your AI code before deployment.
$ pip install aisentry
LLMs are being deployed to production faster than ever, but security is an afterthought.
Teams rush AI features to production without security testing. No one checks if the model can be jailbroken, tricked into leaking data, or manipulated via prompt injection.
Prompt injection, insecure output handling, model theft — these aren't theoretical. They're actively exploited. Yet most teams don't know they exist, let alone how to detect them.
Existing SAST tools don't understand LLM-specific vulnerabilities. You need multiple tools, different configurations, separate reports. No single tool covers the OWASP LLM Top 10.
Security teams lack a single CLI tool that covers the full OWASP LLM Top 10. No standardized output format. No way to track vulnerabilities across code and runtime.
aisentry combines static code analysis and security posture audit in a single, unified tool.
Evaluated against 10 major LLM frameworks including LangChain, LlamaIndex, vLLM, OpenAI Python, and more.
Patterns that generic SAST tools (Semgrep, Bandit) cannot detect:
Note: For general patterns (eval/exec/SQL), use aisentry + Bandit together. See methodology →
| Repository | Files | Findings | Findings/File |
|---|---|---|---|
| LangChain | 2,501 | 170 | 0.07 |
| LlamaIndex | 4,088 | 999 | 0.24 |
| Haystack | 523 | 45 | 0.09 |
| LiteLLM | 2,792 | 1,623 | 0.58 |
| DSPy | 231 | 98 | 0.42 |
| OpenAI Python | 1,134 | 40 | 0.04 |
| Guidance | 149 | 31 | 0.21 |
| vLLM | 2,239 | 1,245 | 0.56 |
| Semantic Kernel | 1,241 | 30 | 0.02 |
| Text Gen WebUI | 93 | 131 | 1.41 |
| Total | 14,991 | 4,412 | 0.29 |
| Category | Recall | Precision | F1 |
|---|---|---|---|
| LLM07: Insecure Plugin | 100% | 87.5% | 93.3% |
| LLM04: Model DoS | 66.7% | 100% | 80.0% |
| LLM09: Overreliance | 66.7% | 100% | 80.0% |
| LLM02: Insecure Output | 70.0% | 77.8% | 73.7% |
| LLM01: Prompt Injection | 66.7% | 80.0% | 72.7% |
| LLM06: Sensitive Info | 71.4% | 55.6% | 62.5% |
| LLM08: Excessive Agency | 50.0% | 75.0% | 60.0% |
| LLM03: Training Poisoning | 40.0% | 100% | 57.1% |
| LLM05: Supply Chain | 60.0% | 54.5% | 57.1% |
| LLM10: Model Theft | 28.6% | 100% | 44.4% |
All metrics are computed against a ground truth testbed with labeled vulnerabilities across 10 OWASP categories. Results are fully reproducible.
Automatically filter common false positives with 88% accuracy using ML-trained heuristics.
Not Python's dangerous eval(). Sets model to evaluation mode.
Not Python's dangerous exec(). Executes SQL queries safely.
Data URIs like data:image/png;base64,... are not leaked secrets.
Example values like your-api-key or sk-test-xxx.
For ML-based classification trained on 1,000 labeled findings, install with the [ml] extra:
pip install aisentry[ml]
Whether you're a security engineer, developer, or platform team — we've got you covered.
Install, scan, test. It's that simple.
Clone any LLM project and generate reports in seconds:
# Install pip install aisentry # Clone a sample LLM project git clone https://github.com/langchain-ai/langchain.git cd langchain # Generate HTML report (interactive, with audit) aisentry scan ./libs/langchain -o html -f report.html # Generate JSON report (for automation) aisentry scan ./libs/langchain -o json -f report.json # Generate SARIF report (for GitHub Code Scanning) aisentry scan ./libs/langchain -o sarif -f report.sarif # View the HTML report open report.html
See example reports from real-world scans.
Install from PyPI with pip
Static analysis + security posture audit
For runtime testing, use Garak
aisentry v1.0.0 Scanning ./my-project... ✓ Analyzed 47 files ✓ Ran 10 OWASP detectors ✓ Evaluated 61 security controls ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ FINDINGS SUMMARY ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ■ CRITICAL 1 Prompt Injection (LLM01) ■ HIGH 2 Insecure Output (LLM02) ■ MEDIUM 3 Secrets Exposure (LLM06) ───────────────────────────────────── Vulnerability Score: 67/100 Security Posture: 72/100 Maturity Level: Developing ✓ Report saved to report.html
Cloud providers, local models, or custom endpoints — we support them all.
scan - Static code analysis for OWASP LLM Top 10 vulnerabilities in your source code.
audit - Security posture assessment evaluating 61 controls across 10 categories.
For live runtime testing, we recommend Garak.
Current metrics: 75.4% precision, 63.0% recall, 68.7% F1 score.
We outperform Semgrep and Bandit on LLM-specific vulnerabilities. Use --mode strict for fewer false positives.
Currently Python-only. JavaScript/TypeScript support is planned. The architecture is extensible — see CONTRIBUTING.md if you want to help add new parsers.
Yes! Use -o sarif for GitHub Code Scanning, or -o json for custom integrations.
See CI/CD integration docs.
No. All analysis runs 100% locally. Your source code never leaves your machine.
We welcome contributions! Check out our CONTRIBUTING.md for guidelines.
Good first issues are labeled good-first-issue on GitHub.
Features you won't find anywhere else. Open-source and community-driven.
Full coverage of OWASP LLM Top 10
88% accuracy filtering false positives
61 controls across 10 categories
CI/CD ready with GitHub integration
Scan Model Context Protocol servers for over-permissioned tools
Trace agent chains in LangGraph, CrewAI, AutoGen
Vector DB injection, unsafe loaders, PII in embeddings
Real-time scanning in your editor
Static analysis for injection vectors in prompts
Detect missing NeMo, LlamaGuard, Guardrails AI
PII detection, poisoning indicators in training data
Audit LiteLLM, AI Gateway proxy configs
Image prompt injection, audio input sanitization
Generate Garak tests from static findings
Full support for JS/TS LLM applications
Cloud dashboard with trend tracking