pydepgate
pydepgate is a static analyzer for Python supply-chain malware that exploits interpreter startup vectors. It inspects wheels, source distributions, installed packages, and individual source files for code that executes silently before any user script runs.
The problem
Python’s interpreter runs several kinds of code automatically at startup:
.pthfiles insite-packages/. Any line beginning withimportis passed toexec()bysite.pyduring interpreter initialization.sitecustomize.pyandusercustomize.py. Imported automatically when present.__init__.pytop-level code in any imported package.setup.py. Executed duringpip installfor source distributions.- Console-script entry points. Generated and executed by
pip install.
Each is a legitimate Python feature. Each has been used in real supply-chain attacks. The .pth vector in particular is acknowledged as a security gap in CPython issue #113659 and has no upstream patch. Existing Python security tooling (pip-audit, safety, bandit) does not inspect these vectors.
This is the attack class used in the March 2026 LiteLLM supply-chain compromise, catalogued as MITRE ATT&CK T1546.018.
Installation
pip install pydepgate
Requires Python 3.11 or later. No third-party runtime dependencies.
Quick start
# Scan a wheel
pydepgate scan some-package-1.0.0-py3-none-any.whl
# Scan an installed package by name
pydepgate scan litellm
# Scan a source distribution
pydepgate scan some-package-1.0.0.tar.gz
# Scan a single file
pydepgate scan --single suspicious_module.py
Exit code 2 means at least one HIGH or CRITICAL finding. Exit code 0 is clean. See Exit Codes for the full contract.
What pydepgate detects
- Encoded payloads in source files (base64, hex, zlib, gzip, bzip2, lzma chains)
- Dynamic execution patterns (
exec,eval,compile,__import__) - Obfuscated string construction and variable chaining leading to execution
- Suspicious stdlib use (subprocess, socket, urllib, ctypes, os) in startup contexts
- Code density anomalies: high entropy identifiers, homoglyphs, deep AST nesting, low-vowel-ratio names
- PEM and DER certificate material embedded in source files
- Multi-layer encoded payloads via the recursive decode pipeline
Documentation
| Section | Contents |
|---|---|
| Getting Started | First scan, reading output, using explain |
| CLI Reference: scan | All flags for pydepgate scan |
| CLI Reference: explain | Rule and signal lookup |
| Exit Codes | Exit code contract and CI implications |
| Output Formats | Human, JSON, SARIF schemas |
| Environment Variables | All PYDEPGATE_* variables |
| Rules File | pydepgate.gate format and precedence |
| Signals Reference | All signal IDs across all namespaces |
| Guide: CI Integration | GitHub Actions, GitLab CI, pre-commit, Docker |
| Guide: Custom Rules | Suppressing false positives, scoping rules |
| Guide: Decode Payloads | --decode-payload-depth, IOC sidecars, encrypted archives |
| Guide: SARIF Integration | GitHub Code Scanning ingestion |
Relationship to PyDepGuard
PyDepGuard is a broader Python security framework covering runtime sandboxing and dependency management. The startup-vector engine in pydepgate is intended to integrate with PyDepGuard as a subsystem. Until then, the two projects are developed independently.
Use pydepgate for startup-vector static analysis. Use PyDepGuard for the full runtime security model.
License
Apache 2.0. See LICENSE.