Skip to content

Agent Technical Reference

Audience: Security teams, IT administrators, and compliance officers who need to understand exactly what the agent does, how it behaves, and what it can and cannot detect.


Table of Contents

  1. How the Agent Works
  2. Scanner Coverage
  3. Known Limitations
  4. Offline Resilience
  5. Security Architecture
  6. Configuration Reference
  7. Audit Log

How the Agent Works

The Sentari agent is a single statically-linked binary with zero runtime dependencies. It performs three core functions:

  1. Discover Python environments by walking the filesystem and inspecting known metadata locations
  2. Extract package metadata (name, version, install path, interpreter version) from each environment
  3. Report results via JSON output (Community Edition) or server upload over mTLS (Enterprise Edition)

The agent never executes pip, conda, poetry, pipenv, or any other binary. All data is read directly from the filesystem. This eliminates command injection risks and ensures the agent works on locked-down endpoints where package managers may not be installed or accessible.

Scan lifecycle (Enterprise Edition)

Start
  |
  v
[Drain offline cache] --> Upload any previously queued scans
  |
  v
[Walk filesystem] --> Discover Python environments up to MaxDepth
  |
  v
[Scan environments] --> Extract package metadata (parallel workers)
  |
  v
[Generate SBOM] --> Optional local CycloneDX file
  |
  v
[Upload to server] --> mTLS HTTPS POST
  |                     |
  | (success)           | (failure)
  v                     v
[Audit: upload.success] [Queue locally] --> Retry next cycle
  |
  v
[Poll server config] --> Apply any interval/scope changes
  |
  v
[Sleep scanInterval +/- jitter]
  |
  v
(repeat)

Scanner Coverage

Supported environment types

Type Discovery marker Metadata source What is reported
pip (global) site-packages/ directory .dist-info/METADATA, .egg-info/PKG-INFO Installed packages with exact versions
venv pyvenv.cfg file Same as pip (reads site-packages inside venv) Installed packages per virtual environment
conda conda-meta/ directory JSON metadata files (one per package) All packages in the conda environment
Poetry poetry.lock file TOML lockfile Declared dependencies and their locked versions
Pipenv Pipfile.lock file JSON lockfile Declared dependencies and their pinned versions
System (Debian/Ubuntu) /var/lib/dpkg/status dpkg status database Python-related system packages
System (RHEL/CentOS/Fedora) /var/lib/rpm/rpmdb.sqlite RPM SQLite database Python-related system packages
Windows Registry HKLM/HKCU\SOFTWARE\Python\PythonCore Registry keys + delegated to pip scanner Registered Python installations and their packages
pyenv ~/.pyenv/versions/ path enumeration Same as pip (reads site-packages per version) Packages per pyenv-managed interpreter
asdf (Python plugin) ~/.asdf/installs/python/ path enumeration Same as pip (reads site-packages per version) Packages per asdf-managed interpreter

pyenv and asdf installations are discovered through explicit path enumeration in addition to the general depth-based filesystem traversal. This ensures they are reliably detected regardless of nesting depth.

Legacy editable install support

Packages installed with pip install -e are detected in both forms: - Modern editable installs (setuptools >= 60.0): standard .dist-info metadata, detected as part of normal pip/venv scanning - Legacy editable installs (setuptools < 60.0): .egg-link pointer files, detected and included in the package inventory with the egg-link install marker

Per-package metadata extracted

Field Description Availability
name Package name All types
version Installed or declared version All types
install_path Filesystem path to the package metadata pip, venv, conda
env_type Environment type (pip, venv, conda, poetry, pipenv, system_deb, system_rpm) All types
interpreter_version Python version for this environment Most types (from pyvenv.cfg, conda metadata, lockfile metadata)
installer_user OS user who owns the package files pip, venv (Unix)
install_date Last modification time of metadata file pip, venv, conda, poetry, pipenv
environment Path to the environment root All types

Dangling virtual environment detection

The scanner detects broken virtual environments where the base Python interpreter has been removed. These are reported as scan errors rather than being silently included, ensuring your inventory reflects only functional environments.


Known Limitations

Transparency about what the scanner does not cover is essential for accurate risk assessment.

Environment types not specifically detected

The following Python installation methods are only discovered if their site-packages directory happens to be within the scan root and depth:

  • PDM (uses .venv which is detected, but pdm.lock is not parsed)
  • uv (uses .venv which is detected)
  • Nix/NixOS (/nix/store/ -- non-standard layout)
  • Flatpak / Snap -- sandboxed, not visible to the host scanner
  • Microsoft Store Python (Windows) -- may not register in standard registry keys
  • Scoop / Chocolatey (Windows) -- third-party package managers not registered in the Python registry path

Depth limitations

The default MaxDepth of 12 directory levels covers most standard installations, including pyenv and asdf environments. Environments installed deeper than this are silently skipped. If your organisation uses deeply nested directory structures, increase MaxDepth in the agent configuration or via the server-managed config.

Examples of path depth:

Path Depth from / Discovered at MaxDepth=12?
/usr/lib/python3.12/site-packages 4 Yes
/home/user/.local/lib/python3.12/site-packages 6 Yes
/home/user/projects/app/.venv/lib/python3.12/site-packages 7 Yes
/home/user/.pyenv/versions/3.12.0/lib/python3.12/site-packages 8 Yes
/opt/tools/internal/builds/python/3.12/lib/python3.12/site-packages 9 Yes
/home/user/.asdf/installs/python/3.12.0/lib/python3.12/site-packages 10 Yes

Lockfile scanners report declared dependencies

Poetry and Pipenv scanners parse lockfiles, which describe declared dependencies with pinned versions. These may differ from what is actually installed in the corresponding virtual environment. If both a lockfile and a venv are present, the scanner reports both independently.

System package scanners use name-based filtering

Debian and RPM scanners filter packages by name (matching "python", "pip", "pypy", "jython"). This may include non-application packages (e.g., python3-dev, python-minimal) and miss Python libraries installed under non-standard names.

Symlinked directories are not traversed

To prevent infinite loops and traversal outside the scan root, the scanner skips symlinked directories during filesystem traversal. If Python environments are exposed exclusively through symlinks, they will not be discovered.

Legacy editable installs

Packages installed via pip install -e using modern setuptools (>= 60.0) create standard .dist-info metadata and are detected normally. Legacy editable installs that use .egg-link files (setuptools below 60.0) are also detected and included in the package inventory.


Offline Resilience

The Enterprise Edition is designed for air-gapped and intermittently connected environments.

Offline scan queuing

When the server is unreachable:

  1. The scan completes normally on the endpoint
  2. Results are stored in a local SQLite queue (cache.db) in the agent's data directory
  3. An audit log entry records the upload failure
  4. On the next scan cycle, the agent attempts to drain the queue (oldest first) before running a new scan
  5. Successfully uploaded scans are marked as delivered

Important: Scans accumulate in the queue during extended outages. There is no data loss as long as the local disk has sufficient space.

Behaviour during extended outages

Duration Scans queued (at 1h interval) Approximate queue size
1 hour 1 ~50 KB
24 hours 24 ~1.2 MB
7 days 168 ~8 MB
30 days 720 ~36 MB

Queue sizes depend on the number of packages per scan. The estimates above assume ~500 packages per scan.

What is NOT queued

  • Server configuration poll failures (agent uses its last known configuration)
  • Audit log entries (stored locally, shipped separately when connectivity resumes)
  • SBOM files (always written locally if --sbom-out is configured)

Security Architecture

Zero binary invocation

The agent reads metadata files and databases directly. It never executes pip, conda, python, or any other binary. This eliminates:

  • Command injection vulnerabilities
  • Dependency on package manager availability
  • Risk of triggering package manager side effects (network access, script execution)

Mutual TLS (mTLS)

All agent-server communication uses TLS 1.3 with mutual certificate authentication:

  1. On first registration, the agent generates an ECDSA P-256 key pair locally
  2. A Certificate Signing Request (CSR) is sent to the server -- the private key never leaves the endpoint
  3. The server's internal CA signs the CSR and returns the device certificate
  4. All subsequent communication requires both the agent and server to present valid certificates

Bootstrap security

During initial registration (before the agent has a CA certificate), you can pin the server's identity using:

sentari-agent-enterprise \
  --server-url https://sentari.yourorg.com:8000 \
  --enroll-token <token> \
  --enroll-token-file /path/to/token-file \
  --bootstrap-ca-fingerprint "aa:bb:cc:..." \
  --upload
  • --enroll-token-file reads the token from a file instead of a CLI argument, preventing exposure via /proc/cmdline on multi-user systems
  • --bootstrap-ca-fingerprint verifies the server's TLS certificate SHA-256 fingerprint during the first connection, preventing man-in-the-middle attacks even when the system trust store is compromised

Tamper-evident audit log

Every agent action is recorded in a local SQLite audit log with SHA-256 hash chaining. Each entry includes:

  • Event type and detail
  • Timestamp
  • SHA-256 hash of (event + detail + previous hash + timestamp)

Database triggers prevent modification or deletion of audit entries. Any tampering breaks the hash chain and is detectable during server-side verification.

File permissions

File Permissions Contents
Data directory 0700 Parent directory for all agent data
device.key 0600 Agent's private key
device.crt 0644 Agent's signed certificate
ca.crt 0644 Server CA certificate
audit.db 0600 Audit log database
cache.db 0600 Offline scan queue
SBOM output 0600 Generated SBOM files

Network behaviour

  • All connections are outbound-only -- the agent initiates; no inbound ports are required
  • HTTP redirects are rejected -- the agent never follows redirects, preventing redirection attacks
  • Server error responses are truncated in logs to prevent leaking server internals
  • Server-pushed scan_root changes are validated against a denylist of sensitive directories (/etc, /root, /home, /proc, /sys, /var/log, /dev, /run)

Configuration Reference

CLI flags (Enterprise Edition)

Flag Description Default
--upload One-shot: scan and upload, then exit --
--serve Daemon: scan continuously on a schedule --
--server-url Server URL (overrides config file) From config file
--config Path to agent configuration file --
--enroll-token Enrollment token for first registration --
--enroll-token-file Path to file containing enrollment token --
--bootstrap-ca-fingerprint SHA-256 fingerprint of server TLS cert (hex, colon-separated) --
--sbom-out Write CycloneDX SBOM to this path after each scan --
--data-dir Override data directory /var/lib/sentari
--version Print version and exit --

Configuration file (INI format)

[server]
url = https://sentari.yourorg.com:8000
cert_file = /var/lib/sentari/certs/device.crt
key_file = /var/lib/sentari/certs/device.key
ca_cert_file = /var/lib/sentari/certs/ca.crt

[scanner]
scan_root = /
max_depth = 12
interval = 3600

[proxy]
https_proxy = http://proxy.corp.example:3128
no_proxy = sentari.internal,.corp.example
auth_user = proxyuser
auth_pass_file = /etc/sentari/proxy-password

Server-managed configuration

In daemon mode, the agent polls the server for configuration updates. The server can adjust:

  • scan_interval -- seconds between scans
  • scan_root -- filesystem root to scan (validated against a denylist)
  • max_depth -- maximum traversal depth

Changes take effect on the next scan cycle without requiring an agent restart.


Audit Log

Recorded events

Event When Detail
agent.registered First successful registration device_id=<uuid>
scan.started Beginning of each scan cycle hostname=<hostname>
scan.completed Scan finished successfully packages=<count>
scan.failed Scan encountered a fatal error Error message
upload.success Results uploaded to server packages=<count>
upload.failed Upload failed (results queued) Error message
config.updated Server pushed new configuration scan_interval=<seconds>
agent.shutdown Agent received termination signal signal=SIGINT or signal=SIGTERM

Integrity verification

Each audit entry contains: - content_hash: SHA-256 of (event_type + detail + prev_hash + timestamp) - prev_hash: Hash of the previous entry (empty for the first entry)

To verify the chain, iterate entries in order and confirm each content_hash matches the recomputed value. Any mismatch indicates tampering.