Agent Technical Reference¶

Audience: Security teams, IT administrators, and compliance officers who need to understand exactly what the agent does, how it behaves, and what it can and cannot detect.

Table of Contents¶

How the Agent Works
Scanner Coverage
Known Limitations
Offline Resilience
Security Architecture
Configuration Reference
Audit Log

How the Agent Works¶

The Sentari agent is a single statically-linked binary with zero runtime dependencies. It performs three core functions:

Discover Python environments by walking the filesystem and inspecting known metadata locations
Extract package metadata (name, version, install path, interpreter version) from each environment
Report results via JSON output (Community Edition) or server upload over mTLS (Enterprise Edition)

The agent never executes pip, conda, poetry, pipenv, or any other binary. All data is read directly from the filesystem. This eliminates command injection risks and ensures the agent works on locked-down endpoints where package managers may not be installed or accessible.

Scan lifecycle (Enterprise Edition)¶

Start
  |
  v
[Drain offline cache] --> Upload any previously queued scans
  |
  v
[Walk filesystem] --> Discover Python environments up to MaxDepth
  |
  v
[Scan environments] --> Extract package metadata (parallel workers)
  |
  v
[Generate SBOM] --> Optional local CycloneDX file
  |
  v
[Upload to server] --> mTLS HTTPS POST
  |                     |
  | (success)           | (failure)
  v                     v
[Audit: upload.success] [Queue locally] --> Retry next cycle
  |
  v
[Poll server config] --> Apply any interval/scope changes
  |
  v
[Sleep scanInterval +/- jitter]
  |
  v
(repeat)

Scanner Coverage¶

Supported environment types¶

Type	Discovery marker	Metadata source	What is reported
pip (global)	`site-packages/` directory	`.dist-info/METADATA`, `.egg-info/PKG-INFO`	Installed packages with exact versions
venv	`pyvenv.cfg` file	Same as pip (reads site-packages inside venv)	Installed packages per virtual environment
conda	`conda-meta/` directory	JSON metadata files (one per package)	All packages in the conda environment
Poetry	`poetry.lock` file	TOML lockfile	Declared dependencies and their locked versions
Pipenv	`Pipfile.lock` file	JSON lockfile	Declared dependencies and their pinned versions
System (Debian/Ubuntu)	`/var/lib/dpkg/status`	dpkg status database	Python-related system packages
System (RHEL/CentOS/Fedora)	`/var/lib/rpm/rpmdb.sqlite`	RPM SQLite database	Python-related system packages
Windows Registry	`HKLM/HKCU\SOFTWARE\Python\PythonCore`	Registry keys + delegated to pip scanner	Registered Python installations and their packages
pyenv	`~/.pyenv/versions/` path enumeration	Same as pip (reads site-packages per version)	Packages per pyenv-managed interpreter
asdf (Python plugin)	`~/.asdf/installs/python/` path enumeration	Same as pip (reads site-packages per version)	Packages per asdf-managed interpreter

pyenv and asdf installations are discovered through explicit path enumeration in addition to the general depth-based filesystem traversal. This ensures they are reliably detected regardless of nesting depth.

Legacy editable install support¶

Packages installed with pip install -e are detected in both forms: - Modern editable installs (setuptools >= 60.0): standard .dist-info metadata, detected as part of normal pip/venv scanning - Legacy editable installs (setuptools < 60.0): .egg-link pointer files, detected and included in the package inventory with the egg-link install marker

Per-package metadata extracted¶

Field	Description	Availability
`name`	Package name	All types
`version`	Installed or declared version	All types
`install_path`	Filesystem path to the package metadata	pip, venv, conda
`env_type`	Environment type (pip, venv, conda, poetry, pipenv, system_deb, system_rpm)	All types
`interpreter_version`	Python version for this environment	Most types (from pyvenv.cfg, conda metadata, lockfile metadata)
`installer_user`	OS user who owns the package files	pip, venv (Unix)
`install_date`	Last modification time of metadata file	pip, venv, conda, poetry, pipenv
`environment`	Path to the environment root	All types

Dangling virtual environment detection¶

The scanner detects broken virtual environments where the base Python interpreter has been removed. These are reported as scan errors rather than being silently included, ensuring your inventory reflects only functional environments.

Known Limitations¶

Transparency about what the scanner does not cover is essential for accurate risk assessment.

Environment types not specifically detected¶

The following Python installation methods are only discovered if their site-packages directory happens to be within the scan root and depth:

PDM (uses .venv which is detected, but pdm.lock is not parsed)
uv (uses .venv which is detected)
Nix/NixOS (/nix/store/ -- non-standard layout)
Flatpak / Snap -- sandboxed, not visible to the host scanner
Microsoft Store Python (Windows) -- may not register in standard registry keys
Scoop / Chocolatey (Windows) -- third-party package managers not registered in the Python registry path

Depth limitations¶

The default MaxDepth of 12 directory levels covers most standard installations, including pyenv and asdf environments. Environments installed deeper than this are silently skipped. If your organisation uses deeply nested directory structures, increase MaxDepth in the agent configuration or via the server-managed config.

Examples of path depth:

Path	Depth from `/`	Discovered at MaxDepth=12?
`/usr/lib/python3.12/site-packages`	4	Yes
`/home/user/.local/lib/python3.12/site-packages`	6	Yes
`/home/user/projects/app/.venv/lib/python3.12/site-packages`	7	Yes
`/home/user/.pyenv/versions/3.12.0/lib/python3.12/site-packages`	8	Yes
`/opt/tools/internal/builds/python/3.12/lib/python3.12/site-packages`	9	Yes
`/home/user/.asdf/installs/python/3.12.0/lib/python3.12/site-packages`	10	Yes

Lockfile scanners report declared dependencies¶

Poetry and Pipenv scanners parse lockfiles, which describe declared dependencies with pinned versions. These may differ from what is actually installed in the corresponding virtual environment. If both a lockfile and a venv are present, the scanner reports both independently.

System package scanners use name-based filtering¶

Debian and RPM scanners filter packages by name (matching "python", "pip", "pypy", "jython"). This may include non-application packages (e.g., python3-dev, python-minimal) and miss Python libraries installed under non-standard names.

Symlinked directories are not traversed¶

To prevent infinite loops and traversal outside the scan root, the scanner skips symlinked directories during filesystem traversal. If Python environments are exposed exclusively through symlinks, they will not be discovered.

Legacy editable installs¶

Packages installed via pip install -e using modern setuptools (>= 60.0) create standard .dist-info metadata and are detected normally. Legacy editable installs that use .egg-link files (setuptools below 60.0) are also detected and included in the package inventory.

Offline Resilience¶

The Enterprise Edition is designed for air-gapped and intermittently connected environments.

Offline scan queuing¶

When the server is unreachable:

The scan completes normally on the endpoint
Results are stored in a local SQLite queue (cache.db) in the agent's data directory
An audit log entry records the upload failure
On the next scan cycle, the agent attempts to drain the queue (oldest first) before running a new scan
Successfully uploaded scans are marked as delivered

Important: Scans accumulate in the queue during extended outages. There is no data loss as long as the local disk has sufficient space.

Behaviour during extended outages¶

Duration	Scans queued (at 1h interval)	Approximate queue size
1 hour	1	~50 KB
24 hours	24	~1.2 MB
7 days	168	~8 MB
30 days	720	~36 MB

Queue sizes depend on the number of packages per scan. The estimates above assume ~500 packages per scan.

What is NOT queued¶

Server configuration poll failures (agent uses its last known configuration)
Audit log entries (stored locally, shipped separately when connectivity resumes)
SBOM files (always written locally if --sbom-out is configured)

Security Architecture¶

Zero binary invocation¶

The agent reads metadata files and databases directly. It never executes pip, conda, python, or any other binary. This eliminates:

Command injection vulnerabilities
Dependency on package manager availability
Risk of triggering package manager side effects (network access, script execution)

Mutual TLS (mTLS)¶

All agent-server communication uses TLS 1.3 with mutual certificate authentication:

On first registration, the agent generates an ECDSA P-256 key pair locally
A Certificate Signing Request (CSR) is sent to the server -- the private key never leaves the endpoint
The server's internal CA signs the CSR and returns the device certificate
All subsequent communication requires both the agent and server to present valid certificates

Bootstrap security¶

During initial registration (before the agent has a CA certificate), you can pin the server's identity using:

sentari-agent-enterprise \
  --server-url https://sentari.yourorg.com:8000 \
  --enroll-token <token> \
  --enroll-token-file /path/to/token-file \
  --bootstrap-ca-fingerprint "aa:bb:cc:..." \
  --upload

--enroll-token-file reads the token from a file instead of a CLI argument, preventing exposure via /proc/cmdline on multi-user systems
--bootstrap-ca-fingerprint verifies the server's TLS certificate SHA-256 fingerprint during the first connection, preventing man-in-the-middle attacks even when the system trust store is compromised

Tamper-evident audit log¶

Every agent action is recorded in a local SQLite audit log with SHA-256 hash chaining. Each entry includes:

Event type and detail
Timestamp
SHA-256 hash of (event + detail + previous hash + timestamp)

Database triggers prevent modification or deletion of audit entries. Any tampering breaks the hash chain and is detectable during server-side verification.

File permissions¶

File	Permissions	Contents
Data directory	`0700`	Parent directory for all agent data
`device.key`	`0600`	Agent's private key
`device.crt`	`0644`	Agent's signed certificate
`ca.crt`	`0644`	Server CA certificate
`audit.db`	`0600`	Audit log database
`cache.db`	`0600`	Offline scan queue
SBOM output	`0600`	Generated SBOM files

Network behaviour¶

All connections are outbound-only -- the agent initiates; no inbound ports are required
HTTP redirects are rejected -- the agent never follows redirects, preventing redirection attacks
Server error responses are truncated in logs to prevent leaking server internals
Server-pushed scan_root changes are validated against a denylist of sensitive directories (/etc, /root, /home, /proc, /sys, /var/log, /dev, /run)

Configuration Reference¶

CLI flags (Enterprise Edition)¶

Flag	Description	Default
`--upload`	One-shot: scan and upload, then exit	--
`--serve`	Daemon: scan continuously on a schedule	--
`--server-url`	Server URL (overrides config file)	From config file
`--config`	Path to agent configuration file	--
`--enroll-token`	Enrollment token for first registration	--
`--enroll-token-file`	Path to file containing enrollment token	--
`--bootstrap-ca-fingerprint`	SHA-256 fingerprint of server TLS cert (hex, colon-separated)	--
`--sbom-out`	Write CycloneDX SBOM to this path after each scan	--
`--data-dir`	Override data directory	`/var/lib/sentari`
`--version`	Print version and exit	--

Configuration file (INI format)¶

[server]
url = https://sentari.yourorg.com:8000
cert_file = /var/lib/sentari/certs/device.crt
key_file = /var/lib/sentari/certs/device.key
ca_cert_file = /var/lib/sentari/certs/ca.crt

[scanner]
scan_root = /
max_depth = 12
interval = 3600

[proxy]
https_proxy = http://proxy.corp.example:3128
no_proxy = sentari.internal,.corp.example
auth_user = proxyuser
auth_pass_file = /etc/sentari/proxy-password

Server-managed configuration¶

In daemon mode, the agent polls the server for configuration updates. The server can adjust:

scan_interval -- seconds between scans
scan_root -- filesystem root to scan (validated against a denylist)
max_depth -- maximum traversal depth

Changes take effect on the next scan cycle without requiring an agent restart.

Audit Log¶

Recorded events¶

Event	When	Detail
`agent.registered`	First successful registration	`device_id=<uuid>`
`scan.started`	Beginning of each scan cycle	`hostname=<hostname>`
`scan.completed`	Scan finished successfully	`packages=<count>`
`scan.failed`	Scan encountered a fatal error	Error message
`upload.success`	Results uploaded to server	`packages=<count>`
`upload.failed`	Upload failed (results queued)	Error message
`config.updated`	Server pushed new configuration	`scan_interval=<seconds>`
`agent.shutdown`	Agent received termination signal	`signal=SIGINT` or `signal=SIGTERM`

Integrity verification¶

Each audit entry contains: - content_hash: SHA-256 of (event_type + detail + prev_hash + timestamp) - prev_hash: Hash of the previous entry (empty for the first entry)

To verify the chain, iterate entries in order and confirm each content_hash matches the recomputed value. Any mismatch indicates tampering.