Agent - Overview
The DATAZONE Control Agent is a lightweight service installed on every managed host. It collects system data, executes commands, and establishes the connection to the backend.
Features
| Feature | Description |
|---|---|
| Heartbeat | Regular transmission of system metrics (CPU, RAM, Disk, Uptime) |
| System Information | Collection of hardware, software, and network data |
| Script Execution | Receiving and executing scripts from the backend |
| Remote Shell | WebSocket-based terminal session |
| Tunnel | Port forwarding over WebSocket connection |
| SSH Server | Embedded SSH server for agent SSH tunnels |
| Auto-Update | Self-update of the agent to new versions |
| Task Processing | Receiving and executing backend tasks |
Architecture
┌─────────────────────────────────────┐
│ DATAZONE Agent │
├─────────────────────────────────────┤
│ WebSocket Client │
│ ├── Heartbeat Sender (30s) │
│ ├── Task Receiver (30s) │
│ ├── Ping/Pong (25s) │
│ ├── Shell Handler │
│ └── Tunnel Handler │
├─────────────────────────────────────┤
│ System Collector │
│ ├── Hardware Cache (one-time) │
│ ├── Metrics (per heartbeat) │
│ ├── Update Cache (hourly) │
│ └── Windows Cache (background) │
├─────────────────────────────────────┤
│ Embedded SSH Server │
├─────────────────────────────────────┤
│ Auto-Updater (5 min) │
├─────────────────────────────────────┤
│ Configuration (agent.json) │
└─────────────────────────────────────┘Communication
The agent communicates exclusively via WebSocket with the backend:
- The agent establishes an outgoing WebSocket connection to the backend
- Messages are exchanged bidirectionally over this connection
- No incoming port needs to be opened on the host
Firewall-Friendly
Since the agent only establishes an outgoing connection, it works behind NAT and firewalls without requiring any port forwarding on the host.
Reconnect Behavior
On connection loss, the agent automatically attempts to reconnect:
- Initial delay: 1 second
- Maximum delay: 30 seconds
- Algorithm: Exponential backoff (delay doubles with each failed attempt)
- Reset: After a successful connection, the delay resets to 1 second
Heartbeat Intervals and Data Collection
Timing Overview
| Operation | Interval | Description |
|---|---|---|
| Heartbeat | 30 seconds | Sends metrics and system data to the backend |
| Task Polling | 30 seconds | Checks for new tasks from the backend |
| WebSocket Ping | 25 seconds | Detects broken connections (10s timeout) |
| Auto-Update Check | 5 minutes | Checks for new agent versions |
Heartbeat Data (All Platforms)
Each heartbeat contains the following base data:
| Field | Description | Collection |
|---|---|---|
cpu | CPU usage percentage | Real-time (500ms measurement) |
memory | RAM usage percentage | Real-time |
memory_total | Total system memory | Real-time |
disk | Disk usage percentage | Real-time |
uptime | Uptime in seconds | Real-time |
hostname | System hostname | Real-time |
agent_version | Agent version | Fixed |
Platform-Specific Data
Linux / PVE / PBS
| Data Type | Collection Interval | Details |
|---|---|---|
| Hardware Info | Once at startup | CPU, RAM, mainboard, BIOS, serial numbers |
| Pending Updates | Hourly (1h cache) | apt list --upgradable or dnf check-update |
| Docker Containers | Per heartbeat (30s) | Name, image, status, ports |
| SSH Sessions | Per heartbeat (30s) | User, source IP, login time |
| Cron Jobs | Per heartbeat (30s) | System and user crontabs |
| Load Average | Per heartbeat (30s) | 1/5/15 minute averages |
| Services | Per heartbeat (30s) | Systemd services with status |
| Disks | Per heartbeat (30s) | Partitions, mountpoints, usage |
| Network Interfaces | Per heartbeat (30s) | IPs, MAC, bytes in/out |
| Open Ports | Per heartbeat (30s) | Listening TCP/UDP with process |
PVE-specific:
| Data Type | Collection Interval | Details |
|---|---|---|
| VMs and Containers | Per heartbeat (30s) | VMID, name, status, resources |
| Storages | Per heartbeat (30s) | Storage pools and usage |
| Backup Jobs | 30 minutes (cache) | Backup job configurations |
| Backup History | 30 minutes (cache) | Backup execution history |
PBS-specific:
| Data Type | Collection Interval | Details |
|---|---|---|
| Datastores | Per heartbeat (30s) | Datastores and usage |
| Backups | Per heartbeat (30s) | Backup list and counts |
| Sync Jobs | Per heartbeat (30s) | Synchronization status |
OPNsense
| Data Type | Collection Interval | Details |
|---|---|---|
| Hardware Info | Once at startup | CPU, RAM, BIOS |
| Interfaces | Per heartbeat (30s) | Interface status, IPs, throughput |
| Gateways | Per heartbeat (30s) | Gateway status, latency, packet loss |
| VPN Tunnels | Per heartbeat (30s) | OpenVPN, IPsec, WireGuard with status |
| Routes | Per heartbeat (30s) | Routing table (since v1.3.0) |
| Certificates | 6 hours (cache) | ACME and SSL certificates with expiration date (since v1.3.0) |
| Unbound DNS Stats | Per heartbeat (30s) | DNS queries, cache hits, top domains (since v1.3.0) |
| Nginx Virtual Hosts | 1 hour (cache) | Nginx configurations and servers (since v1.3.0) |
| Services | Per heartbeat (30s) | OPNsense services with status |
| Config Change | Per heartbeat (30s) | Last config.xml change |
| External IP | Per heartbeat (30s) | WAN IP address |
| Update Status | Per heartbeat (30s) | Available firmware updates |
Windows
Windows data is collected in the background at varying intervals to minimize system load:
| Data Type | Collection Interval | Details |
|---|---|---|
| Hardware Info | 24 hours (cache) | CPU, RAM, manufacturer, model |
| OS Version/Domain | 1 hour (cache) | Build, edition, AD domain, DC status |
| Windows Services | 2 minutes (cache) | Name, status, startup type |
| Network Interfaces | 2 minutes (cache) | Adapters, IPs, DNS |
| Installed Software | 5 minutes (cache) | Via Winget, name and version |
| Scheduled Tasks | 5 minutes (cache) | Scheduled tasks |
| Disk Information | 5 minutes (cache) | Drives, size, free space |
| Windows Updates | 10 minutes (cache) | Available updates |
| Logged-in Users | Per heartbeat (30s) | Active sessions |
| RustDesk ID/Version | Per heartbeat (30s) | RustDesk remote ID and version (if installed) |
| Domain/DC Status | 1 hour (cache) | AD domain, domain controller status |
| Product Type | 1 hour (cache) | Server / Domain Controller / Workstation |
Why Different Intervals?
Windows data is collected at staggered intervals to avoid bunching PowerShell and WMI queries. Frequently changing data (users, services) is updated more often than rarely changing data (hardware, software).
Response Times
| Event | Response Time |
|---|---|
| Host goes offline | Max. 60 seconds (2x heartbeat interval) |
| New task | Max. 30 seconds (next task poll) |
| Agent update available | Max. 5 minutes (next auto-update check) |
| Connection loss detected | Max. 35 seconds (25s ping + 10s timeout) |
| Reconnect after failure | 1-30 seconds (exponential backoff) |
| Script execution | Immediate (via WebSocket message) |
| Shell connection | Immediate (via WebSocket message) |
Platforms
| Platform | Binary | Notes |
|---|---|---|
| Linux (amd64) | datazone-agent-linux-amd64 | Systemd service |
| Linux (arm64) | datazone-agent-linux-arm64 | For ARM servers (Raspberry Pi, etc.) |
| Windows (amd64) | datazone-agent-windows-amd64.exe | Windows service |
| FreeBSD (amd64) | datazone-agent-freebsd-amd64 | OPNsense-compatible |
Resource Usage
| Resource | Typical Value |
|---|---|
| RAM | 10-20 MB |
| CPU | < 1% (idle) |
| Network | ~1 KB/s (heartbeat) |
| Storage | < 20 MB |
Configuration
The agent configuration is stored in a JSON file:
Linux / FreeBSD
/etc/datazone/agent.jsonWindows
C:\ProgramData\DATAZONE\agent.jsonConfiguration Parameters
{
"server_url": "wss://control.yourdomain.com",
"token": "onboarding-token-here",
"host_type": "linux",
"hostname": "web01"
}| Parameter | Description |
|---|---|
server_url | WebSocket URL of the backend |
token | Onboarding token for registration |
host_type | System type (linux, windows, pve, pbs, opnsense) |
hostname | Display name (optional, detected automatically) |
Task Execution
The agent receives tasks from the backend and executes them:
| Task Type | Timeout | Description |
|---|---|---|
| Script | 5 minutes (default) | Configurable timeout per script |
| Update | 30 minutes | System update (apt/dnf/pkg/Windows Update) |
| Reboot | - | Immediate restart with 2s delay |
| Agent Update | - | Self-update of the agent binary |
If task result delivery fails, it retries up to 3 times with exponential backoff (2s, 4s).
Next Steps
- Agent Installation - Step-by-step guide
- Troubleshooting - Resolving common issues