Skip to content

Agent - Overview

The DATAZONE Control Agent is a lightweight service installed on every managed host. It collects system data, executes commands, and establishes the connection to the backend.

Features

FeatureDescription
HeartbeatRegular transmission of system metrics (CPU, RAM, Disk, Uptime)
System InformationCollection of hardware, software, and network data
Script ExecutionReceiving and executing scripts from the backend
Remote ShellWebSocket-based terminal session
TunnelPort forwarding over WebSocket connection
SSH ServerEmbedded SSH server for agent SSH tunnels
Auto-UpdateSelf-update of the agent to new versions
Task ProcessingReceiving and executing backend tasks

Architecture

┌─────────────────────────────────────┐
│           DATAZONE Agent            │
├─────────────────────────────────────┤
│  WebSocket Client                   │
│  ├── Heartbeat Sender (30s)         │
│  ├── Task Receiver (30s)            │
│  ├── Ping/Pong (25s)               │
│  ├── Shell Handler                  │
│  └── Tunnel Handler                 │
├─────────────────────────────────────┤
│  System Collector                   │
│  ├── Hardware Cache (one-time)      │
│  ├── Metrics (per heartbeat)        │
│  ├── Update Cache (hourly)          │
│  └── Windows Cache (background)     │
├─────────────────────────────────────┤
│  Embedded SSH Server                │
├─────────────────────────────────────┤
│  Auto-Updater (5 min)              │
├─────────────────────────────────────┤
│  Configuration (agent.json)         │
└─────────────────────────────────────┘

Communication

The agent communicates exclusively via WebSocket with the backend:

  1. The agent establishes an outgoing WebSocket connection to the backend
  2. Messages are exchanged bidirectionally over this connection
  3. No incoming port needs to be opened on the host

Firewall-Friendly

Since the agent only establishes an outgoing connection, it works behind NAT and firewalls without requiring any port forwarding on the host.

Reconnect Behavior

On connection loss, the agent automatically attempts to reconnect:

  • Initial delay: 1 second
  • Maximum delay: 30 seconds
  • Algorithm: Exponential backoff (delay doubles with each failed attempt)
  • Reset: After a successful connection, the delay resets to 1 second

Heartbeat Intervals and Data Collection

Timing Overview

OperationIntervalDescription
Heartbeat30 secondsSends metrics and system data to the backend
Task Polling30 secondsChecks for new tasks from the backend
WebSocket Ping25 secondsDetects broken connections (10s timeout)
Auto-Update Check5 minutesChecks for new agent versions

Heartbeat Data (All Platforms)

Each heartbeat contains the following base data:

FieldDescriptionCollection
cpuCPU usage percentageReal-time (500ms measurement)
memoryRAM usage percentageReal-time
memory_totalTotal system memoryReal-time
diskDisk usage percentageReal-time
uptimeUptime in secondsReal-time
hostnameSystem hostnameReal-time
agent_versionAgent versionFixed

Platform-Specific Data

Linux / PVE / PBS

Data TypeCollection IntervalDetails
Hardware InfoOnce at startupCPU, RAM, mainboard, BIOS, serial numbers
Pending UpdatesHourly (1h cache)apt list --upgradable or dnf check-update
Docker ContainersPer heartbeat (30s)Name, image, status, ports
SSH SessionsPer heartbeat (30s)User, source IP, login time
Cron JobsPer heartbeat (30s)System and user crontabs
Load AveragePer heartbeat (30s)1/5/15 minute averages
ServicesPer heartbeat (30s)Systemd services with status
DisksPer heartbeat (30s)Partitions, mountpoints, usage
Network InterfacesPer heartbeat (30s)IPs, MAC, bytes in/out
Open PortsPer heartbeat (30s)Listening TCP/UDP with process

PVE-specific:

Data TypeCollection IntervalDetails
VMs and ContainersPer heartbeat (30s)VMID, name, status, resources
StoragesPer heartbeat (30s)Storage pools and usage
Backup Jobs30 minutes (cache)Backup job configurations
Backup History30 minutes (cache)Backup execution history

PBS-specific:

Data TypeCollection IntervalDetails
DatastoresPer heartbeat (30s)Datastores and usage
BackupsPer heartbeat (30s)Backup list and counts
Sync JobsPer heartbeat (30s)Synchronization status

OPNsense

Data TypeCollection IntervalDetails
Hardware InfoOnce at startupCPU, RAM, BIOS
InterfacesPer heartbeat (30s)Interface status, IPs, throughput
GatewaysPer heartbeat (30s)Gateway status, latency, packet loss
VPN TunnelsPer heartbeat (30s)OpenVPN, IPsec, WireGuard with status
RoutesPer heartbeat (30s)Routing table (since v1.3.0)
Certificates6 hours (cache)ACME and SSL certificates with expiration date (since v1.3.0)
Unbound DNS StatsPer heartbeat (30s)DNS queries, cache hits, top domains (since v1.3.0)
Nginx Virtual Hosts1 hour (cache)Nginx configurations and servers (since v1.3.0)
ServicesPer heartbeat (30s)OPNsense services with status
Config ChangePer heartbeat (30s)Last config.xml change
External IPPer heartbeat (30s)WAN IP address
Update StatusPer heartbeat (30s)Available firmware updates

Windows

Windows data is collected in the background at varying intervals to minimize system load:

Data TypeCollection IntervalDetails
Hardware Info24 hours (cache)CPU, RAM, manufacturer, model
OS Version/Domain1 hour (cache)Build, edition, AD domain, DC status
Windows Services2 minutes (cache)Name, status, startup type
Network Interfaces2 minutes (cache)Adapters, IPs, DNS
Installed Software5 minutes (cache)Via Winget, name and version
Scheduled Tasks5 minutes (cache)Scheduled tasks
Disk Information5 minutes (cache)Drives, size, free space
Windows Updates10 minutes (cache)Available updates
Logged-in UsersPer heartbeat (30s)Active sessions
RustDesk ID/VersionPer heartbeat (30s)RustDesk remote ID and version (if installed)
Domain/DC Status1 hour (cache)AD domain, domain controller status
Product Type1 hour (cache)Server / Domain Controller / Workstation

Why Different Intervals?

Windows data is collected at staggered intervals to avoid bunching PowerShell and WMI queries. Frequently changing data (users, services) is updated more often than rarely changing data (hardware, software).

Response Times

EventResponse Time
Host goes offlineMax. 60 seconds (2x heartbeat interval)
New taskMax. 30 seconds (next task poll)
Agent update availableMax. 5 minutes (next auto-update check)
Connection loss detectedMax. 35 seconds (25s ping + 10s timeout)
Reconnect after failure1-30 seconds (exponential backoff)
Script executionImmediate (via WebSocket message)
Shell connectionImmediate (via WebSocket message)

Platforms

PlatformBinaryNotes
Linux (amd64)datazone-agent-linux-amd64Systemd service
Linux (arm64)datazone-agent-linux-arm64For ARM servers (Raspberry Pi, etc.)
Windows (amd64)datazone-agent-windows-amd64.exeWindows service
FreeBSD (amd64)datazone-agent-freebsd-amd64OPNsense-compatible

Resource Usage

ResourceTypical Value
RAM10-20 MB
CPU< 1% (idle)
Network~1 KB/s (heartbeat)
Storage< 20 MB

Configuration

The agent configuration is stored in a JSON file:

Linux / FreeBSD

/etc/datazone/agent.json

Windows

C:\ProgramData\DATAZONE\agent.json

Configuration Parameters

json
{
  "server_url": "wss://control.yourdomain.com",
  "token": "onboarding-token-here",
  "host_type": "linux",
  "hostname": "web01"
}
ParameterDescription
server_urlWebSocket URL of the backend
tokenOnboarding token for registration
host_typeSystem type (linux, windows, pve, pbs, opnsense)
hostnameDisplay name (optional, detected automatically)

Task Execution

The agent receives tasks from the backend and executes them:

Task TypeTimeoutDescription
Script5 minutes (default)Configurable timeout per script
Update30 minutesSystem update (apt/dnf/pkg/Windows Update)
Reboot-Immediate restart with 2s delay
Agent Update-Self-update of the agent binary

If task result delivery fails, it retries up to 3 times with exponential backoff (2s, 4s).

Next Steps

DATAZONE Control Documentation