Archer — Beacon Hunting for Zeek

The north star: find the beacon

Almost every intrusion eventually phones home. An implant checks in with its command-and-control server on a schedule — a quiet, repeating heartbeat buried in millions of ordinary connections. That callback is the most reliable signal that something is wrong, and one of the hardest things on a network to see. Archer exists to find it.

It's an open-source, self-hosted platform that reads your Zeek logs and hunts C2 beacons with more depth than anything else you can stand up yourself — then hands you a browser-based workbench to confirm, attribute, and escalate. No configuration to start; it sizes itself to the box it lands on, and every finding pivots back to the raw records that produced it.

Why beacons are hard

A capable operator does everything possible to make a beacon look like noise. Each trick below defeats a naïve detector — Archer is built to beat all five:

Jitter — the check-in interval is randomized around a base cadence, smearing the clean periodicity that simple interval statistics depend on.
Low-and-slow — a callback every few hours over a long window leaves almost no data to score.
Blending — C2 rides the same destination, port 443, and CDN as legitimate telemetry, so a chatty benign channel drowns the signal.
Cloud hosting — the destination is a shared cloud front, so IP reputation tells you nothing.
Staging — the operator pivots between hosts, spreading the campaign so no single host-pair looks busy.

How Archer hunts beacons

Every external host-pair is scored on a composite of four axes — each one independently queryable and chartable:

tscore Timing

Inter-arrival regularity
Bowley skewness + MAD

dscore Data size

Payload consistency
steady size = automated

hist Histogram

Hour-of-day coverage
24-bucket circadian shape

dur Persistence

Bounded trailing window
survives long retention

When the statistical view is weak, a second engine takes over — and each technique below is aimed squarely at one of the five evasions above:

Spectral rescue → beats jitter. A Lomb-Scargle periodogram over reservoir-sampled timestamps recovers bounded-jitter beacons by finding the hidden frequency, with DC-correction and a plausibility gate to kill false peaks. Runs only when the statistical path scored weak (~4 ms/pair).
Per-channel scoring → beats blending. Archer partitions a beacon by TLS fingerprint and re-scores each channel, so a sharp C2 hiding inside a chatty benign channel to the same destination is surfaced on its own instead of being averaged away.
TLS-fingerprint attribution → beats cloud hosting. Every TLS beacon carries its JA3/JA4 with a one-click pivot to every beacon sharing it. The fingerprint is the malware's, not the host's — so it survives shared cloud fronts and turns per-pair detection into implant-family attribution.
Multi-stage staging → beats pivoting. Binds two or more internal hosts beaconing to the same rare external destination with staggered onsets — the "land on A, move to B, B calls the same C2" pattern single-pair detection is blind to.
DGA augmentation. A beacon to an algorithmically-generated domain (high entropy + low English-bigram likelihood) gets a severity bump, with CDN/operator allowlists to spare legitimate algorithmic hostnames.
Port-hopping detection. The beacon key excludes the destination port, so a callback that rotates ports to dodge a port rule is still caught as a single beacon.

Every flavor of beacon

The same multi-axis engine runs over three log layers, so a beacon is caught wherever it hides — raw connections, DNS, or HTTP over a CDN:

conn Beacon

Classic C2 over TCP/TLS
+ Port-Hopping variant

dns DNS Beacon

The DNS-C2 heartbeat
cadence on (src, apex)

http HTTP Beacon

C2 over CDN infra
scored per URI footprint

A Hunts menu ships nine beacon-variety lenses as one-click queries — the shapes worth checking first:

textbook check-intasking channeljitter-evading / spectralclockworkscheduled / fixed-hourlow-and-slowpersistentDGA-backedport-hopping

Triage built for beaconing

Five-second triage header — a beacon's detail pane leads with jitter %, a plain-English cadence ("every 47s ± 3s"), median interval, sample size, and the per-axis sub-score breakdown, so the confidence signal is readable at a glance.
The score is a query space — every sub-axis and timing metric is its own searchable field, so the shape of a staging beacon ("tight timing, short duration") becomes a query you can save and reuse.
30-day score evolution — a per-beacon chart of the composite and all four sub-axes, so you can tell whether a beacon is escalating, stable, or decaying.
TLS Fingerprints inventory — a ranked wall of every high-signal JA3/JA4 in the capture, surfacing a rare fingerprint shared across hosts even when it tripped no detector at all.

Queryable beacon fields

Each takes comparisons and [lo TO hi] ranges — for example:

type:Beacon AND tscore:>=0.9 AND dur:<=0.3

tscoredscorehistdurjittermeanintmedintconnsoutratio

Beaconing rarely travels alone

A real intrusion leaves more than a heartbeat. Once it has your logs, Archer also runs the rest of the kill chain and correlates it back to the beacon — these detectors are the supporting cast around the main hunt:

conn Connections

Strobe high
Data Exfiltration high
Lateral Movement high
Protocol on Unexpected Port high
C2 Port high
Long Connection high

dns DNS

DNS Tunneling high
NXDOMAIN Flood high
Subdomain DGA high
Suspicious TLD med
DoH Bypass med

http HTTP

Cobalt Strike URI crit
C2 URI Pattern crit
Domain Fronting crit
Suspicious File Download high

ssl TLS & certs

Malicious JA3 / JA4 crit
No-SNI on C2 Port high
Weak TLS med
Suspicious Certificate med

Cross-detector correlation ties it together: findings sharing a host-pair roll up into Correlated Activity, so a beacon that also shows DNS tunneling or a malicious JA3 lands as one conviction instead of three loose alerts.

The platform around the hunt

Everything else exists to get you to a confirmed beacon faster — and keep the hunt running:

Analyst workbench — a Lucene-style query language (boolean logic, wildcards, numeric ranges, date windows); campaign & host pivots with a force-directed graph; one-click pivot to the raw Zeek records; reversible triage (ack / escalate / dismiss / suppress) that survives re-analysis by fingerprint; a "New only" filter; and per-tab CSV/JSON/TXT exports over a virtualized table.
Live threat intel — escalations enrich against VirusTotal, CrowdSec, OTX, AbuseIPDB, GreyNoise (keyless), and Censys, plus automatic free feeds (Feodo, URLhaus) and internal MISP/OpenCTI; MITRE ATT&CK mapping on every finding; a retroactive archive IOC scan over 100+ GB in minutes.
Built to run for real — optional Quiver sensors ship logs from any Linux host over rsync-on-ssh; watch mode schedules analysis hourly-to-daily; bounded-memory detection with cgroup-aware sizing; role-based access; SIEM forwarding over CEF (validated against Security Onion); and it runs fully air-gapped.

Get it running

Archer runs in Docker. Clone it, drop your Zeek logs into logs/, and start — one command sizes the container to your host and brings up the stack:

git clone https://github.com/BushidoCyb3r/Archer.git
cd Archer
sudo ./start.sh up

Then open https://<your-host>:8443 — the start script prints the exact paste-ready URL. Requires Docker & the Docker Compose plugin. Full prerequisites, air-gapped install, and the analyst playbook are in the README.

A

Start the hunt

Archer is free and open source. Point it at your Zeek logs and find the beacon that's been hiding in your traffic.

View on GitHub ↗ ← All projects