How to install CrowdSec as a community WAF

Logotipo oficial de CrowdSec, la herramienta WAF comunitaria cuya instalación con Traefik detalla el artículo

CrowdSec is the modern evolution of fail2ban, with a very different approach to the classic: it decouples detection from blocking, exposes an API, and leverages a blocklist fed by thousands of voluntary installations around the world. In 2024, after several years of active development, it’s a mature tool that deserves a spot in any administrator’s toolkit for internet-facing services. This guide walks through installation and Traefik integration aimed at anyone considering replacing fail2ban, with emphasis on understanding why each piece exists rather than just copying commands.

Why it’s worth the switch

The most important conceptual difference between fail2ban and CrowdSec is that the latter separates who detects from who blocks. fail2ban reads logs, decides, and runs an iptables rule in the same process. CrowdSec reads logs and emits decisions to a local database (the LAPI); those decisions are consulted by one or more bouncers, which are the ones that actually block.

It sounds like a minor architectural detail, but it has important consequences. You can have a bouncer on the firewall (for SSH), another on Traefik (for HTTP), another on Nginx, and all act on the same decisions. You can change detection without touching blocking, and vice versa. And you can, if you want, anonymously contribute your detections to the community and receive in return an updated list of IPs actively attacking others right now.

The second difference is expressiveness. fail2ban uses regex over log lines; CrowdSec uses declarative YAML scenarios combining detection, capacity, time window, and grouping. Writing a scenario to detect a new pattern is a fifteen-minute exercise, versus the hours it sometimes takes to debug a fail2ban filter.

Installing the agent

On Debian or Ubuntu, the official package installs via the project’s script followed by the usual apt install:

curl -s https://install.crowdsec.net | sudo sh
sudo apt install crowdsec

What’s left on the system is the agent and the LAPI (a REST API exposed on localhost) in the same process. For a single machine it’s comfortable; for a fleet, you can later separate the LAPI to a dedicated host and have agents on other nodes point there. To start, it’s not needed.

Once installed, sudo cscli version and sudo systemctl status crowdsec confirm everything started. If you hit errors, they’re almost always in the acquisition config (which logs to read), which we’ll cover next.

Collections: what saves you work

A collection is a package bundling parsers (how to read a specific log format) and scenarios (attack patterns) for a technology. CrowdSec maintains an official hub where the community publishes collections; installing them takes one command each. For a typical stack with Traefik in front, WordPress behind, and Gitea on some subdomain, installing at least the Traefik, WordPress, Gitea, Nextcloud (if you use it), and Nginx (if any directly) collections is reasonable. The install command is sudo cscli collections install <name> and it’s painless.

What each brings is reasonable: WordPress detects login brute force, user scans, wp-config.php attempts, and several others. Traefik leverages the access log to detect scans, bot patterns, basic DoS. Gitea captures registration and login attacks. You don’t need to understand every scenario from day one; trust the authors know what they’re doing, and within days you’ll have reviewed which ones fire on your real traffic.

Telling the agent what to read

The most often-forgotten step is telling CrowdSec which logs to read. This goes in /etc/crowdsec/acquis.yaml, and the concept is simple: each block declares a source (file, journald, etc.) and a type label that must match what parsers expect. For a setup with Traefik in Docker whose access log is at /var/log/traefik/access.log, WordPress inside a container, and SSH in systemd journal, typical config includes entries for the Traefik file with label traefik, for journal filtered by sshd.service with label syslog, and for Nginx if used with label nginx.

The detail to note is labels aren’t arbitrary: parsers filter by them, so putting type: traefik-access when the parser expects type: traefik means nothing gets detected and you won’t see an explicit error. This is probably the most common mistake among new users. After editing, sudo systemctl restart crowdsec applies changes, and within minutes sudo cscli metrics shows whether lines are being read.

First look at what it detects

Once running with acquisition properly configured, three diagnostic commands deserve a place in your routine. sudo cscli alerts list gives you triggered alerts (detections). sudo cscli decisions list shows active decisions, which are IPs currently blocked or under captcha. sudo cscli metrics gives an aggregate view of detection rate, cache hits, and general state. If you’ve been running 24 hours and none of those commands shows anything, your acquisition almost certainly isn’t reading what you think. Better to discover that now than when a real attack arrives.

Bouncers: from detection to effective blocking

This is where the decoupled design starts to shine. CrowdSec now detects, but still doesn’t block anything. You need at least one bouncer, and in a stack with Traefik in front, the logical choice is Traefik’s native plugin, maintained by maxlerebourg/crowdsec-bouncer-traefik-plugin. It’s declared in Traefik’s static config as an experimental plugin and, in dynamic config, as a middleware applied to the routes you want to protect. The plugin queries the LAPI in stream mode (caches decisions and refreshes every few seconds), keeping perceived latency at zero.

The plugin config needs an API key specific to that bouncer. Generate one with sudo cscli bouncers add traefik-bouncer, copy the token shown once, and put it in the middleware configuration. If you lose it, delete and regenerate without drama.

For SSH, a firewall-bouncer is reasonable as an additional piece, applying decisions at iptables or nftables level. The package is crowdsec-firewall-bouncer-iptables or its nftables variant, depending on what you use. It configures itself; barely needs touching. The important thing is to understand you now have two bouncers, one HTTP and one network, sharing the same decision source, which avoids duplicating detection logic.

Captcha remediation: not everything is blocking

The classic WAF mistake is blocking everything suspicious and discovering a week later you were blocking legitimate customers whose IP got reused. CrowdSec lets you emit captcha-type decisions instead of ban, and the Traefik bouncer knows how to interpret them: it shows a challenge (typically Cloudflare Turnstile, free) instead of cutting off. If solved, the user passes; if not, they’re blocked.

Which decisions become captcha and which become ban lives in /etc/crowdsec/profiles.yaml. The pattern that works well in production is sending brute-force scenarios (someone trying passwords) to captcha and serious-exploit scenarios (attempting to exploit known CVEs, accessing sensitive files) to ban. A legitimate user who mistyped several times solves the captcha and moves on; an exploit bot stays out.

The Turnstile integration only needs two keys (site key and secret key) obtained from the Cloudflare panel and placed in the middleware config. The widget is free and has no practical limits for normal use.

The real value of the community blocklist

Registering your installation with the central CAPI (sudo cscli capi register) puts you in the community network. You start receiving a constantly-updated list of IPs currently attacking others, and contribute (anonymizing) your own. In practice, this blocks you hundreds or thousands of known IPs before they reach you. The impact is clearly measurable: review your metrics before and after enabling CAPI and you’ll see how noise from typical endpoint-access attempts drops.

There’s an ethical nuance here. Sharing detections contributes to collective defense, but you’re also sending information about your traffic (even if IPs are partially anonymized). For most installations the trade-off is acceptable; for environments with strict privacy requirements, read the terms before enabling.

Monitoring without which operating makes no sense

CrowdSec exposes Prometheus metrics on a local HTTP endpoint you just need to scrape. Useful names for alerting are cs_bucket_overflow_count (total accumulated detections), cs_active_decisions (currently blocked IPs), and cache-performance metrics. There’s an official Grafana dashboard giving you a global view in minutes.

What I always additionally configure is a basic Alertmanager alert: if the agent stops emitting metrics for more than five minutes, something is failing (service down, broken acquisition, full disk). Alerts on detection spikes are useful but noisy; the absence alert is the one that saves you when something silently stops working.

Whitelists and common mistakes

A day-one universal lesson is: before blocking anything seriously, make sure your own legitimate IPs are whitelisted. The office you access from, the VPN you use, CI/CD that deploys, external uptime monitors. This goes in /etc/crowdsec/parsers/s02-enrich/whitelists.yaml and accepts individual IPs and CIDR ranges. The number of admins who’ve ended up blocking themselves during the first hours of use is literally higher than the number of real attackers they’ve frustrated that same day.

Another frequent mistake is not restarting the agent after each config change. CrowdSec doesn’t hot-reload most files; a systemctl restart crowdsec after editing is basic discipline.

When CrowdSec doesn’t pay off

There are scenarios where fail2ban is still simpler and more appropriate. Single server, low traffic, no need to share intelligence between nodes, no Traefik plugin because you don’t use Traefik. In those cases, CrowdSec’s additional infrastructure (agent, LAPI, separate bouncers, acquisition config) is complexity without clear benefit.

CrowdSec starts paying off when you have more than one layer to protect (web + SSH + specific apps), when sharing detections between nodes saves work, or when the community blocklist cuts significant noise for your traffic. On a single hobby VPS, maybe not worth it; on a production stack with real exposure, almost always yes.

My recommendation

If you’re going to try it, my suggestion is progressive rollout. First, install the agent, configure acquisition, and leave it a week in detection-only mode with no bouncer. See what fires, tune scenarios if noisy, add whitelists for legitimate IPs. Only then activate the first bouncer (Traefik’s is the most visible). Wait another week, add captcha remediation for brute force, and finally connect to CAPI. Reaching this maturity level takes two to three weeks of living with it.

Compared to fail2ban, you’ll discover fast that CrowdSec offers much more visibility (the cscli alerts and decisions CLIs are comfortable) and much more flexibility (adding a custom scenario for your own application is a trivial exercise). What you pay is a couple of days of initial learning curve. For any stack with serious internet exposure, the investment is absolutely reasonable, and at six months you probably won’t even remember how fail2ban was configured.

Entradas relacionadas