Back to Clickhouse
We were very happy with what we achieved using InfluxDB, its easy use of timeseries was very helpful in our work of trends, machine learning and stats.
However, using InfluxDB there is a query that we cannot ask: who was the IP that asked for a given domain query? the relationship IP-domain has a high cardinality so the engine died very often.
Thus, we decided to go back to our own recommendation from Usenix SRECON 2018 and use a ClickHouse column database to store the queries and materialized views to produce aggregations.
Now, after install ClickHouse on your servers, write cat tables.sql | clickhouse-client --multiquery
to create the database schema, and use grafana to watch the aggregations, for instance, the sum of queries by type on a time span will be:
SELECT t, Type, sum(c)
FROM $table
WHERE $timeFilter
GROUP BY t, Type
ORDER BY t, Type