Store the system logs in MariaDB
841 words, 4 minutes
I’ve used Elasticsearch on OpenBSD to store my system logs for quite long now. And if it does the job, there are a few things I don’t like so much with it.
I only used a single instance so I was warned about availability. But a sudden power outage had severe impact on my daily data. Way much more than what I expected from a Production-ready software. Rebuilding and re-indexing the data was a real pain in the ass. From time to time, I also get errors about indexing that seem to go away without doing nothing.
The latter is probably due to my low memory server. But I want to store logs for only a couple of boxes. And I don’t want to reserve 4GB of RAM just for this. This “gimme more RAM” manner really annoys me. And as I also need RAM for Logstash (to parse the data and send them to Elasticsearch), this leads to way too much resources consumption.
That said, I decided to test another way for storing the logs : using a RDBMS, namely MariaDB. I already have one running smooth. And I read Grafana was able to read data from it using SQL commands.
How it’ll work
The stock syslogd(8) will be configured to send everything it gets to a local (or remote) syslog-ng daemon. The latter will parse, filter, format and store the logs into a (remote) MySQL / MariaDB instance.
Prepare the RDBMS
I’m using mariadb-server-10.0.34v1 on OpenBSD 6.3/amd64.
First of all, I want to be able to compress the (text) data from the logs. So I had to enable a few InnoDB related options.
# vi /etc/my.cnf
(...)
innodb_file_per_table = 1
innodb_file_format = barracuda
innodb_strict_mode = 1
(...)
# rcctl restart mysqld
Then, I simply created a database and the credentials that’d be used by syslog-ng.
# mysql -u root -p
(...)
> CREATE database logsink;
> GRANT ALL PRIVILEGES ON logsink.* \
TO 'syslog-ng'@'%' IDENTIFIED BY 'changeme';
> FLUSH PRIVILEGES;
Install and configure Syslog-NG
There are drivers required by syslog-ng to store data into mysql.
# pkg_add syslog-ng libdbi-drivers-mysql
Syslog-NG will listen on all interfaces, UDP and TCP ports. This way, any other box can send its logs to him.
# vi /etc/syslog-ng/syslog-ng.conf
(...)
source s_net {
udp(port(8514));
tcp(port(8514));
};
(...)
destination d_mysql_compressed {
sql(
type(mysql)
host("127.0.0.1") username("syslog-ng") password("changeme")
database("logsink")
table("_all")
create-statement-append(ROW_FORMAT=COMPRESSED)
columns(
"seq bigint(20) unsigned NOT NULL AUTO_INCREMENT PRIMARY KEY",
"unixtime bigint NOT NULL",
"facility varchar(16)",
"priority varchar(16)",
"level varchar(16)",
"host varchar(64) NOT NULL",
"program varchar(64) NOT NULL",
"pid smallint",
"message text",
"tag varchar(32)"
)
values(
"", "${UNIXTIME}", "$FACILITY_NUM", "$PRIORITY", "$LEVEL_NUM",
"${HOST}", "$PROGRAM", "${PID}", "${MSGONLY}", "$TAG"
)
indexes("unixtime", "host", "program", "tag")
null("")
);
};
(...)
log { source(s_net); filter(f_all); destination(d_mysql_compressed); };
(...)
# rcctl enable syslog_ng
# rcctl start syslog_ng
When this is done, configure syslogd(8).
# vi /etc/syslog.conf
(...)
*.* @127.0.0.1:8514
# rcctl restart syslogd
Explore the logs
From here, the logs should be stored in MariaDB / MySQL.
A first look at the tables shows “COMPRESSED” is better than the standard storage ; regarding disk usage.
> SELECT TABLE_NAME,ENGINE,ROW_FORMAT,TABLE_ROWS,DATA_LENGTH,INDEX_LENGTH,DATA_FREE \
FROM information_schema.tables WHERE table_schema='logsink';
+----------------+--------+------------+------------+-------------+--------------+-----------+
| TABLE_NAME | ENGINE | ROW_FORMAT | TABLE_ROWS | DATA_LENGTH | INDEX_LENGTH | DATA_FREE |
+----------------+--------+------------+------------+-------------+--------------+-----------+
| _all | InnoDB | Compressed | 92409 | 6561792 | 5275648 | 2097152 |
| _all_compact | InnoDB | Compact | 93643 | 14172160 | 10551296 | 7340032 |
+----------------+--------+------------+------------+-------------+--------------+-----------+
From the filesystem POV, the gain is also clearly visible.
-rw-rw---- 1 _mysql _mysql 3.3K Aug 3 16:22 _all.frm
-rw-rw---- 1 _mysql _mysql 16.0M Aug 6 15:52 _all.ibd
-rw-rw---- 1 _mysql _mysql 3.3K Aug 3 16:22 _all_compact.frm
-rw-rw---- 1 _mysql _mysql 36.0M Aug 6 15:52 _all_compact.ibd
Have a look at the most verbose programs is just a matter of writing SQL sentence:
> SELECT program, COUNT(program) AS messages FROM _all GROUP BY program ORDER BY messages DESC;
+---------------------+----------+
| program | messages |
+---------------------+----------+
| monit | 47596 |
| smtpd | 19689 |
| rspamd | 12884 |
| doas | 4546 |
| collectd | 4265 |
| sshd | 3018 |
| cron | 2545 |
(...)
The logs can be accessed and rendered by Grafana. A simple query can print the last logs. Add alerting when some value appear and you have a nice event-based monitoring tool.
Organize storage
There are logs that I don’t want to store. And there are some that I want to store in a specific table. This can be done in Syslog-NG using filters.
<pre>filter f_all {
not program("fetchmail");
and not program("monit");
and not filter(f_unbound);
and not filter(f_apache);
};
This will not send messages from fetchmail or monit to the compressed table. Nor will it send messages that match the f_unbound and f_apache filters. Those two guys are used to store messages in a specific table with a dedicated schema. I’ll probably write about the details some day…
Now… send all your logs to Syslog-NG rather than Logstash. Count to 10 and get your RAM back! So far, MariaDB seem to handle it pretty well.