Feuerfest

Just the private blog of a Linux sysadmin

How to stop logging HTTP-Requets from check_http in Apache 2

Photo by Sena: https://www.pexels.com/photo/wood-covered-in-snow-10944259/

I encountered a situation where the HTTP(S)-Requests done by Icinga via check_http checkplugin seriously messed up the generated website traffic reports. The software used to generate said reports wasn't capable of filtering out certain requests based on User-Agents. Restructuring the report in a way that the hits from the monitoring requests could be easily identified was also out of the question as this would generate too much follow-up work for other involved parties & systems.

The solution (or rather: workaround...) was to simply omit the logging of said monitoring requests to the logfile of the Apache vHost.

The relevant line in the virtual host config was the following:

CustomLog /var/log/apache2/host.domain-tld-access.log combined

This defines where to store the logfile in the "NCSA extended/combined log format" as specified on https://httpd.apache.org/docs/2.4/mod/mod_log_config.html#formats respectively in the /etc/apache2/apache2.conf in Debian.

root@host:~ # grep LogFormat /etc/apache2/apache2.conf
LogFormat "%v:%p %h %l %u %t \"%r\" %>s %O \"%{Referer}i\" \"%{User-Agent}i\"" vhost_combined
LogFormat "%h %l %u %t \"%r\" %>s %O \"%{Referer}i\" \"%{User-Agent}i\"" combined
LogFormat "%h %l %u %t \"%r\" %>s %O" common
LogFormat "%{Referer}i -> %U" referer
LogFormat "%{User-agent}i" agent

SetEnvIf to the rescue!

Apache2 allows us to set environment variables for each request via the SetEnvIf directive (among others). This is documented on the page for mod_setenvif: https://httpd.apache.org/docs/2.4/mod/mod_setenvif.html#setenvif

A typical line in the access.log would log like the following:

256.133.7.42 - - [30/Dec/2024:05:46:18 +0100] "GET / HTTP/1.0" 302 390 "-" "check_http/v2.3.1 (monitoring-plugins 2.3.1)

Since the requests were for the URI / we can't exclude requests based on Request_URI. Instead, we have to go for the supplied user-agent. Fortunately this is quite unique.

To catch future versions we use a wildcard-regex for the version numbers. As environment variable I choose monitoring.

# Don't log Icinga monitoring requests
SetEnvIf User-Agent "check_http/v(.*) \(monitoring-plugins (.*)\)" monitoring

Now all that is needed is to adopt the CustomLog-directive in the virtual host configuration. As the documentation for the CustomLog-directive explains we can access the environment variables via the env= option. Negating this will simply omit all requests with that environment variable set.

Therefore our resulting CustomLog line now looks like this:

CustomLog /var/log/apache2/admin.brennt.net-access.log combined env=!monitoring

Now just check your configuration for errors via apach2ctl configtest and restart your Apache webserver.

Comments

Monitoring Teamspeak3 servers with check_teamspeak3 & Icinga2

Photo by cottonbro studio: https://www.pexels.com/photo/woman-sitting-on-the-floor-among-laptops-and-tangled-cables-and-wearing-goggles-8721343/

Just a short note as no one seems to really mention this. If you want to monitor your Teamspeak 3 server with Icinga2 (or Nagios or any other compatible monitoring system): check_teamspeak3 from xicon.eu / xiconfjs still works flawlessly despite the fact that the "Last commit" being 3 years old.

A little bit of background, or: UDP monitoring is hard

Teamspeak 3 - being a voice chat - utilizes UDP for most of it's services. Understandably as speed is key in providing enjoyable voice communications and simultaneously it can cope well with a few lost packets.

This however is the main problem in monitoring. With TCP you can open a simple TCP-Connect and if the port is open assume that your service is working. With UDP: You can't as UDP won't give you any feedback if any your packets were received as UDP in its entirety lacks the "Transmission Control" part of TCP in favour of faster packet sending/progressing. Therefore you have to send a request that provokes an reply and thus enables you to check that reply against your expected "Known good/working" reply.

This means that you must delve into the depths of a protocol in order to know what your packets must include. And this is the part where it often gets complicated, time-consuming and cumbersome. More often than not people rather went with a "Let's just check if the process is running." or "If the application opens a TCP-port too, let's just check that one." solution.

I however wanted a detailed check and check_teamspeak3 exactly does this. It connects to the voice datagram port and sends the appropriate encoded UDP packets and checks the result. Et voilà we have a monitoring check that really checks the working condition.

Thanks xiconfjs!

Alternatives

For those coming here in order to search for other ways to monitor Teamspeak 3: You can of course do one of the following:

  • Check if the Teamspeak process is running
  • Examine the list of locally opened ports and check if the ports are in listening mode (lsof, netstat, etc.)
  • Expose the serverquery port and do your checks via commands (see: https://community.teamspeak.com/t/how-to-use-the-server-query/25386)
  • The FileTransfer part of Teamspeak uses a TCP port, you could verify that this port is open with a simple check_tcp servicecheck
Comments

How to monitor your APT-repositories with Icinga

Photo by Pixabay: https://www.pexels.com/photo/software-engineers-working-on-computers-256219/

During my series about unattended-upgrades (Part 1, Part 2) I noticed that a Debian Mirror I use was unresponsive for 22 days but I had nothing to notify me of this. As this also meant that unattended-upgrades didn't apply any patches I wanted a check for this which in turn will trigger unattended-upgrades when there are outstanding updates.

The problem

The monitoring-plugins provide the check_apt plugin. This is normally used to check for available packages. However, in the default configuration it doesn't execute an apt-get update as this requires root privileges. Personally I think the risk is worth the gain. As adding the -u parameter will execute an apt-get update and therefore check_apt will notify you when apt-get update finishes with a non-zero exit-code.

Take the following problem:

root@host:~# apt-get update
Hit:1 http://security.debian.org/debian-security bookworm-security InRelease
Hit:2 http://debian.tu-bs.de/debian bookworm InRelease
Get:3 http://debian.tu-bs.de/debian bookworm-updates InRelease [55.4 kB]
Reading package lists... Done
E: Release file for http://debian.tu-bs.de/debian/dists/bookworm-updates/InRelease is expired (invalid since 16d 0h 59min 33s). Updates for this repository will not be applied.

A mere check_apt wont notify you of any problems:

root@host:~# /usr/lib/nagios/plugins/check_apt
APT CRITICAL: 12 packages available for upgrade (12 critical updates). |available_upgrades=12;;;0 critical_updates=12;;;0

Making this go undetected.

Executed with the -u parameter however, we are notified of the problem:

root@host:~# /usr/lib/nagios/plugins/check_apt -u
'/usr/bin/apt-get -q update' exited with non-zero status.
APT CRITICAL: 12 packages available for upgrade (12 critical updates).  warnings detected, errors detected.|available_upgrades=12;;;0 critical_updates=12;;;0

The solution

Fixing this via Icinga is however a bit more complicated, as the standard apt CheckCommand from the Icinga Template Library (ITL) doesn't include the -u option and isn't prefixed to use sudo despite root privileges being needed. This can be checked here: https://github.com/Icinga/icinga2/blob/master/itl/command-plugins.conf#L2155 or in your local /usr/share/icinga2/include/command-icinga.conf if you happen to use Icinga.

The root cause is also the number one main problem I have with the check_apt CheckPlugin. check_apt is designed to actually install package updates when check_apt reports outstanding available updates. This however breaks the number one paradigm I have regarding monitoring systems: They should not modify the system on their own. And when they do, they should do it in the same way as it is normally done. check_apt breaks this.

Maybe that person should have read a blog article about unattended-upgrades prior to writting that plugin? 😜

Normally you utilize Event Commands for that type of scenario: "If service X is in state Y execute event command Z."

The CheckCommand check_apt_update

Therefore I recommend creating your own apt_update CheckCommand and using that.

object CheckCommand "check_apt_update" {
        command = [ "/usr/bin/sudo", + PluginDir + "/check_apt" ]

        arguments = {
                "-u" = {
                        description = "Perform an apt-get update"
                }
        }
}

Defining the service and configuring the EventCommand

Then in your service definition add a suitable event_command:

apply Service "apt repositories" to Host {
  import "hourly-service"

  check_command = "check_apt_update"

  enable_event_handler = true
  // Execute unattended-upgrades automatically if service goes critical
  event_command = "execute_unattended_upgrades"
  // For services which should be executed ON the host itself
  command_endpoint = host.vars.agent_endpoint

  assign where host.vars.distribution == "Debian"

}

Creating the EventCommand

And create the EventCommand like this:

object EventCommand "execute_unattended_upgrades" {
  command = "sudo /usr/bin/unattended-upgrades"
}

Necessary sudo rights

This requires a sudo config file for the icinga user executing that command. And the commands must be executable without the need for a TTY, hence we end up with the following:

root@host:~# cat /etc/sudoers.d/icinga2
# This line disables the need for a tty for sudo
#  else we will get all kind of "sudo: a password is required" errors
Defaults:icinga2 !requiretty

# sudo rights for Icinga2
icinga2  ALL=(ALL) NOPASSWD: /usr/bin/unattended-upgrades
icinga2  ALL=(ALL) NOPASSWD: /usr/bin/unattended-upgrade
icinga2  ALL=(ALL) NOPASSWD: /usr/bin/apt-get
icinga2  ALL=(ALL) NOPASSWD: /usr/lib/nagios/plugins/check_apt

Conclusion

This is now sufficient as I'm notified when something prevents APT from properly updating the package lists. APT itself takes care to validate the various entries inside the Release file and exits with a non-zero exit-code, so there is no need to put that logic inside of check_apt.

Setting up similar checks for other monitoring systems is of course also possible. In general raising an alarm when apt-get update throws an non-zero exit-code is a somewhat foolproof method.

Comments

Icinga2 error "check command does not exist" because of missing constant

Photo by Christina Morillo: https://www.pexels.com/photo/software-engineer-standing-beside-server-racks-1181354/

Apparently this problem kept me busy far too long, as I kept looking into the Icinga2 Master logfiles only. Main due to the service definition for my icinga CheckCommand still being from a time when it was only one Master without any Agents. This lead to it being executed on the Master and hence I never saw the problems on the agent..

Additionally the cluster and cluster-health checks only check if all endpoints are connected. Which was the case all the time. Therefore I got no error there too.

But what happened?

I defined a new CheckCommand. It worked fine on the master. Then I re-rewrote the service apply-Rule so that it matches for all Linux hosts being monitored. And then I got Check command not found for all these new service checks on all agent hosts.

I deleted the API config sync directories and restarted Icinga2 on the agents to trigger a new sync:

root@agent:/etc/icinga2# rm /var/lib/icinga2/api/zones-stage/* -rf && rm /var/lib/icinga2/api/zones/* -rf
root@agent:/etc/icinga2# systemctl restart icinga2.service

And suddenly all CheckCommands which are not part of the Icinga Template Library stopped working on the agents.

Uhm, ok. At this point I suspected I had somehow messed up my /etc/icinga2/zones.conf file some time ago. Turns out, this wasn't the case.

The root cause

Some weeks ago I defined a service check which is only executed on my Icinga2 master. However I stored the CheckCommand and Service-Configuration under /etc/icinga2/zones.d/master anyway as you never know when this comes in handy. (This has since been corrected in the article.) But the Telegram API requires a Token. And I defined that in /etc/icinga2/constants.conf - but this file isn't synced as it is outside of /etc/icinga2/zones.d/master. Something which I did on purpose, as I didn't want to sync the Token to all agents.

This apparently caused the config file sync to run into an syntax error as the constant for the Token couldn't be resolved.
But again.. This was only logged in the logfiles on the agents..

root@agent:/etc/icinga2# cat /var/log/icinga2/icinga2.log
[...]
[2024-07-17 22:39:04 +0200] information/ApiListener: Received configuration for zone 'global-templates' from endpoint 'master.domain.tld'. Comparing the timestamp and checksums.
[2024-07-17 22:39:04 +0200] information/ApiListener: Stage: Updating received configuration file '/var/lib/icinga2/api/zones-stage/global-templates//_etc/eventcommands.conf' for zone 'global-templates'.
[2024-07-17 22:39:04 +0200] information/ApiListener: Stage: Updating received configuration file '/var/lib/icinga2/api/zones-stage/global-templates//_etc/groups.conf' for zone 'global-templates'.
[2024-07-17 22:39:04 +0200] information/ApiListener: Stage: Updating received configuration file '/var/lib/icinga2/api/zones-stage/global-templates//_etc/host-templates.conf' for zone 'global-templates'.
[2024-07-17 22:39:04 +0200] information/ApiListener: Stage: Updating received configuration file '/var/lib/icinga2/api/zones-stage/global-templates//_etc/notifications.conf' for zone 'global-templates'.
[2024-07-17 22:39:04 +0200] information/ApiListener: Stage: Updating received configuration file '/var/lib/icinga2/api/zones-stage/global-templates//_etc/service-templates.conf' for zone 'global-templates'.
[2024-07-17 22:39:04 +0200] information/ApiListener: Stage: Updating received configuration file '/var/lib/icinga2/api/zones-stage/global-templates//_etc/telegrambot-notifications.conf' for zone 'global-templates'.
[2024-07-17 22:39:04 +0200] information/ApiListener: Stage: Updating received configuration file '/var/lib/icinga2/api/zones-stage/global-templates//_etc/templates.conf' for zone 'global-templates'.
[2024-07-17 22:39:04 +0200] information/ApiListener: Stage: Updating received configuration file '/var/lib/icinga2/api/zones-stage/global-templates//_etc/timeperiods.conf' for zone 'global-templates'.
[2024-07-17 22:39:04 +0200] information/ApiListener: Stage: Updating received configuration file '/var/lib/icinga2/api/zones-stage/global-templates//_etc/users.conf' for zone 'global-templates'.
[2024-07-17 22:39:04 +0200] information/ApiListener: Applying configuration file update for path '/var/lib/icinga2/api/zones-stage/global-templates' (6688 Bytes).
[2024-07-17 22:39:04 +0200] information/ApiListener: Received configuration updates (2) from endpoint 'master.domain.tld' are different to production, triggering validation and reload.
[2024-07-17 22:39:04 +0200] critical/ApiListener: Config validation failed for staged cluster config sync in '/var/lib/icinga2/api/zones-stage/'. Aborting. Logs: '/var/lib/icinga2/api/zones-stage//startup.log'
[...]

The /var/lib/icinga2/api/zones-stage/startup.log has the details:

root@agent:/etc/icinga2# cat /var/lib/icinga2/api/zones-stage/startup.log
[2024-07-17 23:36:19 +0200] information/cli: Icinga application loader (version: r2.12.3-1)
[2024-07-17 23:36:19 +0200] information/cli: Loading configuration file(s).
[2024-07-17 23:36:19 +0200] information/ConfigItem: Committing config item(s).
[2024-07-17 23:36:19 +0200] critical/config: Error: Error while evaluating expression: Tried to access undefined script variable 'TelegramBotToken'
Location: in /var/lib/icinga2/api/zones-stage//global-commands/_etc/telegrambot-commands.conf: 46:26-46:41
/var/lib/icinga2/api/zones-stage//global-commands/_etc/telegrambot-commands.conf(44):     HOSTDISPLAYNAME = "$host.display_name$"
/var/lib/icinga2/api/zones-stage//global-commands/_etc/telegrambot-commands.conf(45):     SERVICEDISPLAYNAME = "$service.display_name$"
/var/lib/icinga2/api/zones-stage//global-commands/_etc/telegrambot-commands.conf(46):     TELEGRAM_BOT_TOKEN = TelegramBotToken
                                                                                                               ^^^^^^^^^^^^^^^^
/var/lib/icinga2/api/zones-stage//global-commands/_etc/telegrambot-commands.conf(47):     TELEGRAM_CHAT_ID = "$user.vars.telegram_chat_id$"
/var/lib/icinga2/api/zones-stage//global-commands/_etc/telegrambot-commands.conf(48):

[2024-07-17 23:36:19 +0200] critical/config: Error: Error while evaluating expression: Tried to access undefined script variable 'TelegramBotToken'
Location: in /var/lib/icinga2/api/zones-stage//global-commands/_etc/telegrambot-commands.conf: 20:26-20:41
/var/lib/icinga2/api/zones-stage//global-commands/_etc/telegrambot-commands.conf(18):     NOTIFICATIONCOMMENT = "$notification.comment$"
/var/lib/icinga2/api/zones-stage//global-commands/_etc/telegrambot-commands.conf(19):     HOSTDISPLAYNAME = "$host.display_name$"
/var/lib/icinga2/api/zones-stage//global-commands/_etc/telegrambot-commands.conf(20):     TELEGRAM_BOT_TOKEN = TelegramBotToken
                                                                                                               ^^^^^^^^^^^^^^^^
/var/lib/icinga2/api/zones-stage//global-commands/_etc/telegrambot-commands.conf(21):     TELEGRAM_CHAT_ID = "$user.vars.telegram_chat_id$"
/var/lib/icinga2/api/zones-stage//global-commands/_etc/telegrambot-commands.conf(22):

[2024-07-17 23:36:19 +0200] critical/config: 2 errors
[2024-07-17 23:36:19 +0200] critical/cli: Config validation failed. Re-run with 'icinga2 daemon -C' after fixing the config.

However... The tricky part is that a config validation will succeed!

root@agent:/etc/icinga2# icinga2 daemon -C
[2024-07-18 00:00:16 +0200] information/cli: Icinga application loader (version: r2.12.3-1)
[2024-07-18 00:00:16 +0200] information/cli: Loading configuration file(s).
[2024-07-18 00:00:16 +0200] information/ConfigItem: Committing config item(s).
[2024-07-18 00:00:16 +0200] information/ApiListener: My API identity: agent.domaint.tld
[2024-07-18 00:00:16 +0200] information/ConfigItem: Instantiated 1 CheckerComponent.
[2024-07-18 00:00:16 +0200] information/ConfigItem: Instantiated 5 Zones.
[2024-07-18 00:00:16 +0200] information/ConfigItem: Instantiated 1 IcingaApplication.
[2024-07-18 00:00:16 +0200] information/ConfigItem: Instantiated 2 Endpoints.
[2024-07-18 00:00:16 +0200] information/ConfigItem: Instantiated 1 FileLogger.
[2024-07-18 00:00:16 +0200] information/ConfigItem: Instantiated 235 CheckCommands.
[2024-07-18 00:00:16 +0200] information/ConfigItem: Instantiated 1 ApiListener.
[2024-07-18 00:00:16 +0200] information/ScriptGlobal: Dumping variables to file '/var/cache/icinga2/icinga2.vars'
[2024-07-18 00:00:16 +0200] information/cli: Finished validating the configuration file(s).

And this was the reason why I was too focused on the master..

What I learned later is, that you can utilize the following command to validate the configuration from the stage-dir.
Documentation for the Config Sync: Receive Config is here.

root@agent:/var/log/icinga2# icinga2 daemon -C --define System.ZonesStageVarDir=/var/lib/icinga2/api/zones-stage/
[2024-07-21 16:28:51 +0200] information/cli: Icinga application loader (version: r2.12.3-1)
[2024-07-21 16:28:51 +0200] information/cli: Loading configuration file(s).
[2024-07-21 16:28:51 +0200] information/ConfigItem: Committing config item(s).
[2024-07-21 16:28:51 +0200] critical/config: Error: Error while evaluating expression: Tried to access undefined script variable 'TelegramBotToken'
Location: in /var/lib/icinga2/api/zones-stage//global-commands/_etc/telegrambot-commands.conf: 20:26-20:41
/var/lib/icinga2/api/zones-stage//global-commands/_etc/telegrambot-commands.conf(18):     NOTIFICATIONCOMMENT = "$notification.comment$"
/var/lib/icinga2/api/zones-stage//global-commands/_etc/telegrambot-commands.conf(19):     HOSTDISPLAYNAME = "$host.display_name$"
/var/lib/icinga2/api/zones-stage//global-commands/_etc/telegrambot-commands.conf(20):     TELEGRAM_BOT_TOKEN = TelegramBotToken
                                                                                                               ^^^^^^^^^^^^^^^^
/var/lib/icinga2/api/zones-stage//global-commands/_etc/telegrambot-commands.conf(21):     TELEGRAM_CHAT_ID = "$user.vars.telegram_chat_id$"
/var/lib/icinga2/api/zones-stage//global-commands/_etc/telegrambot-commands.conf(22):

[2024-07-21 16:28:51 +0200] critical/config: Error: Error while evaluating expression: Tried to access undefined script variable 'TelegramBotToken'
Location: in /var/lib/icinga2/api/zones-stage//global-commands/_etc/telegrambot-commands.conf: 46:26-46:41
/var/lib/icinga2/api/zones-stage//global-commands/_etc/telegrambot-commands.conf(44):     HOSTDISPLAYNAME = "$host.display_name$"
/var/lib/icinga2/api/zones-stage//global-commands/_etc/telegrambot-commands.conf(45):     SERVICEDISPLAYNAME = "$service.display_name$"
/var/lib/icinga2/api/zones-stage//global-commands/_etc/telegrambot-commands.conf(46):     TELEGRAM_BOT_TOKEN = TelegramBotToken
                                                                                                               ^^^^^^^^^^^^^^^^
/var/lib/icinga2/api/zones-stage//global-commands/_etc/telegrambot-commands.conf(47):     TELEGRAM_CHAT_ID = "$user.vars.telegram_chat_id$"
/var/lib/icinga2/api/zones-stage//global-commands/_etc/telegrambot-commands.conf(48):

[2024-07-21 16:28:51 +0200] critical/config: 2 errors
[2024-07-21 16:28:51 +0200] critical/cli: Config validation failed. Re-run with 'icinga2 daemon -C' after fixing the config.

The solution

On the master I moved the 2 files from /etc/icinga2/zones.d to /etc/icinga2/conf.d and restarted the service.

root@master:/etc/icinga2# mv /etc/icinga2/zones.d/global-commands/telegrambot-commands.conf /etc/icinga2/conf.d/
root@master:/etc/icinga2# mv /etc/icinga2/zones.d/global-templates/telegrambot-notifications.conf /etc/icinga2/conf.d/
root@master:/etc/icinga2# systemctl restart icinga2.service

On the agent a simple restart is enough:

root@agent:/etc/icinga2# systemctl restart icinga2.service

And after that everything worked again.

Another problem detected - an even deeper rooted cause

In the aftermath I was curious why & how Icinga didn't notify me that the config in the stage-dir couldn't be validated. Shouldn't there be some kind of included check for this?

Yes, turns out the built-in Icinga CheckCommand does exactly this. But it was never executed on my agent. As I still had a service definition from a time when I didn't have any agents. Initially the configuration was the following:

// Checks the agent health
apply Service "icinga" {
  import "generic-service"

  check_command = "icinga"

  assign where (host.address || host.address6) && host.vars.os == "Linux"
}

This was still a remnant of having only the Icinga Master and no agents. But this lead to it being executed on the Master. Which is... Not smart if you want to validate the configuration on the Agent.

After changing it to the following:

// Checks the agent health - must be executed on the agent
apply Service "icinga" {
  import "generic-service"

  check_command = "icinga"

  command_endpoint = host.vars.agent_endpoint

  assign where host.vars.agent_endpoint
}

The check worked as intended.

Oh, and I opened a pull request to enhance Icinga's documentation regarding the config sync: https://github.com/Icinga/icinga2/pull/10101. Let's see if it get's accepted.

Comments

Icinga2 Monitoring notifications via Telegram Bot (Part 1)

Photo by Kindel Media: https://www.pexels.com/photo/low-angle-shot-of-robot-8566526/

One thing I wanted to set up for a long time was to get my Icinga2 notifications via some of the Instant Messaging apps I have on my mobile. So there was Threema, Telegram and Whatsapp to choose from.

Well.. Threema wants money for this kind of service, Whatsapp requires a business account who must be connected with the bot. And Whatsapp Business means I have to pay again? - Don't know - didn't pursue that path any further. As this either meant I would've needed to convert my private account into a business account or get a second account. No, sorry. Not something I want.

Telegram on the other hand? "Yeah, well, message the @botfather account, type /start, type /newbot, set a display and username, get your Bot-Token (for API usage) that's it. The only requirement we have? The username must end in bot." From here it was an easy decision which app I choose. (Telegram documentation here.)

Creating your telegram bot

  1. Search for the @botfather account on Telegram; doesn't matter if you use the mobile app or do it via Telegram Web.
  2. Type /start and a help message will be displayed.
  3. To create a new bot, type: /newbot
  4. Via the question "Alright, a new bot. How are we going to call it? Please choose a name for your bot." you are asked for the display name of your bot.
    • I choose something generic like "Icinga Monitoring Notifications".
  5. Likewise the question "Good. Now let's choose a username for your bot. It must end in `bot`. Like this, for example: TetrisBot or tetris_bot." asks for the username.
    • Choose whatever you like.
  6. If the username is already taken Telegram will state this and simply ask for a new username until you find one which is available.
  7. In the final message you will get your token to access the HTTP API. Note this down and save it in your password manager. We will need this later for Icinga.
  8. To test everything send a message to your bot in Telegram

That's the Telegram part. Pretty easy, right?

Testing our bot from the command line

We are now able to receive (via /getUpdates) and send messages (via /sendMessage) from/to our bot. Define the token as a shell variable and execute the following curl command to get the message that was sent to your bot. Note: Only new messages are received. If you already viewed them in Telegram Web the response will be empty. As seen in the first executed curl command.

Just close Telegram Web and the App on your phone and sent a message via curl. This should do the trick. Later we define our API-Token as a constant in the Icinga2 configuration.

For better readability I pipe the output through jq.

When there is a new message from your Telegram-Account to your bot, you will see a field with the named id. Note this number down. This is the Chat-ID from your account and we need this, so that your bot can actually send you messages.

Relevant documentation links are:

user@host:~$ TOKEN="YOUR-TOKEN"
user@host:~$ curl --silent "https://api.telegram.org/bot${TOKEN}/getUpdates" | jq
{
  "ok": true,
  "result": []
}
user@host:~$ curl --silent "https://api.telegram.org/bot${TOKEN}/getUpdates" | jq
{
  "ok": true,
  "result": [
    {
      "update_id": NUMBER,
      "message": {
        "message_id": 3,
        "from": {
          "id": CHAT-ID,
          "is_bot": false,
          "first_name": "John Doe Example",
          "username": "JohnDoeExample",
          "language_code": "de"
        },
        "chat": {
          "id": CHAT-ID,
          "first_name": "John Doe Example",
          "username": "JohnDoeExample",
          "type": "private"
        },
        "date": 1694637798,
        "text": "This is a test message"
      }
    }
  ]
}

Configuring Icinga2

Now we need to integrate our bot into the Icinga2 notification process. Luckily there were many people before us doing this, so there are already some notification scripts and example configuration files on GitHub.

I choose the scripts found here: https://github.com/lazyfrosch/icinga2-telegram

As I use the distributed monitoring I store some configuration files beneath /etc/icinga2/zones.d/. If you don't use this, feel free to store those files somewhere else. However as I define the Token in /etc/icinga2/constants.conf which isn't synced via the config file sync, I have to make sure that the Notification configuration is also stored outside of /etc/icinga2/zones.d/. Else the distributed setup will fail as the config file sync throws an syntax error on all other machines due to the missing TelegramBotToken constant.

First we define the API-Token in the /etc/icinga2/constants.conf file:

user@host:/etc/icinga2$ grep -B1 TelegramBotToken constants.conf
/* Telegram Bot Token */
const TelegramBotToken = "YOUR-TOKEN-HERE"

Afterwards we download the host and service notification script into /etc/icinga2/scripts and set the executeable bit.

user@host:/etc/icinga2/scripts$ wget https://raw.githubusercontent.com/lazyfrosch/icinga2-telegram/master/telegram-host-notification.sh
user@host:/etc/icinga2/scripts$ wget https://raw.githubusercontent.com/lazyfrosch/icinga2-telegram/master/telegram-service-notification.sh
user@host:/etc/icinga2/scripts$ chmod +x telegram-host-notification.sh telegram-service-notification.sh

Based on the notifications we want to receive, we need to define the variable vars.telegram_chat_id in the appropriate user/group object(s). An example for the icingaadmin is shown below and can be found in the icinga2-example.conf on GitHub: https://github.com/lazyfrosch/icinga2-telegram/blob/master/icinga2-example.conf along with the notification commands which we are setting up after this.

user@host:~$ cat /etc/icinga2/zones.d/global-templates/users.conf
object User "icingaadmin" {
  import "generic-user"

  display_name = "Icinga 2 Admin"
  groups = [ "icingaadmins" ]

  email = "root@localhost"
  vars.telegram_chat_id = "YOUR-CHAT-ID-HERE"
}

Notifications for Host & Service

We need to define 2 new NotificationCommand objects which trigger the telegram-(host|service)-notification.sh scripts. These are stored in /etc/icinga2/conf.d/telegrambot-commands.conf.

Note: We store the NotificationCommands and Notifications in /etc/icinga2/conf.d and NOT in /etc/icinga2/zones.d/master. This is because I have only one master in my setup which sends out notifications and as we defined the constant TelegramBotToken in /etc/icinga2/constants.conf - which is not synced via the zone config sync. Therefore we would run into an syntax error on all Icinga2 agents.

See https://admin.brennt.net/icinga2-error-check-command-does-not-exist-because-of-missing-constant for details.

Of course we could also define the constant in a file under /etc/icinga2/zones.d/master but I choose not to do so for security reasons.

user@host:/etc/icinga2$ cat /etc/icinga2/conf.d/telegrambot-commands.conf
/*
 * Notification Commands for Telegram Bot
 */
object NotificationCommand "telegram-host-notification" {
  import "plugin-notification-command"

  command = [ SysconfDir + "/icinga2/scripts/telegram-host-notification.sh" ]

  env = {
    NOTIFICATIONTYPE = "$notification.type$"
    HOSTNAME = "$host.name$"
    HOSTALIAS = "$host.display_name$"
    HOSTADDRESS = "$address$"
    HOSTSTATE = "$host.state$"
    LONGDATETIME = "$icinga.long_date_time$"
    HOSTOUTPUT = "$host.output$"
    NOTIFICATIONAUTHORNAME = "$notification.author$"
    NOTIFICATIONCOMMENT = "$notification.comment$"
    HOSTDISPLAYNAME = "$host.display_name$"
    TELEGRAM_BOT_TOKEN = TelegramBotToken
    TELEGRAM_CHAT_ID = "$user.vars.telegram_chat_id$"

    // optional
    ICINGAWEB2_URL = "https://host.domain.tld/icingaweb2"
  }
}

object NotificationCommand "telegram-service-notification" {
  import "plugin-notification-command"

  command = [ SysconfDir + "/icinga2/scripts/telegram-service-notification.sh" ]

  env = {
    NOTIFICATIONTYPE = "$notification.type$"
    SERVICEDESC = "$service.name$"
    HOSTNAME = "$host.name$"
    HOSTALIAS = "$host.display_name$"
    HOSTADDRESS = "$address$"
    SERVICESTATE = "$service.state$"
    LONGDATETIME = "$icinga.long_date_time$"
    SERVICEOUTPUT = "$service.output$"
    NOTIFICATIONAUTHORNAME = "$notification.author$"
    NOTIFICATIONCOMMENT = "$notification.comment$"
    HOSTDISPLAYNAME = "$host.display_name$"
    SERVICEDISPLAYNAME = "$service.display_name$"
    TELEGRAM_BOT_TOKEN = TelegramBotToken
    TELEGRAM_CHAT_ID = "$user.vars.telegram_chat_id$"

    // optional
    ICINGAWEB2_URL = "https://host.domain.tld/icingaweb2"
  }
}

As I want to get all notifications for all hosts and services I simply apply the notification object for all hosts and services which have a set host.name. - Same as in the example.

user@host:/etc/icinga2$ cat /etc/icinga2/conf.d/telegrambot-notifications.conf
/*
 * Notifications for alerting via Telegram Bot
 */
apply Notification "telegram-icingaadmin" to Host {
  import "mail-host-notification"
  command = "telegram-host-notification"

  users = [ "icingaadmin" ]

  assign where host.name
}

apply Notification "telegram-icingaadmin" to Service {
  import "mail-service-notification"
  command = "telegram-service-notification"

  users = [ "icingaadmin" ]

  assign where host.name
}

Checking configuration

Now we check if our Icinga2 config has no errors and reload the service:

root@host:~ # icinga2 daemon -C
[2023-09-14 22:19:37 +0200] information/cli: Icinga application loader (version: r2.12.3-1)
[2023-09-14 22:19:37 +0200] information/cli: Loading configuration file(s).
[2023-09-14 22:19:37 +0200] information/ConfigItem: Committing config item(s).
[2023-09-14 22:19:37 +0200] information/ApiListener: My API identity: hostname.domain.tld
[2023-09-14 22:19:37 +0200] information/ConfigItem: Instantiated 1 NotificationComponent.
[...]
[2023-09-14 22:19:37 +0200] information/ScriptGlobal: Dumping variables to file '/var/cache/icinga2/icinga2.vars'
[2023-09-14 22:19:37 +0200] information/cli: Finished validating the configuration file(s).
root@host:~ # systemctl reload icinga2.service

Verify it works

Log into your Icingaweb2 frontend, click the Notification link for a host or service and trigger a custom notification.

And we have a message in Telegram:

What about CheckMK?

If you use CheckMK see this blogpost: https://www.srcbox.net/posts/monitoring-notifications-via-telegram/

Lessons learned

And if you use this setup you will encounter one situation/question rather quickly: Do I really need to be woken up at 3am for every notification?

No. No you don't want to. Not in a professional context and even less in a private context.

Therefore: In part 2 we will register a second bot account and then use these to differentiate between important and unimportant notifications. The unimportant ones will be muted in the Telegram client on our smartphone. The important ones won't, and therefore are able to wake us at 3am in the morning.

Comments