Skip to content

Cleaning a broken GnuPG (gpg) key

IT

I've long said that the main tools in the Open Source security space, OpenSSL and GnuPG (gpg), are broken and only a complete re-write will solve this. And that is still pending as nobody came forward with the funding. It's not a sexy topic, so it has to get really bad before it'll get better.

Gpg has a UI that is close to useless. That won't substantially change with more bolted-on improvements.

Now Robert J. Hansen and Daniel Kahn Gillmor had somebody add ~50k signatures (read 1, 2, 3, 4 for the g{l}ory details) to their keys and - oops - they say that breaks gpg.

But does it?

I downloaded Robert J. Hansen's key off the SKS-Keyserver network. It's a nice 45MB file when de-ascii-armored (gpg --dearmor broken_key.asc ; mv broken_key.asc.gpg broken_key.gpg).

Now a friendly:

$ /usr/bin/time -v gpg --no-default-keyring --keyring ./broken_key.gpg --batch --quiet --edit-key 0x1DCBDC01B44427C7 clean save quit

pub  rsa3072/0x1DCBDC01B44427C7
     erzeugt: 2015-07-16  verfällt: niemals     Nutzung: SC  
     Vertrauen: unbekannt     Gültigkeit: unbekannt
sub  ed25519/0xA83CAE94D3DC3873
     erzeugt: 2017-04-05  verfällt: niemals     Nutzung: S  
sub  cv25519/0xAA24CC81B8AED08B
     erzeugt: 2017-04-05  verfällt: niemals     Nutzung: E  
sub  rsa3072/0xDC0F82625FA6AADE
     erzeugt: 2015-07-16  verfällt: niemals     Nutzung: E  
[ unbekannt ] (1). Robert J. Hansen <rjh@sixdemonbag.org>
[ unbekannt ] (2)  Robert J. Hansen <rob@enigmail.net>
[ unbekannt ] (3)  Robert J. Hansen <rob@hansen.engineering>

User-ID "Robert J. Hansen <rjh@sixdemonbag.org>": 49705 Signaturen entfernt
User-ID "Robert J. Hansen <rob@enigmail.net>": 49704 Signaturen entfernt
User-ID "Robert J. Hansen <rob@hansen.engineering>": 49701 Signaturen entfernt

pub  rsa3072/0x1DCBDC01B44427C7
     erzeugt: 2015-07-16  verfällt: niemals     Nutzung: SC  
     Vertrauen: unbekannt     Gültigkeit: unbekannt
sub  ed25519/0xA83CAE94D3DC3873
     erzeugt: 2017-04-05  verfällt: niemals     Nutzung: S  
sub  cv25519/0xAA24CC81B8AED08B
     erzeugt: 2017-04-05  verfällt: niemals     Nutzung: E  
sub  rsa3072/0xDC0F82625FA6AADE
     erzeugt: 2015-07-16  verfällt: niemals     Nutzung: E  
[ unbekannt ] (1). Robert J. Hansen <rjh@sixdemonbag.org>
[ unbekannt ] (2)  Robert J. Hansen <rob@enigmail.net>
[ unbekannt ] (3)  Robert J. Hansen <rob@hansen.engineering>

        Command being timed: "gpg --no-default-keyring --keyring ./broken_key.gpg --batch --quiet --edit-key 0x1DCBDC01B44427C7 clean save quit"
        User time (seconds): 3911.14
        System time (seconds): 2442.87
        Percent of CPU this job got: 99%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 1:45:56
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 107660
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 1
        Minor (reclaiming a frame) page faults: 26630
        Voluntary context switches: 43
        Involuntary context switches: 59439
        Swaps: 0
        File system inputs: 112
        File system outputs: 48
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0
 

And the result is a nicely useable 3835 byte file of the clean public key. If you supply a keyring instead of --no-default-keyring it will also keep the non-self signatures that are useful for you (as you apparently know the signing party).

So it does not break gpg. It does break things that call gpg at runtime and not asynchronously. I heard Enigmail is affected, quelle surprise.

Now the main problem here is the runtime. 1h45min is just ridiculous. As Filippo Valsorda puts it:

Someone added a few thousand entries to a list that lets anyone append to it. GnuPG, software supposed to defeat state actors, suddenly takes minutes to process entries. How big is that list you ask? 17 MiB. Not GiB, 17 MiB. Like a large picture. https://dev.gnupg.org/T4592

If I were a gpg / SKS keyserver developer, I'd

  • speed this up so the edit-key run above completes in less than 10 s (just getting rid of the lseek/read dance and deferring all time-based decisions should get close)
  • (ideally) make the drop-sig import-filter syntax useful (date-ranges, non-reciprocal signatures, ...)
  • clean affected keys on the SKS keyservers (needs coordination of sysops, drop servers from unreachable people)
  • (ideally) use the opportunity to clean all keyserver filesystem and the message board over pgp key servers keys, too
  • only accept new keys and new signatures on keys extending the strong set (rather small change to the existing codebase)

That way another key can only be added to the keyserver network if it contains at least one signature from a previously known strong-set key. Attacking the keyserver network would become at least non-trivial. And the web-of-trust thing may make sense again.

Updates

09.07.2019

GnuPG 2.2.17 has been released with another set of quickly bolted together fixes:

  * gpg: Ignore all key-signatures received from keyservers.  This
    change is required to mitigate a DoS due to keys flooded with
    faked key-signatures.  The old behaviour can be achieved by adding
    keyserver-options no-self-sigs-only,no-import-clean
    to your gpg.conf.  [#4607]
  * gpg: If an imported keyblocks is too large to be stored in the
    keybox (pubring.kbx) do not error out but fallback to an import
    using the options "self-sigs-only,import-clean".  [#4591]
  * gpg: New command --locate-external-key which can be used to
    refresh keys from the Web Key Directory or via other methods
    configured with --auto-key-locate.
  * gpg: New import option "self-sigs-only".
  * gpg: In --auto-key-retrieve prefer WKD over keyservers.  [#4595]
  * dirmngr: Support the "openpgpkey" subdomain feature from
    draft-koch-openpgp-webkey-service-07. [#4590].
  * dirmngr: Add an exception for the "openpgpkey" subdomain to the
    CSRF protection.  [#4603]
  * dirmngr: Fix endless loop due to http errors 503 and 504.  [#4600]
  * dirmngr: Fix TLS bug during redirection of HKP requests.  [#4566]
  * gpgconf: Fix a race condition when killing components.  [#4577]

Bug T4607 shows that these changes are all but well thought-out. They introduce artificial limits, like 64kB for WKD-distributed keys or 5MB for local signature imports (Bug T4591) which weaken the web-of-trust further.

I recommend to not run gpg 2.2.17 in production environments without extensive testing as these limits and the unverified network traffic may bite you. Do validate your upgrade with valid and broken keys that have segments (packet groups) surpassing the above mentioned limits. You may be surprised what gpg does. On the upside: you can now refresh keys (sans signatures) via WKD. So if your buddies still believe in limiting their subkey validities, you can more easily update them bypassing the SKS keyserver network. NB: I have not tested that functionality. So test before deploying.

10.08.2019

Christopher Wellons (skeeto) has released his pgp-poisoner tool. It is a go program that can add thousands of malicious signatures to a GNUpg key per second. He comments "[pgp-poisoner is] proof that such attacks are very easy to pull off. It doesn't take a nation-state actor to break the PGP ecosystem, just one person and couple evenings studying RFC 4880. This system is not robust." He also hints at the next likely attack vector, public subkeys can be bound to a primary key of choice.

Wiping harddisks in 2019

Linux

Wiping hard disks is part of my company's policy when returning servers. No exceptions.

Good providers will wipe what they have received back from a customer, but we don't trust that as the hosting / cloud business is under constant budget-pressure and cutting corners (wipefs) is a likely consequence.

With modern SSDs there is "security erase" (man hdparm or see the - as always well maintained - Arch wiki) which is useful if the device is encrypt-by-default. These devices basically "forget" the encryption key but it also means trusting the devices' implementation security. Which doesn't seem warranted. Still after wiping and trimming, a secure erase can't be a bad idea :-).

Still there are three things to be aware of when wiping modern hard disks:

  1. Don't forget to add bs=4096 (blocksize) to dd as it will still default to 512 bytes and that makes writing even zeros less than half the maximum possible speed. SSDs may benefit from larger block sizes matched to their flash page structure. These are usually 128kB, 256kB, 512kB, 1MB, 2MB and 4MB these days.1
  2. All disks can usually be written to in parallel. screen is your friend.
  3. The write speed varies greatly by disk region, so use 2 hours per TB and wipe pass as a conservative estimate. This is better than extrapolating what you see initially in the fastest region of a spinning disk.
  4. The disks have become huge (we run 12TB disks in production now) but the write speed is still somewhere 100 MB/s ... 300 MB/s. So wiping servers on the last day before returning is not possible anymore with disks larger than 4 TB each (and three passes). Or 12 TB and one pass (where e.g. fully encrypted content allows to just do a final zero-wipe).

hard disk size one pass three passes
1 TB2 h6 h
2 TB4 h12 h
3 TB6 h18 h
4 TB8 h24 h (one day)
5 TB10 h30 h
6 TB12 h36 h
8 TB16 h48 h (two days)
10 TB20 h60 h
12 TB24 h72 h (three days)
14 TB28 h84 h
16 TB32 h96 h (four days)
18 TB36 h108 h
20 TB40 h120 h (five days)

Hard disk wipe animation


  1. As Douglas pointed out correctly in the comment below, these are IT Kilobytes and Megabytes, so 210 Bytes and 220 Bytes. So Kibibytes and Mebibytes for those firmly in SI territory. 

Apple Time Machine backups on Debian 9 (Stretch)

Debian
Warning: Superseded by v3.1.13. Do not use the below packages anymore. Update from 28.04.2022: Do not use the packages below any more. There is Netatalk 3.1.13 out with fixes for multiple remote code execution (RCE) bugs. Use packages from recent Debian again, they have been updated.

Netatalk 3.1.12 has been released which fixes an 18 year old RCE bug. The Medium write up on CVE-2018-1160 by Jacob Baines is quite an entertaining read.

The full release notes for 3.1.12 are unfortunately not even half as interesting.

Warning: Read the original blog post before installing for the first time. Be sure to read the original blog post if you are new to Netatalk3 on Debian Jessie or Stretch!
You'll get nowhere if you install the .debs below and don't know about the upgrade path from 2.2.x which is still in the Debian archive. So RTFA.

For Debian Buster (Debian 10) we'll have Samba 4.9 which has learnt (from Samba 4.8.0 onwards) how to emulate a SMB time machine share. I'll make a write up how to install this once Buster stabilizes. This luckily means there will be no need to continue supporting Netatalk in normal production environments. So I guess bug #690227 won't see a proper fix anymore. Waiting out problems helps at times, too :/.

Update instructions and downloads:

Continue reading "Apple Time Machine backups on Debian 9 (Stretch)"

Xfce 4.12 not suspending on laptop-lid close

Linux

Xfce 4.12 as default in Ubuntu/Xubuntu 18.04 LTS did not suspend a laptop after closing the lid. In fact running xfce4-power-manager --quit ; xfce4-power-manager --no-daemon --debug showed that xfce4 wasn't seeing a laptop lid close event at all.

To the contrary acpi_listen nicely finds button/lid LID close and button/lid LID open events when folding the screen and opening it up again.

As so often the wonderful docs / community of Arch Linux to the rescue. This forum thread from 2015 received the correct answer in 2017:

Xfce4 basically recognizes systemd and thus disables its built-in power-management options for handling these "button events" (but doesn't tell you so in the config UI for power-manager). Systemd is configured to handle these events by default (/etc/systemd/logind.conf has HandleLidSwitch=suspend but for unknown reasons decides not to honor that).

So best is to teach Xfce4 to handle the events again as in pre-systemd times:

xfconf-query -c xfce4-power-manager -p /xfce4-power-manager/logind-handle-lid-switch -s false

Now the UI options will work again as intended and the laptop suspends on lid close and resumes on lid open.

Update:

07.01.19: Changed XFCE -> Xfce as per Corsac's suggestion in the comments below. Thank you!

Background info:

The name "XFCE" was originally an acronym for "XForms Common Environment", but since that time it has been rewritten twice and no longer uses the XForms toolkit. The name survived, but it is no longer capitalized as "XFCE", but rather as "Xfce". The developers' current stance is that the initialism no longer stands for anything specific. After noting this, the FAQ on the Xfce Wiki comments "(suggestion: X Freakin' Cool Environment)".

(quoted from Wikipedia's Xfce article also found in the Xfce docs FAQ).

Openssh taking minutes to become available, booting takes half an hour ... because your server waits for a few bytes of randomness

Linux

So, your machine now needs minutes to boot before you can ssh in where it used to be seconds before the Debian Buster update?

Problem

Linux 3.17 (2014-10-05) learnt a new syscall getrandom() that, well, gets bytes from the entropy pool. Glibc learnt about this with 2.25 (2017-02-05) and two tries and four years after the kernel, OpenSSL used that functionality from release 1.1.1 (2018-09-11). OpenSSH implemented this natively for the 7.8 release (2018-08-24) as well.

Now the getrandom() syscall will block1 if the kernel can't provide enough entropy. And that's frequenty the case during boot. Esp. with VMs that have no input devices or IO jitter to source the pseudo random number generator from.

First seen in the wild January 2017

I vividly remember not seeing my Alpine Linux VMs back on the net after the Alpine 3.5 upgrade. That was basically the same issue.

Systemd. Yeah.

Systemd makes this behaviour worse, see issues #4271, #4513 and #10621.
Basically as of now the entropy file saved as /var/lib/systemd/random-seed will not - drumroll - add entropy to the random pool when played back during boot. Actually it will. It will just not be accounted for. So Linux doesn't know. And continues blocking getrandom(). This is obviously different from SysVinit times2 when /var/lib/urandom/random-seed (that you still have lying around on updated systems) made sure the system carried enough entropy over reboot to continue working right after enough of the system was booted.

#4167 is a re-opened discussion about systemd eating randomness early at boot (hashmaps in PID 0...). Some Debian folks participate in the recent discussion and it is worth reading if you want to learn about the mess that booting a Linux system has become.

While we're talking systemd ... #10676 also means systems will use RDRAND in the future despite Ted Ts'o's warning on RDRAND [Archive.org mirror and mirrored locally as 130905_Ted_Tso_on_RDRAND.pdf, 205kB as Google+ will be discontinued in April 2019].
Update: RDRAND doesn't return random data on pre-Ryzen AMD CPUs (AMD CPU family <23) as per systemd bug #11810. It will always be 0xFFFFFFFFFFFFFFFF (264-1). This is a known issue since 2014, see kernel bug #85991.

Debian

Debian is seeing the same issue working up towards the Buster release, e.g. Bug #912087.

The typical issue is:

[    4.428797] EXT4-fs (vda1): mounted filesystem with ordered data mode. Opts: data=ordered
[ 130.970863] random: crng init done

with delays up to tens of minutes on systems with very little external random sources.

This is what it should look like:

[    1.616819] random: fast init done
[    2.299314] random: crng init done

Check dmesg | grep -E "(rng|random)" to see how your systems are doing.

If this is not fully solved before the Buster release, I hope some of the below can end up in the release notes3.

Solutions

You need to get entropy into the random pool earlier at boot. There are many ways to achieve this and - currently - all require action by the system administrator.

Kernel boot parameter

From kernel 4.19 (Debian Buster currently runs 4.18 [Update: but will be getting 4.19 before release according to Ben via Mika]) you can set RANDOM_TRUST_CPU at compile time or random.trust_cpu=on on the kernel command line. This will make recent Intel / AMD systems trust RDRAND and fill the entropy pool with it. See the warning from Ted Ts'o linked above.

Update: Since Linux kernel build 4.19.20-1 CONFIG_RANDOM_TRUST_CPU has been enabled by default in Debian.

Using a TPM

The Trusted Platform Module has an embedded random number generator that can be used. Of course you need to have one on your board for this to be useful. It's a hardware device.

Load the tpm-rng module (ideally from initrd) or compile it into the kernel (config HW_RANDOM_TPM). Now, the kernel does not "trust" the TPM RNG by default, so you need to add

rng_core.default_quality=1000

to the kernel command line. 1000 means "trust", 0 means "don't use". So you can chose any value in between that works for you depending on how much you consider your TPM to be unbugged.

VirtIO (KVM, QEMU, ...)

For Virtual Machines (VMs) you can forward entropy from the host (that should be running longer than the VMs and have enough entropy) via virtio_rng.

So on the host, you do:

kvm ... -object rng-random,filename=/dev/urandom,id=rng0 -device virtio-rng-pci,rng=rng0,bus=pci.0,addr=0x7

and within the VM newer kernels should automatically load virtio_rng and use that.

You can confirm with dmesg as per above.

Or check:

# cat /sys/devices/virtual/misc/hw_random/rng_available
virtio_rng.0
# cat /sys/devices/virtual/misc/hw_random/rng_current
virtio_rng.0

Patching systemd

The Fedora bugtracker has a bash / python script that replaces the systemd rnd seeding with a (better) working one. The script can also serve as a good starting point if you need to script your own solution, e.g. for reading from an entropy provider available within your (secure) network.

Chaoskey

The wonderful Keith Packard and Bdale Garbee have developed a USB dongle, ChaosKey, that supplies entropy to the kernel. Hard- and software are open source.

Jitterentropy_RNG

Kernel 4.2 introduced jitterentropy_rng which will use the jitter in CPU timings to generate randomness.

modprobe jitterentropy_rng

This apparently needs a userspace daemon though (read: design mistake) so

apt install jitterentropy-rngd (available from Buster/testing).

The current version 1.0.8-3 installs nicely on Stretch. dpkg -i is your friend.

But - drumroll - that daemon doesn't seem to use the kernel module at all.

That's where I stopped looking at that solution. At least for now. There are extensive docs if you want to dig into this yourself.

Update: The Linux kernel 5.3 will have an updated jitterentropy_rng as per Commit 4d2fa8b44. This is based on the upstream version 2.1.2 and should be worth another look.

Haveged

apt install haveged

Haveged is a user-space daemon that gathers entropy though the timing jitter any CPU has. It will only run "late" in boot but may still get your openssh back online within seconds and not minutes.

It is also - to the best of my knowledge - not verified at all regarding the quality of randomness it generates. The haveged design and history page provides and interesting read and I wouldn't recommend haveged if you have alternatives. If you have none, haveged is a wonderful solution though as it works reliably. And unverified entropy is better than no entropy. Just forget this is 2018 2019 :-).

early-rng-init-tools

Thorsten Glaser has posted newly developed early-rng-init-tools in a debian-devel thread. He provides packages at http://fish.mirbsd.org/~tg/Debs/dists/sid/wtf/Pkgs/early-rng-init-tools/ .

First he deserves kudos for naming a tool for what it does. This makes it much more easily discoverable than the trend to name things after girlfriends, pets or anime characters. The implementation hooks into the early boot via initrd integration and carries over a seed generated during the previous shutdown. This and some other implementation details are not ideal and there has been quite extensive scrutiny but none that discovered serious issues. Early-rng-init-tools look like a good option for non-RDRAND (~CONFIG_RANDOM_TRUST_CPU) capable platforms.

Linus to the rescue

Luckily end of September Linus Torvalds was fed up with the entropy starvation issue and the non-conclusive discussions about (mostly) who's at fault and ... started coding.

With the kernel 5.4 release on 25.11.2019 his patch has made it into mainline. He created a try_to_generate_entropy function that uses CPU jitter to generate seed entropy for the PRNG early in boot.

In the merge commit Linus explains:

This is admittedly partly "for discussion". We need to have a way forward for the boot time deadlocks where user space ends up waiting for more entropy, but no entropy is forthcoming because the system is entirely idle just waiting for something to happen.

While this was triggered by what is arguably a user space bug with GDM/gnome-session asking for secure randomness during early boot, when they didn't even need any such truly secure thing, the issue ends up being that our "getrandom()" interface is prone to that kind of confusion, because people don't think very hard about whether they want to block for sufficient amounts of entropy.

The approach here-in is to decide to not just passively wait for entropy to happen, but to start actively collecting it if it is missing. This is not necessarily always possible, but if the architecture has a CPU cycle counter, there is a fair amount of noise in the exact timings of reasonably complex loads.

We may end up tweaking the load and the entropy estimates, but this should be at least a reasonable starting point.

So once this kernel is available in your distribution, you should be safe from entropy starvation at boot on any platform that has hardware timers (I haven't encountered one that does not in the last decade).

Ted Ts'o reviewed the approach and was fine and Ahmed Dawish did some testing of the quality of randomness generated and that seems fine, too.

Updates

14.01.2019

Stefan Fritsch, the Apache2 maintainer in Debian, OpenBSD developer and a former Debian security team member stumbled over the systemd issue preventing Apache libssl to initialize at boot in a Debian bug #916690 - apache2: getrandom call blocks on first startup, systemd kills with timeout.

The bug has been retitled "document getrandom changes causing entropy starvation" hinting at not fixing the underlying issue but documenting it in the Debian Buster release notes.

Unhappy with this "minimal compromise" Stefan wrote a comprehensive summary of the current situation to the Debian-devel mailing list. The discussion spans over December 2018 and January 2019 and mostly iterated what had been written above already. The discussion has - so far - not reached any consensus. There is still the "systemd stance" (not our problem, fix the daemons) and the "ssh/apache stance" (fix systemd, credit entropy).

The "document in release notes" minimal compromise was brought up again and Stefan warned of the problems this would create for Buster users:

> I'd prefer having this documented in the release notes:
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=916690
> with possible solutions like installing haveged, configuring virtio-rng,
> etc. depending on the situation.

That would be an extremely user-unfriendly "solution" and would lead to 
countless hours of debugging and useless bug reports.

This is exactly why I wrote this blog entry and keep it updated. We need to either fix this or tell everybody we can reach before upgrading to Buster. Otherwise this will lead to huge amounts of systems dead on the network after what looked like a successful upgrade.

Some interesting tidbits were mentioned within the thread:

Raphael Hertzog fixed the issue for Kali Linux by installing haveged by default. Michael Prokop did the same for the grml distribution within its December 2018 release.

Ben Hutchings pointed to an interesting thread on the debian-release mailing list he kicked off in May 2018. Multiple people summarized the options and the fact that there is no "general solution that is both correct and easy" at the time.

Sam Hartman identified Debian Buster VMs running under VMware as an issue, because that supervisor does not provide virtio-rng. So Debian VMs wouldn't boot into ssh availability within a reasonable time. This is an issue for real world use cases albeit running a proprietary product as the supervisor.

16.01.2019

Daniel Kahn Gillmor wrote in to explain a risk for VMs starting right after the boot of the host OS:

If that pool is used by the guest to generate long-term secrets because it appears to be well-initialized, that could be a serious problem.
(e.g. "Mining your P's and Q's" by Heninger et al -- https://factorable.net/weakkeys12.extended.pdf)
I've just opened https://bugs.launchpad.net/qemu/+bug/1811758 to report a way to improve that situation in qemu by default.

So ... make sure that your host OS has access to a hardware random number generator or at least carries over its random seed properly across reboots. You could also delay VM starts until the crng on the host Linux is fully initialized (random: crng init done).
Otherwise your VMs may get insufficiently generated pseudo-random numbers and won't even know.

12.03.2019

Stefan Fritsch revived the thread on debian-devel again and got a few more interesting tidbits out of the developer community:

Ben Hutchings has enabled CONFIG_RANDOM_TRUST_CPU for Debian kernels from 4.19.20-1 so the problem is somewhat contained for recent CPU AMD64 systems (RDRAND capable) in Buster.

Thorsten Glaser developed early-rng-init-tools which combine a few options to try and get entropy carried across boot and generated early during boot. He received some scrutiny as can be expected but none that would discourage me from using it. He explains that this is for early boot and thus has initrd integration. It complements safer randomness sources or haveged.

16.04.2019

The Debian installer for Buster is running into the same problem now as indicated in the release notes for RC1. Bug #923675 has details. Essentially choose-mirror waits serveral minutes for entropy when used with https mirrors.

08.05.2019

The RDRAND use introduced in systemd to bypass the kernel random number generator during boot falls for a AMD pre-Ryzen bug as RDRAND on these systems doesn't return random data after a suspend / resume cycle. Added an update note to the systemd section above.

03.06.2019

Bastian Blank reports the issue is affecting Debian cloud images now as well as cloud-init generates ssh keys during boot.

10.07.2019

Added the update of jitterentropy_rng to a version based on upstream v2.1.2 into the Jitterentropy section above.

16.09.2019

The Linux Kernel Mailing List (LKML) is re-iterating the entropy starvation issue and the un-willingness of systemd to fix its usage of randomness in early boot. Ahmed S. Darwish has reported the issue leading to ext4 reproducibly blocking boot with Kernel 5.3-r8. There are a few patches floated and the whole discussion it worth reading albeit non-conclusive as of now.

Ted Ts'o says "I really very strongly believe that the idea of making getrandom(2) non-blocking and to blindly assume that we can load up the buffer with 'best efforts' randomness to be a terrible, terrible idea that is going to cause major security problems that we will potentially regret very badly. Linus Torvalds believes I am an incompetent systems designer." in this email.

In case you needed a teaser to really start reading the thread! Linus Torvalds also mentions the issue (and a primer on what "never break userspace" means) in the Linux kernel 5.3 release notes.

18.09.2019

... and Martin Steigerwald kindly noticed that I update this blog post with the relevant discussions I come across as this entropy starvation mess continues to haunt us.

25.11.2019

Added the "Linus to the rescue" section after the Linux kernel 5.4 has been released.

02.04.2020

I ran into the same issue on a Gentoo system today. Luckily OpenRC handeled this gracefully but it delayed booting: syslog-ng actually hangs the boot for some time ... waiting for entropy. Argh. The Gentoo forums thread on the topic clearly listed the options:

  1. Make syslog-ng depend on haveged by adding rc_syslog_ng_need="haveged" to /etc/rc.conf (and obviously having haveged installed)
  2. Re-compiling the kernel with CONFIG_RANDOM_TRUST_CPU=y where that is an option

  1. it will return with EAGAIN in the GRND_NONBLOCK use case. The blocking behaviour when lacking entropy is a security measure as per Bug #1559 of Google's Project Zero

  2. Update 18.12.2018: "SysVinit times" ::= "The times when most Linux distros used SysVinit over other init systems." So Wheezy and previous for Debian. Some people objected to the statement, so I added this footnote as a clarification. See the discussion in the comments below. 

  3. there is no Buster branch in the release notes repository yet (17.12.2018). Update: I wrote a section for the release notes 06.05.2019 and Paul Gevers amended and committed that. So when users of affected systems read the release notes before upgrading to Buster they will hopefully not be surprised (and worried) by the long boot delays. 

Firefox asking to be made the default browser again and again

Linux

Firefox on Linux can develop the habit to (rather randomly) ask again and again to be made the default browser. E.g. when started from Thunderbird by clicking a link it asks but when starting from a shell all is fine.

The reason to this is often two (or more) .desktop entries competing with each other.

So, walkthrough: (GOTO 10 in case you are sure to have all the basics right)

update-alternatives --display x-www-browser
update-alternatives --display gnome-www-browser

should both show firefox for you. If not

update-alternatives --config <entry>

the entry to fix the preference on /usr/bin/firefox.

Check (where available)

exo-preferred-applications

that the "Internet Browser" is "Firefox".

Check (where available)

xfce4-mime-settings

that anything containing "html" points to Firefox (or is left at a non-user set default).

Check (where available)

xdg-settings get default-web-browser

that you get firefox.desktop. If not run

xdg-settings check default-web-browser firefox.desktop

If you are running Gnome, check

xdg-settings get default-url-scheme-handler http

and the same for https.

LABEL 10:

Run

sensible-editor ~/.config/mimeapps.list

and remove all entries that contain something like userapp-Firefox-<random>.desktop.

Run

find ~/.local/share/applications -iname "userapp-firefox*.desktop"

and delete these files or move them away.

Done.

Once you have it working again consider disabling the option for Firefox to check whether it is the default browser. Because it will otherwise create those pesky userapp-Firefox-<random>.desktop files again.

Configuring Linux is easy, innit?

Unbalanced volume (channels) on headset audio

Linux

I use a headset to make phone calls and when they are mono the great awesomeness of the Linux audio stack seems to change volume only on the active channel (e.g. the right channel). So when I listen to some music (stereo) afterwards the channels are not balanced anymore and one side is louder than the other. And this persists thanks to saving the preferences across reboots. Duh.

As usually checking Pulseaudio (pavucontrol) is useless, it shows balanced channels.

But checking Alsa (alsamixer) revealed the issue and alsamixer can fix this, too:

Step 1: run alsamixer in a terminal and select your headset after pressing [F6]:

Alsamixer: Select sound card

Step 2: Select the headset audio output with [<-] and [->] cursor keys:

Alsamixer: Unbalanced channels on the headset (left / right channel loudness are different)

Step 3: Press [b] to balance the left and right channels:

Alsamixer: Balanced channels (left / right channel loudness) again

Step 4: Press [Esc] to exit alsamixer which will keep the changed settings (... great choice of key, [q] raises the left channel's loundness ...).

Step 5: Save this setting by running sudo alsactl store which should update /var/lib/alsa/asound.state with the fixed settings so they persist across reboots.

Step 6: Enjoy music again :-).

If you need to script this, amixer is the tool to use, e.g. amixer -c 1 set "Headset" 36.
1 is the card number which you see in alsamixer, "Headset" is the channel name, also from alsamixer (which can contain blanks, hence the quotes around the name) and 36 is the desired loundness level for both channels. See the screenshots above where to find the data or run aplay -l to see the cards on your PC and amixer -c 1 (with your card id) to see the channels that (virtual, USB) sound card has.

Debian Gitlab (salsa.debian.org) tricks

Debian

Debian is moving the git hosting from alioth.debian.org, an instance of Fusionforge, to salsa.debian.org which is a Gitlab instance.

There is some background reading available on https://wiki.debian.org/Salsa/. This also has pointers to an import script to ease migration for people that move repositories. It's definitely worth hanging out in #alioth on oftc, too, to learn more about salsa / gitlab in case you have a persistent irc connection.

As of now() salsa has 15,320 projects, 2,655 users in 298 groups.
Alioth has 29,590 git repositories (which is roughly equivalent to a project in Gitlab), 30,498 users in 1,154 projects (which is roughly equivalent a group in Gitlab).

So we currently have 50% of the git repositories migrated. One month after leaving beta. This is very impressive.
As Alioth has naturally accumulated some cruft, Alexander Wirt (formorer) estimates that 80% of the repositories in use have already been migrated.

So it's time to update your local .git/config URLs!

Mehdi Dogguy has written nice scripts to ease handling salsa / gitlab via the (extensive and very well documented) API. Among them is list_projects that gets you nice overview of the projects in a specific group. This is especially true for the "Debian" group that contains the former collab-maint repositories, so source code that can and shall be maintained by Debian Developers collectively.

Finding migrated repositories

Salsa can search quite quickly via the Web UI: https://salsa.debian.org/search?utf8=✓&search=htop

Salsa search screenshot

but finding the URL to clone the repository from is more clicks and ~4MB of data each time (yeah, the modern web), so

$ curl --silent https://salsa.debian.org/api/v4/projects?search="htop" | jq .
[
  {
    "id": 9546,
    "description": "interactive processes viewer",
    "name": "htop",
    "name_with_namespace": "Debian / htop",
    "path": "htop",
    "path_with_namespace": "debian/htop",
    "created_at": "2018-02-05T12:44:35.017Z",
    "default_branch": "master",
    "tag_list": [],
    "ssh_url_to_repo": "git@salsa.debian.org:debian/htop.git",
    "http_url_to_repo": "https://salsa.debian.org/debian/htop.git",
    "web_url": "https://salsa.debian.org/debian/htop",
    "avatar_url": null,
    "star_count": 0,
    "forks_count": 0,
    "last_activity_at": "2018-02-17T18:23:05.550Z"
  }
]

is a bit nicer.

Please notice the git url format is a bit odd, it's either
git@salsa.debian.org:debian/htop.git or
ssh://git@salsa.debian.org/debian/htop.git.

Notice the ":" -> "/" after the hostname. Bit me once.

Finding repositories to update

At this time I found it useful to check which of the repositories I have cloned had not yet been updated in the local .git/config:

find ~/debconf ~/my_sources ~/shared -ipath '*.git/config' -exec grep -H 'url.*git\.debian' '{}' \;

Thanks to Jörg Jaspert (Ganneff) the Debconf repositories have all been moved to Salsa now.
Hint: Bug him for his scripts if you need to do complex moves.

Updating the URLs has been an hours work on my side and there is little you can do to speed that up if - as in the Debconf case - teams have used the opportunity to clean up and things are not as easy as using sed -i.

But there is no reason to do this more than once, so for the laptops...

Speeding up migration on multiple devices

rsync -armuvz --existing --include="*/" --include=".git/config" --exclude="*" ~/debconf/ laptop:debconf/

will rsync the .git/config files that you changed to other systems where you keep partial copies.

On these a simple git pull to get up to remote HEAD or using the git_pull_all one-liner from https://daniel-lange.com/archives/99-Managing-a-project-consisting-of-multiple-git-repositories.html will suffice.

Git short URL

Stefano Rivera (tumbleweed) shared this clever trick:

git config --global url."ssh://git@salsa.debian.org/".insteadOf salsa:

This way you can git clone salsa:debian/htop.

IMAPFilter 2.6.11-1 backport for Debian Jessie AMD64 available

Debian

One of the perks you get as a Debian Developer is a @debian.org email address. And because Debian is old and the Internet used to be a friendly place this email address is plastered all over the Internet. So you get email spam, a lot of spam.

I'm using a combination of server and client site filtering to keep spam at bay. Unfortunately the IMAPFilter version in Debian Jessie doesn't even support "dry run" (-n) which is not so cool when developing complex filter rules. So I backported the latest (sid) version and agreed with Sylvestre Ledru, one of its maintainers, to share it here and see whether making an official backport is worth it. It's a straight recompile so no magic and no source code or packaging changes required.

Get it while its hot:

imapfilter_2.6.11-1~bpo8+1_amd64.deb (IMAPFilter Jessie backport)
SHA1: bedb9c39e576a58acaf41395e667c84a1b400776

Clever LUA snippets for ~/.imapfilter/config.lua appreciated.

tail -S (truncating lines to terminal width)

Open Source

The tail command has a quite glaring omission in that it can't truncate lines. Thus it wraps long log line into multiple terminal lines regardless. Which makes them very hard to read.

I used to work around this using less -S and then hitting the [F] key but that's interactive. less +F <filename> is the little known work-around for the interactive issue but that still doesn't work well with pipes (tail -f logfile | grep "ERROR:" etc).

There is a bug report from 2004 against GNU coreutils but that went nowhere.

So we're not getting a tail -S anytime soon.

Bash to the rescue: tail -S → tails

Hence I wrote this little script, tails [1kB]:

  1. #!/bin/bash -i
  2. # v2 from 170712: introduce loop to work around GNU coreutils issues with pipe/fifo/isatty
  3.  
  4. if [[ -z "$COLUMNS" ]] ; then
  5.         MYCOL=$(tput cols)
  6. else
  7.         MYCOL=${COLUMNS}
  8. fi
  9. tail "$@" | while read line; do
  10.         echo "$line" | expand | cut -c1-${MYCOL:-80}
  11. done

Now, there are some interesting bits even in this tiny script:

The bash -i results in $COLUMNS being set within the script on sane Linux bash. Otherwise that variable wouldn't be available. Because it is a shell variable and not an environment variable. You knew that, right?

Unfortunately the bash -i doesn't get $COLUMNS set on either MacOS (X) or FreeBSD, so that's where the tput cols comes into play. It outputs the column width for the current terminal to stdout.

If all that fails tails will default to 80 columns.

So tails -f /var/log/apache/access.log will now look nice.

Corner case: color

If you use color codes somewhere (grep --color=always, dmesg --color=always) tails will just truncate lines too short so they will still not wrap.
There is a slight risk that it may cut into half a color code escape sequence and mess up the terminal a bit. You could change that by removing the -i from the shebang line and setting $COLUMNS explicitly. But that then needs manual adjustment for each combination of colored lines (=count of ANSI sequences) and terminal width. Better to color after the tails invocation then where possible, e.g. tails -f /var/log/httpd/error.log | grep ':error' to watch for PHP errors and the like.

Mended corner cases: inconsistent tail behaviour

A first version of the script didn't use a loop but just had

tail "$@" | expand | cut -c1-${MYCOL:-80}

This would break tails -f on Debian (coreutils 8.23) / Ubuntu (coreutils 8.26) while removing the |expand would make it work. On Fedora 25 (coreutils 8.25) I couldn't get tails -f to work at all with that v1. The cut (so just a single command chained) already broke the pipe :-(. And nope, stdbuf didn't help.

If you have a more simple solution to work around the isatty / isfifo mess, please leave a comment!

Alternatives

If you want to show multiple log tails in parallel, highlight strings etc. multitail is worth a look.

Depending on what you want to achieve you could also tell your terminal emulator to not wrap lines:

setterm -linewrap off; less -SR +F /var/log/apache/access.log; setterm -linewrap on

Prevent Ubuntu from phoning home

Linux

Ubuntu unfortunately has decided again to implement another "phone home" feature, this time transferring your lsb_release information, CPU model and speed (from /proc/cpuinfo), uptime output, most of uname -a and curl version to a Ubuntu news web-service.

Here is the Launchpad bug report #1637800 introducing this ... web bug.

This thing runs both systemd-timer based (via /lib/systemd/system/motd-news.service and /lib/systemd/system/motd-news.timer) and on request when you log in (via /etc/update-motd.d/50-motd-news).

Ubuntu news on ssh login

There has even been a bug filed about the motd advertising HBO's Silicon Valley show.

To prevent this from running (it is enabled by default on Ubuntu 17.04 and may probably propagate down to earlier versions as well), edit /etc/default/motd-news to include

ENABLED=0

so

sed -i "s/ENABLED=1/ENABLED=0/" /etc/default/motd-news # run as root

for your automated installs.

Update:

02.07.2017: Dustin Kirkland responded to a YC "hacker news" mention of his motd spam. He mentions:

You're welcome to propose your own messages for merging, if you have a well formatted, informative message for Ubuntu users.
We'll be happy to review and include them in the future.

What could possibly go wrong?

Thunderbird startup hang (hint: Add-Ons)

Debian

If you see Thunderbird hanging during startup for a minute and then continuing to load fine, you are probably running into an issue similar to what I saw when Debian migrated Icedove back to the "official" Mozilla Thunderbird branding and changed ~/.icedove to ~/.thunderbird in the process (one symlinked to the other).

Looking at the console log (=start Thunderbird from a terminal so you see its messages), I got:

console.log: foxclocks.bootstrap._loadIntoWindow(): got xul-overlay-merged - waiting for overlay-loaded
[.. one minute delay ..]
console.log: foxclocks.bootstrap._windowListener(): got window load chrome://global/content/commonDialog.xul

Stracing confirms it hangs because Thunderbird loops waiting for a FUTEX until that apparently gets kicked by a XUL core timeout.
(Thanks for defensive programming folks!)

So in my case uninstalling the Add-On Foxclocks easily solved the problem.

I assume other Thunderbird Add-Ons may cause the same issue, hence the more generic description above.

Updating the Dell XPS 13 9360 Thunderbolt firmware to get VGA and HDMI working

IT

Last year I bought the wonderful Dell XPS 13 9360 as it is certified to work with Ubuntu Linux and is just all around an awesome device. Dell made me buy the Windows version as only that got a 1 TB NVMe-SSD option. Linux apparently is only worthy of the 512GB and below models. What product manager comes up with such a stupid idea? Are SKUs that precious? Anyways ... so I bought a Windows version and that got wiped with a Linux install immediately as that was and is its intended purpose.

Dell DA200 USB-C to HDMI/VGA/Ethernet/USB 3.0 adapter

I purchased a DA200 with the system which is Dell's USB-C to anything (HDMI/VGA/Ethernet/USB 3.0) dongle. When I got the laptop the Ethernet port and USB 3.0 via the DA200 were working right out of the box. The VGA and HDMI ports were detected by Ubuntu but there was no way to get connected screens working. They stayed black.

The device was shipped with Thunderbolt firmware NVM18 and we've been told rather quickly by Dell this would be fixed with an update. And lo and behold Dell published the firmware version NVM21 right for Christmas 2016. Now unfortunately while their BIOS updates are Windows / DOS executables that can be just shoved at the Dell UEFI flash updater and thus the main BIOS can be updated from any OS, including Linux, without any hassle, the Intel provided Thunderbolt update needs Windows to get installed. Or, well, there is a convoluted way to compile an out-of-tree Linux kernel module, download and compile a few sets of software and do it via Linux. That description read so lengthy, I didn't even try it. Additionally there seems to have been no progress at all in getting this more mainline in the last three months, so I chose the cheap route and installed Windows 10 on a USB thumb drive1.

This is done via the (unfortunately Windows only) Win2USB software (the free version is sufficient).

Update: There's a new bash script windows2usb that looks good and should work to get you a bootable Windows USB thumb drive in Linux. WinUSB (that stopped working in the Win10 area some time) has also been forked and updated into WoeUSB. And there is WinToUSBLinux, yet another shell script. Give them a try.

Once Windows has rebooted often enough to finish its own installation, you can work with the USB thumb drive install as with any Windows 10. Nice.

Dell TPM 1.2 to 2.0 firmware update

Put all the files you downloaded from Dell to update your XPS 13 into a directory on the USB thumb drive. That way Windows does not need to have any network connectivity.

I first updated the TPM 1.2 firmware to a TPM 2.0 version (DellTpm2.0_Fw1.3.2.8_V1_64.exe at the time of writing this blog entry). Now this is quite hilarious as the Windows installer doesn't do anything but putting a UEFI firmware update into the EFI partition that runs on reboot. Duh. You do need to manually clear the TPM in the BIOS' security settings section (there's a clear checkbox) to be able to program new firmware onto it.
Thunderbolt firmware upgrade progress bar Thunderbolt firmware upgrade successful Now back in Windows install the Thunderbolt drivers (Chipset_Driver_J95RR_WN32_16.2.55.275_A01.exe at the time of writing this) and then run Intel_TBT3_FW_UPDATE_NVM21_0THFT_A00_3.21.00.008.exe, which is the NVM21 Thunderbolt firmware update (or a later version).

Reboot again (into Linux if you want to) and (drumroll) the VGA and HDMI ports are working. Awesome.

An update log can be found on the USB thumb drive at Dell\UpdatePackage\Log\Intel_TBT3_FW_UPDATE_NVM21_0THFT_A00_3.log:

*** Dell Thunderbolt firmware update started on 4/6/2017 at 12:56:56***
Command: C:\Install\Intel_TBT3_FW_UPDATE_NVM21_0THFT_A00_3.21.00.008.exe 

Starting FW Update....
***TBT GPIO Power is Turning On:  No Dock or DockInfo.
***TBT GPIO power is turned on.

Thunderbolt Firmware Update SUCCEEDED
TBT Items Registry creation is Success at \SOFTWARE\Dell\ManageableUpdatePackage\Thunderbolt Controller:
User selected OK for reboot
System TBT NVM Current Version:BCD:00000018: New Version:BCD:00000021

Exit Code = 0 (Success) 
***Thunderbolt Firmware flash finished at 4/6/2017 at 13:00:23***

If Windows has added its boot loader entry into your UEFI options, you can easily remove that again with the Dell UEFI BIOS or efibootmgr from within Linux.

The whole process took me less than 30 minutes. And most of that was creating the Windows USB thumb drive. I'll keep that for future updates until Intel and Dell have sorted out the Thunderbolt update process in Linux.

Updates:

18.05.17: Intel has published a large patchset on LKML to enable Thunderbolt security levels (thus preventing DMA attacks) and get NVM firmware upgrades mainlined. Yeah!

02.05.18: Added a link to the windows2usb bash script that should remove the need to create a bootable Windows USB thumb drive with a Windows only software.

11.06.18: Added a link to WoeUSB which is currently packaged for Ubuntu in a PPA.

08.04.20: Added a link to WinToUSBLinux. A recently released shell script to create a bootable Windows USB stick from Linux.


  1. If you go the Linux route please post a minimal image somewhere (kernel, initrd, squashfs or FAT16/32 raw image) and put a link into a comment below this blog post. Thanks. 

Saving misc/jive

BSD

One thing I love about FreeBSD is the way the core team keeps the wider community updated about project news e.g. via their quarterly status reports. So while reading the FreeBSD Q4/2016 status report, I was quite surprised to find that a text filter converting English to "Jive speak" had been removed from the ports tree. FreeBSD Core members argue that "today the implicit approval implied by having it in the ports tree sends a message at odds with the project's aims."

Now this is bullshit as I'm sure FreeBSD core neither endorses Citrix (net/citrix_ica) nor Cisco (emulators/gna3, devel/libcli, graphics/py27-blockdiagcontrib-cisco and many more) but just hosts code to make living with them easier.

So the important thing here is:

Important: Switch on brain and try to memorize. Hosting is not endorsing.
It is a purely technical act and by definition agnostic to the hosted content.

In every sane jurisdiction there is the requirement to remove hosted content that violates a law. And that makes sense. It reflects the societal consensus what is still acceptable and what is not. This changes over time but there is a proven process in place for these changes to become relevant: political discussion and consequential law making.

There is very deliberately never a law against bad taste and/or offensive humor. Where such a law still exists, you're in a somewhat underdeveloped jurisdiction. Because the hosting (pun intended) society has not matured sufficiently yet. This may happen due to overly conservative or self-protective ruling classes, ideological or religious blindness. None of these are desirable for society as a whole and the scissors in your head are paving the way to go back to darker ages. So don't. Be welcoming, be tolerant.

Tolerance means accepting things you do not like. Not accepting just what endorses your personal taste, beliefs or state of mind.

Does that mean, FreeBSD should continue to host the "Jive" filter? No, it's purely their choice. But their argument that hosting is endorsing is wrong. Inclusion into a FreeBSD media may be, like Debian strictly differentiates between the main archive, which it endorses, and contrib or non-free sections which it does not endorse. But still hosts regardless. So hosting is not endorsing.

That said, here you go:

File Function sha256
jive-1.1.tar.gz Source to the "Jive" filter 3463d80ad159a27d9fcf87f163a7be5eba39dbf15c5156f052798b81271523f2
ports_misc_jive.tar.gz ports files to build the "Jive" filter under FreeBSD 47dc7b660d499d671daa18f992cdd348bd95c34e02874addd2bcf3e5c3f90b59
swedishchef.zip mirror of swedishchef.zip d0830b81aec6ad6a6ff824e1d80c9fa97d3a5447bad9f8a2b32dbd0dfb8df709

The last file above is a mirror of files hosted by John B. Chambers. He has a "chef" cgi running there allowing the conversion of English text to "Swedish Chef", "Valley Girl" or "Pig Latin". And the "Jive" variant that uses the same Lex/Yacc/Flex files as the misc/jive that used to be part of the FreeBSD ports tree and is conserved above.

If you are interested in the public part of the discussion that happened after misc/jive was marked for removal from the ports tree, check out the freebsd-ports mailing list thread.

P.S.: Valspeak is still in the ports tree as misc/valspeak ... just sayin'.

P.P.S.: apt-cache show filters # Debian & Ubuntu. Awesome. ♡

Mozilla Firefox and Thunderbird Menu font sizes

Open Source

The font size Mozilla chose for Firefox and Thunderbird menus looks awfully large on Netbook screens. It wastes space and is visually at odds with reasonably sized content. And for some weird reason you can set the content font and size via the menu but not the font and size for the drop-down menus themselves.

As the "Theme Font & Size Changer" Add-On doesn't work reliably and phones home way too often (showing a nag screen), I dug back into how to do this "manually". Probably a decade after I fixed this the first time...

You need to create the file ~/.mozilla/firefox/*/chrome/userChrome.css with * being your profile directory (<random_number>.default usually) and you most probably have to create the chrome directory first.

The same for Thunderbird resides in ~/.thunderbird/*/chrome/userChrome.css. Here again the chrome directory will most probably need to be created first.


/* Global UI font */
* { font-size: 10pt !important;
  font-family: Ubuntu !important;
}
 

needs to go into these files for Firefox or Thunderbird respectively. The curly braces are important. So copy & paste correctly. Symlinks or hardlinks are fine if those files do not need to differ between your web browser and your email client.

Restart Firefox and/or Thunderbird to see the effect.

Obviously you can choose any other font and font size in the snippet above to suit your taste and requirements.

If you are massively space-confined and don't mind a quite ugly UI, check out the Littlefox Add-on. Ugly but optimal use of the minimal screen estate with very small screens.