Openssh taking minutes to become available, booting takes half an hour ... because your server waits for a few bytes of randomness

Posted by Daniel Lange on Monday, 17. December 2018

So, your machine now needs minutes to boot before you can ssh in where it used to be seconds before the Debian Buster update?

Problem

Linux 3.17 (2014-10-05) learnt a new syscall getrandom() that, well, gets bytes from the entropy pool. Glibc learnt about this with 2.25 (2017-02-05) and two tries and four years after the kernel, OpenSSL used that functionality from release 1.1.1 (2018-09-11). OpenSSH implemented this natively for the 7.8 release (2018-08-24) as well.

Now the getrandom() syscall will block¹ if the kernel can't provide enough entropy. And that's frequenty the case during boot. Esp. with VMs that have no input devices or IO jitter to source the pseudo random number generator from.

First seen in the wild January 2017

I vividly remember not seeing my Alpine Linux VMs back on the net after the Alpine 3.5 upgrade. That was basically the same issue.

Systemd. Yeah.

Systemd makes this behaviour worse, see issues #4271, #4513 and #10621.
Basically as of now the entropy file saved as /var/lib/systemd/random-seed will not - drumroll - add entropy to the random pool when played back during boot. Actually it will. It will just not be accounted for. So Linux doesn't know. And continues blocking getrandom(). This is obviously different from SysVinit times² when /var/lib/urandom/random-seed (that you still have lying around on updated systems) made sure the system carried enough entropy over reboot to continue working right after enough of the system was booted.

#4167 is a re-opened discussion about systemd eating randomness early at boot (hashmaps in PID 0...). Some Debian folks participate in the recent discussion and it is worth reading if you want to learn about the mess that booting a Linux system has become.

While we're talking systemd ... #10676 also means systems will use RDRAND in the future despite Ted Ts'o's warning on RDRAND [Archive.org mirror and mirrored locally as 130905_Ted_Tso_on_RDRAND.pdf, 205kB as Google+ will be discontinued in April 2019].
Update: RDRAND doesn't return random data on pre-Ryzen AMD CPUs (AMD CPU family <23) as per systemd bug #11810. It will always be 0xFFFFFFFFFFFFFFFF (2⁶⁴-1). This is a known issue since 2014, see kernel bug #85991.

Debian

Debian is seeing the same issue working up towards the Buster release, e.g. Bug #912087.

The typical issue is:

[ 4.428797] EXT4-fs (vda1): mounted filesystem with ordered data mode. Opts: data=ordered
[ 130.970863] random: crng init done

with delays up to tens of minutes on systems with very little external random sources.

This is what it should look like:

[ 1.616819] random: fast init done
[ 2.299314] random: crng init done

Check dmesg | grep -E "(rng|random)" to see how your systems are doing.

If this is not fully solved before the Buster release, I hope some of the below can end up in the release notes³.

Solutions

You need to get entropy into the random pool earlier at boot. There are many ways to achieve this and - currently - all require action by the system administrator.

Kernel boot parameter

From kernel 4.19 (Debian Buster currently runs 4.18 [Update: but will be getting 4.19 before release according to Ben via Mika]) you can set RANDOM_TRUST_CPU at compile time or random.trust_cpu=on on the kernel command line. This will make recent Intel / AMD systems trust RDRAND and fill the entropy pool with it. See the warning from Ted Ts'o linked above.

Update: Since Linux kernel build 4.19.20-1 CONFIG_RANDOM_TRUST_CPU has been enabled by default in Debian.

Using a TPM

The Trusted Platform Module has an embedded random number generator that can be used. Of course you need to have one on your board for this to be useful. It's a hardware device.

Load the tpm-rng module (ideally from initrd) or compile it into the kernel (config HW_RANDOM_TPM). Now, the kernel does not "trust" the TPM RNG by default, so you need to add

rng_core.default_quality=1000

to the kernel command line. 1000 means "trust", 0 means "don't use". So you can chose any value in between that works for you depending on how much you consider your TPM to be unbugged.

VirtIO (KVM, QEMU, ...)

For Virtual Machines (VMs) you can forward entropy from the host (that should be running longer than the VMs and have enough entropy) via virtio_rng.

So on the host, you do:

kvm ... -object rng-random,filename=/dev/urandom,id=rng0 -device virtio-rng-pci,rng=rng0,bus=pci.0,addr=0x7

and within the VM newer kernels should automatically load virtio_rng and use that.

You can confirm with dmesg as per above.

Or check:

# cat /sys/devices/virtual/misc/hw_random/rng_available
virtio_rng.0
# cat /sys/devices/virtual/misc/hw_random/rng_current
virtio_rng.0

Patching systemd

The Fedora bugtracker has a bash / python script that replaces the systemd rnd seeding with a (better) working one. The script can also serve as a good starting point if you need to script your own solution, e.g. for reading from an entropy provider available within your (secure) network.

Chaoskey

The wonderful Keith Packard and Bdale Garbee have developed a USB dongle, ChaosKey, that supplies entropy to the kernel. Hard- and software are open source.

Jitterentropy_RNG

Kernel 4.2 introduced jitterentropy_rng which will use the jitter in CPU timings to generate randomness.

modprobe jitterentropy_rng

This apparently needs a userspace daemon though (read: design mistake) so

apt install jitterentropy-rngd (available from Buster/testing).

The current version 1.0.8-3 installs nicely on Stretch. dpkg -i is your friend.

But - drumroll - that daemon doesn't seem to use the kernel module at all.

That's where I stopped looking at that solution. At least for now. There are extensive docs if you want to dig into this yourself.

Update: The Linux kernel 5.3 will have an updated jitterentropy_rng as per Commit 4d2fa8b44. This is based on the upstream version 2.1.2 and should be worth another look.

Haveged

apt install haveged

Haveged is a user-space daemon that gathers entropy though the timing jitter any CPU has. It will only run "late" in boot but may still get your openssh back online within seconds and not minutes.

It is also - to the best of my knowledge - not verified at all regarding the quality of randomness it generates. The haveged design and history page provides and interesting read and I wouldn't recommend haveged if you have alternatives. If you have none, haveged is a wonderful solution though as it works reliably. And unverified entropy is better than no entropy. Just forget this is ~~2018~~ 2019 .

early-rng-init-tools

Thorsten Glaser has posted newly developed early-rng-init-tools in a debian-devel thread. He provides packages at http://fish.mirbsd.org/~tg/Debs/dists/sid/wtf/Pkgs/early-rng-init-tools/ .

First he deserves kudos for naming a tool for what it does. This makes it much more easily discoverable than the trend to name things after girlfriends, pets or anime characters. The implementation hooks into the early boot via initrd integration and carries over a seed generated during the previous shutdown. This and some other implementation details are not ideal and there has been quite extensive scrutiny but none that discovered serious issues. Early-rng-init-tools look like a good option for non-RDRAND (~CONFIG_RANDOM_TRUST_CPU) capable platforms.

Linus to the rescue

Luckily end of September Linus Torvalds was fed up with the entropy starvation issue and the non-conclusive discussions about (mostly) who's at fault and ... started coding.

With the kernel 5.4 release on 25.11.2019 his patch has made it into mainline. He created a try_to_generate_entropy function that uses CPU jitter to generate seed entropy for the PRNG early in boot.

In the merge commit Linus explains:

This is admittedly partly "for discussion". We need to have a way forward for the boot time deadlocks where user space ends up waiting for more entropy, but no entropy is forthcoming because the system is entirely idle just waiting for something to happen.

While this was triggered by what is arguably a user space bug with GDM/gnome-session asking for secure randomness during early boot, when they didn't even need any such truly secure thing, the issue ends up being that our "getrandom()" interface is prone to that kind of confusion, because people don't think very hard about whether they want to block for sufficient amounts of entropy.

The approach here-in is to decide to not just passively wait for entropy to happen, but to start actively collecting it if it is missing. This is not necessarily always possible, but if the architecture has a CPU cycle counter, there is a fair amount of noise in the exact timings of reasonably complex loads.

We may end up tweaking the load and the entropy estimates, but this should be at least a reasonable starting point.

So once this kernel is available in your distribution, you should be safe from entropy starvation at boot on any platform that has hardware timers (I haven't encountered one that does not in the last decade).

Ted Ts'o reviewed the approach and was fine and Ahmed Dawish did some testing of the quality of randomness generated and that seems fine, too.

Updates

14.01.2019

Stefan Fritsch, the Apache2 maintainer in Debian, OpenBSD developer and a former Debian security team member stumbled over the systemd issue preventing Apache libssl to initialize at boot in a Debian bug #916690 - apache2: getrandom call blocks on first startup, systemd kills with timeout.

The bug has been retitled "document getrandom changes causing entropy starvation" hinting at not fixing the underlying issue but documenting it in the Debian Buster release notes.

Unhappy with this "minimal compromise" Stefan wrote a comprehensive summary of the current situation to the Debian-devel mailing list. The discussion spans over December 2018 and January 2019 and mostly iterated what had been written above already. The discussion has - so far - not reached any consensus. There is still the "systemd stance" (not our problem, fix the daemons) and the "ssh/apache stance" (fix systemd, credit entropy).

The "document in release notes" minimal compromise was brought up again and Stefan warned of the problems this would create for Buster users:

> I'd prefer having this documented in the release notes:
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=916690
> with possible solutions like installing haveged, configuring virtio-rng,
> etc. depending on the situation.

That would be an extremely user-unfriendly "solution" and would lead to 
countless hours of debugging and useless bug reports.

This is exactly why I wrote this blog entry and keep it updated. We need to either fix this or tell everybody we can reach before upgrading to Buster. Otherwise this will lead to huge amounts of systems dead on the network after what looked like a successful upgrade.

Some interesting tidbits were mentioned within the thread:

Raphael Hertzog fixed the issue for Kali Linux by installing haveged by default. Michael Prokop did the same for the grml distribution within its December 2018 release.

Ben Hutchings pointed to an interesting thread on the debian-release mailing list he kicked off in May 2018. Multiple people summarized the options and the fact that there is no "general solution that is both correct and easy" at the time.

Sam Hartman identified Debian Buster VMs running under VMware as an issue, because that supervisor does not provide virtio-rng. So Debian VMs wouldn't boot into ssh availability within a reasonable time. This is an issue for real world use cases albeit running a proprietary product as the supervisor.

16.01.2019

Daniel Kahn Gillmor wrote in to explain a risk for VMs starting right after the boot of the host OS:

If that pool is used by the guest to generate long-term secrets because it appears to be well-initialized, that could be a serious problem. (e.g. "Mining your P's and Q's" by Heninger et al -- https://factorable.net/weakkeys12.extended.pdf) I've just opened https://bugs.launchpad.net/qemu/+bug/1811758 to report a way to improve that situation in qemu by default.

So ... make sure that your host OS has access to a hardware random number generator or at least carries over its random seed properly across reboots. You could also delay VM starts until the crng on the host Linux is fully initialized (random: crng init done).
Otherwise your VMs may get insufficiently generated pseudo-random numbers and won't even know.

12.03.2019

Stefan Fritsch revived the thread on debian-devel again and got a few more interesting tidbits out of the developer community:

Ben Hutchings has enabled CONFIG_RANDOM_TRUST_CPU for Debian kernels from 4.19.20-1 so the problem is somewhat contained for recent CPU AMD64 systems (RDRAND capable) in Buster.

Thorsten Glaser developed early-rng-init-tools which combine a few options to try and get entropy carried across boot and generated early during boot. He received some scrutiny as can be expected but none that would discourage me from using it. He explains that this is for early boot and thus has initrd integration. It complements safer randomness sources or haveged.

16.04.2019

The Debian installer for Buster is running into the same problem now as indicated in the release notes for RC1. Bug #923675 has details. Essentially choose-mirror waits serveral minutes for entropy when used with https mirrors.

08.05.2019

The RDRAND use introduced in systemd to bypass the kernel random number generator during boot falls for a AMD pre-Ryzen bug as RDRAND on these systems doesn't return random data after a suspend / resume cycle. Added an update note to the systemd section above.

03.06.2019

Bastian Blank reports the issue is affecting Debian cloud images now as well as cloud-init generates ssh keys during boot.

10.07.2019

Added the update of jitterentropy_rng to a version based on upstream v2.1.2 into the Jitterentropy section above.

16.09.2019

The Linux Kernel Mailing List (LKML) is re-iterating the entropy starvation issue and the un-willingness of systemd to fix its usage of randomness in early boot. Ahmed S. Darwish has reported the issue leading to ext4 reproducibly blocking boot with Kernel 5.3-r8. There are a few patches floated and the whole discussion it worth reading albeit non-conclusive as of now.

Ted Ts'o says "I really very strongly believe that the idea of making getrandom(2) non-blocking and to blindly assume that we can load up the buffer with 'best efforts' randomness to be a terrible, terrible idea that is going to cause major security problems that we will potentially regret very badly. Linus Torvalds believes I am an incompetent systems designer." in this email.

In case you needed a teaser to really start reading the thread! Linus Torvalds also mentions the issue (and a primer on what "never break userspace" means) in the Linux kernel 5.3 release notes.

18.09.2019

... and Martin Steigerwald kindly noticed that I update this blog post with the relevant discussions I come across as this entropy starvation mess continues to haunt us.

25.11.2019

Added the "Linus to the rescue" section after the Linux kernel 5.4 has been released.

02.04.2020

I ran into the same issue on a Gentoo system today. Luckily OpenRC handeled this gracefully but it delayed booting: syslog-ng actually hangs the boot for some time ... waiting for entropy. Argh. The Gentoo forums thread on the topic clearly listed the options:

Make syslog-ng depend on haveged by adding rc_syslog_ng_need="haveged" to /etc/rc.conf (and obviously having haveged installed)
Re-compiling the kernel with CONFIG_RANDOM_TRUST_CPU=y where that is an option

it will return with EAGAIN in the GRND_NONBLOCK use case. The blocking behaviour when lacking entropy is a security measure as per Bug #1559 of Google's Project Zero. ↩
Update 18.12.2018: "SysVinit times" ::= "The times when most Linux distros used SysVinit over other init systems." So Wheezy and previous for Debian. Some people objected to the statement, so I added this footnote as a clarification. See the discussion in the comments below. ↩
there is no Buster branch in the release notes repository yet (17.12.2018). Update: I wrote a section for the release notes 06.05.2019 and Paul Gevers amended and committed that. So when users of affected systems read the release notes before upgrading to Buster they will hopefully not be surprised (and worried) by the long boot delays. ↩

Trackbacks

Trackback specific URI for this entry

No Trackbacks

Comments

Display comments as Linear | Threaded

Corsac on Monday, 17. December 2018:

Actually the old initscript was doing exactly the same thing as the systemd unit file. It's just writing to /dev/urandom (https://sources.debian.org/src/sysvinit/2.93-1/debian/src/initscripts/etc/init.d/urandom/?hl=66#L56) so entropy wasn't credited either.

It would be nice to have a way for the users to tell the system that the random seed is to be trusted enough to provide entropy for a freshly booted system, but doing that by default is asking for trouble.

Daniel Lange on Monday, 17. December 2018:

Yes, it wasn't credited. But without getrandom() being used and blocking, nobody needed to care. There was randomness available and it could be read. (And in case there wasn't enough entropy, it would still be read. This is the case the now prevalent getrandom() use case mends. Which is fine in itself but comes with collateral damage¹.)

¹ we'll see how bad this gets with Buster released. And that - luckily - still depends on how much we can mend up to that release date.

Theo on Monday, 17. December 2018:

OpenBSD trusts /etc/random.seed.

Why would you trust /bin/sh or /sbin/init but not your saved random seed? Doesn't compute.

Philipp Kern on Tuesday, 18. December 2018:

Because the seed isn't stored in /etc but in /var and that's unavailable at the time.

Alexander Patrakov on Monday, 17. December 2018:

If you order a new physical server from Hetzner and ask for Ubuntu 18.10 (I haven't checked their Debian image), you'll get haveged installed by default.

Michael Biebl on Tuesday, 18. December 2018:

"This is obviously different from SysVinit times when /var/lib/urandom/random-seed (that you still have laying around on updated systems) made sure the system carried enough entropy over reboot to continue working right after enough of the system was booted."

Could you please fix that. This statement is simply false.

sysvinit (or rather /etc/init.d/urandom) behaves exactly as systemd-random-service.

Thank you

Daniel Lange on Tuesday, 18. December 2018:

No, it is not wrong.

I never said SysVinit does something substantially different from systemd (reg. carrying over the RNG seed). But that is the problem.

The kernel offers a new interface (getrandom()). Userspace is using that. Systemd itself needs entropy during boot. SysVinit didn't. So the systemd-based boot flow needs to change the way entropy is carried over from SysVinit times to a more modern approach. Because init is responsible to make sure systems can boot and daemons can start doing their jobs.

Actually, the funny thing here is, SysVinit itself doesn't know anything about RNGs, i.e. https://git.savannah.gnu.org/git/sysvinit.git doesn't have a single line of code dealing with RNGs.

It was (is) all handled in the shell scripts every distro made around the core init daemon.

To the contrary (because systemd is a suite of things and not just a lean "PID 1" as it's called these days):
https://github.com/systemd/systemd has /units/systemd-random-seed.service.in, /src/random-seed/random-seed.c and /src/basic/random-util.c.

Reviewing that code ... it is wrong on so many levels, I'm considering writing another blog article about this.

For example (as I run short on time and that article will be 2019 if at all):

Commit 68534345b "solves" the problem partially by relying (as I already said in the main blog article) on RDRAND. Regardless of the stance the system admin has on this function (random.trust_cpu=on/off). And it uses it ignoring Intel's specs (which is documented in the comments). And it should be using RDSEED anyways, cf. https://software.intel.com/en-us/blogs/2012/11/17/the-difference-between-rdrand-and-rdseed .

pseudo_random_bytes contains a complete PRNG despite the kernel providing one. And it is insufficiently initialized. Cf. https://marc.info/?t=141807240600001&r=1&w=2 .

Michael Biebl on Tuesday, 18. December 2018:

Got it. You simply want to bash systemd not to report accurately about the current situation. Thanks for making that clear

Daniel Lange on Tuesday, 18. December 2018:

No. As you see, the first issue I had with this topic personally was on Alpine Linux that runs busybox init.

This only gets systemd specific if we consider Debian (or Ubuntu). Fedora already works-around this via random.trust_cpu=on and making virtio-rng a kernel default (cf. https://fedoraproject.org/wiki/Common_F28_bugs#Boot_process_is_very_slow_or_appears_to_hang_with_kernel_4.16.4_onwards and https://bugzilla.redhat.com/show_bug.cgi?id=1572944).

Debian bug #912087 shows you play ping-pong with openssh about whose responsibility it is to fix this. As this is a general boot-time entropy problem, I consider this to be in systemd's ball park but you closed the bug as "wontfix" citing: "Sigh, and there is nothing that systemd can do to fix this, so I don't see a point re-assigning this to systemd (again)."

That is utterly wrong and the amount of crypto code already in systemd shows this beyond doubt. At least upstream systemd (which you may or may not package in time for Buster) will contain a partial (see above comments) fix using RDRAND on platforms that support this.

A fellow DD reported "I once saw the 'crng init done' only after 1h40" in the above bug report. This is seriously broken. Esp. for embedded platforms (he works on these) that do not usually have RDRAND.

So this is not bashing, this is warning developers, professional sysadmins (and at some time in the future users) about what may happen on Buster upgrades and - hopefully - getting you and other developers to fix it before users get affected.

I'll be monitoring this issue, update the mitigation options in the blog article and I will advise my community and customers accordingly and provide fixes. I'll also try to get this mentioned in the release notes if it is still a problem at Buster release.

I hope all of this will not be necessary because Debian Buster will just carry entropy properly (read: improved to handle the getrandom() use case) across reboots.

Philipp Kern on Tuesday, 18. December 2018:

I still hold that you make your argument unnecessarily weak by bashing systemd as part of this - by implying that it creates a mess when it makes problems more obvious (be it ordering or in this case missing entropy). You could similarly blame the kernel for offering an API that can't be used. It happens that systemd needs entropy but anything else in sysvinit land might too - say the web server on the embedded device you mention that needs to have random bytes for its portion of the TLS master key. systemd made this obvious very early during boot, but in the end the question is how to make the kernel behave properly so that entropy is available early. You state examples for that and that's helpful. If you make the argument that it's harder to do it early enough now with systemd initializing its hash tables that's fine too. Then you can have an argument about where this needs fixing. To some degree the systemd people can never do you right with how you present your argument: They need weak entropy, they use RDRAND to unblock the boot, it's still a bug in your view. Then let's fix the kernel, which is - as you state - to some degree already in progress.

Daniel Lange on Tuesday, 18. December 2018:

Breaking userspace should be avoided. Which is why the kernel made a separate API. That is working according to specification™ and has seen broad adoption. Which seems fine except for the depleted entropy boot case.

Now systemd is needed to fix userspace for the boot case. That's not ideal. But unfortunately what an init system needs to do. Esp. one that has never had problems breaking things for the sake of progress elsewhere. Which is fine as it is consistent with the will to innovate that drives systemd.

The criticism of the cryptography implemented is completely independent of that. If you do crypto, do it proper.
So in this case: RDSEED as per Intel's spec, shield against compiler optimization etc. Best is just not to do crypto though. Why does systemd have to contain another PRNG implementation? Esp. as the kernel has the whole lot, much better verified, already. If systemd were seeding that proper, none of the PRNG and auxiliary functions would be needed.

This is getting slightly hilarious. Google Project Zero finds a flaw in Linux PRNG seeding on early boot. So everything gets amended to be more honest about available entropy (blocking). Instead of now doing everything to provide more entropy, we make another PRNG (much weaker than the Linux one), seed it badly (deliberately, that is documented) and call it a day?

I agree, there are many options to improve the situation. The kernel can be amended, grub could learn about carrying a seed. The initrd could learn about where to find seeds. All these options seem less logical and more work to me compared to "read the seed file and push it into the kernel RNG rings with proper credit".

Corsac on Tuesday, 18. December 2018:

Daniel, this issue is quite complex indeed, but I don't think this blog post and the various comments do anything to make it simpler.

Systemd unit file and the sysvrc bash script do exactly the same thing and would have exactly the same results. Systemd might be in a better way to actually fix the mess because it has a more tight control on the way machines boots nowadays. If any, it seems like a convincing argument to use systemd rather than something else, distor-wide.

Where exactly is it indicated that someone would make another PRNG “much weaker than the Linux one”. All I've seen for now is about seeding the Linux PRNG.

Daniel Lange on Tuesday, 18. December 2018:

No idea whether this blog post makes things simpler. It makes the issue more prominently known. Which has a chance to spur activity towards a solution or - the less favourable case - at least inform developers, sysadmins and ultimately users about why their systems are hanging at boot in 6 or so months.

Full ack on your other statements. With systemd we are in a better position than without it. And we should probably {drop | discourage use | not support}¹ any other init for the Buster release (or we need to fix the entropy availability across boots for these too).

Check /src/basic/random-util.c with a cup of your favourite beverage and some time.

¹ I'm not on the release team and have no idea what the proper approach here is. But any init sees the same issue as per the busybox init from Alpine Linux mentioned in the article.

Alexander Patrakov on Tuesday, 18. December 2018:

I object to the statement that the set of random-related issues is the same between inits. Yes, the random number generator is initialized in exactly the same way. Yes, services take a long time to start both ways. The key difference is that systemd has a timeout for starting a service, and other inits don't. So with systemd, Apache will fail, and with sysvinit it will come up after some unreasonable amount of time.

Daniel Lange on Tuesday, 18. December 2018:

Good point. This is the -ETOOMUCHFUNCTIONALITY thing with systemd. While time-outs sound like a good idea, they not always are.

Sunday I had systemd respawn Apaches at a rate of 1000/s because logrotate (hup'ing Apache) and backups freezing the VM filesystem for a split second had a very unlucky timing co-incidence. That wasn't systemd's fault at all but a stupid init would not have made the system load 30+ and eat up all RAM while trying to write coredumps via systemd-coredump (which are denied in prod anyways, ulimit -c 0).
We're currently setting up a test case with

/etc/systemd/coredump.conf:
[Coredump]
Storage=none
ProcessSizeMax=0

to see whether that changes behaviour over just setting ulimits for system users like in the 90s. And then we're checking whether systemd-coredump, systemd-coredump.socket ("Too many incoming connections (16), dropping connection.") or the process manager side is the place to fix. The (default) StartLimitBurst, StartLimitInterval did not bite at all so we're digging deep.

Oh, and for added beauty, systemd-journal dumps core too, when under such load:

[1265452.121615] systemd-coredump[19523]: MESSAGE=Process 412 (systemd-journal) of user 0 dumped core.
[1265452.121618] systemd-coredump[19523]: Coredump diverted to /var/lib/systemd/coredump/core.systemd-journal.0.e92a4b9b564d481c91ed5bb36a6c33cf.412.1544935100000000.lz4
[1265452.121620] systemd-coredump[19523]: Stack trace of thread 412:
[1265452.121622] systemd-coredump[19523]: #0  0x00007f15945e7036 journal_file_move_to_object (libsystemd-shared-238.so)
[1265452.121624] systemd-coredump[19523]: #1  0x00007f15945e8407 journal_file_find_data_object_with_hash (libsystemd-shared-238.so)
[1265452.121626] systemd-coredump[19523]: #2  0x00007f15945e85d9 journal_file_append_data (libsystemd-shared-238.so)
[1265452.121627] systemd-coredump[19523]: #3  0x00007f15945ea951 journal_file_append_entry (libsystemd-shared-238.so)
[1265452.121629] systemd-coredump[19523]: #4  0x000055ef2ef7b6c8 dispatch_message_real (systemd-journald)
[1265452.121631] systemd-coredump[19523]: #5  0x000055ef2ef775e5 server_read_dev_kmsg (systemd-journald)

(That is a Red Hat system so not a Debian issue.)

Corsac on Tuesday, 18. December 2018:

Thanks for the systemd clarification, could you put it in the post itself as well so it doesn't confuse users? Especially I don't think it's true that “systemd made it worse” (all the bugs you point are about entropy credit, which the shell script didn't do either).

I'm not sure it really makes the issue more prominently known. It sure add some noise to the various existing bugs though.

About the systemd PRNG, the code seems ok to me at first sight: it will only use it internally, when getrandom() fails (because Linux PRNG is unseeded) and the request is for low entropy.

Theo on Monday, 16. September 2019:

You're wrong. Lennart deliberately wants to see the world burn here. From the patch series Daniel linked above: "Quite frankly, I don't think this is something to fix in the kernel. Let the people putting together systems deal with this. Let them provide a creditable hw rng, and let them pay the price if they don't. Lennart" https://lore.kernel.org/linux-ext4/20190915085907.GC29771@gardel-login/

Martin McMartypants on Friday, 21. December 2018:

You mean "lying around", not "laying around".

Daniel Lange on Friday, 21. December 2018:

Thank you!

I'm lying here upon the shore;
I lie here every day.
I've lain here many times before;
I lay here yesterday.

I'd lay my head upon the floor
If you'd lie down by me.
I've laid it there five times or more;
(I lied—it's only three).

(from https://forum.wordreference.com/threads/lie-lay-lying-laying.2432265/#post-12234174 by member Rover_KE)

Matt Domsch on Thursday, 18. July 2019:

We solved this in Fedora / RHEL back in 2009 for Dell systems using the TPM approach, which at the time had a userspace daemon to read the TPM. Now the kernel can read it directly. 10 years later I'm somewhat surprised it's coming up as a significant failure mode again.

https://bugzilla.redhat.com/show_bug.cgi?id=529767

Onlyjob on Wednesday, 29. January 2020:

Worth mentioning BitBabbler the hardware (USB) TRNG with excellent native Debian support through "bit-babbler" package.

IMHO BitBabbler devices are awesome - I highly recommend them.

Gerry Johnson on Wednesday, 21. October 2020:

For those who may be interested in this topic, I strongly recommend this open access paper. It deals with the subject of boot time entropy starvation and proposes a quite clever solution

https://ieeexplore.ieee.org/document/9050782

Add Comment

Name

Homepage

Comment

In reply to

Markdown format allowed

Standard emoticons like :-) and ;-) are converted to images.

E-Mail addresses will not be displayed and will only be used for E-Mail notifications.

To prevent automated Bots from commentspamming, please enter the string you see in the image below in the appropriate input box. Your comment will only be submitted if the strings match. Please ensure that your browser supports and accepts cookies, or your comment cannot be verified correctly.
CAPTCHA

Enter the string from the spam-prevention image above:

Form options

Remember Information?

Submitted comments will be subject to moderation before being displayed.

	Xing profile
	PM DLange on Libera or OFTC