What's a CPU to do when it has nothing to do?
Please consider subscribing to LWN Subscriptions are the lifeblood of LWN.net. If you appreciate this content and would like to see more of it, your subscription will help to ensure that LWN continues to thrive. Please visit this page to join up and keep LWN on the net. |
It would be reasonable to expect doing nothing to be an easy, simple task for a kernel, but it isn't. At Kernel Recipes 2018, Rafael Wysocki discussed what CPUs do when they don't have anything to do, how the kernel handles this, problems inherent in the current strategy, and how his recent rework of the kernel's idle loop has improved power consumption on systems that aren't doing anything.
The idle loop, one of the kernel subsystems that Wysocki maintains, controls what a CPU does when it has no processes to run. Precise to a fault, Wysocki defined his terms: for the purposes of this discussion, a CPU is an entity that can take instructions from memory and execute them at the same time as any other entities in the same system are doing likewise. On a simple, single-core single-processor system, that core is the CPU. If the processor has multiple cores, each of those cores is a CPU. If each of those cores exposes multiple interfaces for simultaneous instruction execution, which Intel calls "hyperthreading", then each of those threads is a CPU.
A CPU is idle if there are no tasks for it to run. Or, again more precisely, the Linux kernel has a number of internal scheduling classes, including the special idle class. If there are no tasks to run on a given CPU in any of those classes save the idle class, the CPU is regarded as idle. If the hardware doesn't make allowance for this, then the CPU will have to run useless instructions until it is needed for real work. However, this is a wildly inefficient use of electricity, so most CPUs support a number of lower-power states into which the kernel can put them until they are needed to do useful work.
Idle states are not free to enter or exit. Entry and exit both require some time, and moreover power consumption briefly rises slightly above normal for the current state on entry to idle and above normal for the destination state on exit from idle. Although increasingly deep idle states consume decreasing amounts of power, they have increasingly large costs to enter and exit. This implies that for short idle periods, a fairly shallow idle state is the best use of system resources; for longer idle periods, the costs of a deeper idle state will be justified by the increased power savings while idle. It is therefore in the kernel's best interests to predict how long a CPU will be idle before deciding how deeply to idle it. This is the job of the idle loop.
In this loop, the CPU scheduler notices that a CPU is idle because it has no work for the CPU to do. The scheduler then calls the governor, which does its best to predict the appropriate idle state to enter. There are currently two governors in the kernel, called "menu" and "ladder". They are used in different cases, but they both try to do roughly the same thing: keep track of system state when a CPU idles and how long it ended up idling for. This is done in order to predict how long a freshly-idle CPU is likely to remain so, and thus what idle state is most appropriate for it.
This job is made particularly difficult by the CPU scheduler's clock tick. This is a timer that is run by the CPU scheduler for the purpose of time-sharing the CPU: if you are going to run multiple jobs on a single CPU, each job can only be run for a while, then periodically put aside in favor of another job. This tick doesn't need to run on a CPU that is idle, since there are no jobs between which the CPU should be shared. Moreover, if the tick is allowed to run on an otherwise-idle CPU, it will prevent the governor from selecting deep idle states by limiting the time for which the CPU is likely to remain idle. So in kernels 4.16 and older, the scheduler disables the tick before calling the governor. When the CPU is woken by an interrupt, the scheduler makes a decision about whether there's work to do and, if so, reactivates the tick.
If the governor predicts a long idle, and the idle period turns out to be long, the governor "wins": the CPU will enter a deep idle state and power will be saved. But if the governor predicts long idle and the period turns out to be short, the governor "loses" because the costs of entering a deep idle state are not repaid by power savings over the short idle period. Worse, if the governor predicts a short idle period, it loses regardless of the actual idle duration: if the actual duration is long, potential power savings have been missed out on, and if it's short, the costs of stopping and restarting the tick have been paid needlessly. Or to put it another way, because stopping and starting the tick have a cost, there is no point in stopping the tick if the governor is going to predict a short idle.
Wysocki considered trying to redesign the governor to work around this, but concluded that the essential problem is that the tick is stopped before the governor is invoked, thus before the recommended idle state is known. He therefore reworked the idle loop for kernel 4.17 so that the decision about stopping the tick is taken after the governor has made its recommendation of the idle state. If the recommendation is for a long idle, the tick is stopped so as not to wake the CPU prematurely. If the recommendation is for a short idle, the tick is left on to avoid paying the cost of turning it off. That means the tick is also a safety net that will wake the CPU in the event that the idle turns out to be longer than predicted and give the governor another chance to get it right.
When the idled CPU is woken by an interrupt, whether from the tick that was left running or by some other event, the scheduler immediately makes a decision about whether there's work to do. If there is, the tick is restarted if need be; but if there is not, the governor is immediately re-invoked. Since that means the governor can now be invoked both when the tick is running and when it is stopped, the governor had to be reworked to take this into account.
Re-examining the win/loss table from earlier, Wysocki expects things to be improved by this rework. If long idle is predicted, the tick is still stopped, so nothing changes; we win if the actual idle is long, and lose if it's short. But if short idle is predicted, we're better off: if the actual idle is short, we've saved the cost of stopping and restarting the tick, and if the actual idle is long, the unstopped timer will wake us up and give us another bite at the prediction cherry.
Since game theory is no substitute for real-world data, Wysocki tested this on a number of systems. The graph above is characteristic of all the systems tested and shows power consumption against time on a system that is idle. The green line is with the old idle loop, the red is with the new: power consumption is less under the new scheme, and moreover it is much more predictable than before. Not all CPUs tested showed as large a gap between the green and red lines, but all showed a flat red line beneath a bumpy green one. As Wysocki put it, this new scheme predicts short idles less often than the old scheme did, but it is right about them being short more often.
In response to a question from the audience, Wysocki said that the work is architecture-independent. Intel CPUs will benefit from it particularly, because they have a comparatively large array of idle states from which the governor may select, giving the governor the best chance of doing well if it predicts correctly; but ARM CPUs, for example, will also benefit.
A 20% drop in idle power consumption may seem small as victories go, but it's not. Any system that wants to be able to cope reasonably well with peak loads will need spare capacity in normal operation, which will manifest as idle time. The graph above shows CPU usage on my mail/talk/file-transfer/VPN/NTP/etc. server over the past year; the bright yellow is idle time. Saving 20% of that power will please my co-location provider very much indeed, and it's good for the planet, too.
[We would like to thank LWN's travel sponsor, The Linux Foundation, for assistance with travel funding for Kernel Recipes.]
Index entries for this article | |
---|---|
GuestArticles | Yates, Tom |
Conference | Kernel Recipes/2018 |
(Log in to post comments)
What's a CPU to do when it has nothing to do?
Posted Oct 5, 2018 19:40 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link]
Mine Bitcoin!
/me runs and hides
What's a CPU to do when it has nothing to do?
Posted Oct 5, 2018 19:49 UTC (Fri) by atai (subscriber, #10977) [Link]
What's a CPU to do when it has nothing to do?
Posted Oct 5, 2018 20:04 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link]
Why, you can even allow attaching eBPF programs to the idle loop to customize it.
What's a CPU to do when it has nothing to do?
Posted Oct 6, 2018 10:26 UTC (Sat) by smurf (subscriber, #17840) [Link]
At least, that's the theory. In practice it is no longer possible to mine bitcoins with the CPU.
What's a CPU to do when it has nothing to do?
Posted Oct 6, 2018 15:37 UTC (Sat) by zlynx (guest, #2285) [Link]
The cost of power runs up quickly.
In the Pentium III days I used to run SETI and Folding @Home but not in the last 10+ years.
What's a CPU to do when it has nothing to do?
Posted Oct 11, 2018 16:24 UTC (Thu) by nilsmeyer (guest, #122604) [Link]
What's a CPU to do when it has nothing to do?
Posted Oct 5, 2018 20:14 UTC (Fri) by halla (subscriber, #14185) [Link]
Nothing doing, of course.
What's a CPU to do when it has nothing to do?
Posted Oct 5, 2018 20:40 UTC (Fri) by josh (subscriber, #17465) [Link]
(Bonus if you're willing to name companies asking. And if anyone has samples of what such code looks like, to put into scanners and similar.)
What's a CPU to do when it has nothing to do?
Posted Oct 5, 2018 21:39 UTC (Fri) by simcop2387 (subscriber, #101710) [Link]
What's a CPU to do when it has nothing to do?
Posted Oct 5, 2018 22:23 UTC (Fri) by halla (subscriber, #14185) [Link]
What's a CPU to do when it has nothing to do?
Posted Oct 6, 2018 8:41 UTC (Sat) by pkern (subscriber, #32883) [Link]
What's a CPU to do when it has nothing to do?
Posted Oct 6, 2018 15:31 UTC (Sat) by zlynx (guest, #2285) [Link]
Do some experiments. If I log into my laptop remotely with SSH and SIGSTOP my already idle GUI session and then wait 10 minutes, Power History shows only about half a watt usage. Maybe less, the graph is small.
The screen is off, the NVMe drive is idle, the networking idles, the CPU idles, the PCIe bus goes into low power. There's nothing else to do. I suppose the system could place the RAM into low power refresh.
Other systems like Windows suspend user processes and offer ways to opt out for email or browser notifications to get a bit of runtime every two minutes. But that does not need any kernel support.
What's a CPU to do when it has nothing to do?
Posted Oct 6, 2018 17:37 UTC (Sat) by pkern (subscriber, #32883) [Link]
What's a CPU to do when it has nothing to do?
Posted Oct 6, 2018 18:47 UTC (Sat) by mjg59 (subscriber, #23239) [Link]
What's a CPU to do when it has nothing to do?
Posted Oct 6, 2018 18:56 UTC (Sat) by zlynx (guest, #2285) [Link]
Are you sure you can do better than half a watt?
What's a CPU to do when it has nothing to do?
Posted Oct 6, 2018 19:05 UTC (Sat) by mjg59 (subscriber, #23239) [Link]
What's a CPU to do when it has nothing to do?
Posted Oct 6, 2018 19:11 UTC (Sat) by zlynx (guest, #2285) [Link]
Things might be better than 0.5W in actual suspend. Remember when I said it looked like 0.5W I was talking about the laptop as it was idling. Fully powered on, with a connected SSH session, but no activity happening.
What's a CPU to do when it has nothing to do?
Posted Oct 6, 2018 19:21 UTC (Sat) by pkern (subscriber, #32883) [Link]
What's a CPU to do when it has nothing to do?
Posted Oct 8, 2018 3:42 UTC (Mon) by jfred (guest, #126493) [Link]
What's a CPU to do when it has nothing to do?
Posted Oct 8, 2018 3:46 UTC (Mon) by jfred (guest, #126493) [Link]
What's a CPU to do when it has nothing to do?
Posted Oct 7, 2018 21:22 UTC (Sun) by isoma (guest, #127702) [Link]
To me, Linux's version of Modern Standby would means making the power consumption under system power state “freeze” (aka “s2idle”) end up very close to S1 ACPI sleep (the state you get from writing “standby” to /sys/power/state). Modern Standby means I can let my laptop / tablet go to sleep and see it instantly wake up with one button press - even on a USB keyboard or Bluetooth mouse. I can have that today, at the expense of battery life. On my laptop (XPS 13 Developer Edition) the “freeze” state power consumption is a lot higher than I'd like it to be. I'd imagine it's a common story on lots of PC devices that use Linux: idle power use is low, but it could be lots lower. Some improvements fit best into userland. After (say) 96 hours in freeze / idle, the system might want to look at battery use and consider using deep S3 / S4 sleep with scheduled wakeups. Maybe also wake up to poll the network. Once idle laptops and tablets are using about the same power as S1 sleep, we can one day look forward to convenience similar to Windows and iOS: beeps on new email; vibration for alerts we're interested in; waking the screen up to let us know about a video call from a friend. One day.
What's a CPU to do when it has nothing to do?
Posted Oct 8, 2018 9:06 UTC (Mon) by rjw@sisk.pl (subscriber, #39252) [Link]
However, this presentation was not about system-wide suspend at all, but rather about what happens if the system is idle and not suspended (that is a valid use case too, especially for servers).
On laptops it usually is a good idea to set up user space to suspend the whole system when it has been idle for a certain amount of time (either via s2idle or via ACPI S3 if supported), but what happens before it is suspended does matter too.
What's a CPU to do when it has nothing to do?
Posted Oct 8, 2018 9:36 UTC (Mon) by zlynx (guest, #2285) [Link]
What's a CPU to do when it has nothing to do?
Posted Oct 8, 2018 14:59 UTC (Mon) by drag (guest, #31333) [Link]
Sounds like a job for systemd --user.
Maybe systemd can be made aware of power levels and 'pause' notification processes. Then at pre-defined times it can 'pulse' the system, wake up the notification processes and give them a time window so they can do their checks, then put them back to sleep.
That would be something very cool to have.
What's a CPU to do when it has nothing to do?
Posted Oct 18, 2018 8:51 UTC (Thu) by Wol (subscriber, #4433) [Link]
I now have my laptop configured "don't sleep if plugged in", and I have to remember to make sure it's plugged in before I try anything like that. If I'm doing a 20min file transfer (over 100Mb ethernet!), it's a real pain to babysit the laptop to prevent it sleeping.
Cheers,
Wol
What's a CPU to do when it has nothing to do?
Posted Oct 18, 2018 18:26 UTC (Thu) by raven667 (subscriber, #5198) [Link]
What's a CPU to do when it has nothing to do?
Posted Oct 19, 2018 16:44 UTC (Fri) by flussence (subscriber, #85566) [Link]
What's a CPU to do when it has nothing to do?
Posted Oct 28, 2018 21:45 UTC (Sun) by farnz (subscriber, #17727) [Link]
How does cron know that there are processes that want to wake the system from deep sleep, and ensure that they're all told to run at once the moment the system is out of sleep for any reason?
AFAICT, cron only solves the other side of the problem - waking from sleep on a timetable.
What's a CPU to do when it has nothing to do?
Posted Oct 8, 2018 9:23 UTC (Mon) by eru (subscriber, #2753) [Link]
What's a CPU to do when it has nothing to do?
Posted Oct 8, 2018 11:35 UTC (Mon) by tonyblackwell (guest, #43641) [Link]
but I guess the point is valid.
What's a CPU to do when it has nothing to do?
Posted Oct 8, 2018 16:22 UTC (Mon) by eru (subscriber, #2753) [Link]
What's a CPU to do when it has nothing to do?
Posted Oct 8, 2018 15:34 UTC (Mon) by cdamian (subscriber, #1271) [Link]
The second one was much harder to read, but I got it from the context.
What's a CPU to do when it has nothing to do?
Posted Oct 9, 2018 17:23 UTC (Tue) by naptastic (guest, #60139) [Link]
The red line on that graph is suspiciously flat. Is anyone already testing this independently? Also, what's this going to do to latency-sensitive workloads, such as realtime A/V?
What's a CPU to do when it has nothing to do?
Posted Oct 9, 2018 20:07 UTC (Tue) by rweikusat2 (subscriber, #117920) [Link]
What's a CPU to do when it has nothing to do?
Posted Oct 9, 2018 20:52 UTC (Tue) by bfields (subscriber, #19510) [Link]
Just an obvious nitpick: the CPU as far as I know doesn't process audio samples one at a time, it moves them in batches in and out of buffers, so the sample rate isn't the same thing as the frequency with which the CPU has to wake up to process the audio stream.
What's a CPU to do when it has nothing to do?
Posted Oct 10, 2018 12:16 UTC (Wed) by rweikusat2 (subscriber, #117920) [Link]
If you want some numbers on this, the second section of this text
https://www.edn.com/design/consumer/4376143/Fundamentals-...
provides a short overview.
The patch described in the text tweaks the handling of a CPU which is already considered idle. The kernel decidedly shouldn't consider a CPU working with some audio stream idle and if it did, this change wouldn't make matters worse.
What's a CPU to do when it has nothing to do?
Posted Oct 10, 2018 20:23 UTC (Wed) by flussence (subscriber, #85566) [Link]
snd-hda-intel would be better to look at here, because that actually has improved power efficiency over the years by using huge DMA buffers; a system playing audio through pulseaudio and otherwise idle shows very low numbers in powertop.
What's a CPU to do when it has nothing to do?
Posted Oct 10, 2018 21:46 UTC (Wed) by rweikusat2 (subscriber, #117920) [Link]
What's a CPU to do when it has nothing to do?
Posted Oct 11, 2018 2:35 UTC (Thu) by bfields (subscriber, #19510) [Link]
The kernel certainly can consider a CPU working with an audio stream to be idle, and you can probably verify this by experimenting with powertop while playing audio.
What's a CPU to do when it has nothing to do?
Posted Oct 11, 2018 8:22 UTC (Thu) by cladisch (✭ supporter ✭, #50193) [Link]
What's a CPU to do when it has nothing to do?
Posted Oct 11, 2018 14:46 UTC (Thu) by bfields (subscriber, #19510) [Link]
Yes. And even then it's still common to buffer a few milliseconds worth, which is more than enough for the CPU to drop to a lower-power state. Though in practice of course that depends on how much processing it takes to generate the sound.
What's a CPU to do when it has nothing to do?
Posted Oct 18, 2018 9:06 UTC (Thu) by Wol (subscriber, #4433) [Link]
If you're playing a CD, you can have a 5 second lag and, provided the output buffer always has something in it, no-one will notice. If you're running the monitors, it only takes something like 5 milli-seconds (don't know the figure) lag between the analog and digital audio, and the band will be getting headaches.
Cheers,
Wol
What's a CPU to do when it has nothing to do?
Posted Oct 18, 2018 9:31 UTC (Thu) by farnz (subscriber, #17727) [Link]
Further, with the CD example, you can start playing when the buffer has one CD Audio frame in it (1/75th of a second), and run the CD reader at a higher speed until you have 5 seconds of data in the buffer, before slowing it back to 1x to keep the buffer filled. If a hiccup happens, you have 5 seconds to recover before the user hears an issue.
This gets you no lag (you play the moment you've read a single frame), and a 5 second buffer against read issues, system overload etc. Obviously, you can't do this unless you can (as in the CD example) read ahead and have data waiting. Plus, when it's time to stop, you can just drop the buffer and stop immediately.
Monitor audio is problematic not because you can't tolerate delay, but because you can neither tolerate delay nor read ahead into the future; thus, you can't build up a 5 second safety buffer.
What's a CPU to do when it has nothing to do?
Posted Oct 18, 2018 14:46 UTC (Thu) by bfields (subscriber, #19510) [Link]
Not an expert, so I may be misusing these numbers, but I believe the numbers under /sys/devices/system/cpu/cpu0/cpuidle/*/latency are times in microseconds to return from the various states. On my system the vary between 2μ and 166μ. So I think it's possible to enter c-states even when waking up every millisecond?
What's a CPU to do when it has nothing to do?
Posted Oct 19, 2018 1:58 UTC (Fri) by leemeans (guest, #127765) [Link]
Intel CPUs will benefit from it particularly, .....; but ARM CPUs, for example, will also benefit.
Why here we have a 'but' for ARM architecture.
Besides , before this sentence i read that 'Wysocki said that the work is architecture-independent. '. Is that implies this work will make Intel CPUs benefit from it much but ARM doesn't ? Or maybe not ?