Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The Linux evdev sound API is very slow to open(2) the device #6

Open
NoSuck opened this issue Apr 19, 2019 · 19 comments
Open

The Linux evdev sound API is very slow to open(2) the device #6

NoSuck opened this issue Apr 19, 2019 · 19 comments
Assignees

Comments

@NoSuck
Copy link

NoSuck commented Apr 19, 2019

Thank you for overseeing this project.

Cursed to use Arch Linux on one, last machine for the foreseeable future, one of the beep packages I use recently pointed its upstream URL your way (1.3-4 -> 1.4.4-1).

I set up permissions using the beep group method described in PERMISSIONS.md and immediately noticed sluggish, staggered response. Consider the following eight-note scale:

■ Test () { beep -l 250 -f 262 ; beep -l 250 -f 294 ; beep -l 250 -f 330 ; beep -l 250 -f 349 ; beep -l 250 -f 392 ; beep -l 250 -f 440 ; beep -l 250 -f 494 ; beep -l 250 -f 523 ; } ; time Test

real    0m2.466s
user    0m0.012s
sys     0m0.008s

In other words, your fork takes 23% longer to execute than it should--and it staggers from note to note. Do you know why? Do you know of any workarounds?

Thank you for reading.

■ uname -srvmo
Linux 5.0.7-arch1-1-ARCH #1 SMP PREEMPT Mon Apr 8 10:37:08 UTC 2019 x86_64 GNU/Linux
@ndim ndim self-assigned this Apr 19, 2019
@ndim
Copy link
Member

ndim commented Apr 19, 2019

I would just use a single beep invocation for playing a sequence of beeps:

#!/bin/sh
Test () { beep -l 250 -f 262 -n -l 250 -f 294 -n -l 250 -f 330 -n -l 250 -f 349 -n -l 250 -f 392 -n -l 250 -f 440 -n -l 250 -f 494 -n -l 250 -f 523 ; }
time Test

real	0m2.011s
user	0m0.000s
sys	0m0.002s

As to why new beep would be slower than old beep, I will have to think about that a bit first.

@NoSuck
Copy link
Author

NoSuck commented Apr 19, 2019

Thank you for the feedback. To clarify, slowdown is introduced by beep invocations, which is why the Test function I provided is the way it is. What sort of time output does your system show for the original Test?

@ndim
Copy link
Member

ndim commented Apr 20, 2019

Sorry I misunderstood your actual concern.

The original Test() reports

real	0m2.088s
user	0m0.005s
sys	0m0.012s

here. That gives around 11ms per invocation, which is consistent with the

real	0m2.011s

of the one invocation Test().

My system is an about 5 to 8 year old AMD A6-3500 3 core 2GHz(-ish?) with 16GB not at all swapping, running Fedora 29.

@NoSuck
Copy link
Author

NoSuck commented Apr 20, 2019

Thank you.

I will try the other permissions setups (“all other users” with GROUP="beep" and “users currently logged into virtual console”). I am unable to do this now, as my changes to udev rules files do not seem to take effect with udevadm control --reload and udevadm trigger, and I am unable to reboot the system until later.

Just to be safe, I downgraded beep to the original for a side-by-side comparison:

#!/usr/bin/env -S bash --

beep -l 250 -f 262
beep -l 250 -f 294
beep -l 250 -f 330
beep -l 250 -f 349
beep -l 250 -f 392
beep -l 250 -f 440
beep -l 250 -f 494
beep -l 250 -f 523
■ time ./beepTest # 1.3-4

real    0m2.021s
user    0m0.011s
sys     0m0.009s

■ time ./beepTest # 1.4.4-1
real    0m2.524s
user    0m0.017s
sys     0m0.004s

I also compared strace output. It was identical in all the parts that could be reasonably expected to be identical. 😕

By the way, what were the “few serious security issues for beep in 2018” mentioned in PERMISSIONS.md? I never considered setuid a serious problem, so I am curious as to what actually happened. Were incidents reported?

@ndim
Copy link
Member

ndim commented Apr 20, 2019

The permission setup happens whenever the pcspkr.ko kernel module is loaded and creates the devices, which usually happens once on system bootup, or later when you manually modprobe pcspkr.

The permission setup does not happen when you run beep.

What is different between old beep and new beep?

Old beep, for each beep it does, opens /dev/tty0 (and failing that, /dev/vc/0) and calls the KIOCSOUND ioctl on it to start and stop the sound. If you give old beep a different device to use (such as e.g. /dev/input/by-path/platform-pcspkr-event-spkr), it will use that instead.

New beep opens /dev/input/by-path/platform-pcspkr-event-spkr once and writes a few bytes to it to start and stop the sound however many times requested. However, new beep checks a few more things, and has a bit more infrastructure code to initialize, which is probably where it loses a bit of time. As to where exactly new beep loses those milliseconds, I will have to investigate by profiling the program.

I do not see those few milliseconds as a huge issue in practical use, but now that you have pointed this out I do want to know how exactly this happens, how this can be examined, and also how this could be improved (and what are the tradeoffs). Not my biggest priority, though.

As to the security issues, see CVE-2018-0492 and CVE-2018-1000532. Both are covered in the johnath#11 issue.

@NoSuck
Copy link
Author

NoSuck commented Apr 21, 2019

As to the security issues, see CVE-2018-0492 and CVE-2018-1000532.

I was wondering about incidents reported, partly because I think it would be hilarious but mostly because I am curious about the CVE's relevance in the real world.

Thank you for maintaining an interest in this bug. I am sorry I was not able to help much in pinpointing the cause.

@ndim
Copy link
Member

ndim commented Apr 21, 2019

Being unfamiliar with either of them, I have just tried gprof and oprofile on beep and did not find any useful data.

After inserting exit(EXIT_SUCCESS); at various points within the beep sources and running time beep -f 2222 -l 1000 a number of times, my impression is that beep_drivers_detect() takes about 80% of beep's startup time.

That is 100% code which I have written after taking over the beep source in order to separate the specific beeping API from the main beep program.

The question is whether it makes sense for beep to support more than the evdev API. If not, the need for having more than one driver goes away and we will not need the beep driver infrastructure, saving about 80% of program startup.

@ndim
Copy link
Member

ndim commented Apr 22, 2019

The actual line which takes about 80% of startup time appears to be, as determined again via exit(EXIT_SUCCESS); and _exit(EXIT_SUCCESS);

/* const char *device_name = "/dev/input/by-path/platform-pcspkr-event-spkr"; */
const int fd = open(device_name, O_WRONLY);

Why? No idea.

Opening that device_name cannot be worked around at all, so I do not see what I could do about that.

@ndim
Copy link
Member

ndim commented Apr 22, 2019

I just checked these two programs which just open the device file.

Representing old beep with the console KIOCSOUND API:

#include <stdlib.h>
#include <unistd.h>

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

int main(void)
{
  const int fd = open("/dev/tty5", O_WRONLY);
  close(fd);
  return EXIT_SUCCESS;
}

and representing new beep with the evdev API:

#include <stdlib.h>
#include <unistd.h>

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

int main(void)
{
  const int fd = open("/dev/input/by-path/platform-pcspkr-event-spkr", O_WRONLY);
  close(fd);
  return EXIT_SUCCESS;
}

It turns out that the console API skeleton takes between 1ms and 2ms and the evdev API skeleton takes 9ms to 14ms (BTW, directly opening the target of the symlink does not change the evdev times, FWIW).

So I would blame the additional 8ms to 12ms on the kernel, and, from a beep point of view, close this bug without a fix. If you are using beep for a sequence of beeps, use -n instead of invoking beep multiple times, and the timing issue should disappear.

@ndim
Copy link
Member

ndim commented Apr 22, 2019

If I change these skeleton test programs to run the open() and close() in a loop for 1000 times, it turns out that opening /dev/tty5 1000 times takes about 8ms, while opening /dev/input/by-path/platform-pcspkr-event-spkr 1000 times takes between 14.8s and 17.5s, i.e. about 2000 times as long.

No idea why.

@NoSuck
Copy link
Author

NoSuck commented Apr 23, 2019

You are very diligent. I tried the looped test and saw even worse results than you did (with su):

■ time ./testKernelEvdev # /dev/tty5

real    0m0.011s
user    0m0.000s
sys     0m0.011s
■ time ./testKernelEvdev # /dev/input/by-path/platform-pcspkr-event-spkr

real    0m59.463s
user    0m0.027s
sys     0m0.024s
#include <stdlib.h>
#include <unistd.h>

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

void open_and_close(void)
{
	//const int fd = open("/dev/input/by-path/platform-pcspkr-event-spkr", O_WRONLY);
	const int fd = open("/dev/tty5", O_WRONLY);
	close(fd);
}

int main(void)
{
	for(unsigned int u = 0 ; u < 1000 ; u++)
	{
		open_and_close();
	}
	return EXIT_SUCCESS;
}

How frustrating.

ndim added a commit that referenced this issue Dec 20, 2019
As we found out in <#6>,
opening the character device special files for the more modern
evdev API takes about 5000 times the time needed for opening the
device for the old console API:

    console API
        device name: /dev/ttyN (e.g. /dev/tty4)
        time per open-and-close cycle: about 3us

    evdev API
        device name: /dev/input/by-path/platform-pcspkr-event-spkr
        time per open-and-close cycle: about 15ms

To help with measuring those times, and timing them with
/usr/bin/time we have now added "issue-6-benchmark".

Run "make issue-6-benchmark" to build "issue-6-benchmark".

Read "issue-6-benchmark.c" or run "issue-6-benchmark --help"
for usage information.

One peculiar thing to note is that apparently, the evdev API
takes neither User time nor System time, but still needs wall
clock time:

	Command being timed: "./issue-6-benchmark 5118303 /dev/tty2"
	User time (seconds): 0.54
	System time (seconds): 13.37
	Percent of CPU this job got: 94%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 0:14.75

	Command being timed: "./issue-6-benchmark 958 /dev/input/by-path/platform-pcspkr-event-spkr"
	User time (seconds): 0.00
	System time (seconds): 0.07
	Percent of CPU this job got: 0%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 0:14.66
@ndim
Copy link
Member

ndim commented Dec 31, 2019

Running issue-6-benchmark on my system from a virtual console gives the following results, which I am going to interpret below.

/usr/bin/time -v ./issue-6-benchmark 5407515 /dev/tty4
device_name    = /dev/tty4
counters[succ] = 5407515
counters[fail] = 0
Command being timed: "./issue-6-benchmark 5407515 /dev/tty4"
User time (seconds): 0.58
System time (seconds): 14.89
Percent of CPU this job got: 93%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:16.51
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 1760
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 79
Voluntary context switches: 0
Involuntary context switches: 10087
Swaps: 0
File system inputs: 0
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0

/usr/bin/time -v ./issue-6-benchmark 1237 /dev/input/by-path/platform-pcspkr-event-spkr
device_name    = /dev/input/by-path/platform-pcspkr-event-spkr
counters[succ] = 1237
counters[fail] = 0
Command being timed: "./issue-6-benchmark 1237 /dev/input/by-path/platform-pcspkr-event-spkr"
User time (seconds): 0.00
System time (seconds): 0.09
Percent of CPU this job got: 0%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:16.63
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 1852
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 80
Voluntary context switches: 2475
Involuntary context switches: 7
Swaps: 0
File system inputs: 0
File system outputs: 8
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0

The first obvious thing is that 5407515 times opening the console device /dev/tty4 takes 16.51 seconds, and 1237 times opening the evdev device /dev/input/by-path/platform-pcspkr-event-spkr takes 16.63 seconds. So opening the evdev device takes about 13.4ms, and opening the console device takes about 3.1us, which makes opening the evdev device about 4400 times as slow.

Now, let us read the output of /usr/bin/time -v to see what the differences are.

For the console device, opening 5407515 times takes 0.58 seconds of user time, 14.89 seconds of system time, which about adds up to 16.51 seconds of wall clock time. Also, the console device causes 0 voluntary context switches and 10087 involuntary context switches. So far, so good.

For the evdev device on the other hand, opening 1237 times takes 0.00 seconds of user time, 0.09 seconds of system time, which is just about nothing compared to the 16.63 seconds of wall clock time. So for more than 99% of the elapsed wall clock time, the computer appears to be doing things that are neither user time nor system time.

Also, the evdev device causes 2475 voluntary context switches and 7 involuntary context switches. This is remarkably different from the case of the console device: Every open(2)-and-close(2) cycle appears to cause almost exactly two voluntary context switches.

Are just 2475 voluntary context switches taking all that 16.54 seconds of neither-system-nor-user time?

Also, why is the time taken by context switches not be counted as system time?

The time taken for the 7 involuntary context switches appears negligible, as the console device test case manages 10087 of those without causing taking too much time that is neither user time nor system time.

At this time, I have not yet dug further down into the Linux kernel (glibc is probably not to fault, unless it exhibits different behaviour depending on the filename string passed to open(2)) to find out where those context switches might be caused, why opening those different devices does context switches so wildly differently, and whether this could be changed to improve open(2) times for beep without compromising other aspects of the device behaviour.

@ndim ndim changed the title Your fork is slow. The evdev sound API is very slow. Dec 31, 2019
@ndim ndim changed the title The evdev sound API is very slow. The Linux evdev sound API is very slow. Jan 1, 2020
@ndim ndim pinned this issue Jan 24, 2020
ndim added a commit that referenced this issue Feb 6, 2020
Record resource usage internally in addition to recording
the elapsed wall clock time.

This generates a report internally with all the resource
and time usage which appears to be of importance first.

At the end, it still runs externally via "/usr/bin/time -v"
to make sure everything /usr/bin/time knows about is reported.

Helps with #6
but does not fix it.
@ndim
Copy link
Member

ndim commented Feb 6, 2020

The following is example output of running issue-6-benchmark on my machine (AMD A6-3500 3 core APU from around 2011). No idea who would be the proper person to ask why the differences shown are there.

Writable TTY device found: /dev/tty4

Running benchmarks (this will take a minute or two).

Beginning with a quick test run to dimension the actual benchmark.
  Measuring 1000000 repeats for device /dev/tty4
      counters[succ] = 1000000
      counters[fail] = 0
    Time spent: 2.78443 s
  Measuring 230 repeats for device /dev/input/by-path/platform-pcspkr-event-spkr
      counters[succ] = 230
      counters[fail] = 0
    Time spent: 3.68627 s

Now for some actual benchmarks, measured internally:
  Measuring 9086240 repeats for device /dev/tty4
      counters[succ] = 9086240
      counters[fail] = 0
    Time and resource usage (according to getrusage(2) and clock_gettime(2)):
      usr time (seconds):                 1.008093
      sys time (seconds):                23.070988
      wall clock time (seconds):         25.096588
      voluntary context switches:         0
      involuntary context switches:   13079
      major (I/O) page faults:            0
      minor (reclaim) page faults:        0
    Time unaccounted for (wall clock - usr - sys):
      in seconds                          1.017507
      as fraction of wall clock time      4.05%

  Measuring 1579 repeats for device /dev/input/by-path/platform-pcspkr-event-spkr
      counters[succ] = 1579
      counters[fail] = 0
    Time and resource usage (according to getrusage(2) and clock_gettime(2)):
      usr time (seconds):                 0.002100
      sys time (seconds):                 0.049265
      wall clock time (seconds):         25.334364
      voluntary context switches:      3158
      involuntary context switches:       1
      major (I/O) page faults:            0
      minor (reclaim) page faults:        0
    Time unaccounted for (wall clock - usr - sys):
      in seconds                         25.282999
      as fraction of wall clock time     99.80%

Summary:
    console device: /dev/tty4
        time per open(2)-and-close(2):       2.762 us
        open(2)-and-close(2) rate:      362050.807 / s
    evdev device: /dev/input/by-path/platform-pcspkr-event-spkr
        time per open(2)-and-close(2):   16044.562 us
        open(2)-and-close(2) rate:          62.326 / s

So opening an evdev device takes about 5808.95 times as long
as opening a console device.

Notes:
  * All measured times can only be relied upon when this benchmark
    is running on an otherwise idle machine.
  * The weird thing is that for the evdev device, almost 100% of
    the elapsed wall clock time is spent doing something which is
    *neither* system time *nor* user time. What time is it then?

Using the external '/usr/bin/time -v' command for the most details:
  /usr/bin/time -v ./issue-6-benchmark 9086240 /dev/tty4
      counters[succ] = 9086240
      counters[fail] = 0
	Command being timed: "./issue-6-benchmark 9086240 /dev/tty4"
	User time (seconds): 1.02
	System time (seconds): 22.96
	Percent of CPU this job got: 95%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 0:25.01
	Average shared text size (kbytes): 0
	Average unshared data size (kbytes): 0
	Average stack size (kbytes): 0
	Average total size (kbytes): 0
	Maximum resident set size (kbytes): 1892
	Average resident set size (kbytes): 0
	Major (requiring I/O) page faults: 0
	Minor (reclaiming a frame) page faults: 81
	Voluntary context switches: 1
	Involuntary context switches: 12748
	Swaps: 0
	File system inputs: 0
	File system outputs: 0
	Socket messages sent: 0
	Socket messages received: 0
	Signals delivered: 0
	Page size (bytes): 4096
	Exit status: 0

  /usr/bin/time -v ./issue-6-benchmark 1579 /dev/input/by-path/platform-pcspkr-event-spkr
      counters[succ] = 1579
      counters[fail] = 0
	Command being timed: "./issue-6-benchmark 1579 /dev/input/by-path/platform-pcspkr-event-spkr"
	User time (seconds): 0.00
	System time (seconds): 0.06
	Percent of CPU this job got: 0%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 0:25.36
	Average shared text size (kbytes): 0
	Average unshared data size (kbytes): 0
	Average stack size (kbytes): 0
	Average total size (kbytes): 0
	Maximum resident set size (kbytes): 1848
	Average resident set size (kbytes): 0
	Major (requiring I/O) page faults: 0
	Minor (reclaiming a frame) page faults: 81
	Voluntary context switches: 3159
	Involuntary context switches: 1
	Swaps: 0
	File system inputs: 0
	File system outputs: 8
	Socket messages sent: 0
	Socket messages received: 0
	Signals delivered: 0
	Page size (bytes): 4096
	Exit status: 0

@deoxal
Copy link

deoxal commented Oct 2, 2022

How can I add a rule to be able to write to /dev/vc/0 or /dev/tty0? I want to try this.

Even using beep -l 1 -r 3000 seems to have delayed tones sometimes.

Also why was the evdev API added?

Is it because of the vulnerabilities? Nothing in those CVEs seem to indicate it was inherent to the console API.

@ndim
Copy link
Member

ndim commented Oct 2, 2022

I do not know whether it is possible to add a rule for non-root users to write to /dev/tty0. I would be very surprised however if that was a safe thing to do.

I also do not consider it a big problem if there is some delay between starting beep and the first tone of many actually starting. There are no realtime guarantees when launching the beep process either, so if it happens to be quick, that is always down to luck.

So... about the APIs.

The "console" API for the PC speaker (my term) works on the virtual console device (ioctl_console(2)). That is the white text on dark background you get from a classical VGA text console or a more modern framebuffer device virtual console, using agetty or a similar program.

So if you log in on a virtual console, you are running on a specific TTY device (e.g. /dev/tty3 if you are using the virtual console you get with Ctrl+Alt+F3, some distributions used to call that device /dev/vc/3, maybe back in devfs days or so).

Controlling the PC speaker beep via that console device made sense back then (probably in the 1990s), because you were obviously always running on a virtual console. I am not familiar with why the /dev/tty0 device exists, so the following is speculation: There is no virtual console to be reached with Alt+F0 as there is no F0 key, so /dev/tty0 might be explicitly for accessing the console ioctls without needing to be logged in via a virtual console like e.g. from a daemon process - which usually runs as root. So I guess it did not matter that /dev/tty0 was only accessible by root.

However, if you log in via graphical login and then start a virtual terminal program like gnome-terminal or xterm or if you log in remotely via ssh, you do not get a /dev/ttyNN virtual console device you can do your ioctl(2) on but something entirely different, which does not give you access to the computer's PC speaker hardware because such a thing does not exist in the more abstract terminal models of a ssh session or a graphical user interface.

Now while it is possible to have setuid-root executables or run executables via sudo for device access to the /dev/tty0 device not associated with any virtual console or just using some user's /dev/ttyNN device for PC speaker access, that gives the complete process root permissions instead of just granting access to the one device it actually needs the root permissions for. And as it is very difficult to get everything right inside the executable (which is where the beep CVEs came from), it is good practise to only have the absolute minimum set of setuid-root and sudo-run executables on your system which are strictly necessary for the operation of the system. And even in the very few setuid programs we are still using today have CVEs found from time to time, and there are a lot more security focused minds looking at those than are looking at some stupid little mostly irrelevant utility like beep.

Anyway, someone came up with a better approach and created the new evdev API (my term) for accessing the PC speaker: A dedicated device for the PC speaker which uses the input event device infrastructure and allows the admin to set arbitrary permissions on that specific device independently from everything else on the system. The dedicated device makes it possible to give access to the PC speaker on a very finely grained basis using standard unix permissions (and later ACLs), and there is no need for any setuid or sudo tricks with any program using the PC speaker, and the kernel can do the security part which it is much better at than some little beep utility can ever be.

I do not actually know whether these were the actual reasons for the evdev API to be added. However, I can observe that the evdev API just makes sense, while the console API just has the air of a legacy API which was originally added without much thought about what Linux would be used for 10 or 20 years after its first implementation.

So unless you use your computer like in the 1990s (text only and without being connected to today's networks), use the evdev API and avoid the console API.

I guess the console API is OK to use if you are actually logged into a virtual console and therefore happen to have access to a /dev/ttyNN device as whatever non-root user you are running as anyway, or if you happen to be root for a reason entirely unrelated to allowing access to the PC speaker. For everything else, there is the perfectly working evdev API.

@ndim
Copy link
Member

ndim commented Oct 2, 2022

For the record, here is the issue-6-benchmark output from a newer Ryzen 5 5600X 6 core 12 threads processor machine.

Summarizing, the overall performance has increased from the old machine (A6-3500 APU 3 core) to the new machine by a bit over 30% for the console API, and by not quite 30% for the evdev API, so the ratio of console open+close to evdev open+close has not changed much (from 5800 times to 5900 times).

Writable TTY device found: /dev/tty4

Running benchmark(s). This will literally take a minute or two.

For reproducible results, make sure the CPU load is very close to 0,
that there is a sufficient amount of available RAM, and there is
no heavy I/O load. Stopping web browsers running Javascript might help.

Beginning with a quick test run to dimension the actual benchmark:
  Measuring 250000 repeats for /dev/tty4
      counters[succ] = 250000
      counters[fail] = 0
    Time spent: 0.528172 s
  Measuring 250 repeats for /dev/input/by-path/platform-pcspkr-event-spkr
      counters[succ] = 250
      counters[fail] = 0
    Time spent: 3.16464 s

Now for some actual benchmarks, measured internally:
  Measuring 11975265 repeats for /dev/tty4
      counters[succ] = 11975265
      counters[fail] = 0
    Time and resource usage (according to getrusage(2) and clock_gettime(2)):
      usr time (seconds):                 0.798095
      sys time (seconds):                23.790903
      wall clock time (seconds):         25.264538
      voluntary context switches:         0
      involuntary context switches:     101
      major (I/O) page faults:            0
      minor (reclaim) page faults:        0
    Time unaccounted for (wall clock - usr - sys):
      in seconds                          0.675540
      as fraction of wall clock time      2.67%

  Measuring 1999 repeats for /dev/input/by-path/platform-pcspkr-event-spkr
      counters[succ] = 1999
      counters[fail] = 0
    Time and resource usage (according to getrusage(2) and clock_gettime(2)):
      usr time (seconds):                 0.000356
      sys time (seconds):                 0.010703
      wall clock time (seconds):         24.929452
      voluntary context switches:      3998
      involuntary context switches:       0
      major (I/O) page faults:            0
      minor (reclaim) page faults:        0
    Time unaccounted for (wall clock - usr - sys):
      in seconds                         24.918393
      as fraction of wall clock time     99.96%

Summary:
    console device: /dev/tty4
        time per open(2)-and-close(2):       2.110 us
        open(2)-and-close(2) rate:      473995.015 / s
    evdev device: /dev/input/by-path/platform-pcspkr-event-spkr
        time per open(2)-and-close(2):   12470.962 us
        open(2)-and-close(2) rate:          80.186 / s

So opening an evdev device takes 5911.17 times as long
as opening a console device.

Notes:
  * All measured times can only be relied upon when this benchmark
    is running on an otherwise idle machine, as wall clock time matters.
  * The really weird thing is that for the evdev device, almost 100% of
    the elapsed wall clock time is spent doing something which is
    *neither* system time *nor* user time. What kind of time is it then?

Using the external command '/usr/bin/time -v' for the most details:
  /usr/bin/time -v ./issue-6-benchmark 11975265 /dev/tty4
      counters[succ] = 11975265
      counters[fail] = 0
	Command being timed: "./issue-6-benchmark 11975265 /dev/tty4"
	User time (seconds): 0.84
	System time (seconds): 23.65
	Percent of CPU this job got: 97%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 0:25.16
	Average shared text size (kbytes): 0
	Average unshared data size (kbytes): 0
	Average stack size (kbytes): 0
	Average total size (kbytes): 0
	Maximum resident set size (kbytes): 1828
	Average resident set size (kbytes): 0
	Major (requiring I/O) page faults: 0
	Minor (reclaiming a frame) page faults: 81
	Voluntary context switches: 1
	Involuntary context switches: 84
	Swaps: 0
	File system inputs: 0
	File system outputs: 0
	Socket messages sent: 0
	Socket messages received: 0
	Signals delivered: 0
	Page size (bytes): 4096
	Exit status: 0

  /usr/bin/time -v ./issue-6-benchmark 1999 /dev/input/by-path/platform-pcspkr-event-spkr
      counters[succ] = 1999
      counters[fail] = 0
	Command being timed: "./issue-6-benchmark 1999 /dev/input/by-path/platform-pcspkr-event-spkr"
	User time (seconds): 0.01
	System time (seconds): 0.00
	Percent of CPU this job got: 0%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 0:25.16
	Average shared text size (kbytes): 0
	Average unshared data size (kbytes): 0
	Average stack size (kbytes): 0
	Average total size (kbytes): 0
	Maximum resident set size (kbytes): 1788
	Average resident set size (kbytes): 0
	Major (requiring I/O) page faults: 0
	Minor (reclaiming a frame) page faults: 83
	Voluntary context switches: 3999
	Involuntary context switches: 0
	Swaps: 0
	File system inputs: 0
	File system outputs: 0
	Socket messages sent: 0
	Socket messages received: 0
	Signals delivered: 0
	Page size (bytes): 4096
	Exit status: 0

@deoxal
Copy link

deoxal commented Oct 2, 2022

No I don't think the initial setup needs to be faster in any situation. I'm saying playing multiple tones with -n seems like there are tiny but perceptible delays using a low value for -l, less than 3 I think.

Even for music there is no reason to use an -l value that low. The only case I can think of to do that would be sound effects in a retro style game.

What do you mean by a non-user though, did you mean non-privileged user? Besides users writing to file, the only thing I can of that can be described as doing that is the kernel but that doesn't make sense to me.

Here's the time using:

beep -l 0 -r 50

real 0m4.931s
user 0m.000s
sys 0m.002s

and the time with

beep -e /dev/tty6 -l 0 -r 50

real 0m4.908s
user 0m.000s
sys 0m.002s

@ndim
Copy link
Member

ndim commented Oct 3, 2022

No I don't think the initial setup needs to be faster in any situation. I'm saying playing multiple tones with -n seems like there are tiny but perceptible delays using a low value for -l, less than 3 I think.

Then we are not talking about console API vs evdev API. Have you read the beep(1) man page about the -d and -D options and the default value of 100ms?

(I am not saying -d/-D makes sense)

Even for music there is no reason to use an -l value that low. The only case I can think of to do that would be sound effects in a retro style game.

So... nothing beep(1) needs to worry about. You are not writing that retro style game in sh, are you?

What do you mean by a non-user though, did you mean non-privileged user? Besides users writing to file, the only thing I can of that can be described as doing that is the kernel but that doesn't make sense to me.

Sorry, typo. Yes, non-root or non-privileged users.

Here's the time using:

beep -l 0 -r 50

real 0m4.931s
user 0m.000s
sys 0m.002s

and the time with

beep -e /dev/tty6 -l 0 -r 50

real 0m4.908s
user 0m.000s
sys 0m.002s

Nothing surprising here. The evdev API takes a few ms longer for opening the device, and 49 delays of 100ms between the 50 notes gives the same 4900ms.

@deoxal
Copy link

deoxal commented Oct 5, 2022

No I would not write a game in any POSIX shell. It seemed like evdev's time was diverging from the console's time which would affect C++ etc but I didn't think closely enough about it.

@ndim ndim changed the title The Linux evdev sound API is very slow. The Linux evdev sound API is very slow to open(2) the device Oct 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants