diff --git a/README.md b/README.md index 520d8ff..121deec 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,128 @@ # ryu +[![GoDoc](https://godoc.org/github.com/cespare/ryu?status.svg)](https://godoc.org/github.com/cespare/ryu) + This is a Go implementation of [Ryu](https://github.com/ulfjack/ryu), a fast algorithm for converting floating-point numbers to strings. -TODO: more description. +The API is: + +``` +func AppendFloat32(b []byte, f float32) []byte +func AppendFloat64(b []byte, f float64) []byte +func FormatFloat32(f float32) string +func FormatFloat64(f float64) string +``` + +These functions are the equivalents of calling strconv.FormatFloat or +strconv.AppendFloat using the formatter `'e'` and precision `-1`: + +``` +// These are the same: +const f float32 = 1.234 +s := ryu.FormatFloat32(f) +s := strconv.FormatFloat(float64(f), 'e', -1, 32) +``` + +## Benchmarks + +These benchmarks were taken with Go 1.12beta1 on Linux/amd64 using an +Intel i7-8700K. + +``` +name old time/op new time/op delta +FormatFloat32-12 128ns ± 1% 50ns ± 2% -60.82% (p=0.000 n=7+8) +FormatFloat64-12 129ns ± 4% 65ns ± 5% -49.54% (p=0.000 n=7+8) +AppendFloat32/0e+00-12 24.4ns ± 1% 3.0ns ± 1% -87.88% (p=0.000 n=8+8) +AppendFloat32/1e+00-12 26.5ns ± 1% 13.2ns ± 3% -49.98% (p=0.000 n=8+8) +AppendFloat32/3e-01-12 52.2ns ± 1% 32.5ns ± 2% -37.73% (p=0.000 n=8+8) +AppendFloat32/1e+06-12 41.2ns ± 1% 17.9ns ± 1% -56.45% (p=0.000 n=8+7) +AppendFloat32/-1.2345e+02-12 83.3ns ± 2% 34.2ns ± 1% -58.90% (p=0.000 n=8+8) +AppendFloat64/0e+00-12 24.5ns ± 2% 3.3ns ± 2% -86.50% (p=0.000 n=8+8) +AppendFloat64/1e+00-12 26.9ns ± 1% 14.5ns ± 1% -46.06% (p=0.001 n=8+6) +AppendFloat64/3e-01-12 53.0ns ± 1% 42.5ns ± 0% -19.75% (p=0.001 n=8+6) +AppendFloat64/1e+06-12 41.4ns ± 1% 21.1ns ± 1% -49.05% (p=0.000 n=8+8) +AppendFloat64/-1.2345e+02-12 83.8ns ± 1% 43.3ns ± 1% -48.32% (p=0.000 n=8+8) +AppendFloat64/6.226662346353213e-309-12 25.5µs ± 1% 0.0µs ± 1% -99.84% (p=0.000 n=8+8) +``` + +The test `TestRandomBenchmark` gathers statistics about the distribution of call +latencies for random float64 values. Here is the summary for one sample of 10,000 +random floats: + +``` + ryu_test.go:279: after sampling 50000 float64s: + ryu: min = 2ns max = 90ns median = 41ns mean = 41ns + strconv (stdlib): min = 8ns max = 25845ns median = 106ns mean = 154ns +``` + +The `strconv.FormatFloat` latency is bimodal because of an infrequently-taken +slow path that is orders of magnitude more expensive +(https://golang.org/issue/15672). + +## Size optimization + +The Ryu algorithm requires several lookup tables. Ulf Adams's C library +implements a size optimization (`RYU_OPTIMIZE_SIZE`) which greatly reduces the +size of the float64 tables in exchange for a little more CPU cost. + +I have a WIP implementation of this optimization on the `size` branch. A binary +built using that version is 7.96 kB smaller. The benchmark results take a hit as +compared with the non-size-optimized build: + +``` +name old time/op new time/op delta +FormatFloat32-12 50.0ns ± 2% 49.4ns ± 1% ~ (p=0.183 n=8+8) +FormatFloat64-12 65.0ns ± 5% 72.1ns ± 5% +10.96% (p=0.000 n=8+8) +AppendFloat32/0e+00-12 2.95ns ± 1% 2.98ns ± 1% ~ (p=0.072 n=8+8) +AppendFloat32/1e+00-12 13.2ns ± 3% 13.1ns ± 1% ~ (p=0.275 n=8+8) +AppendFloat32/3e-01-12 32.5ns ± 2% 32.4ns ± 1% ~ (p=0.742 n=8+8) +AppendFloat32/1e+06-12 17.9ns ± 1% 17.6ns ± 1% -2.12% (p=0.001 n=7+8) +AppendFloat32/-1.2345e+02-12 34.2ns ± 1% 34.4ns ± 1% ~ (p=0.426 n=8+8) +AppendFloat64/0e+00-12 3.31ns ± 2% 3.29ns ± 1% ~ (p=0.394 n=8+8) +AppendFloat64/1e+00-12 14.5ns ± 1% 14.6ns ± 4% ~ (p=0.641 n=6+8) +AppendFloat64/3e-01-12 42.5ns ± 0% 50.0ns ± 1% +17.44% (p=0.001 n=6+8) +AppendFloat64/1e+06-12 21.1ns ± 1% 21.1ns ± 2% ~ (p=0.452 n=8+8) +AppendFloat64/-1.2345e+02-12 43.3ns ± 1% 50.9ns ± 1% +17.57% (p=0.000 n=8+8) +AppendFloat64/6.226662346353213e-309-12 40.6ns ± 1% 47.7ns ± 1% +17.38% (p=0.000 n=8+8) +``` + +However, it's still generally faster than strconv: + +``` +name old time/op new time/op delta +FormatFloat32-12 129ns ± 2% 49ns ± 1% -61.72% (p=0.000 n=8+8) +FormatFloat64-12 130ns ± 3% 72ns ± 5% -44.32% (p=0.000 n=7+8) +AppendFloat32/0e+00-12 24.5ns ± 2% 3.0ns ± 1% -87.83% (p=0.000 n=8+8) +AppendFloat32/1e+00-12 26.4ns ± 1% 13.1ns ± 1% -50.26% (p=0.000 n=7+8) +AppendFloat32/3e-01-12 52.6ns ± 2% 32.4ns ± 1% -38.43% (p=0.000 n=8+8) +AppendFloat32/1e+06-12 41.3ns ± 2% 17.6ns ± 1% -57.51% (p=0.000 n=8+8) +AppendFloat32/-1.2345e+02-12 83.5ns ± 1% 34.4ns ± 1% -58.82% (p=0.000 n=8+8) +AppendFloat64/0e+00-12 24.6ns ± 2% 3.3ns ± 1% -86.63% (p=0.000 n=8+8) +AppendFloat64/1e+00-12 26.7ns ± 1% 14.6ns ± 4% -45.51% (p=0.000 n=8+8) +AppendFloat64/3e-01-12 52.7ns ± 1% 50.0ns ± 1% -5.17% (p=0.000 n=8+8) +AppendFloat64/1e+06-12 41.2ns ± 1% 21.1ns ± 2% -48.61% (p=0.000 n=7+8) +AppendFloat64/-1.2345e+02-12 83.7ns ± 1% 50.9ns ± 1% -39.17% (p=0.000 n=8+8) +AppendFloat64/6.226662346353213e-309-12 25.8µs ± 2% 0.0µs ± 1% -99.81% (p=0.000 n=8+8) +``` + +## Notes + +This package is a fairly direct Go translation of Ulf Adams's C library at +https://github.com/ulfjack/ryu. This code is also licensed with Apache 2.0 as a +derived work of that code. + +This package requires Go 1.12 (expected to be released February 2019). + +For a small fraction of inputs, Ryu gives a different value than strconv does +for the last digit. This is due to a bug in strconv: https://golang.org/issue/29491. + +## Future work + +My plan is to incorporate this into strconv (see +https://golang.org/issue/15672). Then everyone will benefit from the faster +algorithm and there will be no need for this library. + +If you would like to contribute, I'm interested in any bugfixes or clear-cut +optimizations, but given the above I don't intend to add more features or APIs +to this package. diff --git a/go.mod b/go.mod index 877495b..a24fc5d 100644 --- a/go.mod +++ b/go.mod @@ -1,5 +1,3 @@ module github.com/cespare/ryu go 1.12 - -require github.com/kr/pretty v0.1.0 diff --git a/go.sum b/go.sum index a1aa49e..e69de29 100644 --- a/go.sum +++ b/go.sum @@ -1,5 +0,0 @@ -github.com/kr/pretty v0.1.0 h1:L/CwN0zerZDmRFUapSPitk6f+Q3+0za1rQkzVuMiMFI= -github.com/kr/pretty v0.1.0/go.mod h1:dAy3ld7l9f0ibDNOQOHHMYYIIbhfbHSm3C4ZsoJORNo= -github.com/kr/pty v1.1.1/go.mod h1:pFQYn66WHrOpPYNljwOMqo10TkYh1fy3cYio2l3bCsQ= -github.com/kr/text v0.1.0 h1:45sCR5RtlFHMR4UwH9sdQ5TC8v0qDQCHnXt+kaKSTVE= -github.com/kr/text v0.1.0/go.mod h1:4Jbv+DJW3UT/LiOwJeYQe1efqtUx/iVham/4vfdArNI= diff --git a/ryu.go b/ryu.go index bbedc43..f149243 100644 --- a/ryu.go +++ b/ryu.go @@ -15,6 +15,8 @@ // Ulf Adams which may be found at https://github.com/ulfjack/ryu. That source // code is licensed under Apache 2.0 and this code is derivative work thereof. +// Package ryu implements the Ryu algorithm for quickly converting floating +// point numbers into strings. package ryu import ( diff --git a/ulfjack/.gitignore b/ulfjack/.gitignore deleted file mode 100644 index a1675c9..0000000 --- a/ulfjack/.gitignore +++ /dev/null @@ -1 +0,0 @@ -/bench diff --git a/ulfjack/bench.c b/ulfjack/bench.c deleted file mode 100644 index ec121f8..0000000 --- a/ulfjack/bench.c +++ /dev/null @@ -1,45 +0,0 @@ -#include -#include -#include -#include - -#include "ryu/ryu.h" - -int64_t time_sub(const struct timespec *t0, const struct timespec *t1) { - int64_t nsec = (int64_t)t0->tv_sec * 1000000000 + (int64_t)t0->tv_nsec; - nsec -= (int64_t)t1->tv_sec * 1000000000 + (int64_t)t1->tv_nsec; - return nsec; -} - -int main(int argc, char **argv) { - printf("%s\n", f2s((float)(6.400023450830159e+08))); - return 0; - - struct timespec start, end; - int64_t elapsed; - int64_t iters = 0; - - char buf[40]; - int sink; - - clock_gettime(CLOCK_MONOTONIC, &start); - for (;;) { - for (int i = 0; i < 10000; i++) { - d2s_buffered(1.0, buf); - sink += buf[2]; - } - clock_gettime(CLOCK_MONOTONIC, &end); - - iters += 10000; - elapsed = time_sub(&end, &start); - if (elapsed >= 1000000000) { - break; - } - } - - double secs = (double)elapsed / 1000000000.0; - printf("%lu iters in %lf secs: %.2lf ns/iter\n", iters, secs, (double)elapsed / (double)iters); - if (argc == 1000) { - printf("%d\n", sink); - } -}