Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mimalloc v3 memory usage regression #1014

Open
Jarred-Sumner opened this issue Feb 17, 2025 · 1 comment
Open

Mimalloc v3 memory usage regression #1014

Jarred-Sumner opened this issue Feb 17, 2025 · 1 comment

Comments

@Jarred-Sumner
Copy link
Contributor

Jarred-Sumner commented Feb 17, 2025

We use mimalloc in Bun and locally testing mimalloc dev3 in several of our benchmarks yielded a positive result which is exciting, but a handful of our memory leak stress tests in CI regress. I'm mostly filing this issue so you have another data point for the v3 branch - not asking you to fix a bug or anything. We will stick to v2 for now but will give v3 another go in the future.

Logs from this test:

  | 2025-02-16 02:22:19 PST | Start RSS 2704
  | 2025-02-16 02:22:19 PST | RSS Growth 227
  | 2025-02-16 02:22:19 PST | RSS Growth 198
  | 2025-02-16 02:22:19 PST | RSS Growth 203
  | 2025-02-16 02:22:19 PST | RSS Growth 275
  | 2025-02-16 02:22:19 PST | RSS Growth 91
  | 2025-02-16 02:22:19 PST | RSS Growth 150
  | 2025-02-16 02:22:19 PST | RSS Growth 283
  | 2025-02-16 02:22:19 PST | RSS Growth 151
  | 2025-02-16 02:22:19 PST | RSS Growth 246
  | 2025-02-16 02:22:19 PST | RSS Growth 66
  | 2025-02-16 02:22:19 PST | RSS Growth 262
  | 2025-02-16 02:22:19 PST |  
  | 2025-02-16 02:22:19 PST | 156 \|             lastRSS = rss;
  | 2025-02-16 02:22:19 PST | 157 \|           }
  | 2025-02-16 02:22:19 PST | 158 \|           Bun.gc(true);
  | 2025-02-16 02:22:19 PST | 159 \|
  | 2025-02-16 02:22:19 PST | 160 \|           const rss = (process.memoryUsage.rss() / 1024 / 1024) \| 0;
  | 2025-02-16 02:22:19 PST | 161 \|           expect(rss).toBeLessThan(4092);
  | 2025-02-16 02:22:19 PST | ^
  | 2025-02-16 02:22:19 PST | error: expect(received).toBeLessThan(expected)
  | 2025-02-16 02:22:19 PST |  
  | 2025-02-16 02:22:19 PST | Expected: < 4092
  | 2025-02-16 02:22:19 PST | Received: 4905
  | 2025-02-16 02:22:19 PST |  
  | 2025-02-16 02:22:19 PST | at <anonymous> (/var/lib/buildkite-agent/builds/ip-172-31-3-37/bun/bun/test/js/bun/http/bun-serve-static.test.ts:161:23)
  | 2025-02-16 02:22:19 PST |  static > /big > stress (access .body) > arrayBuffer [5291.13ms]
  | 2025-02-16 02:22:19 PST | Start RSS 7207
[OOM killer kicks in]

https://buildkite.com/bun/bun/builds/11698#01950e27-d67a-40a6-b4ec-e381cd978a17

This particular test - bun-serve-static.test.ts makes many HTTP requests that respond with a large amount of data, repeatedly appending to a buffer internally (the .body getter forces it to stream internally). HTTP requests are processed in a dedicated thread with it's own mimalloc heap. The main thread receives chunks of the HTTP response from the HTTP thread and appends to an internal buffer (which uses mi_malloc instead of the heap-specific methods).

Note that the calls to Bun.gc() internally call mi_collect(0)

 test.each(["arrayBuffer", "blob", "bytes", "text"])(
        "%s",
        async method => {
          const byteSize = static_responses[path][method]?.size;

          const bytes = method === "blob" ? static_responses[path] : await static_responses[path][method]();

          // macOS limits backlog to 128.
          // When we do the big request, reduce number of connections but increase number of iterations
          const batchSize = Math.ceil((byteSize > 1024 * 1024 ? 48 : 64) / (isWindows ? 8 : 1));
          const iterations = Math.ceil((byteSize > 1024 * 1024 ? 10 : 12) / (isWindows ? 8 : 1));

          async function iterate() {
            let array = new Array(batchSize);
            const route = `${server.url}${path.substring(1)}`;
            for (let i = 0; i < batchSize; i++) {
              array[i] = fetch(route)
                .then(res => {
                  expect(res.status).toBe(200);
                  expect(res.url).toBe(route);
                  if (label === "access .body") {
                    res.body;
                  }
                  return res[method]();
                })
                .then(output => {
                  expect(output).toStrictEqual(bytes);
                });
            }

            await Promise.all(array);

            Bun.gc();
          }

          for (let i = 0; i < iterations; i++) {
            await iterate();
          }

          Bun.gc(true);
          const baseline = (process.memoryUsage.rss() / 1024 / 1024) | 0;
          let lastRSS = baseline;
          console.log("Start RSS", baseline);
          for (let i = 0; i < iterations; i++) {
            await iterate();
            const rss = (process.memoryUsage.rss() / 1024 / 1024) | 0;
            if (lastRSS + 50 < rss) {
              console.log("RSS Growth", rss - lastRSS);
            }
            lastRSS = rss;
          }
          Bun.gc(true);

          const rss = (process.memoryUsage.rss() / 1024 / 1024) | 0;
          expect(rss).toBeLessThan(4092);
          const delta = rss - baseline;
          console.log("Final RSS", rss);
          console.log("Delta RSS", delta);
        },
        40 * 1000,
      );
    });
@daanx
Copy link
Collaborator

daanx commented Feb 17, 2025

Thanks -- and nice project! This is interesting as I believe that v3 should always use less memory than v2. Is this the latest dev3 (as there have been a bunch of important bug fixes) ?

Does this happen too if you call mi_collect(true) just before getting the rss ?
(Also, if you can give me a way to repro locally without too much hassle that would be give me way to test further -- but no worries if this is not so easy)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants