-
Notifications
You must be signed in to change notification settings - Fork 133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory fragmentation is very high in memory store #289
Memory fragmentation is very high in memory store #289
Comments
Hmmm, would it be possible to get a few printouts of prometheus over a couple days?
I can't think of anything that has changed that has a high chance of a memory leak. The only thing I can really think of is if there's some tokio::spawn that is not finishing.
…------- Original Message -------
On Monday, September 18th, 2023 at 11:54 AM, Chris Staite ***@***.***> wrote:
With the following configuration, the storage server running only turbo-cache got OOM killed even though it has 16Gb of storage after running for ~4 days. Not sure if there's a memory leak somewhere in the recent code?
{
"stores": {
"CAS_MAIN_STORE": {
"fast_slow": {
"fast": {
"memory": {
"eviction_policy": {
// 4gb
"max_bytes": 4000000000
}
}
},
"slow": {
"filesystem": {
"content_path": "/root/.cache/turbo-cache/content_path-cas",
"temp_path": "/root/.cache/turbo-cache/tmp_path-cas",
"eviction_policy": {
// 150gb.
"max_bytes": 150000000000,
// 2gb
"evict_bytes": 2000000000
}
}
}
}
},
"AC_MAIN_STORE": {
"fast_slow": {
"fast": {
"memory": {
"eviction_policy": {
// 500mb
"max_bytes": 500000000
}
}
},
"slow": {
"filesystem": {
"content_path": "/root/.cache/turbo-cache/content_path-cas_ac",
"temp_path": "/root/.cache/turbo-cache/tmp_path-cas_ac",
"eviction_policy": {
// 10gb.
"max_bytes": 10000000000,
}
}
}
}
}
},
"servers": [{
"listen_address": "0.0.0.0:50052",
"services": {
"cas": {
"main": {
"cas_store": "CAS_MAIN_STORE"
},
},
"ac": {
"main": {
"ac_store": "AC_MAIN_STORE"
}
},
"capabilities": {},
"bytestream": {
"cas_stores": {
"main": "CAS_MAIN_STORE",
},
// According to grpc/grpc.github.io#371 16KiB - 64KiB is optimal.
"max_bytes_per_stream": 64000, // 64kb.
}
}
}]
}
—
Reply to this email directly, [view it on GitHub](#289), or [unsubscribe](https://github.com/notifications/unsubscribe-auth/AAN7CIWV5JIQDT6FBNL7WADX3B4EHANCNFSM6AAAAAA45AT5YU).
You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>
|
Thinking about this, i wounder if the evicting map's hashmap is too big due to so many files? If we can get stats on how many files are on disk I can do some napkin math on it. |
CAS:
AC:
After running for a 16 hours:
After restart:
For some reason my metrics endpoint is only showing connected clients. Is there something I need to do to enable the eviction map stats? |
I'll look into this soon. I'll use some memory profilers: As for the metrics, there's probably a store that is not properly publishing down-stream store's metrics, or the version of TurboCache you are using is out of date. |
Yes, it was an old version... Updated now and there's plenty of metrics. Will check to confirm if the issue was already solved since this was an old build. Usage after re-start:
|
I reduced the in-memory cache to 3gb but the usage has already got back up there: Metrics
|
I don't know how much memory it says the system is using now from those metrics, but here's a little bit of a break down that I see:
|
14Gb of usage. |
We are talking maybe 1/2 the ram is accounted for. Can you keep track of this over a day or two and see how it is growing? I'll try and cut out some time soon to properly memory profile it. I'm hoping it is not some zombie futures or something in the tokio runtime. |
I came across this, which might be some of the issues (memory fragmentation): It might be worth trying a different allocator that deals with fragmentation better. |
@chrisstaite-menlo , can you try this branch and compile with https://github.com/TraceMachina/turbo-cache/tree/test-jmalloc (or just apply the latest commit in that branch) |
Looking into this, I've used a few tools to diagnose this. The only thing showing interesting results is The way i'm testing is by modifying the I also tried: https://docs.rs/dhat/latest/dhat/ This hints there is not a proper leak, but it is possible that somewhere the program is not dropping when it is expected. I also did disable metrics for this test, so it is possible that metrics are causing it (but unlikely). (the search continues) |
I tried building with your branch but I get:
|
Ah, it needed |
After restart with jemalloc:
|
I might have a lead. Attached is a trace file that you can decode by uploading to: This branch holds onto the main future in a static so it does not go out of scope, but I make sure there are no active requests when the report is generated and everything should be idle. |
@chrisstaite-menlo, can you possibly try removing FastSlowStore and MemoryStore? I'm hoping that the kernel's disk cache might make up for most of the performance loss of loosing MemoryStore. I'm still looking into it, but it'd be nice to narrow down the repro case, since MemoryStore screws with metrics a lot. |
One idea is that it's not a memory "leak", but rather a memory inefficiency...
The way |
@chrisstaite-menlo, before doing what I asked above, can we try this branch? |
Will be Tuesday before I can try. Will give time for the jemalloc version to soak. |
Just checked up after the weekend and it has only been up 19 hours meaning it must have OOMed and restarted yesterday. This is with the jemalloc version. |
Re-building with the always-copy commit now, will report back. |
After running quite a lot of builds with that change in the memory looks like this (jemalloc change removed):
Still a little early to tell, but I think that might be behaving. |
Yeah, it's early to say, but there's a 50% chance this is it. There is no true leak, there's something holding a pointer because if I destroy the root future memory only references static stuff (about 65k). I'm also pretty sure it's memory store, because I got nothing when I used only filesystem store. |
I'm going to open this back up, as I believe this is still happening, it is just reduced now. From my little research I've done so far it appears that glibc's allocator is horrible for long-lived data like memory store and most people recommend using jemalloc. So this ticket is now more about testing jemalloc and seeing if it's worth adding the dependency. |
We need to add a soak test to our internal infrastructure. We obviously cannot block PRs for a week but should know if a problem surfaces. |
@chrisstaite-menlo, any update if this change fixed your issues? I'd like to begin tuning ref: #749 |
@allada currently getting 3.5 days before it gets OOM'd which is much better than the .5 days previously. Have yet to add in the environment variable as requested. |
Have just re-started with |
I spoke to @chrisstaite-menlo in slack and he said that setting |
With the size partitioning store, the memory usage has grown from 3.3g to 3.8g over 13 days. This seems like a reasonably high amount still considering the harsh 100mb in-memory cache and only caching objects of 100kb. Will continue to monitor to see if it increases any more over time. Still, staying up longer than 48 hours is a big improvement. |
Great! Are you by chance using an image built with nix? If so, it might be worth trying because in my testing the memory seemed much more stable because it uses |
Yes, using a nix image. |
Now up to 3.9g, there can't be any more storage being used since they are all maxed out. Therefore I believe there is a small memory leak occurring somehow. Virtual memory usage is at 164.1g. |
MemoryStore now pre-allocates max_bytes from the eviction policy, then installs a custom allocator in that pool that the store will use. In testing, memory fragmentation appears to have dropped significantly, significantly decreased the number of sys-calls managing memory and slightly reduced user time. In addition, we remove two full copies of our data when we store it in the memory store. closes: TraceMachina#289
MemoryStore now pre-allocates max_bytes from the eviction policy, then installs a custom allocator in that pool that the store will use. In testing, memory fragmentation appears to have dropped significantly, significantly decreased the number of sys-calls managing memory and slightly reduced user time. In addition, we remove two full copies of our data when we store it in the memory store. closes: TraceMachina#289
MemoryStore now pre-allocates max_bytes from the eviction policy, then installs a custom allocator in that pool that the store will use. In testing, memory fragmentation appears to have dropped significantly, significantly decreased the number of sys-calls managing memory and slightly reduced user time. In addition, we remove two full copies of our data when we store it in the memory store. This PR also adds two new crates (likely to be published). "Heap Allocator" Fixed heap allocator based on the TLSF (Two-Level Segregated Fit) algorithm. TLSF is a fast, constant-time memory allocation algorithm designed for real-time applications. It organizes free memory into segregated lists based on block sizes, allowing for quick allocation and deallocation with minimal fragmentation. "Alloc Bytes" Provides custom byte buffers allocated using a user-provided allocator. It offers both mutable and immutable byte buffers, enabling efficient memory allocation strategies in systems with custom memory requirements. closes: TraceMachina#289
MemoryStore now pre-allocates max_bytes from the eviction policy, then installs a custom allocator in that pool that the store will use. In testing, memory fragmentation appears to have dropped significantly, significantly decreased the number of sys-calls managing memory and slightly reduced user time. In addition, we remove two full copies of our data when we store it in the memory store. This PR also adds two new crates (likely to be published). "Heap Allocator" Fixed heap allocator based on the TLSF (Two-Level Segregated Fit) algorithm. TLSF is a fast, constant-time memory allocation algorithm designed for real-time applications. It organizes free memory into segregated lists based on block sizes, allowing for quick allocation and deallocation with minimal fragmentation. "Alloc Bytes" Provides custom byte buffers allocated using a user-provided allocator. It offers both mutable and immutable byte buffers, enabling efficient memory allocation strategies in systems with custom memory requirements. closes: TraceMachina#289
MemoryStore now pre-allocates max_bytes from the eviction policy, then installs a custom allocator in that pool that the store will use. In testing, memory fragmentation appears to have dropped significantly, significantly decreased the number of sys-calls managing memory and slightly reduced user time. In addition, we remove two full copies of our data when we store it in the memory store. This PR also adds two new crates (likely to be published). "Heap Allocator" Fixed heap allocator based on the TLSF (Two-Level Segregated Fit) algorithm. TLSF is a fast, constant-time memory allocation algorithm designed for real-time applications. It organizes free memory into segregated lists based on block sizes, allowing for quick allocation and deallocation with minimal fragmentation. "Alloc Bytes" Provides custom byte buffers allocated using a user-provided allocator. It offers both mutable and immutable byte buffers, enabling efficient memory allocation strategies in systems with custom memory requirements. closes: TraceMachina#289
MemoryStore now pre-allocates max_bytes from the eviction policy, then installs a custom allocator in that pool that the store will use. In testing, memory fragmentation appears to have dropped significantly, significantly decreased the number of sys-calls managing memory and slightly reduced user time. In addition, we remove two full copies of our data when we store it in the memory store. This PR also adds two new crates (likely to be published). "Heap Allocator" Fixed heap allocator based on the TLSF (Two-Level Segregated Fit) algorithm. TLSF is a fast, constant-time memory allocation algorithm designed for real-time applications. It organizes free memory into segregated lists based on block sizes, allowing for quick allocation and deallocation with minimal fragmentation. "Alloc Bytes" Provides custom byte buffers allocated using a user-provided allocator. It offers both mutable and immutable byte buffers, enabling efficient memory allocation strategies in systems with custom memory requirements. closes: TraceMachina#289
MemoryStore now pre-allocates max_bytes from the eviction policy, then installs a custom allocator in that pool that the store will use. In testing, memory fragmentation appears to have dropped significantly, significantly decreased the number of sys-calls managing memory and slightly reduced user time. In addition, we remove two full copies of our data when we store it in the memory store. This PR also adds two new crates (likely to be published). "Heap Allocator" Fixed heap allocator based on the TLSF (Two-Level Segregated Fit) algorithm. TLSF is a fast, constant-time memory allocation algorithm designed for real-time applications. It organizes free memory into segregated lists based on block sizes, allowing for quick allocation and deallocation with minimal fragmentation. "Alloc Bytes" Provides custom byte buffers allocated using a user-provided allocator. It offers both mutable and immutable byte buffers, enabling efficient memory allocation strategies in systems with custom memory requirements. closes: TraceMachina#289
MemoryStore now pre-allocates max_bytes from the eviction policy, then installs a custom allocator in that pool that the store will use. In testing, memory fragmentation appears to have dropped significantly, significantly decreased the number of sys-calls managing memory and slightly reduced user time. In addition, we remove two full copies of our data when we store it in the memory store. This PR also adds two new crates (likely to be published). "Heap Allocator" Fixed heap allocator based on the TLSF (Two-Level Segregated Fit) algorithm. TLSF is a fast, constant-time memory allocation algorithm designed for real-time applications. It organizes free memory into segregated lists based on block sizes, allowing for quick allocation and deallocation with minimal fragmentation. "Alloc Bytes" Provides custom byte buffers allocated using a user-provided allocator. It offers both mutable and immutable byte buffers, enabling efficient memory allocation strategies in systems with custom memory requirements. closes: TraceMachina#289
MemoryStore now pre-allocates max_bytes from the eviction policy, then installs a custom allocator in that pool that the store will use. In testing, memory fragmentation appears to have dropped significantly decreasing the number of sys-calls managing memory and slightly reduced user time. In addition, we remove two full copies of our data when we store it in the memory store. This PR also adds two new crates (likely to be published). "Heap Allocator" Fixed heap allocator based on the TLSF (Two-Level Segregated Fit) algorithm. TLSF is a fast, constant-time memory allocation algorithm designed for real-time applications. It organizes free memory into segregated lists based on block sizes, allowing for quick allocation and deallocation with minimal fragmentation. "Alloc Bytes" Provides custom byte buffers allocated using a user-provided allocator. It offers both mutable and immutable byte buffers, enabling efficient memory allocation strategies in systems with custom memory requirements. closes: TraceMachina#289
MemoryStore now pre-allocates max_bytes from the eviction policy, then installs a custom allocator in that pool that the store will use. In testing, memory fragmentation appears to have dropped significantly decreasing the number of sys-calls managing memory and slightly reduced user time. In addition, we remove two full copies of our data when we store it in the memory store. This PR also adds two new crates (likely to be published). "Heap Allocator" Fixed heap allocator based on the TLSF (Two-Level Segregated Fit) algorithm. TLSF is a fast, constant-time memory allocation algorithm designed for real-time applications. It organizes free memory into segregated lists based on block sizes, allowing for quick allocation and deallocation with minimal fragmentation. "Alloc Bytes" Provides custom byte buffers allocated using a user-provided allocator. It offers both mutable and immutable byte buffers, enabling efficient memory allocation strategies in systems with custom memory requirements. closes: TraceMachina#289
MemoryStore now pre-allocates max_bytes from the eviction policy, then installs a custom allocator in that pool that the store will use. In testing, memory fragmentation appears to have dropped significantly decreasing the number of sys-calls managing memory and slightly reduced user time. In addition, we remove two full copies of our data when we store it in the memory store. This PR also adds two new crates (likely to be published). "Heap Allocator" Fixed heap allocator based on the TLSF (Two-Level Segregated Fit) algorithm. TLSF is a fast, constant-time memory allocation algorithm designed for real-time applications. It organizes free memory into segregated lists based on block sizes, allowing for quick allocation and deallocation with minimal fragmentation. "Alloc Bytes" Provides custom byte buffers allocated using a user-provided allocator. It offers both mutable and immutable byte buffers, enabling efficient memory allocation strategies in systems with custom memory requirements. closes: TraceMachina#289
With the following configuration, the storage server running only turbo-cache got OOM killed even though it has 16Gb of storage after running for ~4 days. Not sure if there's a memory leak somewhere in the recent code?
The text was updated successfully, but these errors were encountered: