Probably problem with SSO for std::string. #36

itwasntme-mk · 2020-09-03T04:47:59Z

Code:

    std::string str("~~~~~~~~~~~~~~~");

    for (int i = 0; i != 1000000; ++i)
      g_log.info("Info line: %s @%d", str, i);

Result in:

munmap_chunk(): invalid pointer
Aborted

Randomly I saw also:

munmap_chunk(): invalid size

If str has more than 15 characters problem doesn't appear. @mattiasflodin Could you take a look at that? Probably it is similar to #35 (also short string).

The text was updated successfully, but these errors were encountered:

mattiasflodin · 2020-10-03T14:26:28Z

Hi, just wanted to let you know I've reproduced this and I'll have a look at it. Thanks for the report.

serpent7776 · 2020-11-27T17:18:29Z

Is there any progress on this?

I noticed that increasing input_buffer_capacity when creating basic_log instance reduces chance of getting crash.

@mattiasflodin Are you sure using relaxed atomics is ok here? https://github.com/mattiasflodin/reckless/blob/v3.0.2/reckless/include/reckless/detail/platform.hpp#L76
Wouldn't it be better to use std::atomic?
The TODO note there is in particular concerning.

mopleen · 2020-12-09T15:02:43Z

Maybe this demo will be useful https://github.com/mopleen/reckless_demo_crash
What's interesting is that the crash does not happen on the system running libc 2.17

Also, logging only long strings does not crash either. Short string is what's triggering the problem.

mattiasflodin · 2021-04-10T12:09:54Z

@serpent7776 @mopleen Thank you for your patience with this bug and your efforts to provide reproduction demos and inspecting the code. I have been busy with some other projects so this had to be put on hold for a while. Anyway, I believe I have figured out what is going wrong. The usual suspect would of course be multithreading and incorrect use of atomics, but in fact this error is not related to that.

The problem lies in the ring buffer, which is implemented by mapping multiple virtual addresses to the same physical memory in an attempt to avoid some wraparound checks. basic_string in libstdc++ has an optimization for strings of 16 characters or less, in which it stores the data in a local array instead of allocating memory. When the string is destroyed, it performs this comparison to determine whether the memory should be deallocated or not:

_M_dispose()
{
    if (!_M_is_local())
        _M_destroy(_M_allocated_capacity);
}

_M_is_local() operates by comparing the buffer pointer to the address of the internal array. If they don't match, it assumes that the memory was allocated and attempts to deallocate it.

For some reason that I'm currently looking into, the same virtual address is not being used when the object is destroyed as when it is created (even though they refer to the same physical memory), which causes _M_is_local to return false and basic_string then attempts to free a buffer that was never allocated.

It may turn out that my ring buffer is too clever for its own good and that I need to change its implementation. I hope not, because that could make the performance of pushing very large objects unintuitive. But so far I believe there is just a bug in some pointer arithmetic related to the buffer.

basic_log::output_worker calculates frame positions by taking the "front" address (current read pointer) from mpsc_ring_buffer and then advancing that until it reaches the end of the batch. However, if the batch size is large enough to go past the end of the ring buffer then no wrap around is applied to the frame pointer. This isn't strictly a buffer overrun because, due to the nature of the ring buffer implementation, the same physical memory is mapped in a second time at the end and we are in fact meant to go past the end in situations where objects overlap the end. However, we can run into trouble when an object was constructed in the first virtual memory-map and then accessed through the second virtual-memory map. For plain old data it works fine, but if the object has code that depends on the object's this-pointer staying consistent (a reasonable expectation) then formatting or destroying the object can lead to undefined behavior. This surfaced as github issue #36 in which the libstdc++ implementation of std::string had optimizations that would make use of the address of a class member to determine which deallocation strategy must be used. The fix is fairly simple: instead of using the frame pointer as loop invariant we use the ring buffer's 64-bit position to determine when the end is reached, and whenever we need to use the frame pointer we make sure that it is consistently wrapped to always start in the first virtual-memory block.

mattiasflodin · 2021-04-10T16:02:23Z

Resolved in v3.0.3.

serpent7776 · 2021-04-16T19:05:13Z

Seems it's working fine now, thanks.

pastorom mentioned this issue Sep 28, 2020

Consistent crash with example from README #37

Closed

mattiasflodin added the bug label Apr 10, 2021

mattiasflodin closed this as completed Apr 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Probably problem with SSO for std::string. #36

Probably problem with SSO for std::string. #36

itwasntme-mk commented Sep 3, 2020

mattiasflodin commented Oct 3, 2020

serpent7776 commented Nov 27, 2020

mopleen commented Dec 9, 2020 •

edited

Loading

mattiasflodin commented Apr 10, 2021 •

edited

Loading

mattiasflodin commented Apr 10, 2021

serpent7776 commented Apr 16, 2021

Probably problem with SSO for std::string. #36

Probably problem with SSO for std::string. #36

Comments

itwasntme-mk commented Sep 3, 2020

mattiasflodin commented Oct 3, 2020

serpent7776 commented Nov 27, 2020

mopleen commented Dec 9, 2020 • edited Loading

mattiasflodin commented Apr 10, 2021 • edited Loading

mattiasflodin commented Apr 10, 2021

serpent7776 commented Apr 16, 2021

mopleen commented Dec 9, 2020 •

edited

Loading

mattiasflodin commented Apr 10, 2021 •

edited

Loading