Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dlt-user: fix memory issues #667

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

alexmohr
Copy link
Contributor

@alexmohr alexmohr commented Jul 17, 2024

  • improve free after failures in init
  • check if freeing in some methods
  • add missing free in unit tests

This fixed #522 on my machines but your that's not the main purpose of this PR. Therefore before resolving #522 more testing is required.

  1. Fixes the following address sanitizer issue
exiting
    #0 0x7ffff7625fc6 in __interceptor_sigaltstack ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:9986
    #1 0x7ffff768c493 in __asan::PlatformUnpoisonStacks() ../../../../src/libsanitizer/asan/asan_posix.cpp:44
    #2 0x7ffff769202c in __asan_handle_no_return ../../../../src/libsanitizer/asan/asan_rtl.cpp:612
    #4 0x7ffff7073ac2 in start_thread nptl/pthread_create.c:442
    #5 0x7ffff710584f  (/lib/x86_64-linux-gnu/libc.so.6+0x12684f)

 at offset 160 in frame

  This frame has 2 object(s):
    [32, 88) 'log_new' (line 5017)
    [128, 160) 'handle' (line 5016) <== Memory access at offset 160 overflows this variable
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
      (longjmp and C++ exceptions *are* supported)
Thread T37879 created by T0 here:
    #0 0x7ffff762a685 in __interceptor_pthread_create ../../../../src/libsanitizer/asan/asan_interceptors.cpp:216

SUMMARY: AddressSanitizer: stack-buffer-overflow ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:9986 in __interceptor_sigaltstack
Shadow bytes around the buggy address:
  0x10007e637b10: 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 f8 f2
  0x10007e637b20: f2 f2 00 00 00 00 00 00 00 00 00 00 00 00 00 06
  0x10007e637b30: f2 f2 f2 f2 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007e637b40: 04 f3 f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00
  0x10007e637b50: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 00
=>0x10007e637b60: 00 00 00 f2 f2 f2 f2 f2 00 00 00 00[f3]f3 f3 f3
  0x10007e637b70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007e637b80: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
  0x10007e637b90: f1 f1 00 00 f2 f2 f8 f2 f2 f2 f8 f8 f2 f2 00 00
  0x10007e637ba0: 00 00 00 00 00 00 00 00 00 00 00 f3 f3 f3 f3 f3
  0x10007e637bb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
==3067906==ABORTING
Process finished with exit code 1
  1. Fixes some minor sanitzer issues in the unit tests due to the leaked objects.
  2. Fixes a crash in libdlt due to presumably double freeing
#0  __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
#1  0x0000007cd8711fe4 in __pthread_kill_internal (signo=6, threadid=<optimized out>) at pthread_kill.c:78
#2  0x0000007cd86cde20 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3  0x0000007cd86baf0c in __GI_abort () at abort.c:79
#4  0x0000007cd8705ea8 in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7cd87e4998 "%s\n") at ../sysdeps/posix/libc_fatal.c:155
#5  0x0000007cd871c080 in malloc_printerr (str=str@entry=0x7cd87df978 "malloc_consolidate(): unaligned fastbin chunk detected") at malloc.c:5659
#6  0x0000007cd871c9d8 in malloc_consolidate (av=av@entry=0x7cd882aa78 <main_arena>) at malloc.c:4745
#7  0x0000007cd871df64 in _int_free (av=0x7cd882aa78 <main_arena>, p=0x5aafe40790, have_lock=<optimized out>) at malloc.c:4669
#8  0x0000007cd87208f8 in __GI___libc_free (mem=<optimized out>) at malloc.c:3386
#9  0x0000007cd8bbb858 in dlt_free () at /src/lib/dlt_user.c:1190
#10 0x0000007cd86d08fc in __cxa_finalize (d=0x7cd8bf0000) at cxa_finalize.c:83
#11 0x0000007cd8bb9de0 in __do_global_dtors_aux () from /usr/lib/libdlt.so.2
#12 0x0000007cdeafcee8 in _dl_call_fini (closure_map=closure_map@entry=0x7cdeacf530) at dl-call_fini.c:43
#13 0x0000007cdeb002d0 in _dl_fini () at dl-fini.c:78
#14 0x0000007cd86d0398 in __run_exit_handlers (status=0, listp=0x7cd882a698 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true) at exit.c:116
#15 0x0000007cd86d046c in __GI_exit (status=<optimized out>) at exit.c:146
#16 0x0000007cd86bb1c4 in __libc_start_call_main (main=main@entry=0x5a80cc67ec <main(int, char const* const*)>, argc=argc@entry=67, argv=argv@entry=0x7febd193c8)
    at ../sysdeps/nptl/libc_start_call_main.h:74
#17 0x0000007cd86bb29c in __libc_start_main_impl (main=0x5a80cc67ec <main(int, char const* const*)>, argc=67, argv=0x7febd193c8, init=<optimized out>, fini=<optimized out>,
    rtld_fini=<optimized out>, stack_end=<optimized out>) at ../csu/libc-start.c:389
#18 0x0000005a80cf5530 in _start () at ../sysdeps/aarch64/start.S:81

This stacktrace above has been detected in an endurance test of dlt. After making the changes the issue did not occur during a 30 hour test again.
Please note that our internal testing was not based on master but the release we're currently using on the actual systems and this patch has been ported from that release to master.

The program was tested solely for our own use cases, which might differ from yours.
Licensed under Mozilla Public License Version 2.0

Alexander Mohr, [email protected], Mercedes-Benz Tech Innovation GmbH, imprint

@alexmohr alexmohr marked this pull request as draft August 1, 2024 17:06
@alexmohr alexmohr marked this pull request as ready for review August 1, 2024 17:37
@@ -485,7 +488,8 @@ DltReturnValue dlt_init(void)

/* Initialize common part of dlt_init()/dlt_init_file() */
if (dlt_init_common() == DLT_RETURN_ERROR) {
dlt_user_init_state = INIT_UNITIALIZED;
dlt_user_init_state = INIT_DONE;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to leave the state set to INIT_IN_PROGRESS and check for this state in dlt_free() as well.

It is quite confusing to set state to INIT_DONE in failure case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite sure if I'm understanding you correct, because the whole idea is to prevent having init and free running at the same time.
Would it be acceptable to add another state into the enum "INIT_ERROR" to be used in this case?

memcpy(log->buffer + log->size, data, length);
if (data != NULL) {
memcpy(log->buffer + log->size, data, length);
}
log->size += length;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

log size should only be increased if copy to buffer has happened. please move into brackets of if-statement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops.. done.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same for args_num?

@@ -2042,7 +2058,7 @@ static DltReturnValue dlt_user_log_write_generic_attr(DltContextData *log, const
if (log == NULL)
return DLT_RETURN_WRONG_PARAMETER;

if (!DLT_USER_INITALIZED) {
if (!DLT_USER_INITALIZED_NOT_FREEING) {
dlt_vlog(LOG_WARNING, "%s dlt_user_init_state != INIT_DONE\n", __FUNCTION__);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor remark:
this can now fail due to many different reasons. it would be good to log the state of
dlt_user_freeing and dlt_user_init_state as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, please check if the log message is to your liking now :) If not please let me know how it should look like.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar messages is used in other places. if you like the message I'll change in these places as well.

* improve free after failures in init
* check if freeing in some methods
* add missing free in unit tests

Signed-off-by: Alexander Mohr <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

dlt_init() and dlt_free() functions are not thread safe
2 participants