Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash in AudioStreamAAudio::read #1995

Open
sonicdebris opened this issue Apr 8, 2024 · 14 comments
Open

Crash in AudioStreamAAudio::read #1995

sonicdebris opened this issue Apr 8, 2024 · 14 comments
Assignees

Comments

@sonicdebris
Copy link

sonicdebris commented Apr 8, 2024

Android version(s): 9 to 14
Android device(s): several (see below)
Oboe version: 1.7.5
App name used for testing: play store reports

We have been getting reports of this crash for a (long) while from google play dev console:

  #00  pc 0x000000000004e574  /apex/com.android.runtime/lib64/bionic/libc.so (abort+180)
  #01  pc 0x0000000000038b40  /system/lib64/libaaudio_internal.so (aaudio::IsochronousClockModel::convertLatestTimeToPosition(long) const+208)
  #02  pc 0x00000000000344d8  /system/lib64/libaaudio_internal.so (aaudio::AudioStreamInternalCapture::processDataNow(void*, int, long, long*)+272)
  #03  pc 0x00000000000337f4  /system/lib64/libaaudio_internal.so (aaudio::AudioStreamInternal::processData(void*, int, long)+332)
  #04  pc 0x00000000007c7e80 ourlibrary.so (oboe::AudioStreamAAudio::read(void*, int, long)+643) (BuildId: f367ae05b14f6252284b0d34b9707bb8ae2a962a)
[...]
  #45  pc 0x00000000007c791c  ourlibrary.so (oboe::AudioStreamAAudio::callOnAudioReady(AAudioStreamStruct*, void*, int)+524) (BuildId: f367ae05b14f6252284b0d34b9707bb8ae2a962a)
  #46  pc 0x0000000000022430  /system/lib64/libaaudio_internal.so (aaudio::AudioStream::maybeCallDataCallback(void*, int)+192)
  #47  pc 0x000000000002411c  /system/lib64/libaaudio_internal.so (aaudio::AudioStreamLegacy::callDataCallbackFrames(unsigned char*, int)+300)
  #48  pc 0x000000000002477c  /system/lib64/libaaudio_internal.so (aaudio::AudioStreamLegacy::processCallbackCommon(int, void*)+956)
  #49  pc 0x00000000000792c4  /system/lib64/libaudioclient.so (android::AudioTrack::processAudioBuffer()+2372)
  #50  pc 0x00000000000785c8  /system/lib64/libaudioclient.so (android::AudioTrack::AudioTrackThread::threadLoop()+312)

I think it's some assert in libaaudio. We tried to investigate our code, make it more robust but we weren't able to affect the rate of this particular crash.

In short, we are reading from the input stream using the syncrhonous API while in the output stream callback.

Interestingly, the distribution of devices with the most occurrences doesn't match the one for the app install base. Here are the top 10 devices (accounting for ~35% of the crashes):

motorola austin           Mediatek MT6833V/NZA
Redmi sunny               Qualcomm SM6150
T-Mobile Bethpage         Mediatek MT6833
T-Mobile Augusta          Mediatek MT6833
T-Mobile TorreyPines      Mediatek MT6833
Redmi camellian           Mediatek MT6833V/ZA
Redmi secret              Mediatek MT6785
T-Mobile Sprout           Mediatek MT6833V/ZA
Redmi miel                Mediatek MT6781V/CD
POCO rosemary             Mediatek MT6785

More interestingly, the Mediatek MT6833 is by far the most represented in the whole list (a couple hundred devices).

Hopefully someone here has some hunch on what we might be doing wrong, or a workaround for some known device-specific issue.

@sonicdebris sonicdebris added the bug label Apr 8, 2024
@philburk
Copy link
Collaborator

I notice from the crash dump that the output callback is using Legacy mode and the input read() is using MMAP.
That situation may be more common on the MT6833.

There could also be timing differences that cause race conditions.

Are you using a shared_ptr for your stream variables?

std::shared_ptr<oboe::AudioStream> mStream;

That can help prevent some use-after-free bugs.

@philburk
Copy link
Collaborator

philburk commented Apr 10, 2024

Do you think this could be happening when the user plugs in or unplugs a headset?

@flamme
Copy link
Collaborator

flamme commented Apr 10, 2024

It is hard to tell if there is any headset connection change just with the stack trace. We used to have an old bug b/220061515 reporting a similar crash with sound amplifier on android S. It is closed due to no longer reproduced. That may be the same root cause while the issue is not really fixed. I will take a look.

@flamme
Copy link
Collaborator

flamme commented Apr 11, 2024

Looking more into the code, it looks like this could be caused by the audio stream is freed when the data callback is fired. In aaudio, the audio stream is held by the clients. @sonicdebris, could you please share if you are using a shared pointer for your stream variables as Phil mentioned above? It will also be good to know if you will free the stream immediately when receiving disconnect error.

@sonicdebris
Copy link
Author

sonicdebris commented Apr 17, 2024

Hello, sorry for the delay in the response.

We are using shared pointers for the streams, and using AudioStreamBuilder::openStream to set them up.

We handle stream errors by implementing AudioStreamErrorCallback::onError(), where we stop and close the stream (and reset the pointer).

We protect all accesses to the stream with a mutex, and in the audio callback we do a try-lock, followed by a null-check on the stream pointer, so the audio callback will just do nothing in case some other function is messing with the stream or has reset the pointer.

I might add that we have the same issue reported on crashlytics, where we log when onError is called and also when audio route changes happen (we listen to such events on the jvm side by using AudioManager.registerAudioDeviceCallback), but in the logs associated to the crashes we don't see any such events related to stream errors or route changes.

@philburk
Copy link
Collaborator

@sonicdebris - you wrote:

We handle stream errors by implementing AudioStreamErrorCallback::onError(), where we stop and close the stream (and reset the pointer).

Do you return true from your onError() call? If you return false then Oboe might close the stream again!

@sonicdebris
Copy link
Author

Yes, we do return true

@philburk
Copy link
Collaborator

philburk commented May 8, 2024

The reason you see more crashes on MTK6833 may be because Mediatek may have set compiler options that are detecting numeric overflows. Or there may be lower memory on those devices.

@philburk
Copy link
Collaborator

I notice that you are using Oboe v1.7.5.

Please try using the latest version v1.9.0 from GitHub.
We plan to release a prebuilt Oboe library this week or next.
We have done a lot of work to make Oboe use shared_ptr as much as possible and to avoid raw pointers.
That seems to help.

@sonicdebris
Copy link
Author

Thanks for the heads-up! I do not see a tag/release for 1.9.0 but I guess I can just reference this commit:
e38c228

@philburk
Copy link
Collaborator

philburk commented Jul 1, 2024

V1.9.0 is released now.

@philburk
Copy link
Collaborator

philburk commented Jul 1, 2024

but in the logs associated to the crashes we don't see any such events related to stream errors or route changes.

So the crashes just happen at odd times unrelated to peripheral disconnects.

Hmmm. Is there any chance that you are storing a pointer to C++ object as a long data value in a Java Object?
If the Java object is destroyed then do you free the C++ object?
Is it possible that object is getting garbage collected prematurely?
I ask because I had a similar bug with a Java app a long time ago and it acted similarly.

@philburk
Copy link
Collaborator

philburk commented Jul 8, 2024

I recommend statically allocating the object that contains your audio engine.
Then you know it will not get deleted while audio is running.

I will close this if I do not hear back.

@sonicdebris
Copy link
Author

We deployed a client build with oboe 1.9.0 but the rate for the crash hasn't been affected.

Hmmm. Is there any chance that you are storing a pointer to C++ object as a long data value in a Java Object?

We are doing something similar, but not directly. The oboe stream objects are part of an "AudioIO" class, which is in turn the native peer of a java object. We use djinni to have JNI bindings automatically generated, but in principle it works in such way that the "native peer long" field is the address of a shared pointer. When the java object is gone (which happens based on a mechanism relying on phantom references), the native peer's shared pointer is released and eventually the destructor for the "AudioIO" C++ class is called. There, we throw an assert if the IO had not already been stopped explicitly (we do try to enforce that IO is started/stopped precisely when needed), but on prod builds after that we still perform the streams stop and destruction (under protection by the mutex mentioned before).

I'll investigate more in this direction, maybe there is something weird happening in the destructor, thanks for the suggestions!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants