Audio Live Mode with Tool Use #62

dselman · 2025-02-03T23:33:23Z

Description of the feature request:

The documentation is quite vague on this issue: https://ai.google.dev/api/multimodal-live#function-calling

Specifically, "Audio inputs and audio outputs negatively impact the model's ability to use function calling." — which is what I think I am seeing. When I use tools with audio input & output the tool responses sent back seem to be interpreted as part of the conversation, or at least the audio part of the conversation gets very confused.

Is this a bug? Or limitation? Should we expect improvements for this use case?

What problem are you trying to solve with this feature?

I would like to conduct an audio live mode interaction (audio In and Out) but with calls to some tools being performed in the background.

Any other information you'd like to share?

No response

hapticdata · 2025-02-08T14:41:11Z

#47 things are always getting better 😉

hapticdata added the duplicate This issue or pull request already exists label Feb 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Audio Live Mode with Tool Use #62

Audio Live Mode with Tool Use #62

dselman commented Feb 3, 2025

hapticdata commented Feb 8, 2025

Audio Live Mode with Tool Use #62

Audio Live Mode with Tool Use #62

Comments

dselman commented Feb 3, 2025

Description of the feature request:

What problem are you trying to solve with this feature?

Any other information you'd like to share?

hapticdata commented Feb 8, 2025