-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Log LLM tool call for streamed response #545
Log LLM tool call for streamed response #545
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #545 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 133 133
Lines 10263 10308 +45
Branches 1399 1405 +6
=========================================
+ Hits 10263 10308 +45 ☔ View full report in Codecov by Sentry. |
'message': final_completion.choices[0].message if final_completion.choices else None, | ||
'usage': final_completion.usage, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I worry about this having a significantly different shape from the other response data dicts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the same shape as the existing non-streamed chat completion, which is why it displays nicely in the UI.
logfire/logfire/_internal/integrations/llm_providers/openai.py
Lines 81 to 91 in af707e2
def on_response(response: ResponseT, span: LogfireSpan) -> ResponseT: | |
"""Updates the span based on the type of response.""" | |
if isinstance(response, LegacyAPIResponse): # pragma: no cover | |
on_response(response.parse(), span) # type: ignore | |
return cast('ResponseT', response) | |
if isinstance(response, ChatCompletion): | |
span.set_attribute( | |
'response_data', | |
{'message': response.choices[0].message, 'usage': response.usage}, | |
) |
I'll admit it's not exactly the same: the message object here is a ParsedChatCompletionMessage
, a subclass of the non-streamed chat completion message ChatCompletionMessage
. It has the "parsed" and tool call "parsed_arguments" fields added. These "ParsedX" classes are those returned by the client.beta.chat.completions.parse
method.
Aside: it would be handy to have the response dicts that have special handling on the frontend be documented or codified (e.g. using TypedDicts), ideally with the order of precedence. For example I found that "combined_chunk_content" takes precedence over "message", which is why I excluded it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, i didn't realise we were already using this shape. yes, creating some types sounds good. @dmontagu any thoughts? should we consider these shapes stable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Are you happy for me to merge?
cc @willbakst in case you want to take a look, but I won't wait to merge. |
Yes, happy to merge! Thanks |
The current LLM streamed response parsing only collects the message content/text. This PR generalizes this to allow collecting any state needed to generate the response_data. Streamed OpenAI chat completions now include the tool calls in the Logfire UI. Behaviour for Anthropic and for OpenAI completions (non-chat) remains the same.
Fixes #542