Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: StreamingResponse attributes shared between different StreamingResponse instances #2320

Open
daniel-ideable opened this issue Feb 24, 2025 · 2 comments · May be fixed by #2321
Open
Labels
bug Something isn't working Python Change/fix applies to Python. If all three, use the 'JS & dotnet & Python' label

Comments

@daniel-ideable
Copy link

daniel-ideable commented Feb 24, 2025

Language

Python

Version

latest

Description

The StreamingResponse class in the Python implementation has a critical flaw: most of its properties are defined at the class level, making them static variables. This means these properties (_stream_id, _next_sequence, _queue, etc.) are shared between all instances of StreamingResponse. This shared state leads to race conditions and incorrect behaviour when handling concurrent streaming requests.

This can lead to a security issue if conversations get mixed.

Imagine a bot handling two concurrent user requests (Request A and Request B):

  1. Request A: A user starts a conversation, and a StreamingResponse instance (let's call it stream_a) is created with a unique _stream_id (e.g., "stream_id_A") on the first answer from the frist queued activity. The bot begins processing the request and queues updates using queue_text_chunk().

  2. Request B: Before stream_a finishes sending all its updates and calls end_stream(), a second user starts a conversation. A new StreamingResponse instance (stream_b) is created. Crucially, because the properties are class-level, stream_b overwrites the _stream_id of stream_a (e.g., to "stream_id_B") in its first queued activity answer.

  3. Race Condition (Not prepared for concurrent environment) and Incorrect Sequence Numbers:

    • _stream_id Problem (as before): If a task related to stream_b is scheduled and executed before a task related to stream_a finishes, stream_a might start using stream_b's _stream_id, sending responses to the wrong client.
    • _next_sequence Problem: stream_b starts sending updates with sequence numbers that are potentially in the middle of the sequence expected by the client for Request B. If stream_a had already sent updates with sequence numbers 1, and 2, stream_b might start at sequence number 3, even though it's the beginning of the conversation for Request B.
  4. Incorrect Response: When the response to stream_A finally returns to the client it will use the context of stream_B. It means, stream_A response could be returning a response with chunks belonging to a different stream, or the stream_A might be sent to a client requesting stream_B, resulting in a mixed-up or incorrect response.

Reproduction Steps

  1. Implement a dummie StreamingResponse (For example that consumes an Langchain Agent).
  2. Add a StreamingResponse that sends the astream response from the Agent.
  3. Create a StreamingResponse instance.
  4. Call the agent's astream method with the user's input.
  5. Use queue_text_chunk() to queue response chunks.
  6. Ensure the logic starts an async process to proccess the queue.
  7. Testing (Multi-User - Preferred):
    7.1. Using Teams (or similar), have two users (User A and User B) interact with the bot concurrently.
    7.2. User A sends a message.
    7.3. Immediately after, User B sends a message (before User A's response completes).
  8. Testing (Single User - Alternative):
    8.1. If multi-user testing is not possible, use a single user.
    8.2. Send a message to the bot.
    8.3. Immediately send another message before the first response finishes.
@daniel-ideable daniel-ideable added the bug Something isn't working label Feb 24, 2025
@lilyydu lilyydu added the Python Change/fix applies to Python. If all three, use the 'JS & dotnet & Python' label label Feb 27, 2025
@saichaitanya1729
Copy link

Hi @daniel-ideable , I am not sure if this is related but, I get an intermittent error saying "Missing streamId from end stream activity." while conversing with the bot.

I use a custom model that yields chunks and sent to the StreamingResponse class from the teams-ai library which then streams responses.
I see this error in the logs intermittently: (BadSyntax) Missing streamId from end stream activity.

The error message seems to be pretty straight forward, but I am not able to find the root cause for this. If I see the send_activity() function in the StreamingResponse, the response_id is being set as the stream_id. Not exactly sure why it(stream_id) is missing in the end_stream() activity.

Just wondering if you have encountered something like this during your development.

@daniel-ideable
Copy link
Author

I am not sure if I encountered the same issue, but It think it is very likely related to the core concurrency issue . The root cause lies in how the StreamingResponse class handles state. Specifically, certain attributes, including stream_id, are defined at the class level rather than the instance level.

When multiple concurrent streaming requests occur (such as those generated by asynchronous tasks), the stream_id can be overwritten or missing while waiting for an assignment. As you mentioned, there's a distinct possibility that the end_stream() activity is executing before the stream_id has been correctly assigned in the initial response of the streaming sequence.

The most frequent problems I encountered were writing to a streaming response that was already finished, or using an already started number sequence from another streaming in a new one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Python Change/fix applies to Python. If all three, use the 'JS & dotnet & Python' label
Projects
None yet
3 participants