Issue streaming Chinese characters using azure-ai-inference SDK #39565

bombert · 2025-02-05T07:07:15Z

Type of issue

Code doesn't work

Description

Azure AI foundry SDK cannot handle steaming with Chinese characters.

Bug Recreation

from azure.ai.inference import ChatCompletionsClient
from azure.ai.inference.models import SystemMessage, UserMessage
from azure.core.credentials import AzureKeyCredential
key = "<your_key>"
endpoint = "<end_point>"
model_name = "<model_name>"

client = ChatCompletionsClient(
    endpoint=endpoint,
    credential=AzureKeyCredential(key),
)

response = client.complete(
    stream=True,
    messages=[
        SystemMessage(content="You are a helpful assistant."),
        UserMessage(content="9.11 and 9.8, which is greater? Please response in Chinese!!!")
    ],
    max_tokens=2048,
    model=model_name
)

for update in response:
    if update.choices:
        print(update.choices[0].delta.content or "", end="")

FATAL

File 342 line

line_list: List[str] = re.split(r"(?<=\n)", element.decode("utf-8"))
would throw exception when element doesn't end with a complete Chinese character, i.e. '\xe6\x8e\xa8\xe8\x8d'

Fixing Suggestion

Catch the exception and wait next chunk.

      try:
            if self._incomplete_element:
                element = self._incomplete_element + element
                self._incomplete_element = b""
            line_list: List[str] = re.split(r"(?<=\n)", element.decode("utf-8"))
        except UnicodeDecodeError:
            self._incomplete_element = element
            return False

The text was updated successfully, but these errors were encountered:

github-actions · 2025-02-05T15:31:39Z

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @achauhan-scc @kingernupur @luigiw @needuv @paulshealy1 @singankit.

kingernupur · 2025-02-06T10:04:40Z

Tagging @jhakulin

dargilco · 2025-02-10T19:14:24Z

@bombert Thanks for reporting this and suggesting a fix! I'm looking into it.

dargilco · 2025-02-10T20:44:47Z

@bombert which AI model did you use?

dargilco · 2025-02-10T23:40:55Z

@bombert I've tried several models, but was not able to reproduce the issue. I'll need to see the issue for myself in order to fix it. Please share model name and how the model was deployed. What is the endpoint value above? Feel free to redact personal resource name if relevant. Were you using Azure OpenAI endpoint? GitHub models endpoint https://models.inference.ai.azure.com, Serverless API endpoint, Managed Computer endpoint, or default deployment via AI Foundry Project into an AI Service resource?.

howieleung · 2025-02-11T05:07:24Z

@dargilco I think this is the line of code to be fixed.

I think the http header from the response has the encoding. BaseAsyncAgentEventHandler has initialize function. See if you can add the response header as an argument. And pass it from the caller and consume in the decode function.

dargilco · 2025-02-11T15:14:37Z

@howieleung the issue was reported on the Inference SDK, not the AI Projects SDK. But of course, we need to make sure both can handle this rare case. I'll follow up with you 1:1.

TshkOtsk · 2025-02-13T12:54:25Z

I got same issue.

AI model: DeepSeek-R1
Endpoint: Azure AI model inference endpoint (https://<DEPLOY_NAME>.services.ai.azure.com/models)

Code:

client = ChatCompletionsClient(
    endpoint=AZUREAI_ENDPOINT_URL,
    credential=AzureKeyCredential(AZUREAI_ENDPOINT_KEY)
)
deploy_name = AZUREAI_DEPLOYMENT_NAME

@app.get("/chat")
async def chat_endpoint(message: str):
    conversation_history.append({"role": "user", "content": message})
    
    messages = [SystemMessage(content="You are a helpful assistant.")]
    for entry in conversation_history:
        if entry["role"] == "user":
            messages.append(UserMessage(content=entry["content"]))
        else:
            messages.append(UserMessage(content=entry["content"]))

    response = client.complete(
        stream=True,
        messages=messages,
        max_tokens=4096,
        model=deploy_name,
    )
    
    def event_generator():
        assistant_response = ""
        for chunk in response:
            if chunk.choices and chunk.choices[0].delta and chunk.choices[0].delta.content:
                delta = chunk.choices[0].delta.content
                print(delta)
                assistant_response += delta
                yield f"data: {delta}\n\n"
        conversation_history.append({"role": "assistant", "content": assistant_response})
        yield "data: [DONE]\n\n"
    
    return StreamingResponse(event_generator(), media_type="text/event-stream")

Error messages:

ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/Users/.pyenv/versions/3.10.4/lib/python3.10/site-packages/starlette/responses.py", line 268, in __call__
    await wrap(partial(self.listen_for_disconnect, receive))
  File "/Users/.pyenv/versions/3.10.4/lib/python3.10/site-packages/starlette/responses.py", line 264, in wrap
    await func()
  File "/Users/.pyenv/versions/3.10.4/lib/python3.10/site-packages/starlette/responses.py", line 233, in listen_for_disconnect
    message = await receive()
  File "/Users/.pyenv/versions/3.10.4/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 531, in receive
    await self.message_event.wait()
  File "/Users/.pyenv/versions/3.10.4/lib/python3.10/asyncio/locks.py", line 214, in wait
    await fut
asyncio.exceptions.CancelledError: Cancelled by cancel scope 10c300d30

During handling of the above exception, another exception occurred:

  + Exception Group Traceback (most recent call last):
  |   File "/Users/.pyenv/versions/3.10.4/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 403, in run_asgi
  |     result = await app(  # type: ignore[func-returns-value]
  |   File "/Users/.pyenv/versions/3.10.4/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in __call__
  |     return await self.app(scope, receive, send)
  |   File "/Users/.pyenv/versions/3.10.4/lib/python3.10/site-packages/fastapi/applications.py", line 1054, in __call__
  |     await super().__call__(scope, receive, send)
  |   File "/Users/.pyenv/versions/3.10.4/lib/python3.10/site-packages/starlette/applications.py", line 112, in __call__
  |     await self.middleware_stack(scope, receive, send)
  |   File "/Users/.pyenv/versions/3.10.4/lib/python3.10/site-packages/starlette/middleware/errors.py", line 187, in __call__
  |     raise exc
  |   File "/Users/.pyenv/versions/3.10.4/lib/python3.10/site-packages/starlette/middleware/errors.py", line 165, in __call__
  |     await self.app(scope, receive, _send)
  |   File "/Users/.pyenv/versions/3.10.4/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
  |     await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
  |   File "/Users/.pyenv/versions/3.10.4/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
  |     raise exc
  |   File "/Users/.pyenv/versions/3.10.4/lib/python3.10/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
  |     await app(scope, receive, sender)
  |   File "/Users/.pyenv/versions/3.10.4/lib/python3.10/site-packages/starlette/routing.py", line 715, in __call__
  |     await self.middleware_stack(scope, receive, send)
  |   File "/Users/.pyenv/versions/3.10.4/lib/python3.10/site-packages/starlette/routing.py", line 735, in app
  |     await route.handle(scope, receive, send)
  |   File "/Users/.pyenv/versions/3.10.4/lib/python3.10/site-packages/starlette/routing.py", line 288, in handle
  |     await self.app(scope, receive, send)
  |   File "/Users/.pyenv/versions/3.10.4/lib/python3.10/site-packages/starlette/routing.py", line 76, in app
  |     await wrap_app_handling_exceptions(app, request)(scope, receive, send)
  |   File "/Users/.pyenv/versions/3.10.4/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
  |     raise exc
  |   File "/Users/.pyenv/versions/3.10.4/lib/python3.10/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
  |     await app(scope, receive, sender)
  |   File "/Users/.pyenv/versions/3.10.4/lib/python3.10/site-packages/starlette/routing.py", line 74, in app
  |     await response(scope, receive, send)
  |   File "/Users/.pyenv/versions/3.10.4/lib/python3.10/site-packages/starlette/responses.py", line 261, in __call__
  |     async with anyio.create_task_group() as task_group:
  |   File "/Users/.pyenv/versions/3.10.4/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 767, in __aexit__
  |     raise BaseExceptionGroup(
  | exceptiongroup.ExceptionGroup: unhandled errors in a TaskGroup (1 sub-exception)
  +-+---------------- 1 ----------------
    | Traceback (most recent call last):
    |   File "/Users/.pyenv/versions/3.10.4/lib/python3.10/site-packages/starlette/responses.py", line 264, in wrap
    |     await func()
    |   File "/Users/.pyenv/versions/3.10.4/lib/python3.10/site-packages/starlette/responses.py", line 245, in stream_response
    |     async for chunk in self.body_iterator:
    |   File "/Users/.pyenv/versions/3.10.4/lib/python3.10/site-packages/starlette/concurrency.py", line 60, in iterate_in_threadpool
    |     yield await anyio.to_thread.run_sync(_next, as_iterator)
    |   File "/Users/.pyenv/versions/3.10.4/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
    |     return await get_async_backend().run_sync_in_worker_thread(
    |   File "/Users/.pyenv/versions/3.10.4/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2461, in run_sync_in_worker_thread
    |     return await future
    |   File "/Users/.pyenv/versions/3.10.4/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 962, in run
    |     result = context.run(func, *args)
    |   File "/Users/.pyenv/versions/3.10.4/lib/python3.10/site-packages/starlette/concurrency.py", line 49, in _next
    |     return next(iterator)
    |   File "/Users/deepseek-chat/deepseekChat.py", line 179, in event_generator
    |     for chunk in response:
    |   File "/Users/.pyenv/versions/3.10.4/lib/python3.10/site-packages/azure/ai/inference/models/_patch.py", line 413, in __next__
    |     self._done = self._read_next_block()
    |   File "/Users/.pyenv/versions/3.10.4/lib/python3.10/site-packages/azure/ai/inference/models/_patch.py", line 426, in _read_next_block
    |     return self._deserialize_and_add_to_queue(element)
    |   File "/Users/.pyenv/versions/3.10.4/lib/python3.10/site-packages/azure/ai/inference/models/_patch.py", line 342, in _deserialize_and_add_to_queue
    |     line_list: List[str] = re.split(r"(?<=\n)", element.decode("utf-8"))
    | UnicodeDecodeError: 'utf-8' codec can't decode bytes in position 4088-4089: unexpected end of data
    +------------------------------------

dargilco · 2025-02-13T15:02:19Z

Thank you @TshkOtsk, let me try with DeepSeek-R1

dargilco · 2025-02-13T18:50:33Z

@TshkOtsk Unfortunately I cannot reproduce this with DeepSeek-R1 model either (deployed to my AI Foundry project). I tried with and without Content Safety (which affects the chunking of the streamed response). Did you use exactly the System and User messages specified by @bombert above, or something else?

dargilco · 2025-02-14T01:59:54Z

Good news @TshkOtsk, I was able to reproduce this once and I have SDK logs. Working on a proper fix and unit-tests. Will update here when I have an update.

dargilco · 2025-02-15T00:47:00Z

@TshkOtsk @bombert I just released azure-ai-inference version 1.0.0b9 with a fix for this issue. Please give it a try. Thank you for reporting this!

bombert changed the title ~~azure ai foundry SDK inference cannot handle steam contains Chinese character~~ azure ai foundry SDK inference cannot handle steaming contains Chinese character Feb 5, 2025

bombert changed the title ~~azure ai foundry SDK inference cannot handle steaming contains Chinese character~~ azure ai foundry SDK inference cannot handle the steamings containing Chinese character Feb 5, 2025

github-actions bot added the needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team label Feb 5, 2025

kristapratico added the Client This issue points to a problem in the data-plane of the library. label Feb 5, 2025

dargilco self-assigned this Feb 10, 2025

dargilco added AI Model Inference Issues related to the client library for Azure AI Model Inference (\sdk\ai\azure-ai-inference) and removed AI labels Feb 10, 2025

dargilco changed the title ~~azure ai foundry SDK inference cannot handle the steamings containing Chinese character~~ Issue streaming Chinese characters using azure-ai-inference SDK (was: "azure ai foundry SDK inference cannot handle the steamings containing Chinese character") Feb 11, 2025

dargilco changed the title ~~Issue streaming Chinese characters using azure-ai-inference SDK (was: "azure ai foundry SDK inference cannot handle the steamings containing Chinese character")~~ Issue streaming Chinese characters using azure-ai-inference SDK Feb 11, 2025

dargilco mentioned this issue Feb 12, 2025

Add regression tests (unit tests) for parsing streaming response #39705

Merged

dargilco mentioned this issue Feb 14, 2025

Fix for Exception raised while parsing Chat Completions streaming response, in some rare cases #39741

Merged

dargilco closed this as completed Feb 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue streaming Chinese characters using azure-ai-inference SDK #39565

Issue streaming Chinese characters using azure-ai-inference SDK #39565

bombert commented Feb 5, 2025 •

edited

Loading

github-actions bot commented Feb 5, 2025

kingernupur commented Feb 6, 2025

dargilco commented Feb 10, 2025

dargilco commented Feb 10, 2025

dargilco commented Feb 10, 2025

howieleung commented Feb 11, 2025

dargilco commented Feb 11, 2025

TshkOtsk commented Feb 13, 2025

dargilco commented Feb 13, 2025

dargilco commented Feb 13, 2025

dargilco commented Feb 14, 2025

dargilco commented Feb 15, 2025

Issue streaming Chinese characters using azure-ai-inference SDK #39565

Issue streaming Chinese characters using azure-ai-inference SDK #39565

Comments

bombert commented Feb 5, 2025 • edited Loading

Type of issue

Description

Bug Recreation

FATAL

Fixing Suggestion

github-actions bot commented Feb 5, 2025

kingernupur commented Feb 6, 2025

dargilco commented Feb 10, 2025

dargilco commented Feb 10, 2025

dargilco commented Feb 10, 2025

howieleung commented Feb 11, 2025

dargilco commented Feb 11, 2025

TshkOtsk commented Feb 13, 2025

dargilco commented Feb 13, 2025

dargilco commented Feb 13, 2025

dargilco commented Feb 14, 2025

dargilco commented Feb 15, 2025

bombert commented Feb 5, 2025 •

edited

Loading