Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When AssistantAgent gets called consecutively without new user messages, getting "Expected last role User or Tool" error when using Mistral models #5044

Open
JMLX42 opened this issue Jan 14, 2025 · 13 comments

Comments

@JMLX42
Copy link
Contributor

JMLX42 commented Jan 14, 2025

What happened?

On some occasions, the chat completion is called with it's last message's role set to "assistant".

Example:

{
    "model": "mistral-large-latest",
    "messages": [
        {
            "content": "You are a GitLab assistant: your purpose is to help users discuss a specific GitLab issue.",
            "role": "system"
        },
        {
            "content": "Please handle the following todo: GitLab instance URL: https://gitlab.com/api/v4\n\n Todo ID: 493871141\n Todo action: directly_addressed\n Todo state: pending\n Todo target ID: 25\n Todo target type: Issue\n Todo target URL: https://gitlab.com/lx-industries/wally-the-wobot/tests/repl-tests/-/issues/25#note_2296444245\n\n Project ID: 45010942\n Project name: LX Industries / Wally The Wobot / tests / REPL Tests\n Project path: lx-industries/wally-the-wobot/tests/repl-tests\n Project default branch: main\n Project description: \n",
            "role": "user"
        },
        {
            "content": "Please reply to the user.",
            "role": "user"
        },
        {
            "tool_calls": [
                {
                    "id": "iRvM4muS4",
                    "function": {
                        "arguments": "{\"todo_id\": 493871141, \"project_id\": 45010942, \"target_url\": \"https://gitlab.com/lx-industries/wally-the-wobot/tests/repl-tests/-/issues/25#note_2296444245\", \"target_type\": \"Issue\", \"target_id\": 25}",
                        "name": "get_todo_discussion_id"
                    },
                    "type": "function"
                }
            ],
            "role": "assistant"
        },
        {
            "content": "e7764e059fad9a55ff30dbd4b2bf108b5205e486",
            "role": "tool",
            "tool_call_id": "iRvM4muS4"
        },
        {
            "content": "[{\"name\": \"list_issue_notes\", \"arguments\": {\"project_id\": 45010942, \"issue_iid\": 25, \"discussion_id\": \"e7764e059fad9a55ff30dbd4b2bf108b5205e486\"}}]",
            "role": "assistant"
        }
    ]
}

The Mistral API does not support that:

An error occurred: litellm.BadRequestError: MistralException - Error code: 400 - {'object': 'error', 'message': 'Expected last role User or Tool (or Assistant with prefix True) for serving but got assistant', 'type': 'invalid_request_error', 'param': None, 'code': None}

What did you expect to happen?

I expect no errors.

My understanding is the next message is expected to be a handoff. But in Mistral's API's case, it's not possible.

How can we reproduce it (as minimally and precisely as possible)?

Run Mistral via LiteLLM:

How I run LiteLLM:

compose.yml

---
services:
  litellm:
    image: ghcr.io/berriai/litellm:main-v1.58.1@sha256:0bd93bb9062e4cb004c8f85c5eb8bf0469f1830f8c888f0f1b1f196d2747774e
    volumes:
      - ./config.yml:/app/config.yml:ro
    ports:
      - 4000:4000
    command: ["--config", "/app/config.yml", --detailed_debug]

config.yml

---

model_list: 
  - model_name: mistral-large-latest
    litellm_params:
      model: mistral/mistral-large-latest
      api_base: https://api.mistral.ai/v1/
      api_key: the-api-key
    model_info:
      id: mistral-large-latest
      max_tokens: 131072

litellm_settings:
  drop_params: true

general_settings:

Then, in my app:

        model_client = OpenAIChatCompletionClient(
            model="mistral-large-latest",
            api_key="notneeded",
            base_url="http://0.0.0.0:4000",
            model_capabilities={
                "json_output": False,
                "vision": False,
                "function_calling": True,
            },
        )

AutoGen version

0.4.1

Which package was this bug in

AgentChat

Model used

mistral-large-latest

Python version

3.12

Operating system

Ubuntu 24.04

Any additional info you think would be helpful for fixing this bug

No response

@JMLX42 JMLX42 changed the title Using Mistral models [bug] Using Mistral models Jan 14, 2025
@JMLX42 JMLX42 changed the title [bug] Using Mistral models "Expected last role User or Tool " error when using Mistral models Jan 14, 2025
@JMLX42 JMLX42 changed the title "Expected last role User or Tool " error when using Mistral models "Expected last role User or Tool" error when using Mistral models Jan 14, 2025
@JMLX42
Copy link
Contributor Author

JMLX42 commented Jan 14, 2025

I think it's because tool call results are not considered as agent "inner messages".
Thus they are not transferred to other agents.
To bypass workaround this, AutoGen adds a new "normal" assistant message with a summary of the tool call:

https://github.com/microsoft/autogen/blob/main/python/packages/autogen-agentchat/src/autogen_agentchat/agents/_assistant_agent.py#L418-L432

@ekzhu
Copy link
Collaborator

ekzhu commented Jan 14, 2025

because tool call results are not considered as agent "inner messages".

Tool call results is also part of the inner messages:

inner_messages.append(tool_call_result_msg)
yield tool_call_result_msg

Though the summary is returned as response directly if reflect on tool use is not set, like what you mentioned.

Could you try to add a new assistant message to its model context before:

  • Returning due to handoff

yield Response(
chat_message=HandoffMessage(content=handoffs[0].message, target=handoffs[0].target, source=self.name),
inner_messages=inner_messages,
)
return

  • Returning tool summary message:

yield Response(
chat_message=ToolCallSummaryMessage(content=tool_call_summary, source=self.name),
inner_messages=inner_messages,
)

Let's see if adding these additional assistant message can help.

@ekzhu
Copy link
Collaborator

ekzhu commented Jan 14, 2025

Related #2828

@JMLX42
Copy link
Contributor Author

JMLX42 commented Jan 14, 2025

Tool call results is also part of the inner messages:

If the messages with role = tool is indeed part of the inner messages, it is part of the discussion and set to the other agents. Correct?

Then why the extra assistant message?

        {
            "tool_calls": [
                {
                    "id": "iRvM4muS4",
                    "function": {
                        "arguments": "{\"todo_id\": 493871141, \"project_id\": 45010942, \"target_url\": \"https://gitlab.com/lx-industries/wally-the-wobot/tests/repl-tests/-/issues/25#note_2296444245\", \"target_type\": \"Issue\", \"target_id\": 25}",
                        "name": "get_todo_discussion_id"
                    },
                    "type": "function"
                }
            ],
            "role": "assistant"
        },
        {
            "content": "e7764e059fad9a55ff30dbd4b2bf108b5205e486",
            "role": "tool",
            "tool_call_id": "iRvM4muS4"
        },
        {
            "content": "[{\"name\": \"list_issue_notes\", \"arguments\": {\"project_id\": 45010942, \"issue_iid\": 25, \"discussion_id\": \"e7764e059fad9a55ff30dbd4b2bf108b5205e486\"}}]",
            "role": "assistant"
        }

Let's see if adding these additional assistant message can help.

@ekzhu my understanding is those messages will have "role": "assistant", thus they will still trigger the "Expected last role User or Tool" error.

@JMLX42
Copy link
Contributor Author

JMLX42 commented Jan 14, 2025

If I comment this block, then my swarm does not work anymore.

2025-01-14 19:02:04 DEBUG    wally.system handle_todo TaskResult(messages=[TextMessage(source='user', models_usage=None, content='\nPlease handle the following todo:\n\nGitLab instance URL: https://gitlab.com/api/v4\n\nTodo ID: 493914053\nTodo action: directly_addressed\nTodo state: pending\nTodo target ID: 25\nTodo target type: Issue\nTodo target URL: https://gitlab.com/lx-industries/wally-the-wobot/tests/repl-tests/-/issues/25#note_2296618800\n\nProject ID: 45010942\nProject name: LX Industries / Wally The Wobot / tests / REPL Tests\nProject path: lx-industries/wally-the-wobot/tests/repl-tests\nProject default branch: main\nProject description: N/A\n', type='TextMessage'), ToolCallRequestEvent(source='todo_agent', models_usage=RequestUsage(prompt_tokens=499, completion_tokens=15), content=[FunctionCall(id='call_081za2p9NTGt4EUF92GRl2X5', arguments='{}', name='transfer_to_issue_discussion_agent')], type='ToolCallRequestEvent'), ToolCallExecutionEvent(source='todo_agent', models_usage=None, content=[FunctionExecutionResult(content='Please reply to the user.', call_id='call_081za2p9NTGt4EUF92GRl2X5')], type='ToolCallExecutionEvent'), HandoffMessage(source='todo_agent', models_usage=None, target='issue_discussion_agent', content='Please reply to the user.', type='HandoffMessage'), ToolCallRequestEvent(source='issue_discussion_agent', models_usage=RequestUsage(prompt_tokens=1682, completion_tokens=73), content=[FunctionCall(id='call_vygFzicDnhbRUAjTy9zoM7ho', arguments='{"todo_id":493914053,"project_id":45010942,"target_url":"https://gitlab.com/lx-industries/wally-the-wobot/tests/repl-tests/-/issues/25#note_2296618800","target_type":"Issue","target_id":25}', name='get_todo_discussion_id')], type='ToolCallRequestEvent'), ToolCallExecutionEvent(source='issue_discussion_agent', models_usage=None, content=[FunctionExecutionResult(content='3728bda8b4c959eb525ebab263f3100879e38cba', call_id='call_vygFzicDnhbRUAjTy9zoM7ho')], type='ToolCallExecutionEvent')], stop_reason=None

@ekzhu
Copy link
Collaborator

ekzhu commented Jan 14, 2025

The extra assistant message comes from reflection on tool use:

if self._reflect_on_tool_use:
# Generate another inference result based on the tool call and result.
llm_messages = self._system_messages + await self._model_context.get_messages()
result = await self._model_client.create(llm_messages, cancellation_token=cancellation_token)
assert isinstance(result.content, str)
# Add the response to the model context.
await self._model_context.add_message(AssistantMessage(content=result.content, source=self.name))
# Yield the response.
yield Response(
chat_message=TextMessage(content=result.content, source=self.name, models_usage=result.usage),
inner_messages=inner_messages,
)

When you comment out the whole block, it won't work because the code doesn't yield a Response anymore.

my understanding is those messages will have "role": "assistant", thus they will still trigger the "Expected last role User or Tool" error.

Yes, that's correct. I missed that.

Is there a chat template somewhere mistral can accept?

@jackgerrits do you think this is a case for model family and addressing the discrepancy between different model providers?

@ekzhu ekzhu changed the title "Expected last role User or Tool" error when using Mistral models When AssistantAgetn gets called consecutively without new user messages, getting "Expected last role User or Tool" error when using Mistral models Jan 14, 2025
@ekzhu ekzhu changed the title When AssistantAgetn gets called consecutively without new user messages, getting "Expected last role User or Tool" error when using Mistral models When AssistantAgent gets called consecutively without new user messages, getting "Expected last role User or Tool" error when using Mistral models Jan 14, 2025
@JMLX42
Copy link
Contributor Author

JMLX42 commented Jan 14, 2025

Is there a chat template somewhere mistral can accept?

@ekzhu IDK. Do you have an example for OpenAI I could refer to?

@ekzhu ekzhu added this to the 0.4.x milestone Jan 14, 2025
@ekzhu
Copy link
Collaborator

ekzhu commented Jan 14, 2025

I have found this: https://github.com/mistralai/cookbook/blob/main/concept-deep-dive/tokenization/chat_templates.md, I have never used this myself.

@JMLX42
Copy link
Contributor Author

JMLX42 commented Jan 14, 2025

Is there a chat template somewhere mistral can accept?

I have found this: https://github.com/mistralai/cookbook/blob/main/concept-deep-dive/tokenization/chat_templates.md, I have never used this myself.

@ekzhu how does that help?

IMHO the problem is this message:

{
            "content": "[{\"name\": \"list_issue_notes\", \"arguments\": {\"project_id\": 45010942, \"issue_iid\": 25, \"discussion_id\": \"e7764e059fad9a55ff30dbd4b2bf108b5205e486\"}}]",
            "role": "assistant"
        }

which is created by:

tool_call_summaries: List[str] = []
for i in range(len(tool_call_msg.content)):
tool_call_summaries.append(
self._tool_call_summary_format.format(
tool_name=tool_call_msg.content[i].name,
arguments=tool_call_msg.content[i].arguments,
result=tool_call_result_msg.content[i].content,
),
)
tool_call_summary = "\n".join(tool_call_summaries)
yield Response(
chat_message=ToolCallSummaryMessage(content=tool_call_summary, source=self.name),
inner_messages=inner_messages,
)

If call tool results are part of the inner messages and are sent to the other agents, what is the point of that extra ToolCallSummaryMessage message?

@JMLX42
Copy link
Contributor Author

JMLX42 commented Jan 14, 2025

If call tool results are part of the inner messages and are sent to the other agents, what is the point of that extra ToolCallSummaryMessage message?

Answering my own question: the on_messages_stream() generator needs to yield a Response to end. The goal of the code block above is to create a Response based on the previous tool calls.

The problem is that Response is added to the messages sent to the chat completion API.

Event if _reflect_on_tool_use = True, it will add another message with role = "assistant", which leads to the same error.

@ekzhu
Copy link
Collaborator

ekzhu commented Jan 14, 2025

how does that help?

I just found this, and I don't know how exactly it can help. Just thought it may be helpful.
 

If call tool results are part of the inner messages and are sent to the other agents, what is the point of that extra ToolCallSummaryMessage message?

The inner messages are not sent to other agents. They are yielded for observability purposes only. Only the application (caller) and the group chat manager (as part of the Team) has access to these messages. Other agents can only see the chat_message field in the Response. Hence the need for ToolCallSummaryMessage. If the reflect_on_tool_use=True, then the model gets called the second time, to produce an assistant message.

The problem is that Response is added to the messages sent to the chat completion API.

Yes, I have already recognized this (I have edited the title and added this as an issue we would like to tackle). Though, to be clear the tool call summary message is not added to the model context, and the Response is not sent to the chat completion API, but to other agents.

I understand when the same agent is called consecutively, it will be called with an assistant message as the last message in the context sent to the LLM. This is going to happen regardless of whether tool call was used or not. So this is going to cause error for model APIs that do not support "assistant, assistant, ..." in the message context.

{
            "content": "[{\"name\": \"list_issue_notes\", \"arguments\": {\"project_id\": 45010942, \"issue_iid\": 25, \"discussion_id\": \"e7764e059fad9a55ff30dbd4b2bf108b5205e486\"}}]",
            "role": "assistant"
        }

I am thinking how the last message could have been a ToolCallSummaryMessage, as it is not added to the model context. Could it be that you have reflect_on_tool_use=True and the LLM model just produced this output. Then, the same agent gets called again, and the last assistant message is sent to the model, causing the error.

Can you post how the agents are setup?

@JMLX42
Copy link
Contributor Author

JMLX42 commented Jan 15, 2025

This is going to happen regardless of whether tool call was used or not.

Correct.

One might argue it is a valid possibility though. I'll bring this up with the Mistral team.

I am thinking how the last message could have been a ToolCallSummaryMessage, as it is not added to the model context.

I'll investigate this.

Could it be that you have reflect_on_tool_use=True

No. Here is my entire code base:

https://gitlab.com/lx-industries/wally-the-wobot/wally/-/blob/main/wally/agents.py?ref_type=heads

Other agents can only see the chat_message field in the Response.

Ok so just to be sure I understand: that's the purpose of that Response with the ToolCallSummaryMessage. Correct?

Meaning agents will answer text or a tool call summary. Not both. Correct?

@ekzhu
Copy link
Collaborator

ekzhu commented Jan 15, 2025

Ok so just to be sure I understand: that's the purpose of that Response with the ToolCallSummaryMessage. Correct?
Meaning agents will answer text or a tool call summary. Not both. Correct?

Agents will see other agents' Response.chat_message in their on_messages method, but not Response.inner_messages. To point to a specific code:

https://github.com/microsoft/autogen/blob/main/python/packages/autogen-agentchat/src/autogen_agentchat/teams/_group_chat/_chat_agent_container.py#L36-L40

The Core agent container's handler only put the chat_message in the buffer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants