Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tools with nested properties getting cut short #429

Open
Sergic-Cell opened this issue Jan 27, 2025 · 6 comments
Open

Tools with nested properties getting cut short #429

Sergic-Cell opened this issue Jan 27, 2025 · 6 comments
Assignees

Comments

@Sergic-Cell
Copy link

Just as the title says, I believe tools set up with pydantic are having their deeper "nests" cut off before being sent to ollama.

Take this example tool definition from the Langgraph course:

from typing import List
from pydantic import BaseModel, Field

class Analyst(BaseModel):
    affiliation: str = Field(
        description="Primary affiliation of the analyst.",
    )
    name: str = Field(
        description="Name of the analyst."
    )
    role: str = Field(
        description="Role of the analyst in the context of the topic.",
    )
    description: str = Field(
        description="Description of the analyst focus, concerns, and motives.",
    )
    @property
    def persona(self) -> str:
        return f"Name: {self.name}\nRole: {self.role}\nAffiliation: {self.affiliation}\nDescription: {self.description}\n"

class Perspectives(BaseModel):
    analysts: List[Analyst] = Field(
        description="Comprehensive list of analysts with their roles and affiliations.",
    )

llm.with_structured_output(Perspectives)

This should produce a unprocessed tool JSON object along the lines of

{
  'type': 'function',
  'function': {
    'name': 'Perspectives',
    'description': '',
    'parameters': {
      'properties': {
        'analysts': {
          'description': 'Comprehensive list of analysts with their roles and affiliations.',
          'items': {
            'properties': {
              'affiliation': {
                'description': 'Primary affiliation of the analyst.',
                'type': 'string'
              },
              'name': {
                'description': 'Name of the analyst.',
                'type': 'string'
              },
              'role': {
                'description': 'Role of the analyst in the context of the topic.',
                'type': 'string'
              },
              'description': {
                'description': 'Description of the analyst focus, concerns, and motives.',
                'type': 'string'
              }
            },
            'required': ['affiliation', 'name', 'role', 'description'],
            'type': 'object'
          },
          'type': 'array'
        }
      },
      'required': ['analysts'],
      'type': 'object'
    }
  }
}

However, the tool object that comes out of the the Tool.model_validate() method in ollama's _client.py module is similar to the following structure:

{
  'type': 'function',
  'function': {
    'name': 'Perspectives',
    'description': '',
    'parameters': {
      'type': 'object',
      'required': ['analysts'],
      'properties': {
        'analysts': {
          'type': 'array',
          'description': 'Comprehensive list of analysts with their roles and affiliations.'
        }
      }
    }
  }
}

I am unsure if this is a pydantic issue, or one caused by the Tool definition in ollama's _types.py module:

class Tool(SubscriptableBaseModel):
  type: Optional[Literal['function']] = 'function'

  class Function(SubscriptableBaseModel):
    name: Optional[str] = None
    description: Optional[str] = None

    class Parameters(SubscriptableBaseModel):
      type: Optional[Literal['object']] = 'object'
      required: Optional[Sequence[str]] = None

      class Property(SubscriptableBaseModel):
        model_config = ConfigDict(arbitrary_types_allowed=True)

        type: Optional[str] = None
        description: Optional[str] = None

      properties: Optional[Mapping[str, Property]] = None

    parameters: Optional[Parameters] = None

  function: Optional[Function] = None

I also know that the Ollama server itself also did this sort of JSON truncation even if the full tool was given to it. I do not know if this has been fixed at that level, but at least the given tool in the client chat request should be one that represents the full tool and all its nested properties. Earlier versions of ollama.py did not cut truncate the tool structure at this stage, so I would consider this a bug.

@ParthSareen
Copy link
Contributor

Hey! Thanks for pointing this out. I saw you mentioned that the server did this too - we should definitely address this on the SDK end. Have you tried the same behavior through a curl request on the server too?

@ParthSareen ParthSareen self-assigned this Jan 28, 2025
@Sergic-Cell
Copy link
Author

I had to learn to use the REST API to test it with my input, but yes I can confirm that this is also still happening on the backend server.

Here is my curl request (Windows CMD Prompt):
curl http://[LOCAL_IP]/api/chat -d @json.txt

Contents of json.txt:

{
  "model": "llama3.1",
  "messages": [
    {
      "role": "system",
      "content": "You are tasked with creating a set of AI analyst personas. Follow these instructions carefully:\n\n1. First, review the research topic:\nThe benefits of adopting LangGraph as an agent framework\n        \n2. Examine any editorial feedback that has been optionally provided to guide creation of the analysts: \n        \n\n    \n3. Determine the most interesting themes based upon documents and / or feedback above.\n                    \n4. Pick the top 3 themes.\n\n5. Assign one analyst to each theme.",
      "images": []
    },
    {
      "role": "user",
      "content": "Generate the set of analysts.",
      "images": []
    }
  ],
  "stream": false,
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "Perspectives",
        "description": "",
        "parameters": {
          "properties": {
            "analysts": {
              "description": "Comprehensive list of analysts with their roles and affiliations.",
              "items": {
                "properties": {
                  "affiliation": {
                    "description": "Primary affiliation of the analyst.",
                    "type": "string"
                  },
                  "name": {
                    "description": "Name of the analyst.",
                    "type": "string"
                  },
                  "role": {
                    "description": "Role of the analyst in the context of the topic.",
                    "type": "string"
                  },
                  "description": {
                    "description": "Description of the analyst focus, concerns, and motives.",
                    "type": "string"
                  }
                },
                "required": ["affiliation", "name", "role", "description"],
                "type": "object"
              },
              "type": "array"
            }
          },
          "required": ["analysts"],
          "type": "object"
        }
      }
    }
  ]
}

Raw return from the Ollama server running in the CMD line with OLLAMA_DEBUG=1:
time=2025-01-28T10:16:09.910-05:00 level=DEBUG source=routes.go:1470 msg="chat request" images=0 prompt="<|start_header_id|>system<|end_header_id|>\n\nYou are tasked with creating a set of AI analyst personas. Follow these instructions carefully:\n\n1. First, review the research topic:\nThe benefits of adopting LangGraph as an agent framework\n \n2. Examine any editorial feedback that has been optionally provided to guide creation of the analysts: \n \n\n \n3. Determine the most interesting themes based upon documents and / or feedback above.\n \n4. Pick the top 3 themes.\n\n5. Assign one analyst to each theme.\n\nYou are a helpful assistant with tool calling capabilities. When you receive a tool call response, use the output to format an answer to the original user question.<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nGiven the following functions, please respond with a JSON for a function call with its proper arguments that best answers the given prompt.\n\nRespond in the format {\"name\": function name, \"parameters\": dictionary of argument name and its value}. Do not use variables.\n\n[{\"type\":\"function\",\"function\":{\"name\":\"Perspectives\",\"description\":\"\",\"parameters\":{\"type\":\"object\",\"required\":[\"analysts\"],\"properties\":{\"analysts\":{\"type\":\"array\",\"description\":\"Comprehensive list of analysts with their roles and affiliations.\"}}}}}]\n\nGenerate the set of analysts.<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n"

The raw prompt formatted for readability:

<|start_header_id|>system<|end_header_id|>
You are tasked with creating a set of AI analyst personas. Follow these instructions carefully:

1. First, review the research topic:\nThe benefits of adopting LangGraph as an agent framework
       
2. Examine any editorial feedback that has been optionally provided to guide creation of the analysts:

3. Determine the most interesting themes based upon documents and / or feedback above.
                   
4. Pick the top 3 themes.
                   
5. Assign one analyst to each theme.
                   
You are a helpful assistant with tool calling capabilities. When you receive a tool call response, use the output to format an answer to the original user question.
<|eot_id|>
<|start_header_id|>user<|end_header_id|>
Given the following functions, please respond with a JSON for a function call with its proper arguments that best answers the given prompt.
Respond in the format {\"name\": function name, \"parameters\": dictionary of argument name and its value}. Do not use variables.
[
  {
    "type":"function",
    "function":{
      "name":"Perspectives",
      "description":"",
      "parameters":{
        "type":"object",
        "required":["analysts"],
        "properties":{
          "analysts":{
            "type":"array",
            "description":"Comprehensive list of analysts with their roles and affiliations."
          }
        }
      }
    }
  }
]
Generate the set of analysts.
<|eot_id|>
<|start_header_id|>assistant<|end_header_id|>

The output returned to curl command:
{"model":"llama3.1","created_at":"2025-01-28T15:16:19.3524297Z","message":{"role":"assistant","content":"","tool_calls":[{"function":{"name":"Perspectives","arguments":{"analysts":[{"Affiliation":"Automation Research Institute","Name":"LangGraph_Automator","Role":"AI Analyst"}]}}},{"function":{"name":"Perspectives","arguments":{"analysts":[{"Affiliation":"Voice Recognition Research Lab","Name":"LangGraph_Voice","Role":"AI Analyst"}]}}},{"function":{"name":"Perspectives","arguments":{"analysts":[{"Affiliation":"Data Science Research Institute","Name":"LangGraph_Analyzer","Role":"AI Analyst"}]}}}]},"done_reason":"stop","done":true,"total_duration":13501671000,"load_duration":3239903100,"prompt_eval_count":249,"prompt_eval_duration":2413000000,"eval_count":499,"eval_duration":7024000000}

As for what's going on in the backend, I'm not too well versed in Golang so I don't really know. However, I would guess it has something to do with the Tools structure definitions in the ollama/api/types.go file:

type Tools []Tool

func (t Tools) String() string {
      bts, _ := json.Marshal(t)
      return string(bts)
}

func (t Tool) String() string {
      bts, _ := json.Marshal(t)
      return string(bts)
}

// Message is a single message in a chat sequence. The message contains the
// role ("system", "user", or "assistant"), the content and an optional list
// of images.
type Message struct {
      Role      string      `json:"role"`
      Content   string      `json:"content"`
      Images    []ImageData `json:"images,omitempty"`
      ToolCalls []ToolCall  `json:"tool_calls,omitempty"`
}

func (m *Message) UnmarshalJSON(b []byte) error {
      type Alias Message
      var a Alias
      if err := json.Unmarshal(b, &a); err != nil {
            return err
      }

      *m = Message(a)
      m.Role = strings.ToLower(m.Role)
      return nil
}

type ToolCall struct {
      Function ToolCallFunction `json:"function"`
}

type ToolCallFunction struct {
      Index     int                       `json:"index,omitempty"`
      Name      string                    `json:"name"`
      Arguments ToolCallFunctionArguments `json:"arguments"`
}

type ToolCallFunctionArguments map[string]any

func (t *ToolCallFunctionArguments) String() string {
      bts, _ := json.Marshal(t)
      return string(bts)
}

type Tool struct {
      Type     string       `json:"type"`
      Function ToolFunction `json:"function"`
}

type ToolFunction struct {
      Name        string `json:"name"`
      Description string `json:"description"`
      Parameters  struct {
            Type       string   `json:"type"`
            Required   []string `json:"required"`
            Properties map[string]struct {
                  Type        string   `json:"type"`
                  Description string   `json:"description"`
                  Enum        []string `json:"enum,omitempty"`
            } `json:"properties"`
      } `json:"parameters"`
}

func (t *ToolFunction) String() string {
      bts, _ := json.Marshal(t)
      return string(bts)
}

Most notably this struct:

type ToolFunction struct {
      Name        string `json:"name"`
      Description string `json:"description"`
      Parameters  struct {
            Type       string   `json:"type"`
            Required   []string `json:"required"`
            Properties map[string]struct {
                  Type        string   `json:"type"`
                  Description string   `json:"description"`
                  Enum        []string `json:"enum,omitempty"`
            } `json:"properties"`
      } `json:"parameters"`
}

Which is inline with the "cut-off point" we are getting in the server debug raw prompt. I'm guessing it just needs to be made, like, recursively deep...

@ParthSareen
Copy link
Contributor

Okay this is really good for me to know. I also plan on having structured outputs for tool calling so it'll end up looking much better (hopefully!).

I get a fix on this in after some of the new engine work - which is also structured outputs. I'm curious though - what has your experience been with nested tools? I'd imagine that the reliability is pretty low.

@Sergic-Cell
Copy link
Author

Thanks! And hmmmm, I haven't really tried tool calling work outside of ollama, and you already know how nested tools come out (just results in invalid pydantic validation errors due to missing attributes from the truncation).

What I can add is that in order to work within that limitation I needed a workaround that went away from the pydantic method of defining complex tools and just passed the tool as a dictionary. But the dictionary would still be three levels or so deep, I would just jam the extra "dimensions" needed into the innermost description. Kinda hacky.

In the case I provided where the final cut-off is a list if analyst, my dictionary would look something like:

  {
    "type":"function",
    "function":{
      "name":"Perspectives",
      "description":"",
      "parameters":{
        "type":"object",
        "required":["analysts"],
        "properties":{
          "analysts":{
            "type":"array",
            "description":"Comprehensive list of analyst objects with their roles and affiliations. Each analyst has an affiliatation of type string, a name of type string, etc... Example json: ["name":"John", "Affiliation": "Researcher", etc...]
          }
        }
      }
    }
  }

At least with llama3.1, this would return a consistent enough json string I could unmarshall (but this is after some serious cleaning, sanitation, and json repair for edge cases.

So far not the most reliable but can deliver a complete workflow 😅. I imagine quality would be better with the formal fix.

@ParthSareen
Copy link
Contributor

ParthSareen commented Jan 28, 2025

Yeah this isn't ideal for sure. Have you tried trying out structured outputs instead of tool calling instead? I think it might fit your case better. The extra work you'd have to do would be to unmarshal the returned JSON and then execute a tool given that.

Maybe give this a look in the meantime? https://ollama.com/blog/structured-outputs

@Sergic-Cell
Copy link
Author

I'll take a look at it meanwhile. Haven't really looked into it. Thanks for the recommendation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants