You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am debugging my application that uses stagehand and I noted that the cache is not working very well.
I attached a simple code to help us to validate.
When running this application, I saw that we have many cache misses, even if the application is executed 3 times in a row, on a static page.
Debugging, I saw that the key used to get the cache and to set the cache are different, because of a simple attribute within the ZodObject: the _cached attribute. The ZodObject is inside of the cacheOptions object, that is used to create a hash (the cache key).
for the first iteration, the object used to hash is:
{"model": "gpt-4o-mini","messages": [{"role": "system","content": "You are extracting content on behalf of a user. If a user asks you to extract a 'list' of information, or 'all' information, YOU MUST EXTRACT ALL OF THE INFORMATION THAT THE USER REQUESTS. You will be given: 1. An instruction 2. A list of DOM elements to extract from. Print the exact text from the DOM elements with all symbols, characters, and endlines as is. Print null or an empty string if no new information is found. "},{"role": "user","content": "Instruction: extract page title\nDOM: 0:<h1>Example Domain</h1>\n1:Example Domain\n2:<p>This domain is for use in illustrative examples in documents. You may use this\n domain in literature without prior coordination or asking for permission.</p>\n3:This domain is for use in illustrative examples in documents. You may use this\n domain in literature without prior coordination or asking for permission.\n4:<a href=\"https://www.iana.org/domains/example\">More information...</a>\n5:More information...\n"}],"temperature": 0.1,"top_p": 1,"frequency_penalty": 0,"presence_penalty": 0,"response_model": {"schema": {"_def": {"unknownKeys": "strip","catchall": {"_def": {"typeName": "ZodNever"},"~standard": {"version": 1,"vendor": "zod"}},"typeName": "ZodObject"},"~standard": {"version": 1,"vendor": "zod"},"_cached": null},"name": "Extraction"}}
and for set the cache the cachedOption (used to hash) is:
{
"model": "gpt-4o-mini",
"messages": [
{
"role": "system",
"content": "You are extracting content on behalf of a user. If a user asks you to extract a 'list' of information, or 'all' information, YOU MUST EXTRACT ALL OF THE INFORMATION THAT THE USER REQUESTS. You will be given: 1. An instruction 2. A list of DOM elements to extract from. Print the exact text from the DOM elements with all symbols, characters, and endlines as is. Print null or an empty string if no new information is found. "
},
{
"role": "user",
"content": "Instruction: extract page title\nDOM: 0:<h1>Example Domain</h1>\n1:Example Domain\n2:<p>This domain is for use in illustrative examples in documents. You may use this\n domain in literature without prior coordination or asking for permission.</p>\n3:This domain is for use in illustrative examples in documents. You may use this\n domain in literature without prior coordination or asking for permission.\n4:<a href=\"https://www.iana.org/domains/example\">More information...</a>\n5:More information...\n"
}
],
"temperature": 0.1,
"top_p": 1,
"frequency_penalty": 0,
"presence_penalty": 0,
"response_model": {
"schema": {
"_def": {
"unknownKeys": "strip",
"catchall": {
"_def": {
"typeName": "ZodNever"
},
"~standard": {
"version": 1,
"vendor": "zod"
}
},
"typeName": "ZodObject"
},
"~standard": {
"version": 1,
"vendor": "zod"
},
- "_cached": null+ "_cached": {+ "shape": {+ "pageTitle": {+ "_def": {+ "checks": [],+ "typeName": "ZodString",+ "coerce": false+ },+ "~standard": {+ "version": 1,+ "vendor": "zod"+ }+ }+ },+ "keys": [+ "pageTitle"+ ]+ }
},
"name": "Extraction"
}
}
In getCache function these objects are hashed and the result is different
As we increase the number of cache misses, this increases the cost of AI.
If I am wrong, help me, please.
The text was updated successfully, but these errors were encountered:
Hi, team.
I am debugging my application that uses stagehand and I noted that the cache is not working very well.
I attached a simple code to help us to validate.
When running this application, I saw that we have many cache misses, even if the application is executed 3 times in a row, on a static page.
Debugging, I saw that the key used to get the cache and to set the cache are different, because of a simple attribute within the ZodObject: the
_cached
attribute. The ZodObject is inside of the cacheOptions object, that is used to create a hash (the cache key).for the first iteration, the object used to hash is:
and for set the cache the cachedOption (used to hash) is:
In getCache function these objects are hashed and the result is different
As we increase the number of cache misses, this increases the cost of AI.
If I am wrong, help me, please.
The text was updated successfully, but these errors were encountered: