Skip to content

Commit

Permalink
Update caching.md
Browse files Browse the repository at this point in the history
Update document for semantic cache
  • Loading branch information
ayush-portkey authored Jun 14, 2023
1 parent 8402191 commit ff96fe9
Showing 1 changed file with 31 additions and 12 deletions.
43 changes: 31 additions & 12 deletions caching.md
Original file line number Diff line number Diff line change
@@ -1,24 +1,43 @@
# Caching
# Caching in Portkey

Portkey supports caching across text & chat completions. When the exact same request comes in to Portkey, we can return the response from our cache.
Portkey offers two types of caching to enhance performance and optimize response retrieval: fixed string matching cache(simple) and semantic cache.

This could be useful if you have fixed input prompts or are testing the app with the same inputs.
## Fixed String Matching Cache
The fixed string matching cache is the traditional caching mechanism where an exact match is performed on the input prompts. If the exact same request is received again, Portkey can directly return the response from the cache without executing the model.

### Enabling cache

To enable caching, pass the following headers in your requests.
### Enabling Fixed String Matching Cache
To enable the fixed string matching cache, include the following headers in your requests:

```sh
"x-portkey-cache": true
"Cache-Control": "max-age:1000"
"x-portkey-cache": "simple"
"Cache-Control": "max-age:1000"
```
The x-portkey-cache header enables or disables the cache storage and retrieval. The Cache-Control header accepts the max-age parameter in seconds, which specifies the maximum age of the cached response. If the Cache-Control header is not provided, Portkey will automatically cache requests for 7*24*60*60 seconds (7 days) when x-portkey-cache is set to true.

The `x-portkey-cache` enables or disables cache storage and retrieval. The `Cache-Control` header accepts `max-age` in seconds. The minimum value for `Cache-Control` is 30. If you don't provide this header, we will automatically cache requests for `7*24*60*60 seconds` (7 days) when the `x-portkey-cache` is set to `true`.
### Invalidating Fixed String Matching Cache
You can force refresh the fixed string matching cache by using the x-portkey-cache-force-refresh header. Setting it to true ensures that the cache is invalidated, and a new value is stored in the cache.

### Invalidating Cache
```sh
"x-portkey-cache-force-refresh": true
```

You can choose to force refresh cache by using the `x-portkey-cache-force-refresh` header. Setting it to `true` ensures that the cache is invalidated, and a new value is stored in the cache.
## Semantic Cache
The semantic cache in Portkey goes beyond exact string matching and takes into account the contextual similarity between input prompts. It uses cosine similarity to determine if the similarity between the input and a cached request exceeds a certain threshold. If the similarity threshold is met, Portkey retrieves the response from the cache.

### Enabling Semantic Cache
To enable the semantic cache feature, use the following header in your requests:

```sh
"x-portkey-cache-force-refresh": true
"x-portkey-cache": "semantic"
```

Setting the x-portkey-cache header to "semantic" enables the semantic cache functionality.

### Implementation Details
When utilizing the semantic cache, it's important to note that the Cache-Control header is still applicable to control the maximum age of the cached response.

If you wish to force refresh the semantic cache and invalidate existing entries, you can use the x-portkey-cache-force-refresh header as described earlier.

By leveraging the semantic cache, you can optimize the caching process by considering the contextual similarity of input prompts, leading to more efficient response retrieval.

Choose the appropriate caching mechanism based on your use case to improve performance and minimize unnecessary model executions in Portkey.

0 comments on commit ff96fe9

Please sign in to comment.