upstash · fahreddinozcan · Jan 12, 2025
diff --git a/mint.json b/mint.json
@@ -616,6 +616,14 @@
                 }
               ]
             },
+            {
+              "group": "Ratelimit AI (TS)",
+              "pages": [
+                "redis/sdks/ratelimit-ai/overview",
+                "redis/sdks/ratelimit-ai/getting-started",
+                "redis/sdks/ratelimit-ai/features"
+              ]
+            },
             {
               "group": "Ratelimit (TS)",
               "pages": [

diff --git a/redis/sdks/ratelimit-ai/features.mdx b/redis/sdks/ratelimit-ai/features.mdx
@@ -0,0 +1,98 @@
+---
+title: "Features"
+---
+
+## Ratelimit Types
+
+Ratelimit AI provides three types of rate limits specifically designed for LLM API usage. You can use these limits individually or combine them based on your needs.
+
+### Requests Per Minute (RPM)
+
+The number of requests allowed per minute. This limit is useful for:
+
+- Controlling the number of requests made to the LLM API in a short period.
+- Preventing API throttling
+- Managing concurrent request load
+
+```typescript Example
+const ratelimit = new RatelimitAI({
+  RPM: 100,  // 100 requests per minute
+});
+```
+
+The RPM limit is checked before each request. If a user exceeds the limit in a minute, subsequent requests will be rate limited until the minute window resets.
+
+### Requests Per Day (RPD)
+
+The number of requests allowed per day. This limit is useful for:
+
+- Implement daily usage quotas
+- Set up tiered access levels
+
+```typescript Example
+const ratelimit = new RatelimitAI({
+  RPD: 1000,  // 1000 requests per day
+});
+```
+
+The RPD counter resets at midnight UTC, making it ideal for implementing quotas like "1000 requests per day per API key"
+
+### Tokens Per Minute (TPM)
+
+A specialized limit that tracks total token usage per minute, which is essential for LLM API cost management:
+
+- Counts both input (prompt) and output (completion) tokens
+- Provides more granular cost control than request-based limits.
+- Aligns with LLM provider pricing models
+
+```typescript Example
+const ratelimit = new RatelimitAI({
+  TPM: 5000,  // 5000 tokens per minute
+});
+```
+
+## Request Scheduling
+When rate limits are hit, Ratelimit AI can automatically schedule requests for later processing.
+
+```typescript Example
+const ratelimit = new RatelimitAI({
+  RPM: 100,
+  RPD: 1000,
+  callback: "https://api.example.com/retry"
+});
+```
+<Note>
+This feature requires the `QSTASH_TOKEN` environment variable to be set.
+</Note>
+
+
+When a request hits the rate limit:
+
+1. The request is automatically scheduled in QStash 
+2. QStash waits until the rate limit resets
+3. The request is executed (prompt will be sent to the LLM API) and the response (completion) is sent to your callback URL.
+
+## Analytics
+
+RatelimitAI can collect analytics about your rate limit usage. Analytics tracking is disabled by default and can be enabled during initialization:
+
+```typescript Example
+const ratelimit = new RatelimitAI({
+  // ...
+  analytics: true
+});
+```
+
+When analytics is enabled, RatelimitAI will collect information about the number of requests made, rate limit successes, and failures. This data can be viewed in the [Upstash Console](https://console.upstash.com).
+
+### Dashboard
+
+The Upstash Console provides a Rate Limit Analytics dashboard where you can monitor your usage. Access it by clicking the three dots menu in your Redis database page and selecting **Rate Limit Analytics**.
+
+The dashboard displays three main categories of requests: allowed requests showing successful API calls, rate limited requests indicating which identifiers hit limits, and denied requests showing blocked API calls. You can view this data over time and see usage patterns for different rate limit types.
+
+<Note>
+If you've configured RatelimitAI with a custom prefix, enter the same prefix in the dashboard's top left corner to filter your analytics data.
+</Note>
+
+For each rate-limited request, the analytics system records the identifier, timestamp, limit type (RPM/RPD/TPM), and status. For token-based limits, it also tracks the number of tokens used. This information helps you understand your API usage patterns and optimize your rate limit configurations.
diff --git a/redis/sdks/ratelimit-ai/getting-started.mdx b/redis/sdks/ratelimit-ai/getting-started.mdx
@@ -0,0 +1,89 @@
+---
+title: Getting Started
+---
+
+## Steps
+<Steps>
+	<Step title="Create Redis Instance">
+	  For the rate limit to work, we need to create an Upstash Redis and get its
+	  credentials. To create an Upstash Redis, you can follow the [Upstash Redis
+	  "Get Started" guide](https://upstash.com/docs/redis/overall/getstarted).
+	</Step>
+
+	<Step title="Installation">
+	  First, we need to install `@upstash/ratelimit-ai`. You can use npm, pnpm, or yarn to install the package from your terminal.
+	  <CodeGroup>
+	    ```bash npm
+		npm install @upstash/ratelimit-ai
+		```
+
+	    ```bash pnpm
+		pnpm install @upstash/ratelimit-ai
+		```
+
+	    ```bash bun 
+		bun install @upstash/ratelimit-ai
+		```
+	  </CodeGroup>
+	</Step>
+
+	<Step title="Environment Variables">
+		Set up your environment variables. You can find these in your [Upstash Console](https://console.upstash.com/).
+
+		```bash .env
+		UPSTASH_REDIS_REST_URL=****
+		UPSTASH_REDIS_REST_TOKEN=****
+		# Optional: For request scheduling
+		QSTASH_TOKEN=****
+		```
+	</Step>
+
+	<Step title="Initialize RatelimitAI">
+		Create a new instance of RatelimitAI with your desired configuration:
+		```typescript 
+		import { fetchRatelimitAI, RatelimitAI } from "@upstash/ratelimit-ai/openai";
+		import { createOpenAI } from '@ai-sdk/openai';
+
+		const ratelimit = new RatelimitAI({
+		  RPM: 100,    // Requests per minute
+		  RPD: 1000,   // Requests per day
+		  TPM: 5000,   // Tokens per minute
+		  analytics: true,  // Optional: Enable analytics
+		});
+
+		const userid = "user1";
+		const openai = createOpenAI({
+		  fetch: fetchRatelimitAI(userid, ratelimit)
+		});
+		```
+
+		RatelimitAI will automatically:
+		- Track request counts (RPM, RPD)
+		- Track token counts (TPM)
+		- Handle rate limiting
+		- Schedule requests if configured
+	</Step>
+
+	<Step title="Optional: Enable Request Scheduling">
+		If you want to automatically retry rate-limited requests, add QStash integration:
+
+		```typescript
+		const ratelimit = new RatelimitAI({
+		  RPM: 100,
+		  RPD: 1000,
+		  TPM: 5000,
+		  callback: "https://api.example.com/retry",  // Your callback URL
+		  analytics: true,
+		});
+		```
+
+		When a request is rate-limited, RatelimitAI will schedule a retry using QStash. You can then handle the retry in your backend.
+
+	</Step>
+</Steps>
+
+## Next Steps
+Now that you have RatelimitAI set up, you can:
+- Learn about available features in the [Features](./features) section
+- See integration examples in the [Examples](./examples) section
+- Monitor your usage in the [Upstash Console](https://console.upstash.com/ratelimit)
diff --git a/redis/sdks/ratelimit-ai/overview.mdx b/redis/sdks/ratelimit-ai/overview.mdx
@@ -0,0 +1,72 @@
+---
+title: Overview
+---
+
+# Upstash Ratelimit AI
+
+It is a specialized rate limiting library designed for Large Language Model (LLM) API providers (OpenAI, Anthropic, Azure etc.) built on top of Upstash Redis. This connectionless library provides token-aware rate limiting with built-in support for request scheduling and analytics.
+
+## Quick Links
+
+<CardGroup cols={3}>
+
+    <Card
+      title="Github Repository"
+      icon="github"
+      href="https://upstash.com/docs/redis/sdks/ratelimit-ts/features#caching"
+    >
+      Have a look at the source code
+    </Card>
+
+
+    <Card
+      title="Getting Started"
+      icon="flag-checkered"
+      href="https://upstash.com/docs/redis/sdks/ratelimit-ts/features#caching"
+    >
+      Start using Ratelimit AI
+    </Card>
+
+  <Card
+    title="Features"
+    icon="wand-magic-sparkles"
+    href="https://upstash.com/docs/redis/sdks/ratelimit-ts/features#caching"
+  >
+    See the things you can do with Ratelimit AI
+  </Card>
+</CardGroup>
+
+## Features
+
+<CardGroup cols={2}>
+  <Card title="Multiple Rate Limits" icon="chart-line" href="#">
+    Support for RPM (Requests Per Minute), RPD (Requests Per Day), and TPM
+    (Tokens Per Minute)
+  </Card>
+  <Card title="Token Counting" icon="calculator" href="#">
+    Automatic token counting for both prompts and responses
+  </Card>
+  <Card title="Analytics" icon="chart-bar" href="#">
+    Built-in analytics support with Upstash Console integration
+  </Card>
+  <Card title="Request Scheduling" icon="clock" href="#">
+    Automatic request scheduling with QStash when rate limits are hit
+  </Card>
+  <Card title="Serverless First" icon="cloud" href="#">
+    Designed for serverless environments including Edge functions
+  </Card>
+  <Card title="Easy Integration" icon="puzzle-piece" href="#">
+    Simple integration with popular LLM providers and AI SDKs
+  </Card>
+</CardGroup>
+
+## Examples
+
+<CardGroup cols={2}>
+  <Card title="OpenAI" href="#">
+    Rate limit OpenAI API calls with token tracking on Vercel AI SDK
+  </Card>
+  <Card title="Anthropic" href="#">
+    Rate limit Anthropic Claude API with request scheduling on Vercel AI SDK
+  </Card>
+</CardGroup>