Skip to main content

Rate Limits

The Kanva API implements rate limiting to ensure fair usage and system stability.

Current Limits

TierRequests/MinuteConcurrent Jobs
Free605
Standard30020
EnterpriseCustomCustom
info

These limits are subject to change. Contact support for current limits on your account.

Rate Limit Headers

Every API response includes rate limit information:

X-RateLimit-Limit: 300
X-RateLimit-Remaining: 299
X-RateLimit-Reset: 1705312800
HeaderDescription
X-RateLimit-LimitMaximum requests per window
X-RateLimit-RemainingRequests remaining in current window
X-RateLimit-ResetUnix timestamp when the window resets

Handling 429 Errors

When you exceed the rate limit, you receive a 429 Too Many Requests response:

{
"success": false,
"error": "Rate limit exceeded. Please retry after 45 seconds."
}

Check the Retry-After header for when you can retry:

import time

response = requests.post(url, ...)

if response.status_code == 429:
retry_after = int(response.headers.get("Retry-After", 60))
print(f"Rate limited. Waiting {retry_after} seconds...")
time.sleep(retry_after)
# Retry the request

Best Practices

Implement Exponential Backoff

import time
import random

def request_with_backoff(func, max_retries=5):
for attempt in range(max_retries):
response = func()

if response.status_code != 429:
return response

# Exponential backoff with jitter
wait = (2 ** attempt) + random.uniform(0, 1)
time.sleep(wait)

raise Exception("Max retries exceeded")

Use Batch Predictions

Instead of many single-item requests, batch multiple predictions:

# Bad: 100 separate requests
for item in items:
predict({"feature": [item]})

# Good: 1 batch request
predict({"feature": items})

Cache Results

If you frequently predict the same inputs, cache the results:

from functools import lru_cache

@lru_cache(maxsize=1000)
def predict_cached(feature_tuple):
return predict({"feature": list(feature_tuple)})

Monitor Usage

Track your API usage to stay within limits:

class RateLimitTracker:
def __init__(self):
self.remaining = None
self.reset_time = None

def update(self, response):
self.remaining = int(response.headers.get("X-RateLimit-Remaining", 0))
self.reset_time = int(response.headers.get("X-RateLimit-Reset", 0))

def should_wait(self, threshold=10):
if self.remaining is not None and self.remaining < threshold:
wait_time = self.reset_time - time.time()
if wait_time > 0:
return wait_time
return 0

Increasing Limits

For higher limits, contact Human-Driven AI sales to discuss enterprise plans.