Guide

Screenshot API Rate Limits: Understanding and Optimizing Usage

Navigate API rate limits effectively. Maximize throughput while staying within limits through smart batching and caching.

Asad AliOctober 25, 20256 min read

Rate limits protect API infrastructure and ensure fair usage across customers. Understanding how rate limits work helps you maximize throughput while avoiding errors.

How Rate Limits Work

Common Limit Types

Limit Type Description Example
Requests per second Short-term burst limit 10 req/sec
Requests per minute Medium-term limit 100 req/min
Requests per day Quota-based limit 5000 req/day
Concurrent requests Simultaneous connections 5 concurrent

Rate Limit Headers

Most APIs return rate limit information in headers:

HTTP/1.1 200 OK
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 95
X-RateLimit-Reset: 1699574400
Retry-After: 60

Headers explained:

  • X-RateLimit-Limit: Maximum requests allowed
  • X-RateLimit-Remaining: Requests left in window
  • X-RateLimit-Reset: Unix timestamp when limit resets
  • Retry-After: Seconds to wait (on 429 response)

Reading Rate Limit Headers

async function captureWithRateLimitAwareness(url) {
  const response = await fetch(API_URL, {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${API_KEY}`,
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({ url }),
  });
  
  // Extract rate limit info
  const rateLimit = {
    limit: parseInt(response.headers.get('X-RateLimit-Limit') || '0'),
    remaining: parseInt(response.headers.get('X-RateLimit-Remaining') || '0'),
    reset: parseInt(response.headers.get('X-RateLimit-Reset') || '0'),
  };
  
  console.log(`Rate limit: ${rateLimit.remaining}/${rateLimit.limit} remaining`);
  
  if (response.status === 429) {
    const retryAfter = parseInt(response.headers.get('Retry-After') || '60');
    throw new RateLimitError(`Rate limited. Retry after ${retryAfter}s`, retryAfter);
  }
  
  return {
    data: await response.arrayBuffer(),
    rateLimit,
  };
}

Handling 429 Errors

Basic Retry

async function captureWithRetry(url, maxRetries = 3) {
  for (let attempt = 0; attempt <= maxRetries; attempt++) {
    try {
      return await captureScreenshot(url);
    } catch (error) {
      if (error instanceof RateLimitError && attempt < maxRetries) {
        console.log(`Rate limited, waiting ${error.retryAfter}s...`);
        await sleep(error.retryAfter * 1000);
        continue;
      }
      throw error;
    }
  }
}

Exponential Backoff

async function captureWithBackoff(url) {
  const maxRetries = 5;
  const baseDelay = 1000; // 1 second
  
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    try {
      return await captureScreenshot(url);
    } catch (error) {
      if (error.status === 429) {
        // Use Retry-After header if available
        const retryAfter = error.retryAfter || 
          Math.min(baseDelay * Math.pow(2, attempt), 60000);
        
        console.log(`Attempt ${attempt + 1} failed, retrying in ${retryAfter}ms`);
        await sleep(retryAfter);
      } else {
        throw error;
      }
    }
  }
  
  throw new Error('Max retries exceeded');
}

Jittered Backoff

Prevent thundering herd:

function getJitteredDelay(attempt, baseDelay = 1000) {
  const exponentialDelay = baseDelay * Math.pow(2, attempt);
  const jitter = Math.random() * baseDelay;
  return Math.min(exponentialDelay + jitter, 120000); // Max 2 minutes
}

Proactive Rate Limiting

Client-Side Rate Limiter

Prevent hitting limits:

import Bottleneck from 'bottleneck';

const limiter = new Bottleneck({
  maxConcurrent: 5,      // Max simultaneous requests
  minTime: 100,          // Min 100ms between requests (10/sec)
  reservoir: 100,        // Max 100 requests
  reservoirRefreshAmount: 100,
  reservoirRefreshInterval: 60 * 1000, // Per minute
});

async function rateLimitedCapture(url) {
  return limiter.schedule(() => captureScreenshot(url));
}

Token Bucket Implementation

class TokenBucket {
  constructor(capacity, refillRate) {
    this.capacity = capacity;
    this.tokens = capacity;
    this.refillRate = refillRate; // tokens per second
    this.lastRefill = Date.now();
  }
  
  refill() {
    const now = Date.now();
    const elapsed = (now - this.lastRefill) / 1000;
    this.tokens = Math.min(this.capacity, this.tokens + elapsed * this.refillRate);
    this.lastRefill = now;
  }
  
  async acquire() {
    this.refill();
    
    if (this.tokens >= 1) {
      this.tokens -= 1;
      return true;
    }
    
    // Calculate wait time
    const waitTime = (1 - this.tokens) / this.refillRate * 1000;
    await sleep(waitTime);
    return this.acquire();
  }
}

const bucket = new TokenBucket(10, 10); // 10 capacity, 10/sec refill

async function throttledCapture(url) {
  await bucket.acquire();
  return captureScreenshot(url);
}

Batch Processing Strategies

Sequential with Delays

async function captureSequentially(urls, delayMs = 100) {
  const results = [];
  
  for (const url of urls) {
    results.push(await captureScreenshot(url));
    await sleep(delayMs);
  }
  
  return results;
}

Controlled Concurrency

import pLimit from 'p-limit';

async function captureWithConcurrency(urls, concurrency = 5) {
  const limit = pLimit(concurrency);
  
  return Promise.all(
    urls.map(url => limit(() => captureScreenshot(url)))
  );
}

Adaptive Rate Limiting

Adjust speed based on rate limit headers:

class AdaptiveRateLimiter {
  constructor() {
    this.minDelay = 50;
    this.currentDelay = 100;
    this.maxDelay = 2000;
  }
  
  async capture(url) {
    await sleep(this.currentDelay);
    
    try {
      const result = await captureWithRateLimitAwareness(url);
      this.adjustDelay(result.rateLimit);
      return result.data;
    } catch (error) {
      if (error instanceof RateLimitError) {
        this.currentDelay = Math.min(this.currentDelay * 2, this.maxDelay);
        throw error;
      }
      throw error;
    }
  }
  
  adjustDelay(rateLimit) {
    const utilizationPercent = 
      ((rateLimit.limit - rateLimit.remaining) / rateLimit.limit) * 100;
    
    if (utilizationPercent > 80) {
      // Slow down
      this.currentDelay = Math.min(this.currentDelay * 1.5, this.maxDelay);
    } else if (utilizationPercent < 50 && this.currentDelay > this.minDelay) {
      // Speed up
      this.currentDelay = Math.max(this.currentDelay * 0.8, this.minDelay);
    }
    
    console.log(`Adjusted delay to ${this.currentDelay}ms (${utilizationPercent}% utilized)`);
  }
}

Caching to Reduce API Calls

Cache Layer

const cache = new Map();
const CACHE_TTL = 60 * 60 * 1000; // 1 hour

async function cachedCapture(url, options = {}) {
  const cacheKey = `${url}:${JSON.stringify(options)}`;
  const cached = cache.get(cacheKey);
  
  if (cached && Date.now() < cached.expiresAt) {
    console.log('Cache hit:', url);
    return cached.data;
  }
  
  const data = await captureScreenshot(url, options);
  
  cache.set(cacheKey, {
    data,
    expiresAt: Date.now() + CACHE_TTL,
  });
  
  return data;
}

Redis Caching for Distributed Systems

import Redis from 'ioredis';

const redis = new Redis();

async function redisCachedCapture(url, options = {}) {
  const cacheKey = `screenshot:${md5(url + JSON.stringify(options))}`;
  
  const cached = await redis.getBuffer(cacheKey);
  if (cached) {
    return cached;
  }
  
  const data = await captureScreenshot(url, options);
  await redis.setex(cacheKey, 3600, data); // 1 hour TTL
  
  return data;
}

Monitoring Usage

Track Rate Limit Status

const rateLimitMetrics = {
  requests: 0,
  rateLimited: 0,
  lastRateLimit: null,
};

async function trackedCapture(url) {
  rateLimitMetrics.requests++;
  
  try {
    const result = await captureWithRateLimitAwareness(url);
    
    // Log when getting close to limit
    if (result.rateLimit.remaining < 10) {
      console.warn(`Low rate limit: ${result.rateLimit.remaining} remaining`);
    }
    
    return result.data;
  } catch (error) {
    if (error instanceof RateLimitError) {
      rateLimitMetrics.rateLimited++;
      rateLimitMetrics.lastRateLimit = new Date();
    }
    throw error;
  }
}

// Report metrics
setInterval(() => {
  const hitRate = rateLimitMetrics.rateLimited / rateLimitMetrics.requests * 100;
  console.log(`Rate limit hit rate: ${hitRate.toFixed(2)}%`);
}, 60000);

Dashboard Metrics

app.get('/api/metrics/rate-limits', (req, res) => {
  res.json({
    totalRequests: rateLimitMetrics.requests,
    rateLimited: rateLimitMetrics.rateLimited,
    hitRate: (rateLimitMetrics.rateLimited / rateLimitMetrics.requests * 100).toFixed(2),
    lastRateLimit: rateLimitMetrics.lastRateLimit,
  });
});

Best Practices

1. Respect Rate Limits

Always honor Retry-After headers:

if (response.status === 429) {
  const retryAfter = response.headers.get('Retry-After');
  await sleep(parseInt(retryAfter) * 1000);
}

2. Implement Circuit Breaker

Stop requests when repeatedly rate limited:

class CircuitBreaker {
  constructor(threshold = 5, timeout = 60000) {
    this.failures = 0;
    this.threshold = threshold;
    this.timeout = timeout;
    this.state = 'closed';
    this.nextAttempt = 0;
  }
  
  async execute(fn) {
    if (this.state === 'open' && Date.now() < this.nextAttempt) {
      throw new Error('Circuit breaker open');
    }
    
    try {
      const result = await fn();
      this.failures = 0;
      this.state = 'closed';
      return result;
    } catch (error) {
      if (error.status === 429) {
        this.failures++;
        if (this.failures >= this.threshold) {
          this.state = 'open';
          this.nextAttempt = Date.now() + this.timeout;
        }
      }
      throw error;
    }
  }
}

3. Pre-calculate Capacity

Plan batch jobs around limits:

const DAILY_LIMIT = 5000;
const urls = getUrlsToCapture(); // e.g., 10000 URLs

if (urls.length > DAILY_LIMIT) {
  console.log(`Warning: ${urls.length} URLs exceeds daily limit of ${DAILY_LIMIT}`);
  console.log(`Will take ${Math.ceil(urls.length / DAILY_LIMIT)} days`);
}

Conclusion

Effective rate limit management involves:

  1. Reading headers - Track remaining quota
  2. Retry with backoff - Handle 429s gracefully
  3. Proactive limiting - Stay under limits
  4. Caching - Reduce duplicate requests
  5. Monitoring - Track usage patterns

With these strategies, you'll maximize API throughput while maintaining reliable service.


Ready for high-volume screenshot automation?

Get your free API key → - 100 free screenshots to get started.

See also: Batch Processing Guide →

rate limits
optimization
api
performance

About the Author

Asad Ali

Asad Ali

Full-Stack Developer and Founder of ZTabs with 8+ years of experience building scalable web applications and APIs. Specializes in performance optimization, SaaS development, and modern web technologies.

Credentials: Founder & CEO at ZTabs, Full-Stack Developer, Expert in Next.js, React, Node.js, and API optimization

Ready to capture your first screenshot?

Get started with 100 free screenshots. No credit card required.

Related Articles