Notion API Rate Limits 2026: Complete Guide with Retry Strategies

Matthew Diakonov··9 min read

Notion API Rate Limits 2026: Complete Guide with Retry Strategies

The Notion API enforces rate limits per integration token, not per workspace or per user. If you are building an integration that syncs databases, processes pages in bulk, or reacts to changes in real time, understanding these limits is the difference between a smooth integration and one that chokes under load.

This guide covers every rate limit that applies to the Notion API in 2026, explains the token bucket mechanism Notion uses, and provides production-tested retry strategies with code examples.

Current Rate Limit Structure

Notion updated its rate limit headers and behavior with the 2026-04-01 API version. The core limits remain the same, but the response format changed.

| Limit | Value | Applies To | |---|---|---| | Sustained request rate | 3 requests/second per integration | All REST endpoints | | Burst allowance | Up to 10 requests/second | Short bursts before throttle kicks in | | Rate limit response | HTTP 429 with Retry-After header | Returned when limit is exceeded | | Search endpoint | ~1 request/second effective | Stricter internal throttle | | Pagination cost | 1 request per page of results | 500 rows at 100/page = 5 requests | | Webhook subscriptions | 50 per integration | No per-second limit on inbound deliveries | | Bulk operations | 100 items per request | Available since 2026-02-01 API version |

The 3 requests/second sustained rate is the limit that matters most. Every API call, whether it reads a single page property or queries an entire database, counts as one request against this budget.

How the Token Bucket Works

Notion uses a token bucket algorithm rather than a fixed window. This distinction matters because it affects how you should time your requests.

Token Bucketmax: 10 tokens6 tokensavailable+3 tokens/sec(refill rate)-1 token/request(drain rate)Bucket full: requests OKBucket low: slow downBucket empty: HTTP 429

Here is how the bucket behaves:

  • Capacity: 10 tokens (the burst limit)
  • Refill rate: 3 tokens per second
  • Cost per request: 1 token
  • When empty: Notion returns HTTP 429 with a Retry-After header

If your integration is idle for a few seconds, the bucket fills to 10. You can then send 10 requests in quick succession before hitting the limit. After the burst, you need to sustain at or below 3 requests per second.

Rate Limit Headers (2026 Format)

Starting with the 2026-04-01 API version, Notion returns standard rate limit headers on every response:

X-RateLimit-Limit: 3
X-RateLimit-Remaining: 2
X-RateLimit-Reset: 1713024001
Retry-After: 1

The X-RateLimit-Remaining header tells you how many tokens are left in your bucket. The X-RateLimit-Reset header gives you a Unix timestamp for when the bucket will have at least one token available. Previous API versions only returned Retry-After on 429 responses.

Per-Endpoint Behavior

Not all endpoints consume rate limit tokens equally in practice. While each call costs exactly one token, some endpoints are more likely to chain into multiple calls due to pagination or nested data fetching.

| Endpoint | Typical Token Cost | Why | |---|---|---| | GET /v1/pages/{id} | 1 | Single page, no pagination | | POST /v1/databases/{id}/query | 1 to 10+ | Depends on result count and page size | | POST /v1/search | 1 to 5 | Pagination plus stricter internal throttle | | GET /v1/blocks/{id}/children | 1 to 20+ | Deep pages with many nested blocks | | PATCH /v1/pages/{id} | 1 | Single write | | POST /v1/pages | 1 | Single create | | POST /v1/bulk/pages | 1 | Up to 100 pages in one call (since Feb 2026) | | GET /v1/views/{id} | 1 | Views API (since Apr 2026) |

The bulk operations endpoint (POST /v1/bulk/pages) is the most efficient way to stay under rate limits when writing multiple pages. One request handles up to 100 pages, compared to 100 individual PATCH calls.

Retry Strategy: Exponential Backoff with Jitter

The simplest reliable retry strategy for Notion's rate limits combines exponential backoff with random jitter. Here is a production-ready implementation:

async function notionRequest(
  url: string,
  options: RequestInit,
  maxRetries = 5
): Promise<Response> {
  for (let attempt = 0; attempt <= maxRetries; attempt++) {
    const response = await fetch(url, options);

    if (response.status !== 429) {
      return response;
    }

    if (attempt === maxRetries) {
      throw new Error(
        `Notion rate limit exceeded after ${maxRetries} retries`
      );
    }

    const retryAfter = response.headers.get("Retry-After");
    const baseDelay = retryAfter
      ? parseInt(retryAfter, 10) * 1000
      : Math.pow(2, attempt) * 1000;
    const jitter = Math.random() * 500;
    const delay = baseDelay + jitter;

    await new Promise((resolve) => setTimeout(resolve, delay));
  }

  throw new Error("Unreachable");
}

The key decisions in this implementation:

  1. Respect Retry-After: Notion tells you exactly how long to wait. Use it.
  2. Exponential backoff as fallback: If the header is missing, double the wait time each attempt (1s, 2s, 4s, 8s, 16s).
  3. Random jitter: Prevents the thundering herd problem when multiple workers hit the limit simultaneously.

Rate Limit Strategies by Integration Type

Different integration patterns require different approaches to staying within the 3 requests/second limit.

Sync Integrations (Database Mirroring)

If you are syncing a Notion database to an external system, the most common pattern is periodic polling. Here is how to calculate your budget:

Requests per sync = (total_rows / page_size) + 1
Sync time at 3 req/sec = requests / 3

For a 1,000-row database at 100 rows per page:

  • 11 requests per full sync (10 pages + 1 initial query)
  • ~3.7 seconds minimum sync time
  • At 3 syncs per minute, you use 33 of your 180 available requests per minute (18%)

With the 2026-03-01 webhook support, you can eliminate polling entirely for change detection. Webhooks do not count against your rate limit because Notion pushes events to your endpoint rather than you pulling from theirs.

Bulk Write Integrations (Import/Migration)

When writing large amounts of data into Notion, the 2026-02-01 bulk operations endpoint reduces your request count by up to 100x:

| Approach | Requests for 5,000 pages | Time at 3 req/sec | |---|---|---| | Individual POST /v1/pages | 5,000 | ~28 minutes | | Bulk POST /v1/bulk/pages | 50 | ~17 seconds |

Real-Time Integrations (Chat Bots, Dashboards)

For integrations that need to respond to user actions in real time, the 3 req/sec limit is usually not a problem because user-driven requests are naturally spaced. The risk comes from background processes (syncs, indexing) consuming tokens that the real-time path needs.

Separate your integration tokens: use one token for user-facing requests and a different token for background sync. Each token gets its own 3 req/sec bucket.

Common Mistakes That Trigger Rate Limits

These patterns cause most rate limit issues in Notion integrations:

1. Fetching all block children recursively without throttling. A page with 50 nested blocks and sub-blocks can require 50+ API calls. Add a delay between recursive calls.

2. Polling unchanged databases. If your database has not changed since the last sync, the full query still costs the same number of requests. Use webhooks or the last_edited_time filter to skip unchanged data.

3. Not using pagination cursors. Starting a query from scratch instead of continuing with a cursor means re-fetching pages you have already seen.

4. Running multiple integration instances with the same token. If you deploy your integration to three servers, all three share the same 3 req/sec bucket. Use a centralized rate limiter or separate tokens.

5. Ignoring the search endpoint throttle. The search endpoint has a stricter internal limit (~1 req/sec). Building a search-heavy integration requires caching results locally.

Monitoring Your Rate Limit Usage

Track the X-RateLimit-Remaining header on every response to build visibility into your consumption:

function logRateLimit(response: Response, endpoint: string) {
  const remaining = response.headers.get("X-RateLimit-Remaining");
  const limit = response.headers.get("X-RateLimit-Limit");

  if (remaining !== null) {
    console.log(
      `[notion] ${endpoint} - ${remaining}/${limit} tokens remaining`
    );

    if (parseInt(remaining, 10) <= 1) {
      console.warn(`[notion] Rate limit nearly exhausted on ${endpoint}`);
    }
  }
}

Log these values to your monitoring system and set alerts when remaining drops below 2. This gives you early warning before your integration starts receiving 429 responses.

What Happens When You Hit the Limit

When Notion returns HTTP 429, the response body includes a JSON error:

{
  "object": "error",
  "status": 429,
  "code": "rate_limited",
  "message": "Rate limited. Please retry after 1 second.",
  "request_id": "abc123-def456"
}

The Retry-After header is authoritative. Do not retry before the specified time. Repeated violations within a short window can extend the backoff period, and persistent abuse may result in temporary token revocation.

Rate Limits vs. Usage Limits

Rate limits (requests per second) are separate from Notion's workspace usage limits. Free workspaces have content limits (1,000 blocks per page for API-created content), and API integrations on free plans have a monthly request quota. These are not the same as the per-second rate limit.

| Limit Type | Free Plan | Plus Plan | Business/Enterprise | |---|---|---|---| | Rate limit (req/sec) | 3 | 3 | 3 | | Monthly API requests | 10,000 | Unlimited | Unlimited | | Blocks per API page | 1,000 | Unlimited | Unlimited | | Webhook subscriptions | 10 | 50 | 50 |

If you are building integrations for customers on free plans, you need to handle both per-second rate limits and monthly quota exhaustion.

Further Reading

For related topics, see:

Related Posts