Page cover

Run a 2M-user API backfill

In this example we:

  • Split 2,000,000 ids into 2,000 chunks.

  • Run up to 1,000 Burla workers.

  • Sleep inside each worker so the global rate stays around 1,000 requests per second.

  • Stream successful rows and retryable failures to a JSONL output.

A 5,000-id test does not tell you much if it never hits the provider's real limit.

Dataset: user ids to backfill

The input is a plain text file with one user id per line.

import json
import os
import time
from pathlib import Path

import httpx
from burla import remote_parallel_map

API_KEY = os.environ["API_KEY"]
OUT_PATH = Path("/workspace/shared/api-backfill/users.jsonl")
CHUNK = 1_000
MAX_PARALLELISM = 1_000
SECONDS_BETWEEN_CALLS_PER_WORKER = 1.0

Step 1: Chunk the ids

Each chunk is large enough to amortize startup and small enough to stream results as it finishes.

Step 2: Put pacing near the HTTP call

The worker owns local pacing and retry behavior. A 429 becomes a paced retry, not a cluster-wide retry storm.

Step 3: Smoke test the real behavior

Use a small run that still exercises real HTTP calls and retry handling.

Step 4: Cap live workers

max_parallelism is the global throttle.

What's the point?

Rate limits are where toy parallelism lies. A local async script can look great until it turns into a retry storm.

The useful question is: can I finish the whole backfill without breaking the provider's contract? That means chunking, local sleeps, global concurrency, and output streaming. Burla handles the worker fleet; your function still owns the API behavior.

Last updated