Run any Python function on 1,000 computers in 1 second.

Burla is the simplest way to scale Python. It only has one function: remote_parallel_map It's open-source, works with GPU's, any docker image, and scales up to 10,000 CPU's in a single call.

Enable anyone to process terabytes of data in minutes.

  • Scalable: See our demo where we process 2.4TB of parquet files in 76s using 10,000 CPUs.

  • Flexible: Run any Python code, in any Docker container, on any hardware, in parallel.

  • Easy to learn: One function, two required arguments. Even beginners can apply Burla in minutes.

Our open-source web platform makes it easy to manage data, and monitor long-running pipelines.

How it works:

Burla only has one function: remote_parallel_map When called, it runs the given function, on every input in the given list, each on a separate computer.

from burla import remote_parallel_map

my_inputs = [1, 2, 3]

def my_function(my_input):
    print("I'm running on my own separate computer in the cloud!")
    return my_input
    
return_values = remote_parallel_map(my_function, my_inputs)

Running code in the cloud with Burla feels the same as coding locally:

  • Anything you print appears in your local terminal.

  • Exceptions thrown in your code are thrown on your local machine.

  • Responses are quick, you run a million function calls in a couple seconds!

Features:

📦 Automatic Package Sync

Burla clusters automatically (and very quickly) install any missing python packages into all containers in the cluster.

🐋 Custom Containers

Easily run code in any linux-based Docker container. Public or private, just paste an image URI in the settings, then hit start!

📂 Network Filesystem

Need to get big data into/out of the cluster? Burla automatically mounts a cloud storage bucket to ./shared in every container.

⚙️ Variable Hardware Per-Function

The func_cpu and func_ram args make it possible to assign more hardware to some functions, and less to others, unlocking new ways to simplify pipelines and architecture.

Create massive pipelines with plain Python:

Build pipelines that fan in/out over thousands of machines, then aggregate data in one big machine. The network filesystem mounted at `./shared` makes it easy to pass big data between steps.

from burla import remote_parallel_map

# Run `process_file` on many smaller machines
results = remote_parallel_map(process_file, files)

# Combine results on one large machine
result = remote_parallel_map(combine_results, [results], func_ram=256)

Quick Demo: Hyper-parameter tuning XGBoost with 1,000 CPUs

Get started now!


Questions? Schedule a call, or email [email protected]. We're always happy to talk.