Burla is the simplest way to scale any data pipeline.
Burla scales up to 10,000 CPUs in a single function call, supports GPUs, and custom containers.
Load data in parallel from cloud storage, then write results in parallel from thousands of VMs at once.
This creates a pipeline like:
Monitor progress in the dashboard:
Cancel bad runs, filter logs to watch individual inputs, or monitor output files in the UI.
How it works:
With Burla, running code in the cloud feels the same as coding on your laptop:
When functions are run with remote_parallel_map:
Anything they print appears locally (and inside Burla's dashboard).
Any exceptions are thrown locally.
Any packages or local modules they use are (very quickly) cloned on remote machines.
Code starts running in under one second! Even with millions of inputs or thousands of machines.
Features:
π¦ Automatic Package Sync
Burla automatically (and very quickly) clones your Python packages on every remote machine where code is executed.
π Custom Containers
Easily run code in any Docker container.
Public or private, just paste an image URI in the settings, then hit start!
π Network Filesystem
Need to get big data into/out of the cluster? Burla automatically mounts a cloud storage bucket to a folder in every container.
βοΈ Variable Hardware Per-Function
The func_cpu and func_ram args make it possible to assign big hardware to some functions, and less to others.