API Reference

remote_parallel_map is the only function in the Burla python package.


burla.remote_parallel_map

Run a Python function on many remote computers at the same time.

remote_parallel_map(
  function_,
  inputs,
  func_cpu=1,
  func_ram="dynamic",
  func_gpu=None,
  image=None,
  grow=False,
  max_parallelism=None,
  detach=False,
  generator=False,
  spinner=True
)

Run provided function_ on each item in inputs in at the same time using all available workers. Extra inputs are queued and processed sequentially on each worker.

If grow=True automatically boots and assigns additional workers to minimize runtime.

While running:

  • If the provided function_ raises an exception, the exception is raised on the client machine.

  • Your print statements (anything written to stdout/stderr) are streamed back to your local machine, appearing like they would have if running the same code locally.

When finished remote_parallel_map returns a list of objects returned by each function_ call. Optionally, it can return a generator that yields results as they become available.

Parameters

Name

Description

function_

Callable

Any python function that is <100MB (1M lines) when pickled.

inputs

List[Any]

List of elements passable to function_. Tuples are unpacked into *args by default.

func_cpu

int

(Optional) Number of CPU's made available to each running instance of function_. Max possible value is determined by your cluster machine type. Defaults to 1.

func_ram

int or "dynamic"

(Optional) Amount of RAM (GB) made available to each running instance of function_. Defaults to "dynamic": Burla starts with as many parallel function calls as the requested CPUs allow, then gives each call more RAM by lowering parallelism on any node where workers run out of memory. Pass an integer, such as 8, to reserve a fixed amount of RAM per function call instead. Max fixed RAM is determined by your cluster machine type.

func_gpu

str

(Optional) Allocate one GPU per call to function_. One of: "A100" / "A100_40G", "A100_80G", "H100" / "H100_80G". Defaults to None (no GPU).

image

str

(Optional) If provided, only nodes running this container image are eligible. When grow=True and no matching nodes are available, newly booted nodes will run this image. Defaults to None (no image filter).

grow

bool (Optional) Automatically adds additional nodes to complete the provided work as quickly as possible. These nodes inherit existing settings.

max_parallelism

int

(Optional) Maximum number of function_ instances allowed to be running at the same time.

detach

bool (Optional) Job will continue to run independently on the cluster if stopped locally. Detached jobs can run in the background indefinitely.

generator

bool (Optional) Set to True to return a Generator instead of a List. The generator will yield outputs as they are produced, instead of all at once at the end.

spinner

bool

(Optional) Set to False to hide the status indicator/spinner.

Returns

Type

Description

List or Generator

List of objects returned by function_ in no particular order. If Generator=True, returns generator yielding objects returned by function_ in the order they are produced.


Questions? Schedule a call with us, or email [email protected]. We're always happy to talk.

Last updated