Burla
GitHub
  • Welcome
  • Overview
  • API-Reference
  • CLI-Reference
  • Installation
  • Architecture
  • Examples
  • About
  • Buy a brick. Get 10% off for life.
Powered by GitBook
On this page
  • burla.remote_parallel_map:
  • Any Questions?

Overview

PreviousWelcomeNextAPI-Reference

Last updated 6 days ago

burla.remote_parallel_map:

Run an arbitrary python function on many remote computers at the same time.

Basic use:

remote_parallel_map requires two arguments:

# Arg 1: Any python function.
def my_function(my_input):
    print(my_input)
    return my_input * 2

# Arg 2: List of inputs for `my_function`.
my_inputs = [1, 2, 3]

Then remote_parallel_map can be called like:

from burla import remote_parallel_map

outputs = remote_parallel_map(my_function, my_inputs)

print(list(outputs))

When run, remote_parallel_map will call my_function, on every object in my_inputs, at the same time, each on a separate CPU in the cloud (the max #CPUs depends on your cluster settings).

In under 1 second, the three function calls are made, all at the same time: my_function(1) , my_function(2), my_function(3)

Stdout produced on the remote machines is streamed back to the client (your machine). The return values of each function are also collected and sent back to the client. The following displays in the client's terminal:

1
2
3
[2, 4, 6]

In the above example, each function call would have been made inside a separate container, each with their own isolated filesystem.

Other arguments:

remote_parallel_map(
  function_,
  inputs,
  func_cpu=1,
  func_ram=4,
  background=False,
  spinner=True,
  generator=False,
  max_parallelism=None,
)

The func_cpu and func_ram arguments can be used to assign more resources to each individual function call, up to max amount your machine type supports.

background makes remote_parallel_map exit after having uploaded the function and inputs. After which the job will continue to run independantly on the cluster in the background.

spinner can be used to turn off the spinner, which also displays status messages from the cluster, like the state of the current job.

max_parallelism can be used to limit the number of function calls running at the same time. By default, the cluster will execute as many parallel functions as possible given the resources it has.

Any Questions?

remote_parallel_map has a few optional arguments, see for the full API-doc.

We're always happy to talk face to face.

Click here for API-Documentation.
API-Reference
Schedule a call with us!