API/CLI Reference

API-Reference

`burla.remote_parallel_map`

Run any python function on many remote computers at the same time. This is the only function in the Burla python package.

remote_parallel_map(
  function_,
  inputs,
  func_cpu=1,
  func_ram=4,
  background=False,
  spinner=True,
  generator=False,
  max_parallelism=None,
)

Run provided function_ on each item in inputs at the same time, each on a separate CPU. If more inputs are provided than there are available CPU's, they are queued and processed sequentially on each worker. remote_parallel_map can reliably queue millions of inputs.

While running:

If the provided function_ raises an exception, the exception, including stack trace, is re-raised on the client machine in a way that looks like it was running locally.
Your print statements (anything written to stdout/stderr) are streamed back to your local machine, appearing like they would have if running the same code locally.

When finished remote_parallel_map returns a list of objects returned by each function_ call.

Parameters

Name

Description

function_

Callable

Any python function that is <100MB (1M lines) when pickled.

inputs

List[Any]

List of elements passable to function_. Tuples are unpacked into *args by default.

func_cpu

int

(Optional) Number of CPU's made available to each running instance of function_. Max possible value is determined by your cluster machine type. Defaults to 1.

func_ram

int

(Optional) Amount of RAM (GB) made available to each running instance of function_. Max possible value is determined by your cluster machine type. Defaults to 4.

detach

bool (Optional) Job will continue to run independently on the cluster if stopped locally. Detached jobs can run in the background for at least a week (as far as we tested).

spinner

bool

(Optional) Set to False to hide the status indicator/spinner.

generator

bool (Optional) Set to True to return a Generator instead of a List. The generator will yield outputs as they are produced, instead of all at once at the end.

max_parallelism

int

(Optional) Maximum number of function_ instances allowed to be running at the same time.

Returns

Type

Description

List or Generator

List of objects returned by function_ in no particular order. If Generator=True, returns generator yielding objects returned by function_ in the order they are produced.

CLI-Reference

Burla's CLI contains the following commands:

burla install Deploy self-hosted Burla in your Google Cloud project.
burla login Connect your computer to the cluster you last logged into in the browser.

The global arg --help can be placed after any command or command group to see CLI documentation.

`burla install`

Deploy a self-hosted Burla instance in your current Google Cloud Project. Running burla install multiple times will update the existing installation with the latest version.

Description:

Installs Burla inside the Google Cloud project that your gcloud CLI is currently pointing to. For a more user-friendly installation guide see: Installation: Self-Hosted

To view your current gcloud project run: gcloud config get project To change your current gcloud project run: gcloud config set project <desired-project-id>

Prerequisites:

Have the gcloud CLI installed (how do I install the gcloud CLI?).
Be logged in to the gcloud CLI (how do I log in?) (gcloud auth login & gcloud auth application-default login)
Have a Google Cloud user account with at least the minimum required permissions to install Burla. Or: Just run burla install, if you're missing any permissions it will tell you which ones!

Here is the set of permissions you'll need to run burla install: Any of these three permission set's will work.

Simplest possible permissions

Burla can be installed with either of the following roles:

Project Owner (roles/owner)
Project Editor (roles/editor)

Service admin based permissions (more specific)

Burla can be installed by users having the following generic roles:

Service Usage Admin (roles/serviceusage.serviceUsageAdmin)
Cloud Run Admin (roles/run.admin)
Compute Network Admin (roles/compute.networkAdmin)
Secret Manager Admin (roles/secretmanager.admin)
Firestore Database Admin (roles/datastore.owner)

Exact minimum required permissions (very specific)

Service Usage API (serviceusage.googleapis.com):
- serviceusage.services.enable for enabling:
  - compute.googleapis.com
  - run.googleapis.com
  - firestore.googleapis.com
  - cloudresourcemanager.googleapis.com
  - secretmanager.googleapis.com
Compute Engine API (compute.googleapis.com):
- compute.firewalls.create
- compute.firewalls.get (to check if firewall rule exists)
- compute.networks.updatePolicy
Secret Manager API (secretmanager.googleapis.com):
- secretmanager.secrets.create
- secretmanager.secrets.get
- secretmanager.versions.add
Firestore API (firestore.googleapis.com):
- datastore.databases.create
- datastore.databases.get
- datastore.documents.create
- datastore.documents.write
Cloud Run API (run.googleapis.com):
- run.services.create
- run.services.update
- run.services.get
- run.services.setIamPolicy (for --allow-unauthenticated flag)

Here is an IAM role definition for this permission set:

title: "Burla Installation Role"
description: "Minimum permissions needed to install Burla"
stage: "GA"
includedPermissions:
- serviceusage.services.enable
- compute.firewalls.create
- compute.firewalls.get
- compute.networks.updatePolicy
- secretmanager.secrets.create
- secretmanager.secrets.get
- secretmanager.versions.add
- datastore.databases.create
- datastore.databases.get
- datastore.documents.create
- datastore.documents.write
- run.services.create
- run.services.update
- run.services.get
- run.services.setIamPolicy

On install, your Google account (the one you are currently logged in to gcloud with) is set as the only account authorized to access this new Burla deployment.

We encourage you to check out _install.py in the client for even more specific installation details.

Connects your computer to the Burla cluster you most recently logged into in your browser. Authorizes your machine to call remote_parallel_map on this cluster.

Description:

Launches the "Authorize this Machine" page in your default web browser.

If there is no auth-cookie (you have not yet logged into the dashboard), throws simple error requesting you login to your cluster dashboard first.

When the "Authorize" button is hit, a new auth token is created and sent to your machine.

This token is saved in the text file burla_credentials.json. This file is stored in your operating system's recommended user data directory which is determined using the appdirs python library.

This token is refreshed each time the burla login is run, or certain amount of time passes.

Questions? Schedule a call with us, or email [email protected]. We're always happy to talk.