Install & setup

Installation

pyversions

pip install lamindb

You can configure the installation using extras, e.g.,

pip install 'lamindb[jupyter,bionty]'

Supported extras are:

# commonly used
jupyter   # parse Jupyter notebook metadata
bionty    # basic biological ontologies
# cloud backends
aws       # AWS (s3fs, etc.)
gcp       # Google Cloud (gcfs, etc.)
# biological artifact formats
fcs       # FCS artifacts (flow cytometry)
# storage backends
zarr      # store & stream arrays with zarr
# others
erdiagram # display schema graphs

If you’d like a docker container, here is a way: github.com/laminlabs/lamindb-docker.

Sign up & log in

  1. Sign up for a free account (see more info) and copy the API key.

  2. Log in on the command line:

    lamin login <email> --key <API-key>
    

Note

An account is free & signing up takes 1 min.

Lamin does not store or see any of your data, but only basic metadata about you (email address, etc.).

If you register a LaminDB instance on LaminHub, Lamin only stores the storage location (AWS S3 or GCP bucket names, directory names).

For more, see doc, the source code, or the privacy policy.

On the command line, you can log in with either email or handle:

lamin login testuser1@lamin.ai
lamin login testuser1

If you don’t have a cached API-key in your environment, you need to copy it from your lamin.ai account and pass it:

lamin login <email> --key <API-key>

Log out:

lamin lgout

Init an instance

You init an instance using lamin init on the commmand line and these options:

  • storage: a default storage location for the instance (e.g. s3://my-bucket, gs://my-bucket, ./my-data-dir)

  • name (optional): a name for the instance (e.g., my-assets)

  • db (optional): a Postgres database connection URL, do not pass for SQLite

  • schema (optional): comma-separated string of schema modules

Examples

Local storage + SQLite

If you are only interested in tracking artifacts and their transformations, init your local SQLite instance via:

lamin init --storage ./mydata

Mount the Bionty schema module:

lamin init --storage mydata --schema bionty

S3 + SQLite

lamin init --storage s3://<bucket_name> --schema bionty

GCP + Postgres

lamin init --storage gs://<bucket_name> --db postgresql://<user>:<pwd>@<hostname>:<port>/<dbname> --schema bionty

Load an instance

Load your own instance:

lamin load <instance_name>

Load somebody else’s instance:

lamin load <account_handle/instance_name>

Access settings

Now, let’s look at a specific example:

!lamin init --storage mydata --schema bionty
❗ instance exists with id bad64064a38a5d18ae1654872904d661, but database is not loadable: re-initializing
💡 connected lamindb: testuser1/mydata

Print settings:

!lamin info
Current user: testuser1
- handle: testuser1
- email: [email protected]
- uid: DzTjkKse
Auto-connect in Python: True
Current instance: testuser1/mydata
- owner: testuser1
- name: mydata
- storage root: /home/runner/work/lamindb/lamindb/docs/mydata
- storage region: None
- db: sqlite:////home/runner/work/lamindb/lamindb/docs/mydata/bad64064a38a5d18ae1654872904d661.lndb
- schema: {'bionty'}
- git_repo: None

Settings persist in ~/.lamin/ and can also be accessed via lamindb.setup.settings.

import lamindb as ln
💡 connected lamindb: testuser1/mydata
ln.setup.settings.user
Current user: testuser1
- handle: testuser1
- email: [email protected]
- uid: DzTjkKse
ln.setup.settings.instance
Current instance: testuser1/mydata
- owner: testuser1
- name: mydata
- storage root: /home/runner/work/lamindb/lamindb/docs/mydata
- storage region: None
- db: sqlite:////home/runner/work/lamindb/lamindb/docs/mydata/bad64064a38a5d18ae1654872904d661.lndb
- schema: {'bionty'}
- git_repo: None

Note

  • The user who creates an instance is its owner. Ownership can be transferred in the hub.

  • Advanced users could also consider the Python setup API: lamindb.setup.

Update default storage

It’s easiest to see and update default storage in the Python API using storage:

import lamindb as ln
ln.settings.storage  # set via ln.settings.storage = "s3://other-bucket"
#> s3://default-bucket

You can also change it using the CLI via

lamin set --storage s3://other-bucket

Close an instance

Loading an instance means loading an environment for managing your datasets.

When loading a new instance, you automatically close the previously loaded old instance.

If you want to close the instance without loading a new instance, use lamin close

Migrate an instance

If you are an admin and you haven’t set up automated deployments of migrations, you can use two commands to create and deploy migrations:

  • lamin migrate create

  • lamin migrate deploy

Unless you manage a custom plugin schema, you’ll never need to create a migration.

You’ll receive a logged warning when deploying a migration is advisable.

What does this warning look like?

Here is an example:

% lamin load testdb
🔶 

Your database is not up to date with your installed Python library.

The database misses the following migrations:
[<Migration lnschema_core.0014_rename_ref_field_featureset_registry>, <Migration lnschema_core.0015_artifact_initial_version_artifact_version>]

Only if you are an admin and manage migrations manually, deploy them to the database:
lamin migrate deploy

Otherwise, downgrade your Python library to match the database!

✅ loaded instance: testuser1/testdb

Create a migration

You need to have the schema package installed locally:

git clone https://github.com/my-org/lnschema-custom
cd lnschema-custom
pip install -e .

Edit the registries in your schema.

Then, call

lamin migrate create

to create the migration script.

When you’re happy, commit them to your GitHub repo, and ideally make a new release.

To deploy the migration call lamin migrate deploy.

Note

The lamin migration commands are a wrapper around Django’s migration manager.

Delete an instance

This works as follows. It won’t delete your data, just the metadata managed by LaminDB:

!lamin delete --force mydata
💡 deleting instance testuser1/mydata