Install & setup LaminDB#

Installation#

pyversions

pip install lamindb  # basic data management

You can configure the installation using extras, e.g.,

pip install 'lamindb[jupyter,bionty]'

Supported extras are:

# commonly used
jupyter   # parse Jupyter notebooks
bionty    # manage basic biological entities
# cloud backends
aws       # AWS (s3fs, etc.)
gcp       # Google Cloud (gcfs, etc.)
# biological file formats
fcs       # manage FCS files (flow cytometry)
# storage backends
zarr      # store & stream arrays with zarr
# database backends
postgres  # postgres tools
# others
erdiagram # visualize ER diagrams, also needs graphviz

Sign up & log in#

Why do I have to sign up? → Data flow needs a user identity to answer questions like: Who modified which data when? Who shares this with me?

An account is free & signing up takes 1 min.

Note

Lamin does not store or see any of your data, but only basic metadata about you (email address, etc.) & your LaminDB instances (S3 bucket names, etc.): see the open-source client & the privacy policy.

On the command line, you can log in with either email or handle:

lamin login testuser1@lamin.ai
lamin login testuser1

If you don’t have a cached password in your environment, you need to pass it:

lamin login <email> --password <password>

Help#

Access help on the CLI:

lamin -h
lamin init -h

Docker#

If you’d like a docker container, here is a way: github.com/laminlabs/lamindb-docker.

Init an instance#

Parameters#

  • storage: a default storage location (e.g. s3://my-bucket, gs://my-bucket, ./my-data-dir)

  • name (optional): a name for the instance (e.g., my-project-assets, analyses-team-x, ml-experiments-y)

  • db (optional): a SQL database URI (defaults to SQLite)

  • schema (optional): comma-separated schema names

    • available plug ins are listed here

    • contact us to learn about customized enterprise plug ins

Examples#

Local storage + SQLite#

If you are only interested in tracking files and their transformations, init your local SQLite instance via:

lamin init --storage ./mydata

Mount the Bionty schema module:

lamin init --storage mydata --schema bionty

S3 + SQLite#

lamin init --storage s3://<bucket_name> --schema bionty,lamin1

GCP + Postgres#

lamin init --storage gs://<bucket_name> --db postgresql://<user>:<pwd>@<hostname>:<port>/<dbname> --schema bionty,lamin1

Load an instance#

Load your own instance:

lamin load <instance_name>

Load somebody else’s instance:

lamin load <account_handle/instance_name>

Access settings#

Now, let’s look at a specific example:

!lamin init --storage mydata --schema bionty
✅ saved: User(id='DzTjkKse', handle='testuser1', email='testuser1@lamin.ai', name='Test User1', updated_at=2023-09-26 15:22:35)
✅ saved: Storage(id='WqPucd3n', root='/home/runner/work/lamindb/lamindb/docs/mydata', type='local', updated_at=2023-09-26 15:22:35, created_by_id='DzTjkKse')
💡 loaded instance: testuser1/mydata
💡 did not register local instance on hub (if you want, call `lamin register`)

Print settings:

!lamin info
Current user: testuser1
- handle: testuser1
- email: testuser1@lamin.ai
- id: DzTjkKse
Current instance: testuser1/mydata
- owner: testuser1
- name: mydata
- storage root: /home/runner/work/lamindb/lamindb/docs/mydata
- storage region: None
- db: sqlite:////home/runner/work/lamindb/lamindb/docs/mydata/mydata.lndb
- schema: {'bionty'}

Settings persist in ~/.lamin/ and can also be accessed via lamindb.setup.settings.

import lamindb as ln
💡 loaded instance: testuser1/mydata (lamindb 0.54.2)
ln.setup.settings.user
Current user: testuser1
- handle: testuser1
- email: testuser1@lamin.ai
- id: DzTjkKse
ln.setup.settings.instance
Current instance: testuser1/mydata
- owner: testuser1
- name: mydata
- storage root: /home/runner/work/lamindb/lamindb/docs/mydata
- storage region: None
- db: sqlite:////home/runner/work/lamindb/lamindb/docs/mydata/mydata.lndb
- schema: {'bionty'}

Note

  • The user who creates an instance is its owner. Ownership can be transferred in the hub.

  • Advanced users could also consider the Python setup API: lamindb.setup.

Update default storage#

It’s easiest to see and update default storage in the Python API using storage:

import lamindb as ln
ln.settings.storage  # set via ln.settings.storage = "s3://other-bucket"
#> s3://default-bucket

You can also change it using the CLI via

lamin set --storage s3://other-bucket

Close an instance#

Loading an instance means loading an environment for managing your datasets.

When loading a new instance, you automatically close the previously loaded old instance.

If you want to close the instance without loading a new instance, use lamin close

Migrate an instance#

If you are an admin and you haven’t set up automated deployments of migrations, you can use two commands to create and deploy migrations:

  • lamin migrate create

  • lamin migrate deploy

Unless you manage a custom plugin schema, you’ll never need to create a migration.

You’ll receive a logged warning when deploying a migration is advisable.

How does this warning look like?

Here is an example:

% lamin load testdb
🔶 

Your database is not up to date with your installed Python library.

The database misses the following migrations:
[<Migration lnschema_core.0014_rename_ref_field_featureset_registry>, <Migration lnschema_core.0015_file_initial_version_file_version>]

Only if you are an admin and manage migrations manually, deploy them to the database:
lamin migrate deploy

Otherwise, downgrade your Python library to match the database!

✅ loaded instance: testuser1/testdb

Create a migration#

You need to have the schema package installed locally:

git clone https://github.com/my-org/lnschema-custom
cd lnschema-custom
pip install -e .

Edit the registries in your schema.

Then, call

lamin migrate create

to create the migration script.

When you’re happy, commit them to your GitHub repo, and ideally make a new release.

To deploy the migration call lamin migrate deploy.

Note

The lamin migration commands are a wrapper around Django’s migration manager.

Delete an instance#

This works as follows. It won’t delete your data, just the metadata managed by LaminDB:

!lamin delete --force mydata
💡 deleting instance testuser1/mydata
✅     deleted instance settings file: /home/runner/.lamin/instance--testuser1--mydata.env
✅     instance cache deleted
✅     deleted '.lndb' sqlite file
❗     consider manually deleting your stored data: /home/runner/work/lamindb/lamindb/docs/mydata