Init an instance#
import lamindb as ln
from laminci.db import setup_local_test_postgres
from pathlib import Path
🔶 You haven't yet setup an instance using the CLI: Please call `ln.setup.init()` or `ln.setup.load()`
We already set up a user account for “testuser1@lamin.ai” and chose handle testuser1
.
ln.setup.login("testuser1") # CLI: lamin login testuser1
✅ Logged in with email testuser1@lamin.ai and id DzTjkKse
Local database & storage#
SQLite#
!lamin delete mydata
💬 Deleting instance testuser1/mydata
Traceback (most recent call last):
File "/opt/hostedtoolcache/Python/3.9.16/x64/bin/lamin", line 8, in <module>
sys.exit(main())
File "/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/lndb/__main__.py", line 139, in main
return delete(
File "/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/lndb/_delete.py", line 17, in delete
raise RuntimeError(
RuntimeError: Instance settings do not exist locally. Did you provide a wrong instance name? Could you try loading it?
ln.setup.init(storage="./mydata") # CLI: lamin init --storage ./mydata
💬 Not registering instance on hub, if you want, call `lamin register`
💬 Loading schema modules: core==0.34.0
✅ Loaded instance: testuser1/mydata
✅ Created & loaded instance: testuser1/mydata
This automatically assigns an instance name that equals the name of the storage root along with a few other settings:
ln.setup.settings.instance
Current instance: testuser1/mydata
- owner: testuser1
- name: mydata
- storage root: /home/runner/work/lndb/lndb/docs/guide/mydata
- storage region: None
- db: sqlite:////home/runner/work/lndb/lndb/docs/guide/mydata/mydata.lndb
- schema: set()
Show code cell content
assert ln.setup.settings.instance.storage.is_cloud == False
assert ln.setup.settings.instance.owner == ln.setup.settings.user.handle
assert ln.setup.settings.instance.name == "mydata"
assert ln.setup.settings.storage.root.as_posix() == Path("mydata").resolve().as_posix()
assert ln.setup.settings.storage.cache_dir is None
assert ln.setup.settings.storage.id is not None
assert (
ln.setup.settings.instance.db
== f"sqlite:///{Path('./mydata').resolve().as_posix()}/mydata.lndb"
)
If you want to register it on the hub at lamin.ai, call:
ln.setup.register()
Postgres#
Show code cell content
pgurl = setup_local_test_postgres()
💬 Created Postgres test instance: 'postgresql://postgres:pwd@0.0.0.0:5432/pgtest'
It runs in docker container 'pgtest'
A connection string for postgres looks like this:
pgurl
'postgresql://postgres:pwd@0.0.0.0:5432/pgtest'
Let us call init:
ln.setup.init(storage="./mydatapg", db=pgurl)
🔶 Instance metadata exists, but DB might have been corrupted or deleted. Re-initializing the DB.
💬 Not registering instance on hub, if you want, call `lamin register`
💬 Loading schema modules: core==0.34.0
✅ Loaded instance: testuser1/pgtest
✅ Created & loaded instance: testuser1/pgtest
Show code cell content
assert ln.setup.settings.instance.name == "pgtest"
assert ln.setup.settings.instance.storage.is_cloud == False
assert ln.setup.settings.instance.owner == ln.setup.settings.user.handle
assert ln.setup.settings.instance.dialect == "postgresql"
assert ln.setup.settings.instance.db == pgurl
assert ln.setup.settings.storage.id is not None
assert (
ln.setup.settings.instance.storage.root.as_posix()
== Path("mydatapg").absolute().as_posix()
)
assert ln.setup.settings.instance.storage.cache_dir is None
!lamin delete pgtest
!docker stop pgtest && docker rm pgtest
💬 Deleting instance testuser1/pgtest
💬 instance settings '.env' deleted
💬 current instance settings /home/runner/.lamin/current_instance.env deleted
💬 consider deleting your stored data manually: /home/runner/work/lndb/lndb/docs/guide/mydatapg
pgtest
pgtest
Custom instance name#
Show code cell content
pgurl = setup_local_test_postgres()
💬 Created Postgres test instance: 'postgresql://postgres:pwd@0.0.0.0:5432/pgtest'
It runs in docker container 'pgtest'
Instead of having the instance name be auto-determined from storage
or db
, you can provide a custom name:
ln.setup.init(
storage="./mystorage", name="mydata2", db=pgurl
) # CLI: lamin init --storage ./mystorage --name "mydata" --db ...
💬 Not registering instance on hub, if you want, call `lamin register`
💬 Loading schema modules: core==0.34.0
✅ Loaded instance: testuser1/mydata2
✅ Created & loaded instance: testuser1/mydata2
Show code cell content
assert ln.setup.settings.instance.name == "mydata2"
assert ln.setup.settings.instance.storage.is_cloud == False
assert ln.setup.settings.instance.owner == ln.setup.settings.user.handle
assert ln.setup.settings.instance.dialect == "postgresql"
assert ln.setup.settings.instance.db == pgurl
assert ln.setup.settings.storage.id is not None
assert (
ln.setup.settings.instance.storage.root.as_posix()
== Path("mystorage").absolute().as_posix()
)
assert ln.setup.settings.instance.storage.cache_dir is None
# test calling register()
# ln.setup.register()
# test calling load
ln.setup.load("mydata2")
assert ln.setup.settings.instance.name == "mydata2"
!lamin delete mydata2
!docker stop pgtest && docker rm pgtest
💬 Found cached instance metadata: /home/runner/.lamin/testuser1-instance-mydata2.env
✅ Loaded instance: testuser1/mydata2
💬 Deleting instance testuser1/mydata2
💬 instance settings '.env' deleted
💬 current instance settings /home/runner/.lamin/current_instance.env deleted
💬 consider deleting your stored data manually: /home/runner/work/lndb/lndb/docs/guide/mystorage
pgtest
pgtest
Configure with cloud storage#
AWS#
You need to have access to AWS S3 via awscli configure
.
Let us look at the special case of an sqlite instance:
ln.setup.init(
storage="s3://lndb-setup-ci"
) # CLI: lamin init --storage "s3://lndb-setup-ci"
2023-05-30 15:19:23,698:INFO - Found credentials in environment variables.
🔶 SQLite file s3://lndb-setup-ci/lndb-setup-ci.lndb does not exist
🔶 Instance metadata exists, but DB might have been corrupted or deleted. Re-initializing the DB.
2023-05-30 15:19:24,700:INFO - Found credentials in environment variables.
✅ Registered instance on hub: https://lamin.ai/testuser1/None
💬 Loading schema modules: core==0.34.0
✅ Loaded instance: testuser1/lndb-setup-ci
✅ Created & loaded instance: testuser1/lndb-setup-ci
ln.setup.settings.instance
Current instance: testuser1/lndb-setup-ci
- owner: testuser1
- name: lndb-setup-ci
- storage root: s3://lndb-setup-ci
- storage region: us-east-1
- db: sqlite:////home/runner/.cache/lamindb/lndb-setup-ci/lndb-setup-ci.lndb
- schema: set()
ln.setup.settings.instance._sqlite_file
S3Path('s3://lndb-setup-ci/lndb-setup-ci.lndb')
ln.setup.settings.instance._sqlite_file_local
PosixPath('/home/runner/.cache/lamindb/lndb-setup-ci/lndb-setup-ci.lndb')
Show code cell content
# test
assert ln.setup.settings.storage.is_cloud == True
assert str(ln.setup.settings.storage.root) == "s3://lndb-setup-ci/"
assert ln.setup.settings.storage.region == "us-east-1"
assert (
str(ln.setup.settings.instance._sqlite_file)
== "s3://lndb-setup-ci/lndb-setup-ci.lndb"
)
assert ln.setup.settings.storage.id is not None
# do the same for an S3 bucket in Europe
ln.setup.init(storage="s3://lndb-setup-ci-eu-central-1", name="lndb-setup-ci-europe")
assert ln.setup.settings.storage.region == "eu-central-1"
assert ln.setup.settings.instance.name == "lndb-setup-ci-europe"
assert (
str(ln.setup.settings.instance._sqlite_file)
== "s3://lndb-setup-ci-eu-central-1/lndb-setup-ci-europe.lndb"
)
assert ln.setup.settings.storage.id is not None
ln.setup.delete("lndb-setup-ci-europe")
🔶 SQLite file s3://lndb-setup-ci-eu-central-1/lndb-setup-ci-europe.lndb does not exist
🔶 Instance metadata exists, but DB might have been corrupted or deleted. Re-initializing the DB.
2023-05-30 15:19:32,833:INFO - Found credentials in environment variables.
✅ Registered instance on hub: https://lamin.ai/testuser1/lndb-setup-ci-europe
💬 Loading schema modules: core==0.34.0
✅ Loaded instance: testuser1/lndb-setup-ci-europe
✅ Created & loaded instance: testuser1/lndb-setup-ci-europe
💬 Deleting instance testuser1/lndb-setup-ci-europe
💬 instance settings '.env' deleted
💬 current instance settings /home/runner/.lamin/current_instance.env deleted
💬 instance cache deleted
💬 consider deleting your stored data manually: s3://lndb-setup-ci-eu-central-1/
💬 deleted '.lndb' sqlite file
💬 please manually delete your remote instance on lamin.ai
GCP#
You need to authenticate for Google Clod.
Either, set the environment variable
export GOOGLE_APPLICATION_CREDENTIALS=<HOME-DIR>/.lndb/<GOOGLE CLOUD PROJECT>.json
.Alternatively, if you set up the
gcloud
CLI, log in withgcloud auth application-default login
.
# ln.setup.init(storage="gs://lndb-setup-ci-us")
Show code cell content
# ln.setup.delete("lndb-setup-ci-us")
Re-initialize an existing instance#
Assume we accidentally init
an existing instance, it will be loaded:
assert ln.setup.init(storage="mydata") == "migrate-unnecessary"
💬 Found cached instance metadata: /home/runner/.lamin/testuser1-instance-mydata.env
✅ Loaded instance: testuser1/mydata
Show code cell content
assert type(ln.setup.settings.storage.id)