Track the redun run#
LaminDB only tracks the inputs, outputs and the execution ID of any redun run.
More finegrained information is tracked by redun.
import lamindb as ln
import lamindb.schema as lns
import os
import json
ln.nb.header()
author | Test User1 (testuser1) |
id | CKhlMCFA52oD |
version | 0 |
time_init | 2022-11-13 21:36 |
time_run | 2023-03-09 17:02 |
pypackage | lamindb==0.30.3 |
ℹ️ Instance: testuser1/fasta
ℹ️ Added notebook: CKhlMCFA52oD v0
ℹ️ Added run: tP9qjkt7COobBAKeBCTY
Query input data and create a run#
This is an artificial situation in which we pretend our input data comes from a notebook with ID 0ymQDuqM5Lwq
:
input_dobjects = (
ln.select(ln.DObject).join(lns.Run).join(lns.Notebook, id="0ymQDuqM5Lwq")
)
input_dobjects.df()
name | suffix | size | hash | source_id | storage_id | created_at | updated_at | |
---|---|---|---|---|---|---|---|---|
id | ||||||||
KqkKXJ457k00kO010gve | MYC | .fasta | 535 | yT6x3fflTrhfifJIZqpNGQ | NsKuQJ3EYpITaGUS03jt | 23mKzOkS | 2023-03-09 17:02:39 | None |
dsgk069xk49S3Dc2RcqE | PO5F1 | .fasta | 476 | q-HnUbqF7Z5qCeTUv9xDig | NsKuQJ3EYpITaGUS03jt | 23mKzOkS | 2023-03-09 17:02:39 | None |
Ed25tCL5xDLDAChkBJ7x | SOX2 | .fasta | 413 | rhr_rPQ9hb3bDGAIDNJj4Q | NsKuQJ3EYpITaGUS03jt | 23mKzOkS | 2023-03-09 17:02:39 | None |
iM7uwupZurkXw1us8Fsg | KLF4 | .fasta | 608 | 52g58tOwGbdCohjFCkfcpA | NsKuQJ3EYpITaGUS03jt | 23mKzOkS | 2023-03-09 17:02:39 | None |
Let’s create a LaminDB run record and link it agains the input files:
pipeline = ln.select(lns.Pipeline, name="lamin-redun-fasta").one() # load a pipeline
run = lns.Run(
name="Test run", pipeline=pipeline
) # create a run record that is linked against the pipeline
run.inputs = input_dobjects.all() # link inputs to run
run = ln.add(run) # add run to DB
run
Run(id='1Sn4xFNvlGCiPVD95aIk', name='Test run', pipeline_id='R8QwchFP', pipeline_v='0.1.0', created_by='DzTjkKse', created_at=datetime.datetime(2023, 3, 9, 17, 2, 44))
Execute redun#
In order to pass it to the redun CLI, export the run.id
as an env variable:
os.environ["LNDB_RUN_ID"] = run.id
Let us now call the workflow (tagging with the run.id
is optional but ensures convenient query via redun log
later):
!redun run workflow.py main --run-id $LNDB_RUN_ID --tag run-id=$LNDB_RUN_ID 1> redun_stdout.txt 2>redun_stderr.txt
Inspect the output:
!cat redun_stdout.txt
File(path=/home/runner/work/redun-lamin-fasta/redun-lamin-fasta/docs/guide/data/results.tgz, hash=0d9727b7)
!tail -1 redun_stderr.txt
2023-03-09 17:02:50,732:INFO - Execution duration: 1.74 seconds
!redun log --exec --exec-tag run-id=$LNDB_RUN_ID --format json --no-pager > redun_exec.json
redun_exec = json.load(open("redun_exec.json"))
redun_exec
{'_type': 'Execution',
'_version': 1,
'args': '["/home/runner/work/redun-lamin-fasta/redun-lamin-fasta/.nox/build-3-9/bin/redun", "run", "workflow.py", "main", "--run-id", "1Sn4xFNvlGCiPVD95aIk", "--tag", "run-id=1Sn4xFNvlGCiPVD95aIk"]',
'id': '97b0ca53-4b09-470c-b97f-085463abd45d',
'job_id': 'f6263b60-03fc-4361-a0a2-57e1889889b0'}
Track redun outputs and execution ID#
run = ln.select(lns.Run, id=run.id).one()
run.external_id = redun_exec["id"]
ln.add(run)
Run(id='1Sn4xFNvlGCiPVD95aIk', name='Test run', external_id='97b0ca53-4b09-470c-b97f-085463abd45d', pipeline_id='R8QwchFP', pipeline_v='0.1.0', created_by='DzTjkKse', created_at=datetime.datetime(2023, 3, 9, 17, 2, 44))
There is just a single output file to track, here:
dobject = ln.DObject(data="data/results.tgz", source=run)
ln.add(dobject)
DObject(id='7Yq6PwJoF6OmnyGUrHWw', name='results', suffix='.tgz', size=83769, hash='-2S5ssheWzZo5ykRz5-K8g', source_id='1Sn4xFNvlGCiPVD95aIk', storage_id='23mKzOkS', created_at=datetime.datetime(2023, 3, 9, 17, 2, 53))
View the database content#
ln.view()
****************
* module: core *
****************
DObject
name | suffix | size | hash | source_id | storage_id | created_at | updated_at | |
---|---|---|---|---|---|---|---|---|
id | ||||||||
KqkKXJ457k00kO010gve | MYC | .fasta | 535 | yT6x3fflTrhfifJIZqpNGQ | NsKuQJ3EYpITaGUS03jt | 23mKzOkS | 2023-03-09 17:02:39 | None |
dsgk069xk49S3Dc2RcqE | PO5F1 | .fasta | 476 | q-HnUbqF7Z5qCeTUv9xDig | NsKuQJ3EYpITaGUS03jt | 23mKzOkS | 2023-03-09 17:02:39 | None |
Ed25tCL5xDLDAChkBJ7x | SOX2 | .fasta | 413 | rhr_rPQ9hb3bDGAIDNJj4Q | NsKuQJ3EYpITaGUS03jt | 23mKzOkS | 2023-03-09 17:02:39 | None |
iM7uwupZurkXw1us8Fsg | KLF4 | .fasta | 608 | 52g58tOwGbdCohjFCkfcpA | NsKuQJ3EYpITaGUS03jt | 23mKzOkS | 2023-03-09 17:02:39 | None |
7Yq6PwJoF6OmnyGUrHWw | results | .tgz | 83769 | -2S5ssheWzZo5ykRz5-K8g | 1Sn4xFNvlGCiPVD95aIk | 23mKzOkS | 2023-03-09 17:02:53 | None |
Notebook
name | title | created_by | created_at | updated_at | ||
---|---|---|---|---|---|---|
id | v | |||||
0ymQDuqM5Lwq | 0 | 1-redun | Track redun workflows | DzTjkKse | 2023-03-09 17:02:38 | None |
CKhlMCFA52oD | 0 | 2-redun-run | Track the redun run | DzTjkKse | 2023-03-09 17:02:44 | None |
Pipeline
name | reference | created_by | created_at | updated_at | ||
---|---|---|---|---|---|---|
id | v | |||||
R8QwchFP | 0.1.0 | lamin-redun-fasta | https://github.com/laminlabs/redun-lamin-fasta | DzTjkKse | 2023-03-09 17:02:38 | None |
Run
name | external_id | pipeline_id | pipeline_v | notebook_id | notebook_v | created_by | created_at | |
---|---|---|---|---|---|---|---|---|
id | ||||||||
NsKuQJ3EYpITaGUS03jt | None | None | None | None | 0ymQDuqM5Lwq | 0 | DzTjkKse | 2023-03-09 17:02:38 |
tP9qjkt7COobBAKeBCTY | None | None | None | None | CKhlMCFA52oD | 0 | DzTjkKse | 2023-03-09 17:02:44 |
1Sn4xFNvlGCiPVD95aIk | Test run | 97b0ca53-4b09-470c-b97f-085463abd45d | R8QwchFP | 0.1.0 | None | None | DzTjkKse | 2023-03-09 17:02:44 |
Storage
root | type | region | created_at | updated_at | |
---|---|---|---|---|---|
id | |||||
23mKzOkS | /home/runner/work/redun-lamin-fasta/redun-lami... | local | None | 2023-03-09 17:02:34 | None |
User
handle | name | created_at | updated_at | ||
---|---|---|---|---|---|
id | |||||
DzTjkKse | testuser1@lamin.ai | testuser1 | Test User1 | 2023-03-09 17:02:34 | None |