Track notebooks & scripts#

In addition to tracking Python scripts, LaminDB tracks interactive analyses performed in notebooks.

By calling track() in a notebook or script, input data, and output data get automatically registered associated with the run.

Note

Provenance tracking of notebooks & scripts is analogous to tracking pipelines, scripts & UI data manipulation, see Project flow.

Setup#

!lamin init --storage ./test-track
Hide code cell output
✅ saved: User(uid='DzTjkKse', handle='testuser1', name='Test User1', updated_at=2024-03-04 14:09:23 UTC)
✅ saved: Storage(uid='Pjx0mAVw', root='/home/runner/work/lamindb/lamindb/docs/test-track', type='local', updated_at=2024-03-04 14:09:23 UTC, created_by_id=1)
💡 loaded instance: testuser1/test-track
💡 did not register local instance on lamin.ai
import lamindb as ln

ln.settings.verbosity = "hint"
💡 lamindb instance: testuser1/test-track

Initiate tracking#

Call inside a notebook or script:

ln.transform.stem_uid = "9priar0hoE5u"
ln.transform.version = "0"
ln.track()
💡 Assuming editor is Jupyter Lab.
💡 notebook imports: lamindb==0.68.0
💡 saved: Transform(uid='9priar0hoE5u6K79', name='Track notebooks & scripts', short_name='track', version='0', type=notebook, updated_at=2024-03-04 14:09:26 UTC, created_by_id=1)
💡 saved: Run(uid='48Q3qtOVXHVqpOKLz95K', run_at=2024-03-04 14:09:26 UTC, transform_id=1, created_by_id=1)
💡 tracked pip freeze > /home/runner/.cache/lamindb/run_env_pip_48Q3qtOVXHVqpOKLz95K.txt

LaminDB now automatically tracks all input and output data.

Save run reports and source artifact#

If you want to save a notebook including its run report & source artifact, use the CLI command:

lamin save <notebook_artifact>

See how transforms with execution reports looks in LaminHub:

Query for a notebook or script#

In the API, filter the Transform registry to obtain a notebook record:

import lamindb as ln


transform = ln.Transform.filter(name="Track notebooks").one()
# Your notebook is linked with to its source artifact (stripped of its output cells) and execution report (with the notebook's output cells)
transform.source_artifact
transform.latest_report

On LaminHub, use the UI filter in the Transforms view.

Hide code cell content
# clean up test instance
!lamin delete --force test-track
!rm -r test-track
💡 deleting instance testuser1/test-track
✅     deleted instance settings file: /home/runner/.lamin/instance--testuser1--test-track.env
✅     instance cache deleted
✅     deleted '.lndb' sqlite file
❗     consider manually deleting your stored data: /home/runner/work/lamindb/lamindb/docs/test-track