Model data#

Reference: lamindb.schema

Any LaminDB can mount an arbitrary number of schema modules in its SQL database.

Each schema module is a Python module (and package) in which each SQL table is represented as a Python class, an ORM.

You can set up your own modules or reach out for support within our enterprise plan.

import lamindb as ln
ln.track()
ℹ️ Instance: testuser1/mydata
ℹ️ User: testuser1
ℹ️ Added notebook: Transform(id='6ZBQKdB7Mvlh', v='0', name='09-schema', type=notebook, title='Model data', created_by='DzTjkKse', created_at=datetime.datetime(2023, 3, 30, 23, 17, 5))
ℹ️ Added run: Run(id='l1cjVOvwRlrb9n4OnMUQ', transform_id='6ZBQKdB7Mvlh', transform_v='0', created_by='DzTjkKse', created_at=datetime.datetime(2023, 3, 30, 23, 17, 5))

View#

View the schema that is active in your DB instance using view().

It’s a graph of linked entities that correspond to related tables.

The central table that references indexed data objects (File) that are transformed by runs (Run).

It’s a bit overwhelming, but you’ll see that many entities correspond to very fundamental concepts (like Gene or CellType). You’ll get to know them module by module.

ln.schema.view()
../_images/e1338cb28d8cd858f9b364e83652c52b1694e05b62c7aec54a044e956c3cfe5f.svg

Entities, tables, ORMs#

Tip

You’ll typically access the entities in each schema module through their Python ORM, which you can look up with auto-complete.

Type lns.* for entities in the core schema module, lns.bionty.* for basic biological entities, etc.

Each ORM is a SQLModel, meaning it offers all functionality of a SQLAlchemy ORM and a Pydantic BaseModel.

You can inspect it with auto-lookup or the multitude of possibilities offered by these classes, e.g.,

ln.Transform.__fields__
{'id': ModelField(name='id', type=str, required=False, default_factory='<function pipeline>'),
 'v': ModelField(name='v', type=str, required=False, default='1'),
 'name': ModelField(name='name', type=str, required=True),
 'type': ModelField(name='type', type=TransformType, required=False, default=pipeline),
 'title': ModelField(name='title', type=Optional[str], required=False, default=None),
 'reference': ModelField(name='reference', type=Optional[str], required=False, default=None),
 'created_by': ModelField(name='created_by', type=str, required=False, default_factory='<function current_user_id>'),
 'created_at': ModelField(name='created_at', type=datetime, required=True),
 'updated_at': ModelField(name='updated_at', type=Optional[datetime], required=False, default=None)}

You likely won’t need to, but you can access the underlying SQL table of an ORM like this:

ln.Transform.__table__
Table('core.transform', MetaData(), Column('id', AutoString(), table=<core.transform>, primary_key=True, nullable=False, default=ColumnDefault(<function pipeline at 0x7f966c3df790>)), Column('v', AutoString(), table=<core.transform>, primary_key=True, nullable=False, default=ColumnDefault('1')), Column('name', AutoString(), table=<core.transform>, nullable=False), Column('type', Enum('pipeline', 'notebook', 'app', name='transformtype'), table=<core.transform>, nullable=False, default=ColumnDefault(pipeline)), Column('title', AutoString(), table=<core.transform>), Column('reference', AutoString(), table=<core.transform>), Column('created_by', AutoString(), ForeignKey('core.user.id'), table=<core.transform>, nullable=False, default=ColumnDefault(<function current_user_id at 0x7f966c3df820>)), Column('created_at', DateTime(), table=<core.transform>, nullable=False, server_default=DefaultClause(<sqlalchemy.sql.functions.now at 0x7f966ca67f40; now>, for_update=False)), Column('updated_at', DateTime(), table=<core.transform>, onupdate=ColumnDefault(<sqlalchemy.sql.functions.now at 0x7f966ca67f10; now>)), schema=None)