Model data#
Reference: lamindb.schema
Any LaminDB can mount an arbitrary number of schema modules in its SQL database.
Each schema module is a Python module (and package) in which each SQL table is represented as a Python class, an ORM.
You can set up your own modules or reach out for support within our enterprise plan.
import lamindb as ln
ln.track()
ℹ️ Instance: testuser1/mydata
ℹ️ User: testuser1
ℹ️ Added notebook: Transform(id='6ZBQKdB7Mvlh', v='0', name='09-schema', type=notebook, title='Model data', created_by='DzTjkKse', created_at=datetime.datetime(2023, 3, 30, 23, 17, 5))
ℹ️ Added run: Run(id='l1cjVOvwRlrb9n4OnMUQ', transform_id='6ZBQKdB7Mvlh', transform_v='0', created_by='DzTjkKse', created_at=datetime.datetime(2023, 3, 30, 23, 17, 5))
View#
View the schema that is active in your DB instance using view()
.
It’s a graph of linked entities that correspond to related tables.
The central table that references indexed data objects (File
) that are transformed by runs (Run
).
It’s a bit overwhelming, but you’ll see that many entities correspond to very fundamental concepts (like Gene
or CellType
). You’ll get to know them module by module.
ln.schema.view()
Entities, tables, ORMs#
Tip
You’ll typically access the entities in each schema module through their Python ORM, which you can look up with auto-complete.
Type lns.*
for entities in the core schema module, lns.bionty.*
for basic biological entities, etc.
Each ORM is a SQLModel, meaning it offers all functionality of a SQLAlchemy ORM and a Pydantic BaseModel.
You can inspect it with auto-lookup or the multitude of possibilities offered by these classes, e.g.,
ln.Transform.__fields__
{'id': ModelField(name='id', type=str, required=False, default_factory='<function pipeline>'),
'v': ModelField(name='v', type=str, required=False, default='1'),
'name': ModelField(name='name', type=str, required=True),
'type': ModelField(name='type', type=TransformType, required=False, default=pipeline),
'title': ModelField(name='title', type=Optional[str], required=False, default=None),
'reference': ModelField(name='reference', type=Optional[str], required=False, default=None),
'created_by': ModelField(name='created_by', type=str, required=False, default_factory='<function current_user_id>'),
'created_at': ModelField(name='created_at', type=datetime, required=True),
'updated_at': ModelField(name='updated_at', type=Optional[datetime], required=False, default=None)}
You likely won’t need to, but you can access the underlying SQL table of an ORM like this:
ln.Transform.__table__
Table('core.transform', MetaData(), Column('id', AutoString(), table=<core.transform>, primary_key=True, nullable=False, default=ColumnDefault(<function pipeline at 0x7f966c3df790>)), Column('v', AutoString(), table=<core.transform>, primary_key=True, nullable=False, default=ColumnDefault('1')), Column('name', AutoString(), table=<core.transform>, nullable=False), Column('type', Enum('pipeline', 'notebook', 'app', name='transformtype'), table=<core.transform>, nullable=False, default=ColumnDefault(pipeline)), Column('title', AutoString(), table=<core.transform>), Column('reference', AutoString(), table=<core.transform>), Column('created_by', AutoString(), ForeignKey('core.user.id'), table=<core.transform>, nullable=False, default=ColumnDefault(<function current_user_id at 0x7f966c3df820>)), Column('created_at', DateTime(), table=<core.transform>, nullable=False, server_default=DefaultClause(<sqlalchemy.sql.functions.now at 0x7f966ca67f40; now>, for_update=False)), Column('updated_at', DateTime(), table=<core.transform>, onupdate=ColumnDefault(<sqlalchemy.sql.functions.now at 0x7f966ca67f10; now>)), schema=None)