lamindb.ULabel#
- class lamindb.ULabel(name: str, description: Optional[str] = None, reference: Optional[str] = None, reference_type: Optional[str] = None)#
Bases:
Registry
,HasParents
,CanValidate
Universal label ontology.
- Parameters:
name –
str
A name.description –
str
A description.reference –
Optional[str] = None
For instance, an external ID or a URL.reference_type –
Optional[str] = None
For instance,"url"
.
A
ULabel
record provides the easiest way to annotate a file or dataset with a label:"My project"
,"curated"
, or"Batch X"
:>>> my_project = ULabel(name="My project") >>> my_project.save() >>> dataset.ulabels.add(my_project)
In some cases, a label is measured within a file or dataset a feature (a
Feature
record) denotes the column name in which the label is stored. For instance, the dataset might contain measurements across 2 organism of the Iris flower:"setosa"
&"versicolor"
.See Tutorial: Features & labels to learn more.
Note
If you work with complex entities like cell lines, cell types, tissues, etc., consider using the pre-defined biological registries in
lnschema_bionty
to label files & datasets.If you work with biological samples, likely, the only sustainable way of tracking metadata, is to create a custom schema module.
See also
lamindb.Feature()
Dimensions of measurement for files & datasets.
Examples
Create a new label:
>>> my_project = ln.ULabel(name="My project") >>> my_project.save()
Label a file without associating it to a feature:
>>> ulabel = ln.ULabel.filter(name="My project").one() >>> file = ln.File("./myfile.csv") >>> file.save() >>> file.ulabels.add(ulabel) >>> file.ulabels.list("name") ['My project']
Organize labels in a hierarchy:
>>> ulabels = ln.ULabel.lookup() # create a lookup >>> is_project = ln.ULabel(name="is_project") # create a super-category `is_project` >>> is_project.save() >>> ulabels.my_project.parents.add(is_project)
Query by
ULabel
:>>> ln.File.filter(ulabels=project).first()
Fields
- id AutoField
Internal id, valid only in one DB instance.
- uid CharField
A universal random id, valid across DB instances.
- name CharField
Name or title of ulabel (required).
- description TextField
A description (optional).
- reference CharField
A reference like URL or external ID.
- reference_type CharField
Type of reference, e.g., donor_id from Vendor X.
- created_at DateTimeField
Time of creation of record.
- updated_at DateTimeField
Time of last update to record.
- created_by ForeignKey
Creator of record, a
User
.
- parents ManyToManyField
Parent labels, useful to hierarchically group labels (optional).
Methods
- classmethod from_values(values, **kwargs)#
Bulk create validated records by parsing values for an identifier (a name, an id, etc.).
- Parameters:
values (
TypeVar
(ListLike
,list
, pd.Series, np.array)) – A list of values for an identifier, e.g.["name1", "name2"]
.field – A
Registry
field to look up, e.g.,lb.CellMarker.name
.**kwargs – Additional conditions for creation of records, e.g.,
organism="human"
.
- Return type:
List
[ULabel
]- Returns:
A list of validated records. For bionty registries, also returns knowledge-coupled records.
Notes
For more info, see tutorial: Manage biological registries.
Examples
Bulk create from non-validated values will log warnings & returns empty list:
>>> ulabels = ln.ULabel.from_values(["benchmark", "prediction", "test"], field="name") >>> assert len(ulabels) == 0
Bulk create records from validated values returns the corresponding existing records:
>>> ln.save([ln.ULabel(name=name) for name in ["benchmark", "prediction", "test"]]) >>> ulabels = ln.ULabel.from_values(["benchmark", "prediction", "test"], field="name") >>> assert len(ulabels) == 3
Bulk create records with shared kwargs:
>>> pipelines = ln.Transform.from_values(["Pipeline 1", "Pipeline 2"], field="name", ... type="pipeline", version="1") >>> pipelines
Bulk create records from bionty:
>>> import lnschema_bionty as lb >>> records = lb.CellType.from_values(["T cell", "B cell"], field="name") >>> records