lamindb.Feature#
- class lamindb.Feature(name: str, type: str, unit: Optional[str], description: Optional[str], synonyms: Optional[str])#
Bases:
Registry
,CanValidate
Dimensions of measurement.
See also
from_df()
Create feature records from DataFrame.
features
Manage feature annotations of files & datasets.
lamindb.ULabel()
ULabels for files & datasets.
- Parameters:
name –
str
Name of the feature, typically, a column name.type –
str
Simple type ("number"
,"category"
,"datetime"
).unit –
Optional[str] = None
Unit of measure, ideally SI ("m"
,"s"
,"kg"
, etc.) or"normalized"
etc.description –
Optional[str] = None
A description.synonyms –
Optional[str] = None
Bar-separated synonyms.
Note
Features and labels denote two ways for using entities to organize data:
A feature qualifies which entity is measured (e.g., is a vector of categories)
A label is a measured value of an entity (a category)
If re-shaping data introduced ambiguity, ask yourself what the joint measurement was: a feature qualifies variables in a joint measurement. You might be looking at a label if data was re-shaped from there.
Notes
For more control, you can use
lnschema_bionty
ORMs to manage common basic biological entities like genes, proteins & cell markers involved in expression/count measurements.Similarly, you can define custom ORMs to manage high-level derived features like gene sets, malignancy, etc.
Examples
>>> df = pd.DataFrame({"feat1": [1, 2], "feat2": [3.1, 4.2], "feat3": ["cond1", "cond2"]}) >>> features = ln.Feature.from_df(df) >>> features.save() >>> # the information from the DataFrame is now available in the Feature table >>> ln.Feature.filter().df() id name type a feat1 int b feat2 float c feat3 str
Fields
- id AutoField
Internal id, valid only in one DB instance.
- uid CharField
Universal id, valid across DB instances.
- name CharField
Name of feature (required).
- type CharField
Simple type.
If “category”, consider managing categories with
ULabel
or another Registry for managing labels.
- unit CharField
Unit of measure, ideally SI (
m
,s
,kg
, etc.) or ‘normalized’ etc. (optional).
- description TextField
A description.
- registries CharField
Registries that provide values for labels, bar-separated (|) (optional).
- synonyms TextField
Bar-separated (|) synonyms (optional).
- created_at DateTimeField
Time of creation of record.
- updated_at DateTimeField
Time of run execution.
- created_by ForeignKey
Creator of record, a
User
.
- feature_sets ManyToManyField
Feature sets linked to this feature.
Methods
- save(*args, **kwargs)#
Save.
- Return type:
None