lamindb.Feature#

class lamindb.Feature(name: str, type: str, unit: str | None, description: str | None, synonyms: str | None)#

Bases: Registry, CanValidate

Dimensions of measurement.

See also

from_df()

Create feature records from DataFrame.

features

Manage feature annotations of artifacts & collections.

lamindb.ULabel()

ULabels for artifacts & collections.

Parameters:
  • namestr Name of the feature, typically, a column name.

  • typestr Simple type ("number", "category", "datetime").

  • unitstr | None = None Unit of measure, ideally SI ("m", "s", "kg", etc.) or "normalized" etc.

  • descriptionstr | None = None A description.

  • synonymsstr | None = None Bar-separated synonyms.

  • registriesstr | None = None Bar-separated Registries that provide values for labels.

Note

Features and labels denote two ways for using entities to organize data:

  1. A feature qualifies which entity is measured (e.g., is a vector of categories)

  2. A label is a measured value of an entity (a category)

If re-shaping data introduced ambiguity, ask yourself what the joint measurement was: a feature qualifies variables in a joint measurement. You might be looking at a label if data was re-shaped from there.

Notes

For more control, you can use bionty ORMs to manage common basic biological entities like genes, proteins & cell markers involved in expression/count measurements.

Similarly, you can define custom ORMs to manage high-level derived features like gene sets, malignancy, etc.

Examples

>>> df = pd.DataFrame({"feat1": [1, 2], "feat2": [3.1, 4.2], "feat3": ["cond1", "cond2"]})
>>> features = ln.Feature.from_df(df)
>>> features.save()
>>> # the information from the DataFrame is now available in the Feature table
>>> ln.Feature.filter().df()
id    name    type
 a   feat1     int
 b   feat2   float
 c   feat3     str

Fields

id AutoField

Internal id, valid only in one DB instance.

uid CharField

Universal id, valid across DB instances.

name CharField

Name of feature (required).

type CharField

Simple type.

If “category”, consider managing categories with ULabel or another Registry for managing labels.

unit CharField

Unit of measure, ideally SI (m, s, kg, etc.) or ‘normalized’ etc. (optional).

description TextField

A description.

registries CharField

Registries that provide values for labels, bar-separated (|) (optional).

synonyms TextField

Bar-separated (|) synonyms (optional).

created_at DateTimeField

Time of creation of record.

updated_at DateTimeField

Time of run execution.

created_by ForeignKey

Creator of record, a User.

feature_sets ManyToManyField

Feature sets linked to this feature.

Methods

classmethod from_df(df, field=None)#

Create Feature records for columns..

Return type:

RecordsList

save(*args, **kwargs)#

Save.

Return type:

None