Glossary#

feature#

A feature is an individual measurable property of a phenomenon [Wikipedia], a measured event like a microscopy image or transcriptomic readout of a biological system.

It’s equivalent to the term “independent variable” in statistics, but is the preferred term to denote dimensions of “feature spaces” in machine learning.

label#

A label refers to a descriptor or tag that is assigned to something to describe, identify, or categorize it.

observation#

In statistics (machine learning), an observation refers to a particular measured instance of a set of random variable.

In biology, an observation typically corresponds to measuring (reading out) a set of properties from a biological sample.

record#

A record is a data structure that consists in fields, typically of different types but in a fixed sequence [Wikipedia].

Importantly, we refer to instances of Registry as records. Once a record is inserted into a database table, it becomes a row in that table. Every Registry class (in LaminDB) has a 1:1 correspondence with a database table and a django model, every row in a database table has a 1:1 correspondence with a record.

A record often stores jointly measured variables in its fields, but in general allows updating fields when more information becomes available or changes.

sample#

In biology, a sample is an instance or part of a biological system.

In statistics (machine learning), a sample is an observation of a set of random variables (features, labels, metadata).

Depending on the observational unit chosen for representing data, the statistical sample might correspond 1:1 to a biological sample. Often, this choice presents an interesting cases, as variation across physical samples - targeted in the experimental design - can directly be explained by variation across statistical (digital) samples.

variable#

We almost always mean “random variable”, when we say “variable”.

Random variables and their observations are core to statistics [Wikipedia].

An independent variable is sometimes called a feature, “predictor variable”, “regressor”, “covariate”, “explanatory variable”, “risk factor”, “input variable”, among others [Wikipedia].

A dependent variable is sometimes called a “response variable”, “regressand”, “criterion”, “predicted variable”, “measured variable”, “explained variable”, “experimental variable”, “responding variable”, “outcome variable”, “output variable”, “target” or “label”.