A feature is an individual measurable property of a phenomenon [Wikipedia], a measured event like a microscopy image or transcriptomic readout of a biological system.
It’s equivalent to the term “independent variable” in statistics, but is the preferred term to denote dimensions of “feature spaces” in machine learning.
In statistics (machine learning), an observation refers to a particular measured instance of a set of random variable.
In biology, an observation typically corresponds to measuring (reading out) a set of properties from a biological sample.
Importantly, we refer to instances of
SQLModelas records. Once a record is inserted into a database table, it becomes a row in that table. Every
SQLModelclass (in LaminDB) has a 1:1 correspondence with a database table and a pydantic
BaseModel, every row in a database table has a 1:1 correspondence with a record.
A record often stores jointly measured variables in its fields, but in general allows updating fields when more information becomes available or changes.
In biology, a sample is an instance or part of a biological system.
In statistics (machine learning), a sample is an observation of a set of random variables (features, labels, metadata).
Depending on the observational unit chosen for representing data, the statistical sample might correspond 1:1 to a biological sample. Often, this choice presents an interesting cases, as variation across physical samples - targeted in the experimental design - can directly be explained by variation across statistical (digital) samples.
We almost always mean “random variable”, when we say “variable”.
Random variables and their observations are core to statistics [Wikipedia].
A dependent variable is sometimes called a “response variable”, “regressand”, “criterion”, “predicted variable”, “measured variable”, “explained variable”, “experimental variable”, “responding variable”, “outcome variable”, “output variable”, “target” or “label”.