Open data platform for biology
Enable learning at scale with an open-source lakehouse native to biology & data lineage.

Trace data & code
Always know where a dataset came from and what it's used for. Capture data lineage in interactive analyses & scripts with a simple function call.
Manage datasets at scale
Query flexibly across storage and databases with a biology-aware lakehouse that goes beyond tables.

Manage flexible metadata
One Python class for your LIMS: experiments, samples, datasets, models, and more. Built on the Django ORM with ontology support.

Validate & annotate datasets
Use schemas to enforce consistency. Annotate datasets with a few lines of code.
Ready for the enterprise
Your data stays in your infrastructure. Fine-grained access management. SOC2-certified.

Build your organization's long-term memory
Transform artifacts into more useful representations: queryable datasets, predictive models, and analytical insights.
