This page is, for now, just a stub.
Lamin was influenced by many projects. Here we will attempt to list all of them.
Lamin complements workflow managers with its focus on interactive analyses, biological entities & provenance beyond deterministic workflows (app uploads & notebooks). We encourage using a workflow manager to manage scheduling, execution, error & parameter handling of workflows and integrating successful executions into LaminDB for full provenance tracking.
Despite Lamin’s different scope, the workflow manager redun greatly influenced LaminDB. In particular, naming choices in LaminDB’s
File class (
.stage()) & hashing strategies for sets are inspired by redun’s File class.
Similar to redun, Lamin tries to achieve idempotency but for different use cases & using largely differing designs.
Like redun & git, LaminDB is a distributed system in which any LaminDB instance can exchange & share data with any other LaminDB instance. (Currently, this feature is built into the design, but not yet fully implemented.)
LaminDB hasn’t knowingly been influenced by other workflow managers.
In LaminDB, ontologies are used to standardize & validate metadata based on plug-in
lnschema_bionty. It wraps common public ontologies for which Lamin caches curated assets on S3 for robust availability.
We’re not aware of another tool that focuses on leveraging ontologies for curation & validation, but there exist several tools that extend & harmonize ontologies for building knowledge graphs. We list two of them below.
LaminDB does not attempt to create a knowledge graph but assumes that associations between entities are mainly found through experimentation, statistics & machine learning.
Also within LaminDB, connections between entities can be mapped through the pathway entity and by using enrichment tools or by defining relations between biological entities in custom schema. Some relations might be added to
lnschema_bionty in the future.
Biocypher is a Python package that simplifies the creation of knowledge graphs.
Built upon a modular framework, it enables users to manipulate and harmonize ontologies.
gget provides a simple, intuitive API to query existing web servers of genomic databases.
With bionty, Lamin provides a similar tool with three important differences:
Bionty focuses on leveraging public ontologies for data management (validation, standardization, annotation) rather than queries. In comparison to gget, Bionty’s queries are more limited.
To enable robust & performant access for usage in data pipelines that bulk-validate, -standardize, or -annotate, Lamin hosts versioned ontologies on AWS S3 instead of relying on the sometimes flaky availability of existing public web servers.
Bionty can be plugged into LaminDB to easily import records from public ontologies into biological registries, managed in a simple database.