• 2024-04-03

Training foundation models on large collections of scRNA-seq data

A few labs and companies now train models on large-scale scRNA-seq count matrices and related data modalities. But unlike for many other data types, there isn’t yet a playbook for data scales that don’t fit into memory.


  • 2022-08-29

nbproject: Manage Jupyter notebooks

nbproject is an open-source Python tool to help manage Jupyter notebooks with metadata, dependency, and integrity tracking. A draft-to-publish workflow creates more reproducible notebooks with context.


  • 2022-08-27

readfcs: Read FCS files

readfcs is a lightweight open-source Python package that loads data and metadata from Flow Cytometry Standard (FCS) files into DataFrame and AnnData objects, allowing users to flexibly use downstream analytical tools.


  • 2022-07-31

Key problems of data-heavy R&D

The complexity of modern R&D data often blocks realizing the scientific progress it promises.


  • 2022-05-04

Hello world!

We just launched lamin.ai as a place for sharing prototypes with our beta customers and collaborators. Over time, we’ll add public releases and use this blog to explain our work.