Convert a number of files to a single array store#

In the previous notebooks, we’ve seen how to incrementally create a dataset and train models on it.

Once we have a dataset of validated files, we might want to create them into one big array store.

This is what CellxGene team did for the data in the CellxGene portal: a high number of h5ad files were concatenated to give rise to a single array store.

This requires duplicating the data that’s present in a collection of .h5ad files, but provides the advantage that one can now query slices for arbitrary metadata, rather than just the individual files.

See how this looks for cellxgene here: CELLxGENE: scRNA-seq.