lamindb.dev.QuerySet#

class lamindb.dev.QuerySet(model=None, query=None, using=None, hints=None)#

Bases: QuerySet, CanValidate, IsTree

Lazily loaded queried records returned by queries.

See also

django QuerySet # noqa

Examples

>>> ln.ULabel(name="my label").save()
>>> queryset = ln.ULabel.filter(name="my label")
>>> queryset
<QuerySet [ULabel(id=MIeZISeF, name=my label, updated_at=2023-07-19 19:53:34, created_by_id=DzTjkKse)]> # noqa

Attributes

db property#

Return the database used if this query is executed now.

ordered property#

Return True if the QuerySet is ordered – i.e. has an order_by() clause or a default ordering on the model (or is empty).

query property#

Methods

df(include=None)#

Convert to pd.DataFrame.

By default, shows all fields that aren’t many-to-many fields, except created_at.

If you’d like to include many-to-many fields, use parameter include.

include (Optional[List[str]], default: None) – Optional[List[str]] = None Additional (many-to-many) fields to include. Takes expressions like "labels__name" "cell_types__name".

Return type:

DataFrame

Examples

>>> ln.save(ln.ULabel.from_values(["ULabel1", "ULabel2", "ULabel3"], field="name")) # noqa
>>> ln.ULabel.filter().df()
>>> label = ln.ULabel.filter(name="ULabel1").one()
>>> label = ln.ULabel.filter(name="benchmark").one()
>>> label.parents.add(label)
>>> ln.ULabel.filter().df(include=["labels__name", "labels__created_by_id"])
first()#

If non-empty, the first result in the query set, otherwise None.

Return type:

Optional[Registry]

Examples

>>> ln.save(ln.ULabel.from_values(["ULabel1", "ULabel2", "ULabel3"], field="name")) # noqa
>>> queryset = ln.ULabel.filter(name__icontains = "project")
>>> queryset.first()
ULabel(id=NAgTZxoo, name=ULabel1, updated_at=2023-07-19 19:25:48, created_by_id=DzTjkKse) # noqa
inspect(values, field=None, **kwargs)#

Inspect if values are mappable to a field.

Being mappable means that an exact match exists.

  • values (TypeVar(ListLike, list, pd.Series, np.array)) – Values that will be checked against the field.

  • field (Union[str, TypeVar(StrField, str, DeferredAttribute), None], default: None) – The field of values. Examples are ‘ontology_id’ to map against the source ID or ‘name’ to map against the ontologies field names.

  • mute – Mute logging.

See also

validate()

Examples

>>> import lnschema_bionty as lb
>>> lb.settings.organism = "human"
>>> ln.save(lb.Gene.from_values(["A1CF", "A1BG", "BRCA2"], field="symbol"))
>>> gene_symbols = ["A1CF", "A1BG", "FANCD1", "FANCD20"]
>>> result = lb.Gene.inspect(gene_symbols, field=lb.Gene.symbol)
✅ 2 terms (50.00%) are validated
🔶 2 terms (50.00%) are not validated
    🟠 detected synonyms
    to increase validated terms, standardize them via .standardize()
>>> result.validated
['A1CF', 'A1BG']
>>> result.non_validated
['FANCD1', 'FANCD20']
list(field=None)#

Populate a list with the results.

Return type:

List[Registry]

Examples

>>> ln.save(ln.ULabel.from_values(["ULabel1", "ULabel2", "ULabel3"], field="name")) # noqa
>>> queryset = ln.ULabel.filter(name__icontains = "project")
>>> queryset.list()
[ULabel(id=NAgTZxoo, name=ULabel1, updated_at=2023-07-19 19:25:48, created_by_id=DzTjkKse), # noqa
ULabel(id=bnsAgKRC, name=ULabel2, updated_at=2023-07-19 19:25:48, created_by_id=DzTjkKse), # noqa
ULabel(id=R8xhAJNE, name=ULabel3, updated_at=2023-07-19 19:25:48, created_by_id=DzTjkKse)] # noqa
>>> queryset.list("name")
['ULabel1', 'ULabel2', 'ULabel3']
lookup(field=None, **kwargs)#

Return an auto-complete object for a field.

  • field (Optional[TypeVar(StrField, str, DeferredAttribute)], default: None) – The field to look up the values for. Defaults to first string field.

  • return_field – The field to return. If None, returns the whole record.

Return type:

NamedTuple

Returns:

A NamedTuple of lookup information of the field values with a dictionary converter.

See also

search()

Examples

>>> import lnschema_bionty as lb
>>> lb.settings.organism = "human"
>>> lb.Gene.from_bionty(symbol="ADGB-DT").save()
>>> lookup = lb.Gene.lookup()
>>> lookup.adgb_dt
>>> lookup_dict = lookup.dict()
>>> lookup_dict['ADGB-DT']
>>> lookup_by_ensembl_id = lb.Gene.lookup(field="ensembl_gene_id")
>>> genes.ensg00000002745
>>> lookup_return_symbols = lb.Gene.lookup(field="ensembl_gene_id", return_field="symbol")
one()#

Exactly one result. Throws error if there are more or none.

Return type:

Registry

Examples

>>> ln.ULabel(name="benchmark").save()
>>> ln.ULabel.filter(name="benchmark").one()
ULabel(id=gznl0GZk, name=benchmark, updated_at=2023-07-19 19:39:01, created_by_id=DzTjkKse) # noqa
one_or_none()#

At most one result. Returns it if there is one, otherwise returns None.

Return type:

Optional[Registry]

Examples

>>> ln.ULabel(name="benchmark").save()
>>> ln.ULabel.filter(name="benchmark").one_or_none()
ULabel(id=gznl0GZk, name=benchmark, updated_at=2023-07-19 19:39:01, created_by_id=DzTjkKse) # noqa
>>> ln.ULabel.filter(name="non existing label").one_or_none()
None
search(string, **kwargs)#

Search.

Makes reasonable choices of which fields to search.

For instance, for File, searches key and description fields.

  • string (str) – The input string to match against the field ontology values.

  • field – The field against which the input string is matching.

  • limit – Maximum amount of top results to return.

  • return_queryset – Return search result as a sorted QuerySet.

  • case_sensitive – Whether the match is case sensitive.

  • synonyms_field – Search synonyms if column is available. If None, is ignored.

Returns:

A sorted DataFrame of search results with a score in column score. If return_queryset is True, an ordered QuerySet.

See also

filter() lookup()

Examples

>>> ln.save(ln.ULabel.from_values(["ULabel1", "ULabel2", "ULabel3"], field="name"))
>>> ln.ULabel.search("ULabel2")
            uid    score
name
ULabel2  o3FY3c5n  100.0
ULabel1  CcFPLmpq   75.0
ULabel3  Qi3c4utq   75.0
standardize(values, field=None, **kwargs)#

Maps input synonyms to standardized names.

  • values (Iterable) – Identifiers that will be standardized.

  • field (Union[str, TypeVar(StrField, str, DeferredAttribute), None], default: None) – The field representing the standardized names.

  • return_field – The field to return. Defaults to field.

  • return_mapper – If True, returns {input_value: standardized_name}.

  • case_sensitive – Whether the mapping is case sensitive.

  • mute – Mute logging.

  • bionty_aware – Whether to standardize from Bionty reference. Defaults to True for Bionty registries.

  • keep

    When a synonym maps to multiple names, determines which duplicates to mark as pd.DataFrame.duplicated:
    • ”first”: returns the first mapped standardized name

    • ”last”: returns the last mapped standardized name

    • False: returns all mapped standardized name.

    When keep is False, the returned list of standardized names will contain nested lists in case of duplicates.

    When a field is converted into return_field, keep marks which matches to keep when multiple return_field values map to the same field value.

  • synonyms_field – A field containing the concatenated synonyms.

Returns:

If return_mapper is False – a list of standardized names. Otherwise, a dictionary of mapped values with mappable synonyms as keys and standardized names as values.

See also

add_synonym()

Add synonyms

remove_synonym()

Remove synonyms

Examples

>>> import lnschema_bionty as lb
>>> lb.settings.organism = "human"
>>> ln.save(lb.Gene.from_values(["A1CF", "A1BG", "BRCA2"], field="symbol"))
>>> gene_synonyms = ["A1CF", "A1BG", "FANCD1", "FANCD20"]
>>> standardized_names = lb.Gene.standardize(gene_synonyms)
>>> standardized_names
['A1CF', 'A1BG', 'BRCA2', 'FANCD20']
validate(values, field=None, **kwargs)#

Validate values against existing values of a string field.

Note this is strict validation, only asserts exact matches.

  • values (TypeVar(ListLike, list, pd.Series, np.array)) – Values that will be validated against the field.

  • field (Union[str, TypeVar(StrField, str, DeferredAttribute), None], default: None) – The field of values. Examples are ‘ontology_id’ to map against the source ID or ‘name’ to map against the ontologies field names.

  • mute – Mute logging.

Returns:

A vector of booleans indicating if an element is validated.

See also

inspect()

Examples

>>> import lnschema_bionty as lb
>>> lb.settings.organism = "human"
>>> ln.save(lb.Gene.from_values(["A1CF", "A1BG", "BRCA2"], field="symbol"))
>>> gene_symbols = ["A1CF", "A1BG", "FANCD1", "FANCD20"]
>>> lb.Gene.validate(gene_symbols, field=lb.Gene.symbol)
✅ 2 terms (50.00%) are validated
🔶 2 terms (50.00%) are not validated
array([ True,  True, False, False])
view_tree(level=-1, limit_to_directories=False, length_limit=1000, max_files_per_dir_per_type=7)#

View the tree structure of the keys.

  • level (int, default: -1) – int=-1 Depth of the tree to be displayed. Default is -1 which means all levels.

  • limit_to_directories (bool, default: False) – bool=False If True, only directories will be displayed.

  • length_limit (int, default: 1000) – int=1000 Maximum number of nodes to be displayed.

  • max_files_per_dir_per_type (int, default: 7) – int=7 Maximum number of files per directory per type.

Return type:

None