lamindb.dev.QuerySet#
- class lamindb.dev.QuerySet(model=None, query=None, using=None, hints=None)#
Bases:
QuerySet
,CanValidate
,IsTree
Lazily loaded queried records returned by queries.
See also
django QuerySet # noqa
Examples
>>> ln.ULabel(name="my label").save() >>> queryset = ln.ULabel.filter(name="my label") >>> queryset <QuerySet [ULabel(id=MIeZISeF, name=my label, updated_at=2023-07-19 19:53:34, created_by_id=DzTjkKse)]> # noqa
Attributes
- db property#
Return the database used if this query is executed now.
- ordered property#
Return True if the QuerySet is ordered – i.e. has an order_by() clause or a default ordering on the model (or is empty).
- query property#
Methods
- df(include=None)#
Convert to
pd.DataFrame
.By default, shows all fields that aren’t many-to-many fields, except
created_at
.If you’d like to include many-to-many fields, use parameter
include
.include (
Optional
[List
[str
]], default:None
) –Optional[List[str]] = None
Additional (many-to-many) fields to include. Takes expressions like"labels__name"
"cell_types__name"
.- Return type:
DataFrame
Examples
>>> ln.save(ln.ULabel.from_values(["ULabel1", "ULabel2", "ULabel3"], field="name")) # noqa >>> ln.ULabel.filter().df() >>> label = ln.ULabel.filter(name="ULabel1").one() >>> label = ln.ULabel.filter(name="benchmark").one() >>> label.parents.add(label) >>> ln.ULabel.filter().df(include=["labels__name", "labels__created_by_id"])
- first()#
If non-empty, the first result in the query set, otherwise None.
- Return type:
Optional
[Registry
]
Examples
>>> ln.save(ln.ULabel.from_values(["ULabel1", "ULabel2", "ULabel3"], field="name")) # noqa >>> queryset = ln.ULabel.filter(name__icontains = "project") >>> queryset.first() ULabel(id=NAgTZxoo, name=ULabel1, updated_at=2023-07-19 19:25:48, created_by_id=DzTjkKse) # noqa
- inspect(values, field=None, **kwargs)#
Inspect if values are mappable to a field.
Being mappable means that an exact match exists.
values (
TypeVar
(ListLike
,list
, pd.Series, np.array)) – Values that will be checked against the field.field (
Union
[str
,TypeVar
(StrField
,str
,DeferredAttribute
),None
], default:None
) – The field of values. Examples are ‘ontology_id’ to map against the source ID or ‘name’ to map against the ontologies field names.mute – Mute logging.
See also
Examples
>>> import lnschema_bionty as lb >>> lb.settings.organism = "human" >>> ln.save(lb.Gene.from_values(["A1CF", "A1BG", "BRCA2"], field="symbol")) >>> gene_symbols = ["A1CF", "A1BG", "FANCD1", "FANCD20"] >>> result = lb.Gene.inspect(gene_symbols, field=lb.Gene.symbol) ✅ 2 terms (50.00%) are validated 🔶 2 terms (50.00%) are not validated 🟠 detected synonyms to increase validated terms, standardize them via .standardize() >>> result.validated ['A1CF', 'A1BG'] >>> result.non_validated ['FANCD1', 'FANCD20']
- list(field=None)#
Populate a list with the results.
- Return type:
List
[Registry
]
Examples
>>> ln.save(ln.ULabel.from_values(["ULabel1", "ULabel2", "ULabel3"], field="name")) # noqa >>> queryset = ln.ULabel.filter(name__icontains = "project") >>> queryset.list() [ULabel(id=NAgTZxoo, name=ULabel1, updated_at=2023-07-19 19:25:48, created_by_id=DzTjkKse), # noqa ULabel(id=bnsAgKRC, name=ULabel2, updated_at=2023-07-19 19:25:48, created_by_id=DzTjkKse), # noqa ULabel(id=R8xhAJNE, name=ULabel3, updated_at=2023-07-19 19:25:48, created_by_id=DzTjkKse)] # noqa >>> queryset.list("name") ['ULabel1', 'ULabel2', 'ULabel3']
- lookup(field=None, **kwargs)#
Return an auto-complete object for a field.
field (
Optional
[TypeVar
(StrField
,str
,DeferredAttribute
)], default:None
) – The field to look up the values for. Defaults to first string field.return_field – The field to return. If None, returns the whole record.
- Return type:
NamedTuple
- Returns:
A NamedTuple of lookup information of the field values with a dictionary converter.
See also
Examples
>>> import lnschema_bionty as lb >>> lb.settings.organism = "human" >>> lb.Gene.from_bionty(symbol="ADGB-DT").save() >>> lookup = lb.Gene.lookup() >>> lookup.adgb_dt >>> lookup_dict = lookup.dict() >>> lookup_dict['ADGB-DT'] >>> lookup_by_ensembl_id = lb.Gene.lookup(field="ensembl_gene_id") >>> genes.ensg00000002745 >>> lookup_return_symbols = lb.Gene.lookup(field="ensembl_gene_id", return_field="symbol")
- one()#
Exactly one result. Throws error if there are more or none.
- Return type:
Examples
>>> ln.ULabel(name="benchmark").save() >>> ln.ULabel.filter(name="benchmark").one() ULabel(id=gznl0GZk, name=benchmark, updated_at=2023-07-19 19:39:01, created_by_id=DzTjkKse) # noqa
- one_or_none()#
At most one result. Returns it if there is one, otherwise returns None.
- Return type:
Optional
[Registry
]
Examples
>>> ln.ULabel(name="benchmark").save() >>> ln.ULabel.filter(name="benchmark").one_or_none() ULabel(id=gznl0GZk, name=benchmark, updated_at=2023-07-19 19:39:01, created_by_id=DzTjkKse) # noqa >>> ln.ULabel.filter(name="non existing label").one_or_none() None
- search(string, **kwargs)#
Search.
Makes reasonable choices of which fields to search.
For instance, for
File
, searches key and description fields.string (
str
) – The input string to match against the field ontology values.field – The field against which the input string is matching.
limit – Maximum amount of top results to return.
return_queryset – Return search result as a sorted QuerySet.
case_sensitive – Whether the match is case sensitive.
synonyms_field – Search synonyms if column is available. If None, is ignored.
- Returns:
A sorted DataFrame of search results with a score in column score. If return_queryset is True, an ordered QuerySet.
Examples
>>> ln.save(ln.ULabel.from_values(["ULabel1", "ULabel2", "ULabel3"], field="name")) >>> ln.ULabel.search("ULabel2") uid score name ULabel2 o3FY3c5n 100.0 ULabel1 CcFPLmpq 75.0 ULabel3 Qi3c4utq 75.0
- standardize(values, field=None, **kwargs)#
Maps input synonyms to standardized names.
values (
Iterable
) – Identifiers that will be standardized.field (
Union
[str
,TypeVar
(StrField
,str
,DeferredAttribute
),None
], default:None
) – The field representing the standardized names.return_field – The field to return. Defaults to field.
return_mapper – If True, returns {input_value: standardized_name}.
case_sensitive – Whether the mapping is case sensitive.
mute – Mute logging.
bionty_aware – Whether to standardize from Bionty reference. Defaults to True for Bionty registries.
keep –
- When a synonym maps to multiple names, determines which duplicates to mark as pd.DataFrame.duplicated:
”first”: returns the first mapped standardized name
”last”: returns the last mapped standardized name
False: returns all mapped standardized name.
When keep is False, the returned list of standardized names will contain nested lists in case of duplicates.
When a field is converted into return_field, keep marks which matches to keep when multiple return_field values map to the same field value.
synonyms_field – A field containing the concatenated synonyms.
- Returns:
If return_mapper is False – a list of standardized names. Otherwise, a dictionary of mapped values with mappable synonyms as keys and standardized names as values.
See also
add_synonym()
Add synonyms
remove_synonym()
Remove synonyms
Examples
>>> import lnschema_bionty as lb >>> lb.settings.organism = "human" >>> ln.save(lb.Gene.from_values(["A1CF", "A1BG", "BRCA2"], field="symbol")) >>> gene_synonyms = ["A1CF", "A1BG", "FANCD1", "FANCD20"] >>> standardized_names = lb.Gene.standardize(gene_synonyms) >>> standardized_names ['A1CF', 'A1BG', 'BRCA2', 'FANCD20']
- validate(values, field=None, **kwargs)#
Validate values against existing values of a string field.
Note this is strict validation, only asserts exact matches.
values (
TypeVar
(ListLike
,list
, pd.Series, np.array)) – Values that will be validated against the field.field (
Union
[str
,TypeVar
(StrField
,str
,DeferredAttribute
),None
], default:None
) – The field of values. Examples are ‘ontology_id’ to map against the source ID or ‘name’ to map against the ontologies field names.mute – Mute logging.
- Returns:
A vector of booleans indicating if an element is validated.
See also
Examples
>>> import lnschema_bionty as lb >>> lb.settings.organism = "human" >>> ln.save(lb.Gene.from_values(["A1CF", "A1BG", "BRCA2"], field="symbol")) >>> gene_symbols = ["A1CF", "A1BG", "FANCD1", "FANCD20"] >>> lb.Gene.validate(gene_symbols, field=lb.Gene.symbol) ✅ 2 terms (50.00%) are validated 🔶 2 terms (50.00%) are not validated array([ True, True, False, False])
- view_tree(level=-1, limit_to_directories=False, length_limit=1000, max_files_per_dir_per_type=7)#
View the tree structure of the keys.
level (
int
, default:-1
) –int=-1
Depth of the tree to be displayed. Default is -1 which means all levels.limit_to_directories (
bool
, default:False
) –bool=False
If True, only directories will be displayed.length_limit (
int
, default:1000
) –int=1000
Maximum number of nodes to be displayed.max_files_per_dir_per_type (
int
, default:7
) –int=7
Maximum number of files per directory per type.
- Return type:
None