Ensembl species -> bionty.Species().df
Downloaded from: https://www.ensembl.org/info/about/species.html
2022-10-24 16:54:15,372:INFO - Note: NumExpr detected 10 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8.
2022-10-24 16:54:15,373:INFO - NumExpr defaulting to 8 threads.
author | Sunny Sun (sunnyosun) |
id | SH5O08MYHNXe |
version | 1 |
time_init | 2022-10-24 10:21 |
time_run | 2022-10-24 14:55 |
consecutive_cells | True |
pypackage | lamindb==0.6.0 lnschema_bionty==0.4.3 pandas==1.5.0 |
Curate the species table
|
Common name |
Scientific name |
Taxon ID |
Ensembl Assembly |
Accession |
Genebuild Method |
Variation database |
Regulation database |
0 |
Abingdon island giant tortoise |
Chelonoidis abingdonii |
106734 |
ASM359739v1 |
GCA_003597395.1 |
Full genebuild |
- |
- |
1 |
African ostrich |
Struthio camelus australis |
441894 |
ASM69896v1 |
GCA_000698965.1 |
Full genebuild |
- |
- |
2 |
Agassiz's desert tortoise |
Gopherus agassizii |
38772 |
ASM289641v1 |
GCA_002896415.1 |
Full genebuild |
- |
- |
3 |
Algerian mouse |
Mus spretus |
10096 |
SPRET_EiJ_v1 |
GCA_001624865.1 |
External annotation import |
- |
Y |
4 |
Alpaca |
Vicugna pacos |
30538 |
vicPac1 |
- |
Projection build |
- |
- |
|
Common name |
Taxon ID |
Scientific name |
Ensembl Assembly |
Accession |
0 |
Abingdon island giant tortoise |
106734 |
Chelonoidis abingdonii |
ASM359739v1 |
GCA_003597395.1 |
1 |
African ostrich |
441894 |
Struthio camelus australis |
ASM69896v1 |
GCA_000698965.1 |
2 |
Agassiz's desert tortoise |
38772 |
Gopherus agassizii |
ASM289641v1 |
GCA_002896415.1 |
3 |
Algerian mouse |
10096 |
Mus spretus |
SPRET_EiJ_v1 |
GCA_001624865.1 |
4 |
Alpaca |
30538 |
Vicugna pacos |
vicPac1 |
- |
Generate bionty species ids
|
Common name |
Taxon ID |
Scientific name |
Ensembl Assembly |
Accession |
id |
|
|
|
|
|
MfC |
Abingdon island giant tortoise |
106734 |
Chelonoidis abingdonii |
ASM359739v1 |
GCA_003597395.1 |
oQH |
African ostrich |
441894 |
Struthio camelus australis |
ASM69896v1 |
GCA_000698965.1 |
G2P |
Agassiz's desert tortoise |
38772 |
Gopherus agassizii |
ASM289641v1 |
GCA_002896415.1 |
OC9 |
Algerian mouse |
10096 |
Mus spretus |
SPRET_EiJ_v1 |
GCA_001624865.1 |
Tns |
Alpaca |
30538 |
Vicugna pacos |
vicPac1 |
- |
Push to bionty-assets.lndb
✅ Cell numbers increase consecutively: Awesome!
2022-10-24 16:54:51,715:INFO - Found credentials in shared credentials file: ~/.aws/credentials
Upload /Users/sunnysun/Documents/repos.nosync/bionty-assets/docs/ingest/ensembl_species.parquet: 1.00
ℹ️ Added notebook 'Ensembl species -> `bionty.Species().df`' (SH5O08MYHNXe, 1) by user sunnyosun.
✅ Ingested the following dobjects:
+---+-------------------------------------------------+--------------------------------------------------------------+----------------------+
| | dobject | jupynb | user |
+---+-------------------------------------------------+--------------------------------------------------------------+----------------------+
| 0 | ensembl_species.parquet (VpdUdouFahpvStwddqTwk) | 'Ensembl species -> `bionty.Species().df`' (SH5O08MYHNXe, 1) | sunnyosun (kmvZDIX9) |
+---+-------------------------------------------------+--------------------------------------------------------------+----------------------+
ℹ️ Set notebook version to 1 & wrote pypackages.
Now on S3: https://bionty-assets.s3.amazonaws.com/VpdUdouFahpvStwddqTwk.parquet