Quickstart#

Most features in pysradb are accessible both from the command-line and as a python package. pysradb usage on the two platforms will be displayed by selecting the corresponding tab below.

Note

If you have any questions along the way, please head over to the Python API or the CLI for more information. You may also wish to refer to the API Documentation


Notebooks#

A Google Colaboratory version of most used commands are available in this Colab Notebook . Colab runs Python 3.6 while pysradb requires Python 3.7+ and hence the notebooks no longer run on Colab, but can be downloaded and run locally.

The following notebooks document all the possible features of pysradb:

  1. Python API

  2. Downloading datasets from SRA - command line

  3. Parallely download multiple datasets - Python API

  4. Converting SRA-to-fastq - command line (requires conda)

  5. Downloading subsets of a project - Python API

  6. Download BAMs

  7. Metadata for multiple SRPs

  8. Multithreaded fastq downloads using Aspera Client

  9. Searching SRA/GEO/ENA


Metadata#

pysradb makes it very easy to obtain metadata from SRA/EBI:

$ pysradb metadata SRP265425
from pysradb.sraweb import SRAweb

db = SRAweb()
df = db.metadata("SRP265425")
df

Output:

study_accession     experiment_accession    experiment_title        experiment_desc organism_taxid  organism_name   library_name    library_strategy        library_source  library_selection       library_layout  sample_accession        sample_title    instrument      instrument_model        instrument_model_desc   total_spots     total_size      run_accession   run_total_spots run_total_bases
SRP265425   SRX8434255      Ampliseq of SARS-CoV-2  Ampliseq of SARS-CoV-2  2697049 Severe acute respiratory syndrome coronavirus 2 63-2020-04-22   AMPLICON        VIRAL RNA       RT-PCR  SINGLE  SRS6745319              Ion Torrent S5 XL       Ion Torrent S5 XL       ION_TORRENT     1311358 83306910        SRR11886735     1311358 109594216
SRP265425   SRX8434254      Ampliseq of SARS-CoV-2  Ampliseq of SARS-CoV-2  2697049 Severe acute respiratory syndrome coronavirus 2 62-2020-04-22   AMPLICON        VIRAL RNA       RT-PCR  SINGLE  SRS6745320              Ion Torrent S5 XL       Ion Torrent S5 XL       ION_TORRENT     2614109 204278682       SRR11886736     2614109 262305651
SRP265425   SRX8434253      Ampliseq of SARS-CoV-2  Ampliseq of SARS-CoV-2  2697049 Severe acute respiratory syndrome coronavirus 2 61-2020-04-22   AMPLICON        VIRAL RNA       RT-PCR  SINGLE  SRS6745318              Ion Torrent S5 XL       Ion Torrent S5 XL       ION_TORRENT     2286312 183516004       SRR11886737     2286312 263304134
SRP265425   SRX8434252      Ampliseq of SARS-CoV-2  Ampliseq of SARS-CoV-2  2697049 Severe acute respiratory syndrome coronavirus 2 60-2020-04-22   AMPLICON        VIRAL RNA       RT-PCR  SINGLE  SRS6745317              Ion Torrent S5 XL       Ion Torrent S5 XL       ION_TORRENT     5202567 507524965       SRR11886738     5202567 781291588
SRP265425   SRX8434251      Ampliseq of SARS-CoV-2  Ampliseq of SARS-CoV-2  2697049 Severe acute respiratory syndrome coronavirus 2 38-2020-04-03   AMPLICON        VIRAL RNA       RT-PCR  SINGLE  SRS6745315              Ion Torrent S5 XL       Ion Torrent S5 XL       ION_TORRENT     3313960 356104406       SRR11886739     3313960 612430817
SRP265425   SRX8434250      Ampliseq of SARS-CoV-2  Ampliseq of SARS-CoV-2  2697049 Severe acute respiratory syndrome coronavirus 2 37-2020-04-03   AMPLICON        VIRAL RNA       RT-PCR  SINGLE  SRS6745316              Ion Torrent S5 XL       Ion Torrent S5 XL       ION_TORRENT     5155733 565882351       SRR11886740     5155733 954342917
SRP265425   SRX8434249      Ampliseq of SARS-CoV-2  Ampliseq of SARS-CoV-2  2697049 Severe acute respiratory syndrome coronavirus 2 36-2020-04-03   AMPLICON        VIRAL RNA       RT-PCR  SINGLE  SRS6745313              Ion Torrent S5 XL       Ion Torrent S5 XL       ION_TORRENT     1324589 175619046       SRR11886741     1324589 216531400
SRP265425   SRX8434248      Ampliseq of SARS-CoV-2  Ampliseq of SARS-CoV-2  2697049 Severe acute respiratory syndrome coronavirus 2 35-2020-04-03   AMPLICON        VIRAL RNA       RT-PCR  SINGLE  SRS6745314              Ion Torrent S5 XL       Ion Torrent S5 XL       ION_TORRENT     1639851 198973268       SRR11886742     1639851 245466005
SRP265425   SRX8434247      Ampliseq of SARS-CoV-2  Ampliseq of SARS-CoV-2  2697049 Severe acute respiratory syndrome coronavirus 2 68-2020-05-07   AMPLICON        VIRAL RNA       RT-PCR  SINGLE  SRS6745312              Ion Torrent S5 XL       Ion Torrent S5 XL       ION_TORRENT     3921389 210198580       SRR11886743     3921389 332935558
SRP265425   SRX8434246      Ampliseq of SARS-CoV-2  Ampliseq of SARS-CoV-2  2697049 Severe acute respiratory syndrome coronavirus 2 66-2020-04-22   AMPLICON        VIRAL RNA       RT-PCR  SINGLE  SRS6745311              Ion Torrent S5 XL       Ion Torrent S5 XL       ION_TORRENT     14295475        2150005008      SRR11886744     14295475        2967829315
SRP265425   SRX8434245      Ampliseq of SARS-CoV-2  Ampliseq of SARS-CoV-2  2697049 Severe acute respiratory syndrome coronavirus 2 65-2020-04-22   AMPLICON        VIRAL RNA       RT-PCR  SINGLE  SRS6745310              Ion Torrent S5 XL       Ion Torrent S5 XL       ION_TORRENT     5124692 294846140       SRR11886745     5124692 431819462
SRP265425   SRX8434244      Ampliseq of SARS-CoV-2  Ampliseq of SARS-CoV-2  2697049 Severe acute respiratory syndrome coronavirus 2 64-2020-04-22   AMPLICON        VIRAL RNA       RT-PCR  SINGLE  SRS6745309              Ion Torrent S5 XL       Ion Torrent S5 XL       ION_TORRENT     2986306 205666872       SRR11886746     2986306 275400959
SRP265425   SRX8434243      Ampliseq of SARS-CoV-2  Ampliseq of SARS-CoV-2  2697049 Severe acute respiratory syndrome coronavirus 2 34-2020-04-03   AMPLICON        VIRAL RNA       RT-PCR  SINGLE  SRS6745308              Ion Torrent S5 XL       Ion Torrent S5 XL       ION_TORRENT     1182690 59471336        SRR11886747     1182690 86350631
SRP265425   SRX8434242      Ampliseq of SARS-CoV-2  Ampliseq of SARS-CoV-2  2697049 Severe acute respiratory syndrome coronavirus 2 33-2020-04-03   AMPLICON        VIRAL RNA       RT-PCR  SINGLE  SRS6745307              Ion Torrent S5 XL       Ion Torrent S5 XL       ION_TORRENT     6031816 749323230       SRR11886748     6031816 928054297

Additionally to obtain locations of .fastq/.sra files and other metadata:

pysradb makes it very easy to obtain metadata from SRA/EBI:

$ pysradb metadata SRP265425 --detailed
from pysradb.sraweb import SRAweb

db = SRAweb()
df = db.metadata("SRP265425", detailed=True)
df

Output:

run_accession       study_accession experiment_accession    experiment_title        experiment_desc organism_taxid  organism_name   library_name    library_strategy        library_source  library_selection       library_layout  sample_accession        sample_title    instrument      instrument_model        instrument_model_desc   total_spots     total_size      run_total_spots run_total_bases run_alias       sra_url_alt1    sra_url_alt2    sra_url experiment_alias        isolate collected_by    collection_date geo_loc_name    host    host_disease    isolation_source        lat_lon BioSampleModel  sra_url_alt3    ena_fastq_http  ena_fastq_http_1        ena_fastq_http_2        ena_fastq_ftp   ena_fastq_ftp_1 ena_fastq_ftp_2
SRR11886735 SRP265425       SRX8434255      Ampliseq of SARS-CoV-2  Ampliseq of SARS-CoV-2  2697049 Severe acute respiratory syndrome coronavirus 2 63-2020-04-22   AMPLICON        VIRAL RNA       RT-PCR  SINGLE  SRS6745319              Ion Torrent S5 XL       Ion Torrent S5 XL       ION_TORRENT     1311358 83306910        1311358 109594216       IonXpress_063_R_2020_04_22_15_56_22_user_GCEID-S5-58-SARS_CoV2_SA4.bam  gs://sra-pub-src-9/SRR11886735/Wuhan_Hu_1_NC_045512_21500_and_subgenomics_SA4.fasta.1   https://sra-pub-sars-cov2.s3.amazonaws.com/sra-src/SRR11886735/Wuhan_Hu_1_NC_045512_21500_and_subgenomics_SA4.fasta.1   https://sra-download.ncbi.nlm.nih.gov/traces/sra0/SRR/011608/SRR11886735                GC-20   NA      02-Apr-2020     Australia: Victoria     Homo sapiens    COVID-19        swab    NA      Pathogen.cl             http://ftp.sra.ebi.ac.uk/vol1/fastq/SRR118/035/SRR11886735/SRR11886735.fastq.gz                 era-fasp@fasp.sra.ebi.ac.uk:vol1/fastq/SRR118/035/SRR11886735/SRR11886735.fastq.gz
SRR11886736 SRP265425       SRX8434254      Ampliseq of SARS-CoV-2  Ampliseq of SARS-CoV-2  2697049 Severe acute respiratory syndrome coronavirus 2 62-2020-04-22   AMPLICON        VIRAL RNA       RT-PCR  SINGLE  SRS6745320              Ion Torrent S5 XL       Ion Torrent S5 XL       ION_TORRENT     2614109 204278682       2614109 262305651       IonXpress_062_R_2020_04_22_15_56_22_user_GCEID-S5-58-SARS_CoV2_SA4.bam  gs://sra-pub-src-16/SRR11886736/Wuhan_Hu_1_NC_045512_21500_and_subgenomics_SA4.fasta.1  https://sra-download.ncbi.nlm.nih.gov/traces/sra46/SRZ/011886/SRR11886736/Wuhan_Hu_1_NC_045512_21500_and_subgenomics_SA4.fasta  https://sra-download.ncbi.nlm.nih.gov/traces/sra50/SRR/011608/SRR11886736               GC-51   NA      14-Apr-2020     Australia: Victoria     Homo sapiens    COVID-19        swab    NA      Pathogen.cl     https://sra-pub-sars-cov2.s3.amazonaws.com/sra-src/SRR11886736/Wuhan_Hu_1_NC_045512_21500_and_subgenomics_SA4.fasta.1   http://ftp.sra.ebi.ac.uk/vol1/fastq/SRR118/036/SRR11886736/SRR11886736.fastq.gz                 era-fasp@fasp.sra.ebi.ac.uk:vol1/fastq/SRR118/036/SRR11886736/SRR11886736.fastq.gz
SRR11886737 SRP265425       SRX8434253      Ampliseq of SARS-CoV-2  Ampliseq of SARS-CoV-2  2697049 Severe acute respiratory syndrome coronavirus 2 61-2020-04-22   AMPLICON        VIRAL RNA       RT-PCR  SINGLE  SRS6745318              Ion Torrent S5 XL       Ion Torrent S5 XL       ION_TORRENT     2286312 183516004       2286312 263304134       IonXpress_061_R_2020_04_22_15_56_22_user_GCEID-S5-58-SARS_CoV2_SA4.bam  gs://sra-pub-src-16/SRR11886737/Wuhan_Hu_1_NC_045512_21500_and_subgenomics_SA4.fasta.1  https://sra-download.ncbi.nlm.nih.gov/traces/sra29/SRZ/011886/SRR11886737/Wuhan_Hu_1_NC_045512_21500_and_subgenomics_SA4.fasta  https://sra-download.ncbi.nlm.nih.gov/traces/sra17/SRR/011608/SRR11886737               GC-24   NA      07-Apr-2020     Australia: Victoria     Homo sapiens    COVID-19        swab    NA      Pathogen.cl     https://sra-pub-sars-cov2.s3.amazonaws.com/sra-src/SRR11886737/Wuhan_Hu_1_NC_045512_21500_and_subgenomics_SA4.fasta.1   http://ftp.sra.ebi.ac.uk/vol1/fastq/SRR118/037/SRR11886737/SRR11886737.fastq.gz                 era-fasp@fasp.sra.ebi.ac.uk:vol1/fastq/SRR118/037/SRR11886737/SRR11886737.fastq.gz
SRR11886738 SRP265425       SRX8434252      Ampliseq of SARS-CoV-2  Ampliseq of SARS-CoV-2  2697049 Severe acute respiratory syndrome coronavirus 2 60-2020-04-22   AMPLICON        VIRAL RNA       RT-PCR  SINGLE  SRS6745317              Ion Torrent S5 XL       Ion Torrent S5 XL       ION_TORRENT     5202567 507524965       5202567 781291588       IonXpress_060_R_2020_04_22_15_56_22_user_GCEID-S5-58-SARS_CoV2_SA4.bam  gs://sra-pub-src-15/SRR11886738/IonXpress_060_R_2020_04_22_15_56_22_user_GCEID_S5_58_SARS_CoV2_SA4.bam.1        https://sra-download.ncbi.nlm.nih.gov/traces/sra69/SRZ/011886/SRR11886738/IonXpress_060_R_2020_04_22_15_56_22_user_GCEID_S5_58_SARS_CoV2_SA4.bam        https://sra-download.ncbi.nlm.nih.gov/traces/sra77/SRR/011608/SRR11886738               GC-23   NA      08-Apr-2020     Australia: Victoria     Homo sapiens    COVID-19        swab    NA      Pathogen.cl     https://sra-pub-sars-cov2.s3.amazonaws.com/sra-src/SRR11886738/IonXpress_060_R_2020_04_22_15_56_22_user_GCEID_S5_58_SARS_CoV2_SA4.bam.1 http://ftp.sra.ebi.ac.uk/vol1/fastq/SRR118/038/SRR11886738/SRR11886738.fastq.gz                 era-fasp@fasp.sra.ebi.ac.uk:vol1/fastq/SRR118/038/SRR11886738/SRR11886738.fastq.gz
SRR11886739 SRP265425       SRX8434251      Ampliseq of SARS-CoV-2  Ampliseq of SARS-CoV-2  2697049 Severe acute respiratory syndrome coronavirus 2 38-2020-04-03   AMPLICON        VIRAL RNA       RT-PCR  SINGLE  SRS6745315              Ion Torrent S5 XL       Ion Torrent S5 XL       ION_TORRENT     3313960 356104406       3313960 612430817       IonXpress_038_R_2020_04_03_10_09_05_user_GCEID-S5-55-SARS_CoV2_SA4.bam  gs://sra-pub-src-13/SRR11886739/Wuhan_Hu_1_NC_045512_21500_and_subgenomics_SA4.fasta.1  https://sra-pub-sars-cov2.s3.amazonaws.com/sra-src/SRR11886739/Wuhan_Hu_1_NC_045512_21500_and_subgenomics_SA4.fasta.1   https://sra-download.ncbi.nlm.nih.gov/traces/sra24/SRR/011608/SRR11886739               GC-11b  NA      24-Mar-2020     Australia: Victoria     Homo sapiens    COVID-19        swab    NA      Pathogen.cl             http://ftp.sra.ebi.ac.uk/vol1/fastq/SRR118/039/SRR11886739/SRR11886739.fastq.gz                 era-fasp@fasp.sra.ebi.ac.uk:vol1/fastq/SRR118/039/SRR11886739/SRR11886739.fastq.gz
SRR11886740 SRP265425       SRX8434250      Ampliseq of SARS-CoV-2  Ampliseq of SARS-CoV-2  2697049 Severe acute respiratory syndrome coronavirus 2 37-2020-04-03   AMPLICON        VIRAL RNA       RT-PCR  SINGLE  SRS6745316              Ion Torrent S5 XL       Ion Torrent S5 XL       ION_TORRENT     5155733 565882351       5155733 954342917       IonXpress_037_R_2020_04_03_10_09_05_user_GCEID-S5-55-SARS_CoV2_SA4.bam  gs://sra-pub-src-5/SRR11886740/IonXpress_037_R_2020_04_03_10_09_05_user_GCEID_S5_55_SARS_CoV2_SA4.bam.1 https://sra-pub-sars-cov2.s3.amazonaws.com/sra-src/SRR11886740/IonXpress_037_R_2020_04_03_10_09_05_user_GCEID_S5_55_SARS_CoV2_SA4.bam.1 https://sra-download.ncbi.nlm.nih.gov/traces/sra13/SRR/011608/SRR11886740               GC-14b  NA      28-Mar-2020     Australia: Victoria     Homo sapiens    COVID-19        swab    NA      Pathogen.cl             http://ftp.sra.ebi.ac.uk/vol1/fastq/SRR118/040/SRR11886740/SRR11886740.fastq.gz                 era-fasp@fasp.sra.ebi.ac.uk:vol1/fastq/SRR118/040/SRR11886740/SRR11886740.fastq.gz
SRR11886741 SRP265425       SRX8434249      Ampliseq of SARS-CoV-2  Ampliseq of SARS-CoV-2  2697049 Severe acute respiratory syndrome coronavirus 2 36-2020-04-03   AMPLICON        VIRAL RNA       RT-PCR  SINGLE  SRS6745313              Ion Torrent S5 XL       Ion Torrent S5 XL       ION_TORRENT     1324589 175619046       1324589 216531400       IonXpress_036_R_2020_04_03_10_09_05_user_GCEID-S5-55-SARS_CoV2_SA4.bam  gs://sra-pub-src-11/SRR11886741/Wuhan_Hu_1_NC_045512_21500_and_subgenomics_SA4.fasta.1  https://sra-pub-sars-cov2.s3.amazonaws.com/sra-src/SRR11886741/Wuhan_Hu_1_NC_045512_21500_and_subgenomics_SA4.fasta.1   https://sra-download.ncbi.nlm.nih.gov/traces/sra57/SRR/011608/SRR11886741               GC-12   NA      24-Mar-2020     Australia: Victoria     Homo sapiens    COVID-19        swab    NA      Pathogen.cl             http://ftp.sra.ebi.ac.uk/vol1/fastq/SRR118/041/SRR11886741/SRR11886741.fastq.gz                 era-fasp@fasp.sra.ebi.ac.uk:vol1/fastq/SRR118/041/SRR11886741/SRR11886741.fastq.gz
SRR11886742 SRP265425       SRX8434248      Ampliseq of SARS-CoV-2  Ampliseq of SARS-CoV-2  2697049 Severe acute respiratory syndrome coronavirus 2 35-2020-04-03   AMPLICON        VIRAL RNA       RT-PCR  SINGLE  SRS6745314              Ion Torrent S5 XL       Ion Torrent S5 XL       ION_TORRENT     1639851 198973268       1639851 245466005       IonXpress_035_R_2020_04_03_10_09_05_user_GCEID-S5-55-SARS_CoV2_SA4.bam  gs://sra-pub-src-11/SRR11886742/Wuhan_Hu_1_NC_045512_21500_and_subgenomics_SA4.fasta.1  https://sra-pub-sars-cov2.s3.amazonaws.com/sra-src/SRR11886742/Wuhan_Hu_1_NC_045512_21500_and_subgenomics_SA4.fasta.1   https://sra-download.ncbi.nlm.nih.gov/traces/sra69/SRR/011608/SRR11886742               GC-13   NA      23-Mar-2020     Australia: Victoria     Homo sapiens    COVID-19        swab    NA      Pathogen.cl             http://ftp.sra.ebi.ac.uk/vol1/fastq/SRR118/042/SRR11886742/SRR11886742.fastq.gz                 era-fasp@fasp.sra.ebi.ac.uk:vol1/fastq/SRR118/042/SRR11886742/SRR11886742.fastq.gz
SRR11886743 SRP265425       SRX8434247      Ampliseq of SARS-CoV-2  Ampliseq of SARS-CoV-2  2697049 Severe acute respiratory syndrome coronavirus 2 68-2020-05-07   AMPLICON        VIRAL RNA       RT-PCR  SINGLE  SRS6745312              Ion Torrent S5 XL       Ion Torrent S5 XL       ION_TORRENT     3921389 210198580       3921389 332935558       IonXpress_068_R_2020_05_07_11_47_51_user_GCEID-S5-60-SARS_CoV2_SA4.bam  gs://sra-pub-src-17/SRR11886743/Wuhan_Hu_1_NC_045512_21500_and_subgenomics_SA4.fasta.1  https://sra-download.ncbi.nlm.nih.gov/traces/sra64/SRZ/011886/SRR11886743/Wuhan_Hu_1_NC_045512_21500_and_subgenomics_SA4.fasta  https://sra-download.ncbi.nlm.nih.gov/traces/sra54/SRR/011608/SRR11886743               GC-55   NA      24-Apr-2020     Australia: Victoria     Homo sapiens    COVID-19        swab    NA      Pathogen.cl     https://sra-pub-sars-cov2.s3.amazonaws.com/sra-src/SRR11886743/Wuhan_Hu_1_NC_045512_21500_and_subgenomics_SA4.fasta.1   http://ftp.sra.ebi.ac.uk/vol1/fastq/SRR118/043/SRR11886743/SRR11886743.fastq.gz                 era-fasp@fasp.sra.ebi.ac.uk:vol1/fastq/SRR118/043/SRR11886743/SRR11886743.fastq.gz
SRR11886744 SRP265425       SRX8434246      Ampliseq of SARS-CoV-2  Ampliseq of SARS-CoV-2  2697049 Severe acute respiratory syndrome coronavirus 2 66-2020-04-22   AMPLICON        VIRAL RNA       RT-PCR  SINGLE  SRS6745311              Ion Torrent S5 XL       Ion Torrent S5 XL       ION_TORRENT     14295475        2150005008      14295475        2967829315      IonXpress_066_R_2020_04_22_11_10_56_user_GCEID-S5-57-SARS_CoV2_SA4.fastq        gs://sra-pub-src-11/SRR11886744/IonXpress_066_R_2020_04_22_11_10_56_user_GCEID-S5-57-SARS_CoV2_SA4.fastq.1      https://sra-pub-sars-cov2.s3.amazonaws.com/sra-src/SRR11886744/IonXpress_066_R_2020_04_22_11_10_56_user_GCEID-S5-57-SARS_CoV2_SA4.fastq.1       https://sra-download.ncbi.nlm.nih.gov/traces/sra20/SRR/011608/SRR11886744               GC-26   NA      07-Mar-2020     Australia: Victoria     Homo sapiens    COVID-19        swab    NA      Pathogen.cl             http://ftp.sra.ebi.ac.uk/vol1/fastq/SRR118/044/SRR11886744/SRR11886744.fastq.gz                 era-fasp@fasp.sra.ebi.ac.uk:vol1/fastq/SRR118/044/SRR11886744/SRR11886744.fastq.gz
SRR11886745 SRP265425       SRX8434245      Ampliseq of SARS-CoV-2  Ampliseq of SARS-CoV-2  2697049 Severe acute respiratory syndrome coronavirus 2 65-2020-04-22   AMPLICON        VIRAL RNA       RT-PCR  SINGLE  SRS6745310              Ion Torrent S5 XL       Ion Torrent S5 XL       ION_TORRENT     5124692 294846140       5124692 431819462       IonXpress_065_R_2020_04_22_11_10_56_user_GCEID-S5-57-SARS_CoV2_SA4.bam  gs://sra-pub-src-16/SRR11886745/Wuhan_Hu_1_NC_045512_21500_and_subgenomics_SA4.fasta.1  https://sra-download.ncbi.nlm.nih.gov/traces/sra69/SRZ/011886/SRR11886745/Wuhan_Hu_1_NC_045512_21500_and_subgenomics_SA4.fasta  https://sra-download.ncbi.nlm.nih.gov/traces/sra19/SRR/011608/SRR11886745               GC-25   NA      10-Apr-2020     Australia: Victoria     Homo sapiens    COVID-19        swab    NA      Pathogen.cl     https://sra-pub-sars-cov2.s3.amazonaws.com/sra-src/SRR11886745/Wuhan_Hu_1_NC_045512_21500_and_subgenomics_SA4.fasta.1   http://ftp.sra.ebi.ac.uk/vol1/fastq/SRR118/045/SRR11886745/SRR11886745.fastq.gz                 era-fasp@fasp.sra.ebi.ac.uk:vol1/fastq/SRR118/045/SRR11886745/SRR11886745.fastq.gz
SRR11886746 SRP265425       SRX8434244      Ampliseq of SARS-CoV-2  Ampliseq of SARS-CoV-2  2697049 Severe acute respiratory syndrome coronavirus 2 64-2020-04-22   AMPLICON        VIRAL RNA       RT-PCR  SINGLE  SRS6745309              Ion Torrent S5 XL       Ion Torrent S5 XL       ION_TORRENT     2986306 205666872       2986306 275400959       IonXpress_064_R_2020_04_22_15_56_22_user_GCEID-S5-58-SARS_CoV2_SA4.bam  gs://sra-pub-src-17/SRR11886746/Wuhan_Hu_1_NC_045512_21500_and_subgenomics_SA4.fasta.1  https://sra-download.ncbi.nlm.nih.gov/traces/sra59/SRZ/011886/SRR11886746/Wuhan_Hu_1_NC_045512_21500_and_subgenomics_SA4.fasta  https://sra-download.ncbi.nlm.nih.gov/traces/sra47/SRR/011608/SRR11886746               GC-21   NA      03-Apr-2020     Australia: Victoria     Homo sapiens    COVID-19        swab    NA      Pathogen.cl     https://sra-pub-sars-cov2.s3.amazonaws.com/sra-src/SRR11886746/Wuhan_Hu_1_NC_045512_21500_and_subgenomics_SA4.fasta.1   http://ftp.sra.ebi.ac.uk/vol1/fastq/SRR118/046/SRR11886746/SRR11886746.fastq.gz                 era-fasp@fasp.sra.ebi.ac.uk:vol1/fastq/SRR118/046/SRR11886746/SRR11886746.fastq.gz
SRR11886747 SRP265425       SRX8434243      Ampliseq of SARS-CoV-2  Ampliseq of SARS-CoV-2  2697049 Severe acute respiratory syndrome coronavirus 2 34-2020-04-03   AMPLICON        VIRAL RNA       RT-PCR  SINGLE  SRS6745308              Ion Torrent S5 XL       Ion Torrent S5 XL       ION_TORRENT     1182690 59471336        1182690 86350631        IonXpress_034_R_2020_04_03_10_09_05_user_GCEID-S5-55-SARS_CoV2_SA4.bam  gs://sra-pub-src-16/SRR11886747/Wuhan_Hu_1_NC_045512_21500_and_subgenomics_SA4.fasta.1  https://sra-download.ncbi.nlm.nih.gov/traces/sra77/SRZ/011886/SRR11886747/Wuhan_Hu_1_NC_045512_21500_and_subgenomics_SA4.fasta  https://sra-download.ncbi.nlm.nih.gov/traces/sra13/SRR/011608/SRR11886747               GC-11a  NA      24-Mar-2020     Australia: Victoria     Homo sapiens    COVID-19        swab    NA      Pathogen.cl     https://sra-pub-sars-cov2.s3.amazonaws.com/sra-src/SRR11886747/Wuhan_Hu_1_NC_045512_21500_and_subgenomics_SA4.fasta.1   http://ftp.sra.ebi.ac.uk/vol1/fastq/SRR118/047/SRR11886747/SRR11886747.fastq.gz                 era-fasp@fasp.sra.ebi.ac.uk:vol1/fastq/SRR118/047/SRR11886747/SRR11886747.fastq.gz
SRR11886748 SRP265425       SRX8434242      Ampliseq of SARS-CoV-2  Ampliseq of SARS-CoV-2  2697049 Severe acute respiratory syndrome coronavirus 2 33-2020-04-03   AMPLICON        VIRAL RNA       RT-PCR  SINGLE  SRS6745307              Ion Torrent S5 XL       Ion Torrent S5 XL       ION_TORRENT     6031816 749323230       6031816 928054297       IonXpress_033_R_2020_04_03_10_09_05_user_GCEID-S5-55-SARS_CoV2_SA4.bam  gs://sra-pub-src-15/SRR11886748/IonXpress_033_R_2020_04_03_10_09_05_user_GCEID_S5_55_SARS_CoV2_SA4.bam.1        https://sra-download.ncbi.nlm.nih.gov/traces/sra43/SRZ/011886/SRR11886748/IonXpress_033_R_2020_04_03_10_09_05_user_GCEID_S5_55_SARS_CoV2_SA4.bam        https://sra-download.ncbi.nlm.nih.gov/traces/sra66/SRR/011608/SRR11886748               GC-14a  NA      28-Mar-2020     Australia: Victoria     Homo sapiens    COVID-19        swab    NA      Pathogen.cl     https://sra-pub-sars-cov2.s3.amazonaws.com/sra-src/SRR11886748/IonXpress_033_R_2020_04_03_10_09_05_user_GCEID_S5_55_SARS_CoV2_SA4.bam.1 http://ftp.sra.ebi.ac.uk/vol1/fastq/SRR118/048/SRR11886748/SRR11886748.fastq.gz                 era-fasp@fasp.sra.ebi.ac.uk:vol1/fastq/SRR118/048/SRR11886748/SRR11886748.fastq.gz

Converting between accession numbers#

pysradb provides a suite of commands for interoperability between conversion numbers.

Convert SRP to SRX#

$ pysradb srp-to-srx SRP098789
from pysradb.sraweb import SRAweb

db = SRAweb()
df = db.srp-to-srx("SRP098789")
df

Output:

study_accession experiment_accession    experiment_title        experiment_desc organism_taxid  organism_name   library_strategy        library_source  library_selection       sample_accession        sample_title    instrument      total_spots     total_size      run_accession   run_total_spots run_total_bases study_accesssion
SRP098789       SRX2536428      GSM2476022: vehicle, 60 min, rep 5-mRNAseq; Homo sapiens; RNA-Seq       GSM2476022: vehicle, 60 min, rep 5-mRNAseq; Homo sapiens; RNA-Seq       9606    Homo sapiens    RNA-Seq TRANSCRIPTOMIC  cDNA    SRS1956378      Illumina HiSeq 2500     69422931        1545681856      SRR5227313      69422931        3540569481      SRP098789
SRP098789       SRX2536427      GSM2476021: PF-06446846, 60 min, rep 5 -mRNA-seq; Homo sapiens; OTHER   GSM2476021: PF-06446846, 60 min, rep 5 -mRNA-seq; Homo sapiens; OTHER   9606    Homo sapiens    OTHER   TRANSCRIPTOMIC  other   SRS1956377      Illumina HiSeq 2500     58065134        1302369810      SRR5227312      58065134        2961321834      SRP098789
SRP098789       SRX2536426      GSM2476020: vehicle, 60 min, rep 4-mRNAseq; Homo sapiens; RNA-Seq       GSM2476020: vehicle, 60 min, rep 4-mRNAseq; Homo sapiens; RNA-Seq       9606    Homo sapiens    RNA-Seq TRANSCRIPTOMIC  cDNA    SRS1956376      Illumina HiSeq 2500     63720205        1416818619      SRR5227311      63720205        3249730455      SRP098789
SRP098789       SRX2536425      GSM2476019: PF-06446846, 60 min, rep 4 -mRNA-seq; Homo sapiens; OTHER   GSM2476019: PF-06446846, 60 min, rep 4 -mRNA-seq; Homo sapiens; OTHER   9606    Homo sapiens    OTHER   TRANSCRIPTOMIC  other   SRS1956375      Illumina HiSeq 2500     66363585        1482728577      SRR5227310      66363585        3384542835      SRP098789
SRP098789       SRX2536424      GSM2476018: vehicle, 60 min, rep 5-Ribo-seq; Homo sapiens; RNA-Seq      GSM2476018: vehicle, 60 min, rep 5-Ribo-seq; Homo sapiens; RNA-Seq      9606    Homo sapiens    RNA-Seq TRANSCRIPTOMIC  cDNA    SRS1956374      Illumina HiSeq 2500     40062613        904488287       SRR5227309      40062613        2043193263      SRP098789
SRP098789       SRX2536423      GSM2476017: 1.5 ?M PF-067446846, 60 min, rep 5 -riboseq; Homo sapiens; OTHER    GSM2476017: 1.5 ?M PF-067446846, 60 min, rep 5 -riboseq; Homo sapiens; OTHER    9606    Homo sapiens    OTHER   TRANSCRIPTOMIC other    SRS1956373      Illumina HiSeq 2500     65591217        1499668100      SRR5227308      65591217        3345152067      SRP098789
SRP098789       SRX2536422      GSM2476016: Vehicle, 60 min, rep 4-ribo-seq; Homo sapiens; RNA-Seq      GSM2476016: Vehicle, 60 min, rep 4-ribo-seq; Homo sapiens; RNA-Seq      9606    Homo sapiens    RNA-Seq TRANSCRIPTOMIC  cDNA    SRS1956372      Illumina HiSeq 2500     66480991        1564636133      SRR5227307      66480991        3390530541      SRP098789
SRP098789       SRX2536421      GSM2476015: 1.5 ?M PF-067446846, 60 min, rep 4 -riboseq; Homo sapiens; OTHER    GSM2476015: 1.5 ?M PF-067446846, 60 min, rep 4 -riboseq; Homo sapiens; OTHER    9606    Homo sapiens    OTHER   TRANSCRIPTOMIC other    SRS1956371      Illumina HiSeq 2500     57588015        1357395400      SRR5227306      57588015        2936988765      SRP098789
SRP098789       SRX2536420      GSM2476014: vehicle, 60 min rep 3; Homo sapiens; OTHER  GSM2476014: vehicle, 60 min rep 3; Homo sapiens; OTHER  9606    Homo sapiens    OTHER   TRANSCRIPTOMIC  other   SRS1956370      Illumina HiSeq 2000    48405034 1530784033      SRR5227305      48405034        2420251700      SRP098789
SRP098789       SRX2536419      GSM2476013: vehicle, 60 min rep 2; Homo sapiens; OTHER  GSM2476013: vehicle, 60 min rep 2; Homo sapiens; OTHER  9606    Homo sapiens    OTHER   TRANSCRIPTOMIC  other   SRS1956369      Illumina HiSeq 2000    47139057 1489018603      SRR5227304      47139057        2356952850      SRP098789
SRP098789       SRX2536418      GSM2476012: vehicle, 60 min rep 1; Homo sapiens; OTHER  GSM2476012: vehicle, 60 min rep 1; Homo sapiens; OTHER  9606    Homo sapiens    OTHER   TRANSCRIPTOMIC  other   SRS1956368      Illumina HiSeq 2000    50956178 1495757884      SRR5227303      50956178        2547808900      SRP098789
SRP098789       SRX2536417      GSM2476011: 0.3 ?M PF-067446846, 60 min, rep 3; Homo sapiens; OTHER     GSM2476011: 0.3 ?M PF-067446846, 60 min, rep 3; Homo sapiens; OTHER     9606    Homo sapiens    OTHER   TRANSCRIPTOMIC  other   SRS1956367      Illumina HiSeq 2000     44258180        1404548468      SRR5227302      44258180        2212909000      SRP098789
SRP098789       SRX2536416      GSM2476010: 0.3 ?M PF-067446846, 60 min, rep 2; Homo sapiens; OTHER     GSM2476010: 0.3 ?M PF-067446846, 60 min, rep 2; Homo sapiens; OTHER     9606    Homo sapiens    OTHER   TRANSCRIPTOMIC  other   SRS1956366      Illumina HiSeq 2000     49129512        1536091510      SRR5227301      49129512        2456475600      SRP098789
SRP098789       SRX2536415      GSM2476009: 0.3 ?M PF-067446846, 60 min, rep 1; Homo sapiens; OTHER     GSM2476009: 0.3 ?M PF-067446846, 60 min, rep 1; Homo sapiens; OTHER     9606    Homo sapiens    OTHER   TRANSCRIPTOMIC  other   SRS1956365      Illumina HiSeq 2000     30043362        903983724       SRR5227300      30043362        1502168100      SRP098789
SRP098789       SRX2536414      GSM2476008: 1.5 ?M PF-067446846, 60 min, rep 3; Homo sapiens; OTHER     GSM2476008: 1.5 ?M PF-067446846, 60 min, rep 3; Homo sapiens; OTHER     9606    Homo sapiens    OTHER   TRANSCRIPTOMIC  other   SRS1956364      Illumina HiSeq 2000     48766213        1530350854      SRR5227299      48766213        2438310650      SRP098789
SRP098789       SRX2536413      GSM2476007: 1.5 ?M PF-067446846, 60 min, rep 2; Homo sapiens; OTHER     GSM2476007: 1.5 ?M PF-067446846, 60 min, rep 2; Homo sapiens; OTHER     9606    Homo sapiens    OTHER   TRANSCRIPTOMIC  other   SRS1956363      Illumina HiSeq 2000     49334392        1475414353      SRR5227298      49334392        2466719600      SRP098789
SRP098789       SRX2536412      GSM2476006: 1.5 ?M PF-067446846, 60 min, rep 1; Homo sapiens; OTHER     GSM2476006: 1.5 ?M PF-067446846, 60 min, rep 1; Homo sapiens; OTHER     9606    Homo sapiens    OTHER   TRANSCRIPTOMIC  other   SRS1956362      Illumina HiSeq 2000     60381365        1801283052      SRR5227297      60381365        3019068250      SRP098789
SRP098789       SRX2536411      GSM2476005: vehicle, 10 min rep 3; Homo sapiens; OTHER  GSM2476005: vehicle, 10 min rep 3; Homo sapiens; OTHER  9606    Homo sapiens    OTHER   TRANSCRIPTOMIC  other   SRS1956361      Illumina HiSeq 2000    52737784 1644829192      SRR5227296      52737784        2636889200      SRP098789
SRP098789       SRX2536410      GSM2476004: vehicle, 10 min rep 2; Homo sapiens; OTHER  GSM2476004: vehicle, 10 min rep 2; Homo sapiens; OTHER  9606    Homo sapiens    OTHER   TRANSCRIPTOMIC  other   SRS1956360      Illumina HiSeq 2000    46137148 1455541408      SRR5227295      46137148        2306857400      SRP098789
SRP098789       SRX2536409      GSM2476003: vehicle, 10 min rep 1; Homo sapiens; OTHER  GSM2476003: vehicle, 10 min rep 1; Homo sapiens; OTHER  9606    Homo sapiens    OTHER   TRANSCRIPTOMIC  other   SRS1956359      Illumina HiSeq 2000    76002122 1552821132      SRR5227294      76002122        3800106100      SRP098789
SRP098789       SRX2536408      GSM2476002: 0.3 ?M PF-067446846, 10 min, rep 3; Homo sapiens; OTHER     GSM2476002: 0.3 ?M PF-067446846, 10 min, rep 3; Homo sapiens; OTHER     9606    Homo sapiens    OTHER   TRANSCRIPTOMIC  other   SRS1956358      Illumina HiSeq 2000     42709138        1338829352      SRR5227293      42709138        2135456900      SRP098789
SRP098789       SRX2536407      GSM2476001: 0.3 ?M PF-067446846, 10 min, rep 2; Homo sapiens; OTHER     GSM2476001: 0.3 ?M PF-067446846, 10 min, rep 2; Homo sapiens; OTHER     9606    Homo sapiens    OTHER   TRANSCRIPTOMIC  other   SRS1956357      Illumina HiSeq 2000     60552437        1875910244      SRR5227292      60552437        3027621850      SRP098789
SRP098789       SRX2536406      GSM2476000: 0.3 ?M PF-067446846, 10 min, rep 1; Homo sapiens; OTHER     GSM2476000: 0.3 ?M PF-067446846, 10 min, rep 1; Homo sapiens; OTHER     9606    Homo sapiens    OTHER   TRANSCRIPTOMIC  other   SRS1956356      Illumina HiSeq 2000     41143319        843881081       SRR5227291      41143319        2057165950      SRP098789
SRP098789       SRX2536405      GSM2475999: 1.5 ?M PF-067446846, 10 min, rep 3; Homo sapiens; OTHER     GSM2475999: 1.5 ?M PF-067446846, 10 min, rep 3; Homo sapiens; OTHER     9606    Homo sapiens    OTHER   TRANSCRIPTOMIC  other   SRS1956355      Illumina HiSeq 2000     40462973        1287284933      SRR5227290      40462973        2023148650      SRP098789
SRP098789       SRX2536404      GSM2475998: 1.5 ?M PF-067446846, 10 min, rep 2; Homo sapiens; OTHER     GSM2475998: 1.5 ?M PF-067446846, 10 min, rep 2; Homo sapiens; OTHER     9606    Homo sapiens    OTHER   TRANSCRIPTOMIC  other   SRS1956354      Illumina HiSeq 2000     41657461        1360366732      SRR5227289      41657461        2082873050      SRP098789
SRP098789       SRX2536403      GSM2475997: 1.5 ?M PF-067446846, 10 min, rep 1; Homo sapiens; OTHER     GSM2475997: 1.5 ?M PF-067446846, 10 min, rep 1; Homo sapiens; OTHER     9606    Homo sapiens    OTHER   TRANSCRIPTOMIC  other   SRS1956353      Illumina HiSeq 2000     42082855        916745706       SRR5227288      42082855        2104142750      SRP098789

Convert GSE to SRP#

$ pysradb srp-to-srx SRP098789
from pysradb.sraweb import SRAweb

db = SRAweb()
df = db.srp-to-srx("SRP098789")
df

Output:

study_accession experiment_accession    experiment_title        experiment_desc organism_taxid  organism_name   library_strategy        library_source  library_selection       sample_accession        sample_title    instrument      total_spots     total_size      run_accession   run_total_spots run_total_bases study_accesssion
SRP098789       SRX2536428      GSM2476022: vehicle, 60 min, rep 5-mRNAseq; Homo sapiens; RNA-Seq       GSM2476022: vehicle, 60 min, rep 5-mRNAseq; Homo sapiens; RNA-Seq       9606    Homo sapiens    RNA-Seq TRANSCRIPTOMIC  cDNA    SRS1956378      Illumina HiSeq 2500     69422931        1545681856      SRR5227313      69422931        3540569481      SRP098789
SRP098789       SRX2536427      GSM2476021: PF-06446846, 60 min, rep 5 -mRNA-seq; Homo sapiens; OTHER   GSM2476021: PF-06446846, 60 min, rep 5 -mRNA-seq; Homo sapiens; OTHER   9606    Homo sapiens    OTHER   TRANSCRIPTOMIC  other   SRS1956377      Illumina HiSeq 2500     58065134        1302369810      SRR5227312      58065134        2961321834      SRP098789
SRP098789       SRX2536426      GSM2476020: vehicle, 60 min, rep 4-mRNAseq; Homo sapiens; RNA-Seq       GSM2476020: vehicle, 60 min, rep 4-mRNAseq; Homo sapiens; RNA-Seq       9606    Homo sapiens    RNA-Seq TRANSCRIPTOMIC  cDNA    SRS1956376      Illumina HiSeq 2500     63720205        1416818619      SRR5227311      63720205        3249730455      SRP098789
SRP098789       SRX2536425      GSM2476019: PF-06446846, 60 min, rep 4 -mRNA-seq; Homo sapiens; OTHER   GSM2476019: PF-06446846, 60 min, rep 4 -mRNA-seq; Homo sapiens; OTHER   9606    Homo sapiens    OTHER   TRANSCRIPTOMIC  other   SRS1956375      Illumina HiSeq 2500     66363585        1482728577      SRR5227310      66363585        3384542835      SRP098789
SRP098789       SRX2536424      GSM2476018: vehicle, 60 min, rep 5-Ribo-seq; Homo sapiens; RNA-Seq      GSM2476018: vehicle, 60 min, rep 5-Ribo-seq; Homo sapiens; RNA-Seq      9606    Homo sapiens    RNA-Seq TRANSCRIPTOMIC  cDNA    SRS1956374      Illumina HiSeq 2500     40062613        904488287       SRR5227309      40062613        2043193263      SRP098789
SRP098789       SRX2536423      GSM2476017: 1.5 ?M PF-067446846, 60 min, rep 5 -riboseq; Homo sapiens; OTHER    GSM2476017: 1.5 ?M PF-067446846, 60 min, rep 5 -riboseq; Homo sapiens; OTHER    9606    Homo sapiens    OTHER   TRANSCRIPTOMIC other    SRS1956373      Illumina HiSeq 2500     65591217        1499668100      SRR5227308      65591217        3345152067      SRP098789
SRP098789       SRX2536422      GSM2476016: Vehicle, 60 min, rep 4-ribo-seq; Homo sapiens; RNA-Seq      GSM2476016: Vehicle, 60 min, rep 4-ribo-seq; Homo sapiens; RNA-Seq      9606    Homo sapiens    RNA-Seq TRANSCRIPTOMIC  cDNA    SRS1956372      Illumina HiSeq 2500     66480991        1564636133      SRR5227307      66480991        3390530541      SRP098789
SRP098789       SRX2536421      GSM2476015: 1.5 ?M PF-067446846, 60 min, rep 4 -riboseq; Homo sapiens; OTHER    GSM2476015: 1.5 ?M PF-067446846, 60 min, rep 4 -riboseq; Homo sapiens; OTHER    9606    Homo sapiens    OTHER   TRANSCRIPTOMIC other    SRS1956371      Illumina HiSeq 2500     57588015        1357395400      SRR5227306      57588015        2936988765      SRP098789
SRP098789       SRX2536420      GSM2476014: vehicle, 60 min rep 3; Homo sapiens; OTHER  GSM2476014: vehicle, 60 min rep 3; Homo sapiens; OTHER  9606    Homo sapiens    OTHER   TRANSCRIPTOMIC  other   SRS1956370      Illumina HiSeq 2000    48405034 1530784033      SRR5227305      48405034        2420251700      SRP098789
SRP098789       SRX2536419      GSM2476013: vehicle, 60 min rep 2; Homo sapiens; OTHER  GSM2476013: vehicle, 60 min rep 2; Homo sapiens; OTHER  9606    Homo sapiens    OTHER   TRANSCRIPTOMIC  other   SRS1956369      Illumina HiSeq 2000    47139057 1489018603      SRR5227304      47139057        2356952850      SRP098789
SRP098789       SRX2536418      GSM2476012: vehicle, 60 min rep 1; Homo sapiens; OTHER  GSM2476012: vehicle, 60 min rep 1; Homo sapiens; OTHER  9606    Homo sapiens    OTHER   TRANSCRIPTOMIC  other   SRS1956368      Illumina HiSeq 2000    50956178 1495757884      SRR5227303      50956178        2547808900      SRP098789
SRP098789       SRX2536417      GSM2476011: 0.3 ?M PF-067446846, 60 min, rep 3; Homo sapiens; OTHER     GSM2476011: 0.3 ?M PF-067446846, 60 min, rep 3; Homo sapiens; OTHER     9606    Homo sapiens    OTHER   TRANSCRIPTOMIC  other   SRS1956367      Illumina HiSeq 2000     44258180        1404548468      SRR5227302      44258180        2212909000      SRP098789
SRP098789       SRX2536416      GSM2476010: 0.3 ?M PF-067446846, 60 min, rep 2; Homo sapiens; OTHER     GSM2476010: 0.3 ?M PF-067446846, 60 min, rep 2; Homo sapiens; OTHER     9606    Homo sapiens    OTHER   TRANSCRIPTOMIC  other   SRS1956366      Illumina HiSeq 2000     49129512        1536091510      SRR5227301      49129512        2456475600      SRP098789
SRP098789       SRX2536415      GSM2476009: 0.3 ?M PF-067446846, 60 min, rep 1; Homo sapiens; OTHER     GSM2476009: 0.3 ?M PF-067446846, 60 min, rep 1; Homo sapiens; OTHER     9606    Homo sapiens    OTHER   TRANSCRIPTOMIC  other   SRS1956365      Illumina HiSeq 2000     30043362        903983724       SRR5227300      30043362        1502168100      SRP098789
SRP098789       SRX2536414      GSM2476008: 1.5 ?M PF-067446846, 60 min, rep 3; Homo sapiens; OTHER     GSM2476008: 1.5 ?M PF-067446846, 60 min, rep 3; Homo sapiens; OTHER     9606    Homo sapiens    OTHER   TRANSCRIPTOMIC  other   SRS1956364      Illumina HiSeq 2000     48766213        1530350854      SRR5227299      48766213        2438310650      SRP098789
SRP098789       SRX2536413      GSM2476007: 1.5 ?M PF-067446846, 60 min, rep 2; Homo sapiens; OTHER     GSM2476007: 1.5 ?M PF-067446846, 60 min, rep 2; Homo sapiens; OTHER     9606    Homo sapiens    OTHER   TRANSCRIPTOMIC  other   SRS1956363      Illumina HiSeq 2000     49334392        1475414353      SRR5227298      49334392        2466719600      SRP098789
SRP098789       SRX2536412      GSM2476006: 1.5 ?M PF-067446846, 60 min, rep 1; Homo sapiens; OTHER     GSM2476006: 1.5 ?M PF-067446846, 60 min, rep 1; Homo sapiens; OTHER     9606    Homo sapiens    OTHER   TRANSCRIPTOMIC  other   SRS1956362      Illumina HiSeq 2000     60381365        1801283052      SRR5227297      60381365        3019068250      SRP098789
SRP098789       SRX2536411      GSM2476005: vehicle, 10 min rep 3; Homo sapiens; OTHER  GSM2476005: vehicle, 10 min rep 3; Homo sapiens; OTHER  9606    Homo sapiens    OTHER   TRANSCRIPTOMIC  other   SRS1956361      Illumina HiSeq 2000    52737784 1644829192      SRR5227296      52737784        2636889200      SRP098789
SRP098789       SRX2536410      GSM2476004: vehicle, 10 min rep 2; Homo sapiens; OTHER  GSM2476004: vehicle, 10 min rep 2; Homo sapiens; OTHER  9606    Homo sapiens    OTHER   TRANSCRIPTOMIC  other   SRS1956360      Illumina HiSeq 2000    46137148 1455541408      SRR5227295      46137148        2306857400      SRP098789
SRP098789       SRX2536409      GSM2476003: vehicle, 10 min rep 1; Homo sapiens; OTHER  GSM2476003: vehicle, 10 min rep 1; Homo sapiens; OTHER  9606    Homo sapiens    OTHER   TRANSCRIPTOMIC  other   SRS1956359      Illumina HiSeq 2000    76002122 1552821132      SRR5227294      76002122        3800106100      SRP098789
SRP098789       SRX2536408      GSM2476002: 0.3 ?M PF-067446846, 10 min, rep 3; Homo sapiens; OTHER     GSM2476002: 0.3 ?M PF-067446846, 10 min, rep 3; Homo sapiens; OTHER     9606    Homo sapiens    OTHER   TRANSCRIPTOMIC  other   SRS1956358      Illumina HiSeq 2000     42709138        1338829352      SRR5227293      42709138        2135456900      SRP098789
SRP098789       SRX2536407      GSM2476001: 0.3 ?M PF-067446846, 10 min, rep 2; Homo sapiens; OTHER     GSM2476001: 0.3 ?M PF-067446846, 10 min, rep 2; Homo sapiens; OTHER     9606    Homo sapiens    OTHER   TRANSCRIPTOMIC  other   SRS1956357      Illumina HiSeq 2000     60552437        1875910244      SRR5227292      60552437        3027621850      SRP098789
SRP098789       SRX2536406      GSM2476000: 0.3 ?M PF-067446846, 10 min, rep 1; Homo sapiens; OTHER     GSM2476000: 0.3 ?M PF-067446846, 10 min, rep 1; Homo sapiens; OTHER     9606    Homo sapiens    OTHER   TRANSCRIPTOMIC  other   SRS1956356      Illumina HiSeq 2000     41143319        843881081       SRR5227291      41143319        2057165950      SRP098789
SRP098789       SRX2536405      GSM2475999: 1.5 ?M PF-067446846, 10 min, rep 3; Homo sapiens; OTHER     GSM2475999: 1.5 ?M PF-067446846, 10 min, rep 3; Homo sapiens; OTHER     9606    Homo sapiens    OTHER   TRANSCRIPTOMIC  other   SRS1956355      Illumina HiSeq 2000     40462973        1287284933      SRR5227290      40462973        2023148650      SRP098789
SRP098789       SRX2536404      GSM2475998: 1.5 ?M PF-067446846, 10 min, rep 2; Homo sapiens; OTHER     GSM2475998: 1.5 ?M PF-067446846, 10 min, rep 2; Homo sapiens; OTHER     9606    Homo sapiens    OTHER   TRANSCRIPTOMIC  other   SRS1956354      Illumina HiSeq 2000     41657461        1360366732      SRR5227289      41657461        2082873050      SRP098789
SRP098789       SRX2536403      GSM2475997: 1.5 ?M PF-067446846, 10 min, rep 1; Homo sapiens; OTHER     GSM2475997: 1.5 ?M PF-067446846, 10 min, rep 1; Homo sapiens; OTHER     9606    Homo sapiens    OTHER   TRANSCRIPTOMIC  other   SRS1956353      Illumina HiSeq 2000     42082855        916745706       SRR5227288      42082855        2104142750      SRP098789


Downloading sequencing data#

pysradb can alse be used to download either .fastq or .sra filesboth from ENA and SRA.

Downloading via accession number#

$ pysradb download SRP098789
from pysradb.sraweb import SRAweb

db = SRAweb()
db.download("SRP098789")

It is also possible to pipe the dataframe from metadata or search to download, after filtering the dataframe entries:

$ pysradb metadata SRP276671 --detailed | pysradb download
from pysradb.sraweb import SRAweb
db = SRAweb()
df = db.sra_metadata('SRP016501', detailed=True)
db.download(df=df)

Ultrafast fastq downloads#

With aspera-client installed, pysradb canan perform ultra fast downloads:

To download all original fastqs with aspera-client installed utilizing 8 threads:

$ pysradb download -t 8 --use_ascp -p SRP002605
from pysradb.sraweb import SRAweb

db = SRAweb()
db.download("SRP098789", use_ascp=True, threads=8)

Refer to the notebook for (shallow) time benchmarks.



List of possible pysradb operations#

$ pysradb
usage: pysradb [-h] [--version] [--citation]
               {metadb,metadata,download,search,gse-to-gsm,gse-to-srp,gsm-to-gse,gsm-to-srp,gsm-to-srr,gsm-to-srs,gsm-to-srx,srp-to-gse,srp-to-srr,srp-to-srs,srp-to-srx,srr-to-gsm,srr-to-srp,srr-to-srs,srr-to-srx,srs-to-gsm,srs-to-srx,srx-to-srp,srx-to-srr,srx-to-srs}
               ...

pysradb: Query NGS metadata and data from NCBI Sequence Read Archive.
Citation: 10.12688/f1000research.18676.1

optional arguments:
  -h, --help            show this help message and exit
  --version             show program's version number and exit
  --citation            how to cite

subcommands:
  {metadb,metadata,download,search,gse-to-gsm,gse-to-srp,gsm-to-gse,gsm-to-srp,gsm-to-srr,gsm-to-srs,gsm-to-srx,srp-to-gse,srp-to-srr,srp-to-srs,srp-to-srx,srr-to-gsm,srr-to-srp,srr-to-srs,srr-to-srx,srs-to-gsm,srs-to-srx,srx-to-srp,srx-to-srr,srx-to-srs}
    metadata            Fetch metadata for SRA project (SRPnnnn)
    download            Download SRA project (SRPnnnn)
    search              Search SRA/ENA for matching text
    gse-to-gsm          Get GSM for a GSE
    gse-to-srp          Get SRP for a GSE
    gsm-to-gse          Get GSE for a GSM
    gsm-to-srp          Get SRP for a GSM
    gsm-to-srr          Get SRR for a GSM
    gsm-to-srs          Get SRS for a GSM
    gsm-to-srx          Get SRX for a GSM
    srp-to-gse          Get GSE for a SRP
    srp-to-srr          Get SRR for a SRP
    srp-to-srs          Get SRS for a SRP
    srp-to-srx          Get SRX for a SRP
    srr-to-gsm          Get GSM for a SRR
    srr-to-srp          Get SRP for a SRR
    srr-to-srs          Get SRS for a SRR
    srr-to-srx          Get SRX for a SRR
    srs-to-gsm          Get GSM for a SRS
    srs-to-srx          Get SRX for a SRS
    srx-to-srp          Get SRP for a SRX
    srx-to-srr          Get SRR for a SRX
    srx-to-srs          Get SRS for a SRX