Reference genomes
How to get a reference genome into Galaxy?
-
In a web browser, go to https://www.ncbi.nlm.nih.gov/assembly/
-
Enter the taxon into the search bar.
-
For example, enter Mycoplasma synoviae
- Choose the assembly you want, for example, ASM314756v1
- Under the
Assembly Definition , click on the GenBank sequence ID, for example, CP021129.1 - In the top right, click on
Send to - For
Choose Destination select File - For
Format choose FASTA - Click
Create File -
The file will download locally.
-
In Galaxy, go to
Get Data and chooseUpload File . - Click
Choose local file - Select the file.
- Click
Start - Click
Close - You should now have the file in the top of your current history.
- Re-name it, e.g. to Mycoplasma_reference.fasta
(Alternatively: request a reference genome be added to Galaxy Australia: click on the request link on the Galaxy Australia welcome page.)
How to convert this into other formats?
Tools in Galaxy may require various formats for your reference genome. For bacterial genomes, an easy way to get the required formats is to use the tool Prokka.
- Go to the Tool panel and search for “prokka” in the search box.
- Click on
Prokka - For
Contigs to annotate : yourreference.fasta file - All the other settings can be left as they are.
- Click
Execute
Prokka outputs lots of different files including a genbank, fasta and gff version of your input fasta file.