Earlier, we walked through finding reads and putting them in a selected library box, but before you can assemble, this button has to turn blue before you can assemble, you have some other decisions to make. In this video, we're going to talk about selecting the parameters for your assembly and submitting the job. The information icon here gives you additional information about everything that you need to do in this box, but first, let's talk about the assembly strategy. I click on the down arrow here, and you can see that there are a number of strategies that you can use to assemble your data. If you have Auto selected, it means that you're allowing that PATRIC assembler to choose the best strategy. Let's talk about Unicycler. Unicycler is an assembly pipeline that can assemble Illumina-only read sets where it functions as a SPAde-optimizer. It can also assemble long-read-only sets like from PacBio or Nanopore, where it runs a miniasm plus Rackon pipeline for the best possible assembly. If you had this option, you could give it both Illumina reads and long reads, and it will create a hybrid assembly. So if you have both of those, you should use Unicycler. Unicycler builds an initial assembly graph from short reads using the De novo assembler and then uses a novel semi global aligner to align long reads to it. SPAdes. SPAdes has been around for a while now, and it's an assembler that's designed to assemble small genomes such as those from bacteria. It uses a multi-sized brewing graph to guide assembly. Canu is a long read assembler which works on the third and fourth generation sequencing reads. It supports Nanopore sequencing, have depth of coverage requirements, and improves assembly continuity. It was designed for high noise single model sequencing, such as the PacBio, RS2, SQL, or the Oxford Nanopore, MinION. The metaSPAde software combines new algorithmic ideas with proven solutions from the SPAdes toolkit to address various challenges of metagenome assembly. So you want to use metaSPAdes for meta-genomic assembly, Unicycler for hybrid. Plasmids are stably maintained extra chromosomal multi-genetic elements that replicate independently from the hosts chromosomes. The PlasmidSPAdes algorithm and software tool for assembling plasmids from whole genome sequencing data and benchmark its performance on a diverse set of bacterial genomes. So if you have plasmid data, try PlasmidSPAdes, or if you have a whole genome sequence that you think plasmids are included in, I might submit the whole genome using Auto, and then also submit a separate assembly with plasmidSPAdes to see if its happening like pulling that plasmid out. MDA is something that you use for single cells. This is something that SPAdes has developed and we are getting more and more of these sequences from single-cell isolates and synthesis of specific recipe generated for those. After all that talk, we're just going to go with Auto here. Now you need to create an output folder or your assembly. If you have one in mind, once again, you can click on the down arrow and see the ones you most recently created, or you could click on the folder and create a new folder if you want to. But I don't want to do that. I'm just going to put this in my test folder. Then you need a name or your output file. We could just submit that, then copy and paste that just so I know what I'm calling it. At this point, you'll notice that this turns blue and it's ready to submit an assembly, but let's click on the Advanced button and see what we can find there. We can see that there are several options under Advanced. You can ask it to trim reads prior to assembly. So if you wanted to do that, you'd click on the down arrow, and then you select "True". The function that PATRIC uses for trimming is called Trim Galore. Now let's talk about Rackon or Pilon iterations. Both Rackon and Pilon take the contents and the reads mapped to those contents. It looks for discrepancies between the assembly and the majority of the reads. Where there is a discrepancy, Rackon or Pilon will correctly assemble the reads if the majority of the reads call for that. Rackon is for long reads like PacBio or Nanopore, and Pilon is for the shorter reads like Illumina or Ion Torrent. Once the assembly has been correct with the reads, it's still possible to do another iteration to further improve the assembly that each one takes time. Right now, we have the default set for two of each. So if you're at a hybrid or just short reads or long reads. Next, we have minimum contact length and minimum content coverage. You can adjust the numbers up and down using these up and down arrows, but the default for the minimum contig length is 300 and for coverage is five. Once you're ready to assemble, you click the Assemble box, and your assembly job has been submitted. Here is your third assignment, and this one is going to be a little bit tough, and it's going to take you some time and give it some thought. What I want you to do is submit a number of assembly jobs with the test reads, the MOOC, test 1 and test 2, those paired-end reads. When you select each job, you will either assign them to be Illumina or Ion Torrent, and then you'll change to the assembly strategy, I want you to use every single assembly strategy and them submit it both for Illumina and Ion Torrent. We want to see if there's any difference when you do that and how that affects the assembly. Can the assembler, does it care? If they're longer short reads, if you hit it with Canu, what's going to happen? So just to give you some experience with doing all the different types of strategies, one thing that's going to be important for you to do is name each job uniquely so that you can identify it at the end. I did fill up a template to help me do this, which I would suggest that you might want to do. I did this in Excel, I identified the platform. Those are all the strategies I put in what I named it, how long it took to assemble the short read coverage. These are just some of the things I recorded that you will need to do later, but right now, it's important to do each of these combinations, so you'll have 1, 2, 3, 4, 5, 6 for Ion Torrent and six for Illumina, and you don't know what these rates are. So it's a clever little assignment even if I do say so myself.