Notes

Due to labs closing because of the coronavirus pandemic, sequencing of the additional samples has been put on hold.

Discussed annotation of exomes via aligning targets to assembled scaffolds.
Selected transcripts should maximize presence in both mouse and rat genomes.
Alignment step can be very error-prone. Many steps were discussed to mitigate this, including frameshift aware alignment with MACSE, checking for premature stop codons in initial alignments, and generating an initial dS distribution of alignments to check for abnormalities.
Molecular evolution analyses will have to account for discordance and multi-nucleotide mutations. Using gene trees and an MNM model implemented in HyPhy were brought up ways to do so.
Pairwise and relative rate tests were discussed as other ways to detect shifts in selective pressues on branches of interest.

A Google doc has been set up to coordinate and plan workflows. The link will be distributed as requested.
A github repository will be set up to compile scripts and data.

Carl and Gregg will combine exome data and work on the best assembly/mapping method. Carl will focus on assembly with Spades, while Gregg will develop an iterative mapping approach. We will need to figure out a way to assess and compare approaches. Gregg will set up a Box folder.

Gregg wants to set up a single, unified location for all raw sequence data (Box folder?), with the top directory being the three folders exon-capture, exomes, and genomes, and sub-folders for each species that would contain reads, mappings, assemblies, etc.
Jake points out that there are some restrictions to Box, such as a 15GB single file size limit and only 4 nested folders. If these aren't a problem then it should be ok. Gregg will keep this in mind while compiling the data.

For exomes, we are freezing the sampling at what was discussed today. Kevin will send libraries of all the Bunomys clade members to Montana for sequencing.

Gregg will use only a few of the Pseudomys exomes from the 48 Carl is using so the sampling is not too heavy from that single division. Carl and Gregg can discuss which ones would be best to use.

For whole genome sequencing, we are proposing to sequence:
- Phloeomys, Musseromys, Papagomys, and Komodomys for body size/longevity contrasts
- Crossomys and Waiomys for amphibiousness convergence along with Pseudohydromys and Gracilimus for non-convergent sister species. Kevin notes that Hydromys chrysogaster would be another interesting sample to compare montane and amphbiousness.
- Paucidentomys for worm-sucking convergence (Rhyncomys already sequenced) along with Gracilimus and Apomys for non-convergent sister species.
Sequencing of Notomys was also mentioned, and Kevin notes that there is draft sequence data, possibly from HiSeq 2000.

Gregg will write an NIH NRSA to support the whole genome sequencing. The aims of this grant will focus on phylogenetic discordance, molecular evolution/rate variation, and molecular convergence. For all of these aims, both exomes and genomes will be used to compare these sequencing strategies.

Site designed and maintained by Gregg Thomas | Pure CSS | Page built: 09/02/2020 21:35:49 MDT