Conference call, 04.02.2020
Updates from Jeff:
- Due to labs closing because of the coronavirus pandemic, sequencing of the additional samples has been put on hold.
Updates from Carl:
- First pass of molecular evolution analyses complete for Australian sample.
Updates from Gregg:
- Assembly and mapping complete on full 176 species sample.
Moving forward
- Discussed annotation of exomes via aligning targets to assembled scaffolds.
- Selected transcripts should maximize presence in both mouse and rat genomes.
- Alignment step can be very error-prone. Many steps were discussed to mitigate this, including frameshift aware alignment with MACSE,
checking for premature stop codons in initial alignments, and generating an initial dS distribution of alignments to check for
abnormalities.
- Molecular evolution analyses will have to account for discordance and multi-nucleotide mutations. Using gene trees and an MNM model
implemented in HyPhy were brought up ways to do so.
- Pairwise and relative rate tests were discussed as other ways to detect shifts in selective pressues on branches of interest.
Logistics
- A Google doc has been set up to coordinate and plan workflows. The link will be distributed as requested.
- A github repository will be set up to compile scripts and data.
Meeting at LSU, 11.12.2019
Sampling was discussed in detail.
The plan:
-
Carl and Gregg will combine exome data and work on the best assembly/mapping method. Carl will focus on assembly with Spades,
while Gregg will develop an iterative mapping approach. We will need to figure out a way to assess and compare approaches.
Gregg will set up a Box folder.
-
Gregg wants to set up a single, unified location for all raw sequence data (Box folder?), with the top directory being the three
folders exon-capture, exomes, and genomes, and sub-folders for each species that would contain reads,
mappings, assemblies, etc.
Jake points out that there are some restrictions to Box, such as a 15GB single file size limit and only 4 nested folders. If these
aren't a problem then it should be ok. Gregg will keep this in mind while compiling the data.
-
For exomes, we are freezing the sampling at what was discussed today. Kevin will send libraries of all the Bunomys clade members to
Montana for sequencing.
-
Gregg will use only a few of the Pseudomys exomes from the 48 Carl is using so the sampling is not too heavy from that single division.
Carl and Gregg can discuss which ones would be best to use.
-
For whole genome sequencing, we are proposing to sequence:
- Phloeomys, Musseromys, Papagomys, and Komodomys for body size/longevity contrasts
- Crossomys and Waiomys for amphibiousness convergence along with Pseudohydromys and Gracilimus for non-convergent sister species. Kevin notes that
Hydromys chrysogaster would be another interesting sample to compare montane and amphbiousness.
- Paucidentomys for worm-sucking convergence (Rhyncomys already sequenced) along with Gracilimus and Apomys for non-convergent sister species.
Sequencing of Notomys was also mentioned, and Kevin notes that there is draft sequence data, possibly from HiSeq 2000.
-
Whole genome sequencing will be simple shotgun sequencing.
-
Gregg will write an NIH NRSA to support the whole genome sequencing. The aims of this grant will focus on phylogenetic discordance, molecular evolution/rate variation,
and molecular convergence. For all of these aims, both exomes and genomes will be used to compare these sequencing strategies.
-
We will have a Skype call on Tuesday, November 26 to follow up.
http://dailymammal.com/murines-five-ways/