The evolutionary dynamics of RNA viruses, as all organisms, depend on the occurrence of beneficial, deleterious, and neutral mutations. Yet, despite evidence for strong genetic linkage, the majority of studies on viral adaptive evolution focus exclusively on the role that beneficial mutations play in this process. This is particularly the case in modeling studies examining patterns of antigenic evolution in acute, semi-immunizing viral pathogens such as influenza. Here, I develop a model to address the dynamic interplay between beneficial mutations that enable immune escape and deleterious mutations that reduce between-host viral transmission. In an application of this model to influenza viruses in humans, I show that deleterious mutations are expected to make antigenic evolution occur in a punctuated manner, a pattern that has been empirically observed in these viruses. I further show that deleterious mutations are expected to substantially limit influenza’s genetic and antigenic diversity in the long-run, such that clonal interference does not need to be invoked to explain this pattern. Finally, I show that the dynamical effect of deleterious mutations provides a parsimonious explanation for two empirical observations in influenza: the occasional occurrence of within-subtype reassortment preceding an antigenic cluster transition and the tendency of antigenic clusters to arise from Asian, or more generally, tropical regions. This work highlights the critical importance of deleterious mutations in shaping patterns of viral adaptive evolution, even if they are not themselves the source of adaptation.
Monday, April 28th, 2014
Estimating the Tree of Life will likely involve a two-step procedure, where in the first step trees are estimated on many genes, and then the gene trees are combined into a tree on all the taxa. However, the true gene trees may not agree with with the species tree, due to biological processes such as deep coalescence, gene duplication and loss, and horizontal gene transfer. While methods have been developed to estimate species trees in the presence of incomplete lineage sorting, the relative accuracy of these methods compared to the usual "concatenation" approach is debated.
In this talk, I will present results showing that coalescent-based estimation methods are impacted by gene tree estimation error, so that they can be less accurate than concatenation in many cases. I will also present new methods for estimating species trees in the presence of gene tree conflict due to ILS that are more accurate than current methods. Key to these methods is addressing gene tree estimation error more effectively. I will also present results using these techniques to estimate species tree for several biological datasets, including the Avian Tree of Life.
Individuals differ substantially in fitness components, but such realized variation in lifespan and reproductive success need not be visible to natural selection, because fitness is not a property of individuals but of groups or genotypes. I will present formulas—rooted in Markov Chain theories—to compute exactly the variance in lifespan and in lifetime reproductive success among individuals with identical (genotypic) vital rates. This expected neutral variance is solely driven by individuals transitioning stochastically among stages. I illustrate how the neutral expected variation in fitness components matches those observed in real populations, how ignoring this stochastic variability and labeling it undesired noise can lead to false demographic predictions, and present data from isogenic bacteria that illustrates that the overwhelming amount of variation in fitness components among individuals is neutral and that the amount of variance detected in the lab does not differ much from natural populations.
Hybridization among eukaryotic organisms gives rise to reticulate evolutionary histories, and subsequent potential introgression results in genome mosaics. Consequently, phylogenetic networks have been introduced to model reticulate evolutionary histories. In this talk, I will present our recent work on modeling the evolution of genes and genomes within the branches of phylogenetic networks and the methods we have developed for their inference. Further, I will describe a new comparative genomic method that combines phylogenetic networks with hidden Markov models in order to detect genomic regions with signatures of introgression.
In this talk I will focus on two common sources of gene tree incongruence in phylogenomic studies: incomplete lineage sorting and gene tree estimation error. In prior theoretical work, both were studied extensively -- but separately. I will describe an analytical framework that bridges the gap by considering simultaneously the effect on reconstruction accuracy of the number of genes and of the number of sites per gene. Using this framework, I will discuss the inconsistency of concatenation methods, the advantages of sequence-based over topology-based methods and evidence that in this setting quantity trumps quality.
Joint work with Gautam Dasarathy, Elchanan Mossel, Rob Nowak, Mike Steel and Tandy Warnow.
Tuesday, April 29th, 2014
Pervasive natural selection can strongly influence observed patterns of genetic variation, but these effects remain poorly understood when multiple selected variants segregate in nearby regions of the genome. Classical population genetics fails to account for interference between linked mutations, which grows increasingly severe as the density of selected polymorphisms increases. I will provide a bit of background on the general problem and existing theoretical approaches, and then describe a simple limit that emerges when interference is common, in which the fitness effects of individual mutations play a relatively minor role. Instead, similar to models of quantitative genetics, molecular evolution is determined by the variance in fitness within the population. I will describe how this insensitivity can be exploited to define a "coarse-grained" model, which approximates the effects of many weakly selected mutations with a smaller number of strongly selected mutations with the same variance in fitness. This approximation generates accurate and efficient predictions for silent site variability when interference is common. However, these results suggest that there is reduced power to resolve individual selection pressures when interference is sufficiently widespread.
Cancer arises via the accumulation and spread of advantageous mutations in healthy tissue. Here we discuss a model of the emergence and spread of oncogenic mutations in a spatially structured population. In this model, each lattice site is occupied by a single cell which replicates at exponential times with rate dictated by the type of the cell; the replicate then replaces one of its lattice neighbors chosen at random. In addition, oncogenic mutations occur at a low rate in the population, creating new, advantageous cell types that may either die out or become 'successful.' We will address the following questions: (1) how fast do mutations spread through tissue, and how does this depend on the parameters of the tissue/cancer type? (2) how quickly do unsuccessful mutations die out? and (3) how long does it take for cancer to initiate?
Epithelial cancers often emerge within genetically altered fields of premalignant cells that appear histologically normal but have a high chance of progression to malignancy. Clinical consequences of this so-called field cancerization are multifocal lesions and high recurrence risks after excision of the primary tumor. We develop a spatiotemporal stochastic model field cancerization, combining evolutionary dynamics at the phenotypic level with a general framework for multi-stage progression to cancer. Based on the model, we derive probabilistic distributions for clinically relevant quantities such as size of the invisible premalignant field at time of cancer diagnosis, and risk of local and distant recurrences. Finally, we discuss how our model can be combined with patient-specific measurements to optimize surgical excision margins and post-operative monitoring.
Marc Ryser is no longer able to attend this workshop. Rick Durrett will be presenting giving this talk instead.
Wednesday, April 30th, 2014
Environmental change can drive a population extinct unless it succeeds in adapting to the new conditions. How likely is a population to win the race between population decline and adaptive evolution? Previous models have addressed this issue in a panmictic population. In our study, we assume that environmental degradation progresses across a habitat, allowing for temporal refuges. We analyze the impact of several ecological factors on the probability of evolutionary rescue in this case, such as: (1) the strength of the population structure, (2) the speed and severity of the environmental change, (3) the influence of density-dependent competition, (4) the relative contributions of rescue from standing genetic variation and from new mutations. Our analysis is based on the mathematical theory of time-inhomogeneous branching processes. We find that in the interplay of various, partially antagonistic effects, the probability of evolutionary rescue can show nontrivial and unexpected dependence on ecological characteristics. In particular, we generally observe a nonmonotonic dependence on the migration rate between islands. Counterintuitively, under some circumstances, evolutionary rescue can occur more readily in the face of harsher environmental shifts, because of the reduced competition experienced by mutant individuals. Similarly, rescue sometimes occurs more readily when the entire habitat degrades rapidly, rather than progressively over time, particularly when migration is high and competition strong.
Joint work with Hildegard Uecker and Sally Otto; Uecker et al., AmNat 2014
Given a spatially structured population (of constant size) and a novel, highly beneficial allele, we are dealing with asymptotics of the fixation time of this mutant. We use the selection strength (called $\alpha$) as a scaling parameter and find various regimes dependent on the scaling of the migration strength (called $\mu$) with $\alpha$. We treat the cases $\mu \sim \alpha^p$ for $0\leq p\leq 1$ and $\mu \sim 1/\log(\alpha)$. The main tool for our proofs is an equilibrium version of a dual process, which stems from the ancestral selection graph of Krone and Neuhauser.
Joint work with Andreas Greven (Erlangen), Cornelia Pokalyuk (Lausanne) and Anton Wakolbinger (Frankfurt).
Evolutionary biologists use phylogenetic trees of extant species to study the speciation and extinction patterns of different groups of species. Most inferences are based on methods that use type independent evolutionary rates of speciation and extinction, but recently a number of more type-dependent simulation methods are being used (Maddison et al., Fitzjohnet al., Igic & Goldberg) with important consequences for validity of some previously held beliefs. A fair amount is known about the probability distribution of ancestral trees derived from single type branching processes, while much less is known about the same objects for multi-type branching processes (other than asymptotic results). In this talk I will present a few results in this direction. First, there is an algorithmic way to construct an ancestral tree of the standing population of a multi-type branching process in terms of a Markov chain (of vectors of types and multiplicities). This construction allows one to get explicit formulae for calculating: (a) statistical features that describe the shape of the tree (the law of coalescence times together with types on the ancestral lineages), and (b) statistical features that link types in the standing population with the shape of the tree (the law of same-type coalescence times). Second, explicit calculations can be used to compare the effect that different branching mechanisms have on the distributions of ancestral trees. I will illustrate this in a simple example of two-type process with completely asymmetrical vs symmetrical probabilities of offspring types.
A popular line of research in evolutionary biology is to use time-calibrated phylogenies to understand the process of diversification. This is done by performing statistical inference under stochastic models of species diversification. These models thus need to be robust, biologically sound and mathematically tractable.
We first introduce some new lineage-based, stochastic models of phylogenies, featuring e.g., protracted speciation or age-dependent extinction. Using recent mathematical results allowing the computation of tree likelihoods, we present ML parameter estimates inferred from the recent Jetz et al’s bird phylogeny.
Our goal is then to obtain (these or other) lineage-based models of phylogenies, starting from an individual-based description of populations. We present in particular two non-exchangeable models of phylogenetic trees thus obtained. In the first one, speciation is modelled by genetic differentiation of individual lineages. The second one is a scaling limit of the Tilman-Levins type model where interspecific competition is only felt by older species from younger species.
Thursday, May 1st, 2014
This will be an informal attempt to marshall my thoughts and seek help from workshop attendees.
The within-host reproductive rate of pathogens is a useful proxy for what are often called virulence tradeoffs. The faster a pathogen reproduces, the more attention it attracts from the host, leading to increased likelihood of morbidity, mortality or effective immune responses. Conversely, faster reproduction can lead to higher "titers" and more effective dispersal. Pathogens need to balance these tradeoffs at different levels. Pathogens such as rabies or anthrax, which spread by killing their hosts will see these tradeoffs differently, but may still seem them operate at the level of subpopulations.
Theoretical studies of pathogens frequently focus on the time scale of epidemiological and evolutionary processes at the host population level. In viruses such as influenza and human immunodeficiency virus (HIV) however, mutations are generated sufficiently quickly that phenotypic change can occur over the shorter time scale of a single infection. I will describe some stochastic process models for this within-host evolution, both in the specific cases of influenza and HIV, as well as in a general framework where selection acts concurrently at both within-host and between-host levels. The modeling techniques used range from stochastic simulations for the development of broadly neutralizing anti-HIV antibodies, to the theory of measure-valued processes applied to a ball-and-urn formulation of multilevel selection. In the case of seasonal influenza and HIV, these studies suggest improvements on current or planned vaccination strategies which may not have been identified without a specific within-host model. At a more abstract level, they also suggest general rules of thumb for the evolution of pathogens that experience antagonistic selection between and within hosts.
The antibody repertoire of each individual is continuously updated by the evolutionary process of B cell receptor mutation and selection. It has recently become possible to gain detailed information concerning this process through high-throughput sequencing. In this talk I will describe our work on statistical molecular evolution methods for the analysis of B cell sequence data, and their application to a very deep short-read data set of B cell receptors. We find that the mutational process is conserved across individuals but varies significantly across gene regions. We investigate selection on B cell receptors using a novel method that side-steps the difficulties encountered by previous work in differentiating between selection and motif-driven mutation; this is done through stochastic mapping and empirical Bayes estimators that compare the evolution of in frame and out of frame rearrangements. We use this new method to derive a per-residue map of selection, which we find is dominated by purifying selection, though not uniformly so.
I will talk about several theoretical results on how species adapt in continuous geography, and one descriptive tool designed for use on data. I aim to answer the following questions: When does a species faced with a new selective pressure adapt as a unit, and when do different solutions to the same evolutionary problem arise in parallel in different parts of the range? What about the case of a patchy environment: i.e. when should local adaptations be shared versus heterogeneous? How could we distinguish the two? These questions have suprisingly elegant answers, at least in the large-population-density regime, thanks to stochastic tools going back to Fisher and Kolmogorov. I will also describe a method, currently not motivated by an explicit spatial model, that uses Gaussian random fields to estimate the relative effects of simple geographic distance and environmental differences on genetic isolation.
Historically, models of quantitative trait evolution have assumed that individual mutations rarely, if ever, produce substantial changes in phenotype. Genome-wide association studies suggest that these infinitesimal models may be appropriate for some traits, like human height. However, experimental manipulations of other traits, such as gene expression, suggests that mutations that drastically alter phenotype may be common. In joint work with Michael Landis at UC Berkeley, we are exploring the development of models of quantitative trait evolution that can handle arbitrary mutational effects. We developed a neutral model using coalescent theory and I will show how this framework can be used to calculate the characteristic function for a sample of individuals from a panmictic population. We also found several scaling limits (as the number of loci controlling the trait increases), including the classical normally distributed limits and a class of alpha-stable limits in which the sampling distribution is fat-tailed. Based on these limits, we conjecture a powerful interpretation of these limits based on de Finetti's theorem. We then applied our models to analyze gene expression data from the fungus Neurospora crassa and found between 20 and 40 genes provide evidence for fat-tailed mutational effects.
The evolutionary theory of ageing lies in a complicated intersection among evolutionary genetics, population ecology, and mathematical demography. It is impossible to form a coherent model of the evolution of senescence without taking account of overlapping generations, environmental stochasticity, and genetic pleiotropy and epistasis on a grand scale, in addition to the elements that are standard in mathematical models of evolution. This talk will outline some recent progress and major outstanding questions, including an account of a general framework developed with Steve Evans and Ken Wachter, for analysing mutation-selection equilibrium in contexts with large numbers of nearly neutral mutations.
Friday, May 2nd, 2014
The genetic process of mutation accumulation hold particular interest for demographers, as it provides a plausible mechanism through which regularities across species in age-specific schedules of demographic rates might be explained. In Evans, Steinsaltz, and Wachter (2013) (AMS Memoir 222) a non-linear model for mutation accumulation with demographic structure was developed, in which recombination is assumed to operate on a more rapid timescale than mutation and selection, leading to a deterministic dynamic system within an infinite population context. In this talk, I discuss efforts to relate assumptions and predictions from this model to statistical summaries that can be obtained from genomic sequencing studies. I explore possible connections between the deterministic dynamical system and certain mean values calculated from finite-population Wright-Fisher models incorporating drift. A key role in empirical calibration is played by a generalization of Haldane's Principle which relates total mutation rates to a function of total loss in fitness.
TBA
The adaptive dynamics of an evolving population is constrained by epistatic interactions encoded in the underlying fitness landscape. In recent years empirical data sets have become available that offer glimpses into the structure of real fitness landscapes and motivate theoretical work on classic probabilistic models such as NK-landscapes and the rough Mount Fuji model. In this talk I report on recent efforts aimed at quantifying the accessibility of these landscapes under asexual adaptation. Results concerning structural properties, such as the number of local maxima and the existence of fitness-monotonic pathways, as well as on the dynamics of adaptive walks will be presented, with particular emphasis on the subtle role of genetic architecture in the NK-model. If time permits, the role of recombination in speeding up or slowing down adaptation on rugged fitness landscapes will also be briefly addressed.