Mathematics Home / Colloquium: Spring 2019

(Tenative Schedule)

Time & Location: All talks are on Thursdays in Dinwiddie 102 at 3:30 pm unless otherwise noted. Refreshments in Gibson 426 after the talk.

Organizer: Gustavo Didier

A moment-based approach to estimating molecular heterogeneity

Timothy DaleyStanford (Host: Michelle Lacey)

Abstract:

In modern applications of high-throughput sequencing technologies researchers may be interested in quantifying the molecular diversity of a sample (e.g. T-Cell repertoire, transcriptional diversity, or microbial species diversity). In these sampling-based technologies there is an important detail that is often overlooked in the analysis of the data and the design of the experiments, specifically that the sampled observations often do not give a fully representative picture of the underlying population. This has long been a recognized problem in statistical ecology and in the broader statistics literature, and is commonly known as the missing species problem.

In classical settings, the size of the sample is usually small. New technologies such as high-throughput sequencing have allowed for the sampling of extremely large and heterogeneous populations at scales not previously attainable or even considered. New algorithms are required that take advantage of the scale of the data to account for heterogeneity, but are also sufficiently fast and scale well with the size of the data. I will discuss a moment-based approach for estimating the missing species based on an extension of Chao's moment-based lower bound (Chao, 1984). We apply results from the classical moment problem to show that solutions can be obtained efficiently, allowing for estimators that are simultaneously conservative and use more information. By connecting the rich theory of the classical moment problem to the missing species problem we can also clear up issues in the identifiability of the missing species.

A theoretical framework of the scaled Gaussian stochastic process for calibrating imperfect mathematical models

Mengyang GuJohns Hopkins (Host: Gustavo Didier)

Abstract:

Model calibration or data inversion involves using experimental or field data to estimate the unknown parameters of a mathematical model. This task is complicated by the discrepancy between the model and reality, and by possible bias in field data. The model discrepancy is often modeled by a Gaussian stochastic process (GaSP), but it was observed in many studies that the calibrated mathematical model can be far from the reality. Here we show that modeling the discrepancy function via a GaSP often leads to an inconsistent estimation of the calibration parameters even if one has an infinite number of repeated experiments and an infinite number of observations in each experiment. In this work, we develop the scaled Gaussian stochastic process (S-GaSP), a new stochastic process to model the discrepancy function in calibration. We establish the explicit connection between the GaSP and S-GaSP through the orthogonal series representation. We show the predictive mean estimator in the S-GaSP calibration model converges to the reality at the same rate as the one by the GaSP model, and the calibrated mathematical model in the S-GaSP calibration converges to the one that minimizes the L2 loss between the reality and mathematical model, whereas the GaSP calibration model does not have this property. The scientific goal of this work is to use multiple interferometric synthetic-aperture radar (InSAR) interferograms to calibrate a geophysical model for Kilauea Volcano, Hawaii. Analysis of both simulated and real data confirms that our approach is better than other approaches in prediction and calibration. Both the GaSP and S-GaSP calibration models are implemented in the "RobustCalibration" R Package on CRAN.

The normal scores estimator for the high-dimensional Gaussian copula model

Yue ZhaoInstitution

Abstract:

The (semiparametric) Gaussian copula model consists of distributions that have dependence structure described by Gaussian copulas but that have arbitrary marginals. A Gaussian copula is in turn determined by an Euclidean parameter $R$ called the copula correlation matrix. In this talk we study the normal scores (rank correlation coefficient) estimator, also known as the van der Waerden coefficient, of $R$ in high dimensions. It is well known that in fixed dimensions, the normal scores estimator is the optimal estimator of $R$, i.e., it has the smallest asymptotic covariance. Curiously though, in high dimensions, nowadays the preferred estimators of $R$ are usually based on Kendall's tau or Spearman's rho. We show that the normal scores estimator in fact remains the optimal estimator of $R$ in high dimensions. More specifically, we show that the approximate linearity of the normal scores estimator in the efficient influence function, which in fixed dimensions implies the optimality of the normal scores estimator, holds in high dimensions as well.

Limit theorems for eigenvectors of adjacency and normalized Laplacian for random graphs

Abstract:

We prove a central limit theorem for the components of the eigenvectors corresponding to the d largest eigenvalues of the normalized Laplacian matrix of a finite dimensional random dot product graph. As a corollary, we show that for stochastic blockmodel graphs, the rows of the spectral embedding of the normalized Laplacian converge to multivariate normals and furthermore the mean and the covariance matrix of each row are functions of the associated vertex's block membership. Together with prior results for the eigenvectors of the adjacency matrix, we then compare, via the Chernoff information between multivariate normal distributions, how the choice of embedding method impacts subsequent inference. We demonstrate that neither embedding method dominates with respect to the inference task of recovering the latent block assignments.

Location: Gibson Hall 126A

Time: 3:30

Gerardo ChowellGeorgia State (Host: James Hyman and Zhuolin Qu)

Abstract:

Forecasting the trajectory of social dynamic processes such as the spread of infectious diseases poses significant challenges that call for methods that account for data and model uncertainty. Here we introduce a frequentist computational bootstrap approach that weights the uncertainty derived from a set of plausible models to build an ensemble model for sequential forecasting. The power and transparency of this approach is illustrated in the context of simple dynamic differential-equation models, which we confront against the trajectory of real and simulated outbreak data. For illustration, we generate sequential short-term ensemble forecasts of epidemic outbreaks by combining the strengths of phenomenological models that incorporate flexible epidemic growth scaling namely the Generalized-Growth Model (GGM) and the Generalized Logistic Model (GLM). With our ensemble approach, we also addressed prior lessons of the Ebola forecasting challenge particularly with a focus at improving short-term forecasts of outbreaks which may involve a temporary downturn in case incidence.

Stability in the homology of configuration spaces

Jennifer WilsonUniversity of Michigan (Host: Mentor Stafa)

Abstract:

This talk will illustrate some patterns in the homology of the configuration space F_k(M), the space of ordered k-tuples of distinct points in a manifold M. For a fixed manifold M, as k increases, we might expect the topology of these configuration spaces to become increasingly complicated. Church and others showed, however, that when M is connected and open, there is a representation-theoretic sense in which the homology groups of these spaces stabilize. In this talk I will explain these stability patterns, and describe higher-order stability phenomena -- relationships between unstable homology classes in different degrees -- established in joint work with Jeremy Miller. This project was inspired by work-in-progress of Galatius--Kupers--Randal-Williams.

Convex integration and phenomenologies in turbulence

Abstract:

In this talk, I will discuss a number of recent results concerning wild weak solutions of the incompressible Euler and Navier-Stokes equations. These results build on the groundbreaking works of De Lellis and Székelyhidi Jr., who extended Nash's fundamental ideas on flexible isometric embeddings, into the realm of fluid dynamics. These techniques, which go under the umbrella name convex integration, have fundamental analogies the phenomenological theories of hydrodynamic turbulence.

Queueing networks in heavy traffic: History and some recent results

Arka GhoshIowa State

Abstract:

Stochastic processing networks arise as models in manufacturing, telecommunications, transportation, computer systems, the customer service industry, and biochemical reaction networks. Common characteristics of these networks are that they have entities (jobs, packets, vehicles, customers, or molecules) that move along routes, wait in buffers, receive processing from various resources, and are subject to the effects of stochastic variability through such quantities as arrival times, processing times, and routing protocols. The mathematical theory of queueing aims to understand, analyze, and control congestion in stochastic processing networks. In this talk, we will review some of the major developments in the last century with more emphasis on some common approximations used in the last couple of decades. In particular, we will discuss broad results for control of large networks as well as more detailed results for control of specific smaller networks, under heavy traffic approximations.

Location: Richardson Building 204

Time: 1:30

Combinatorial Reciprocity Theorems

Matthias BeckSan Francisco State University (Host: Amdeberhan)

Abstract:

A common theme of enumerative combinatorics is formed by counting functions that are polynomials. For example, one proves in any introductory graph theory course that the number of proper k-colorings of a given graph G is a polynomial in k, the chromatic polynomial of G. Combinatorics is abundant with polynomials that count something when evaluated at positive integers, and many of these polynomials have a (completely different) interpretation when evaluated at negative integers: these instances go by the name of combinatorial reciprocity theorems. For example, when we evaluate the chromatic polynomial of G at -1, we obtain (up to a sign) the number of acyclic orientations of G, that is, those orientations of G that do not contain a coherently oriented cycle. Combinatorial reciprocity theorems appear all over combinatorics. This talk will attempt to show some of the charm (and usefulness!) these theorems exhibit. Our goal is to weave a unifying thread through various combinatorial reciprocity theorems, by looking at them through the lens of geometry.

Degenerate Variational Integrators

John Finn Los Alamos National Laboratory (Host: James Hyman)

Abstract:

I will introduce variational integrators for finite dimensional ODEsystems based on discretizing a variational principle. The advantageof such a procedure is that, if done with care, it preserves importantgeometric properties of the original system. The presentation willstart with simple examples showing the utility of discretizing avariational integral rather than deriving the differential equationsand discretizing these. For Lagrangian systems (with convexityproperties) a phase space variational principle (Hamilton's principle)can be derived, producing the Hamiltonian equations of motion, asystem of first order (rather than second order)equations. Discretization must be done carefully in order to avoidobtaining a system of higher order, which can lead to parasiticinstabilities. Such a discretization leads to a degenerate variationalintegrator, a form of symplectic integrator. I will briefly discussdiscretizations for Hamiltonian systems with canonical variables aswell as important examples with noncanonical variables. I will brieflydiscuss the extension of these integrators to those with higher orderaccuracy and those with adaptive time stepping.