The Third Workshop on Monte Carlo Methods

Complete Program Schedule

Sunday May 13, 2007

8:00-8:45

Breakfast, onsite registration (Harvard Hall 104)

8:45-9:00

Jun Liu
Opening Remark

 

Plenary Session: Harvard Hall 104

9:00-9:45

Applying MCMC to a Classic Problem in Statistics: Estimating dose-response when only treatment versus placebo is randomized -- the Efron-Feldman (1991 JASA) data reanalyzed

Don Rubin

Harvard University

9:45-10:30

Optimal Filtering of Jump-Diffusions: Extracting Latent States from Asset Prices

Nicholas Polson

University of Chicago

 

In this paper, we develop an approach for filtering state variables in the setting of continuous-time jump-diffusion models. Our method computes the filtering distribution of latent state variables conditional only on discretely observed observations in a manner consistent with the underlying continuous-time process. We provide simulation evidence to verify that our method provides accurate inference. As an application, we apply the methodology to the multivariate jump models in Duffie, Pan and Singleton (2000) using daily S&P 500 returns from 1980-2000 and we investigate option pricing implications. (Joint work with Michael Johannes and Jonathan Stroud)

Break

Parallel Sessions
Session A: Harvard Hall 103
Session B: Harvard Hall 102

10:50-11:30

Session A

Transition state ensemble of protein folding and evolutionary selection pressure through Sequential Monte Carlo and MCMC

Jie Liang

University of Illinois at Chicago

 

The transition state ensemble (TSE) is a set of intermediate conformations of protein molecules that dictates whether protein chains will eventually fold. Although parameters related to TSE can be measured experimentally, the size and shape of conformations in TSE have been elusive, as TSE is short-lived. Here we develop a strategy based on Sequential Monte Carlo to sample TSE effectively. For the protein acylphosphatase, we found TSE has diverse conformations. In contrast to previous results by Vendruscolo et al, we found overall TSE can be very different from native structure of proteins (with RMSD>12AA). Analysis of the TSE conformations suggests a likely scenario of protein folding process for acylphosphatase. To analyze how selection pressure acts on different sites of acylphosphatase and how they are related to the formation of TSE, we estimate the $\omega$ ratio (synonymous vs non-synonymous codon substitution) along the protein sequence. Existing methods require the input of DNA sequences, hence precludes their applications to remotely related sequences, where saturated substitutions occur. We describe a new method to detect selection pressure directly from amino acid sequences by applying a Bayesian MCMC method to a model where codon usages are regarded hidden parameters. (Joint work with Rong Chen, Ming Lin, and Zheng Ouyang)


Chair: Sam Kou

Session B

A (Minor) Miracle: Diagonalization of a Bose-Einstein Markov Chain

Jim Fill

The Johns Hopkins University

 

Over the past twenty years or so, quite a few techniques have been developed for bounding the mixing times of Markov chains. However, it remains true that the most thorough analysis can be given in the (miraculous) case that the chain's one-step transition matrix can be explicitly diagonalized. We present one such "miracle" for an ergodic chain whose stationary distribution is Bose-Einstein configurations (i.e., the uniform distribution over configurations) of n balls in k boxes, and another for a generalization of this chain involving the Ewens distribution on permutations. It is quite elementary to sample directly from Bose-Einstein configurations, but the chain itself is of interest as an example of the "big-moves" Burnside process, introduced by computer scientists Mark Jerrum and Leslie Goldberg and closely connected with Polya's theory of counting, for sampling uniformly from the orbits of a finite set under the action of a finite group. (This is joint work with Persi Diaconis.)


Chair: Paul Edlefsen

11:30-12:10

Session A

In Silico Folding of Small Proteins

Ulrich H.E. Hansmann

Michigan Technological University

John von Neumann Institute for Computing

 

The successful deciphering of the human genome has highlighted an old challenge in protein science: for most of the resolved protein sequences we do not know the corresponding structures and functions. Neither do we understand in detail the mechanism by which a protein folds into its biologically active form. Computer experiments offer one way to evaluate the sequence-structure relationship and the folding process but are extremely difficult for detailed protein models. This is because the energy landscape of all-atom protein models is characterized by a multitude of local minima separated by high energy barriers. Only over the last few years have been algorithms developed that allow one to overcome this multiple-minima problem in protein simulations. Prominent examples of these new techniques are parallel tempering and generalized-ensemble sampling. I will discuss these techniques and ways for their improvement in the context of simulations of small proteins (of size 30-60 residues).

We study for these molecules the folding mechanism and the relation between secondary structure formation and folding. Limitations of current energy functions are discussed.


Chair: Sam Kou

Session B

A Bayesian reassessment of k-nearest-neighbour classification

Christian Robert (joint work with Jean-Michel Marin and Mike Titterington)

Ceremade – University of Paris-Dauphine 

 

The k-nearest-neighbor procedure is a well-known deterministic method used in supervised classification. While it has been superseded by more recent methods developed in machine learning, it remains an essential tool for classifiers. This paper proposes a reassessment of this approach as a statistical technique derived from a proper probabilistic model; in particular, we modify the assessment made in a previous Bayesian analysis of this method undertaken by Holmes and Adams (2002) where the probabilistic underlying model is not completely well-defined. Once the probabilistic bases of the $k$-nearest-neighbour procedure are established, we proceed to the derivation of practical computational tools to conduct Bayesian inference on the parameters of the corresponding model. In particular, we assess the difficulties inherent to pseudo-likelihood and to path sampling approximations, and propose a perfect sampling strategy to implement a correct MCMC sampler associated with our model. Illustrations of the performances of the corresponding Bayesian classifier are provided for two benchmark datasets.

Chair: Paul Edlefsen

Lunch (Harvard Hall 104)

1:30-2:10

Session A

Bayesian Methods to Infer Protein Interactions

Hongyu Zhao

Yale University

 

Protein-protein interactions play important roles in most fundamental cellular processes. Therefore, it is important to develop effective statistical approaches to predicting protein interactions based on recently available large-scale yet noisy experimental data. In this talk we propose Bayesian methods to predict protein interactions based on interactions among domains, the functional units of proteins. We also propose a new model to associate protein interaction probabilities with domain interaction probabilities.

When our Bayesian methods are compared with a likelihood-based approach, our methods have smaller mean square errors through both simulations and theoretical justification under a special scenario. The large-scale protein-protein interaction data obtained from high throughput yeast two-hybrid experiments are used to demonstrate the advantages of the Bayesian approaches. This is joint work with Inyoung Kim and Yin Liu.

Chair: Paul Edlefsen

Session B

Statistical Physics and Statistical Computing: A Critical Link

Xiao-Li Meng

Harvard University

 

The main purpose of this talk is to demonstrate the fruitfulness of cross-fertilization between statistical physics and statistical computation, by focusing on the celebrated

Swendsen-Wang algorithm for Ising model and its recent perfect sampling implementation by M. Huber. In particular, by introducing Hellinger derivative as a measure of instantaneous changes of distributions, we provide a probabilistic insight into the algorithm's critical slowing down at the phase transition point. Namely, at or near the phase transition, an infinitesimal change in the temperature parameter of the Ising model causes an astronomical shift in the underlying state distribution. This suggests a critical slowing down in any Markov chain Monte Carlo algorithm that relies on stationarity for convergence. This insight allows us to approximate critical points of the Ising model, which are physics properties, by monitoring the coupling time of Huber's bounding chain algorithm, which is an algorithmic characteristic. This finding might provide another useful way of approximating criticality of thermodynamic systems, which is typically intractable analytically. We also speculate that whether we can turn perfect sampling from a cute pet into a workhorse for general scientific computation may depend critically on how successful we can engage, in its development, researchers from statistical physics and related scientific fields. (This is a join work with James Servidea from U.S. Department of Defense.)


Chair: Jie Liang

2:10-2:50

Session A

Bayesian Clustering with the Dirichlet Process

Shane T. Jensen

University of Pennsylvania

 

Prior distributions for unknown data distributions play an important role in non-parametric Bayesian statistics. A commonly-used prior distribution for an unknown data distribution is the Dirichlet process, which induces a random partition on the observations from the unknown data distribution.  We investigate several properties of the Dirichlet process prior and introduce several alternative priors for random partitions. We explore several large-sample results for each prior distribution as well as a simulation-based evaluation of their properties in small samples. We also compare different priors in an application to the clustering of transcription factor binding motifs.  In the context of this application, we also discuss different options for summarizing posterior inference from our clustering model.


Chair: Paul Edlefsen

Session B

Implementing Gibbs-Type Samplers Using Incompatible Draws With Applications in High-Energy Astrophysics

David van Dyk

University of California, Irvine

 

Ensuring the compatibility of conditional distributions is commonly emphasized in the constructions of Gibbs samplers: Compatibility is a necessary condition for proper convergence. In this talk I discuss a systematic strategy for the construction of MCMC samplers involving incompatible conditional distributions. This strategy is designed not only to ensure proper convergence to the target stationary distribution but also to provide chains with quicker convergence and smaller autocorrelations at stationarity. Our method can be viewed as a generalization of blocking in that the systematic strategy sometimes yields a blocked version of the parent Gibbs sampler. It is more general than blocking in that it sometimes yields a set of incompatible conditional distributions that do correspond to any standard Gibbs sampler. The method was motivated by an applied problem in high-energy astrophysics, and we illustrate its use in three problems stemming from our applied work. Our strategy can be viewed as a stochastic version of the ECME and AECM algorithms. Changing the order of the steps of these EM-type algorithms can destroy their celebrated monotone convergence. Likewise, changing the order of the conditional draws of our samplers can destroy their target stationary distribution. Thus, some care must be taken in deriving and implementing these fast samplers.


Chair: Jie Liang

2:50-3:30

Session A

The Wang-Landau algorithm: Applications and some convergence results

Yves Atchade

University of Michigan

 

The Wang-Landau algorithm (Wang & Landau (2001)) has recently generated much interest in the Physics literature due to some spectacular simulation performances. In this talk I will show that the algorithm can be naturally extended to more general states spaces and used to improve on MCMC schemes of more interest in Statistics. Then I will discuss some recent convergence results on the algorithm and some interesting open problems.

This is a joint work with Jun Liu.


Chair: Paul Edlefsen

Session B

Joint Multiple Target Tracking and Classification in Collaborative Sensor Networks

Xiaodong Wang

Columbia University

 

We address the problem of jointly tracking and classifying several targets within a sensor network where false detections are present. In order to meet the requirements inherent to sensor networks such as distributed processing and low-power consumption, a collaborative signal processing algorithm is presented. At any time, for a given tracked target, only one sensor is active. This leader node is focused on a single target but takes into account the possible existence of other targets. It is assumed that the motion model of a given target belongs to one of several classes. This class-target dynamic association is the basis of our classification criterion. We propose an algorithm based on the sequential Monte Carlo (SMC) filtering of jump Markov systems to track the dynamic of the system and make the corresponding estimates. A novel class-based resampling scheme is developed in order to get a robust classification of the targets. Furthermore, an optimal sensor selection scheme based on the maximization of the expected mutual information is integrated naturally within the SMC target tracking framework. Simulation results are presented to illustrate the excellent performance of the proposed multi-target tracking and classification scheme in a collaborative sensor network.


Chair: Jie Liang

Coffee Break

3:50-4:30

Session A

Bayesian Variable Selection in Structured High Dimensional Covariate Spaces

Nancy Zhang

Stanford University

 

In modern regression settings, it is common to have a very large set of potential covariates which have a known distance relationship to each other.  For example, with array-CGH data of cancer patients, one is interested in finding genomic regions that are associated with some clinical trait or with the measurement of another biomarker.  The covariates are the genomic locations where copy number is measured, which can be ordered linearly along chromosomes.  We approach this problem from a variable selection point of view, where we fit a regression model for the observed trait based on a small subset of biomarkers.  Our method builds on classic Bayesian approaches for variable selection, and proposes a flexible way to incorporate prior information.  For the analysis of array-CGH data, we introduce a Hidden Markov Model (HMM) on the prior distribution of the latent variables that indicate whether a covariate is selected.  We explore different sampling schemes with regards to speed of computation.  The HMM model can be extended to cases where the covariates lie in high dimensional lattices.  We show preliminary results of the application of such a model to DNA sequence analysis.


Chair: Sam Kou

Session B

Experiments with Monte Carlo methods for spatial models

Murali Haran

Penn State University

 

Hierarchical Bayesian approaches to modeling spatial data are becoming increasingly common. However, even fairly standard spatial models often result in posterior distributions that present challenges to non-expert users of Markov chain Monte Carlo (MCMC) methods. In addition to having to figure out an appropriate MCMC algorithm for each new problem, MCMC users have two long standing questions: How long should one run the chain and, once the algorithm is stopped, how accurate are the resulting estimates? I will describe some approaches for resolving these issues for a small but important class of spatial models. The first approach involves the construction of exact sampling procedures for which these questions are moot since the draws are i.i.d. The second approach involves constructing MCMC algorithms with good theoretical properties, which allows for consistent estimates of Monte Carlo standard errors to be easily computed. These standard errors can then be used to determine the run length of the MCMC algorithm. I will conclude with a comparison of these methods in the context of some data examples.

 

Chair: Gopi Goswami

4:30-5:10

Session A

Objective Bayesian Variable Selection for Logistic Regression Models with Jeffreys's Prior

Ming-Hui Chen

University of Connecticut

 

We study several theoretical properties of Jeffreys's prior for logistic regression models with a focus on its applications to variable selection problems. We show that Jeffreys's prior is symmetric and unimodal about 0 and always has lighter tails than a t distribution and heavier tails than a normal distribution. We also develop an efficient importance sampling algorithm for calculating the prior and posterior normalizing constants based on Jeffreys's prior. Moreover, we show that the prior and posterior normalizing constants under Jeffreys's prior are scale invariant in the covariates. A closed form for Jeffreys's prior is obtained for saturated logistic regression models with binary covariates. Detailed simulation studies are presented to demonstrate its properties and performance in variable selection contexts and a real dataset is also analyzed to further illustrate the proposed methodology.


Chair: Sam Kou

Session B

Particle Markov Chain Monte Carlo

Arnaud Doucet

University of British Columbia

 

We introduce a new class of Markov chain Monte Carlo algorithms which rely on proposal distributions built using Sequential Monte Carlo algorithms.

We demonstrate these algorithms on non-linear non-Gaussian state-space models and on finite mixture of Gaussians. (This is a joint work with Roman Holenstein).

Chair: Gopi Goswami

Monday May 14, 2007

 

Plenary Session: Harvard Hall 104

9:00-9:45

A new approach to Monte Carlo sampling in statistical physics and beyond

David Landau

University of Georgia

 

Chair: Rong Chen

9:45-10:30

Weakly informative priors

Andrew Gelman

Columbia University

 

Bayesians traditionally consider prior distributions that (a) represent the actual state of subject-matter knowledge, or (b) are completely or essentially noninformative. We consider an alternative strategy: choosing priors that convey some generally useful information but clearly less than we actually have for the particular problem under study. We give some examples, including the Cauchy (0, 2.5) prior distribution for logistic regression coefficients, and then briefly discuss the major unsolved problem in Bayesian inference: the construction of models that are structured enough to learn from data but weak enough to learn from data.

 

Chair: Rong Chen

10:30-11:15

Adaptive Annealing

Andrew Barron

Yale University

 

Simulated annealing is a method of sampling and stochastic optimization that suffers in general from very slow (logarithmic) cooling schedules for convergence. Here we devise a new annealing method with transition probabilities that more precisely transform random variables with the distribution at a given temperature into random variables with slightly cooled temperature. Key to this development is solution of a partial differential equation that expresses how the distribution is transformed. Its solution is facilitated in certain cases by suitable enlargement of the state space. Under conditions on the target sampling distribution it permits polynomial cooling schedules and in particular a polynomial-time sampling algorithm. Applications are discussed for computation of estimates of functions of several variables by ridge expansions such as arise using sigmoids or sinusoids.

 

Chair: Rong Chen

Lunch (Harvard Hall 104)

Parallel Sessions
Session A: Sever Hall 103
Session B: Sever Hall 203

1:00-1:40

Session A

Sequential Monte Carlo methods for tracking of partially observed jump processes

Simon Godsill

Cambridge University


Chair: Jinfeng Zhang

Session B

Exact Approximate Algorithms for Monte Carlo computations: the Expected Auxiliary Variable (EAV) principle

Christophe Andrieu

University of Bristol

 

The expected auxiliary variable method is a general framework for Monte Carlo simulation in situations where the target distribution of interest is untractable thus preventing the implementation of classical methods. The method finds application in situations where marginal computations are of interest, transdimensional move design is difficult in model selection setups, when the normalising constant of a particular distribution is unknown but required for exact computations. I will present several examples of applications of this principle as well as some theoretical results that we have recently obtained in some scenarios.


Chair: Gopi Goswami

1:40-2:20

Session A

Short-cut MCMC: An Alternative to Adaptation

Radford M. Neal

University of Toronto


Many MCMC methods have tuning parameters that must be set to suitable values in order to obtain efficient exploration of the state space. Naively adapting these parameters during an MCMC run leads to incorrect results, since the Markov property used to prove correctness no longer holds.? As an alternative, one can do MCMC updates using various parameter values in sequence, thereby ensuring that suitable parameter values are used for at least some updates - at the cost of wasting considerable time on updates with unsuitable parameter values. I show how one can employ a "short-cut" that allows a sequence of updates with unsuitable parameter values to be done using much less than the usual amount of computation time, by arranging for later updates in the sequence to retrace earlier parts of the sequence. This technique achieves the effect of adaptation from a computational standpoint, while being non-adaptive from a mathematical standpoint, and hence keeping the usual correctness properties, and allowing the usual techniques for assessing convergence and precision to be used.


Chair: Jinfeng Zhang

Session B

Multi-Scale MCMC Methods for Sampling from Products of Gaussian Mixtures

Patrick J. Wolfe (Joint work with Dan Rudoy)

Harvard University


In this talk we address the problem of sampling from a product of Gaussian mixtures, an important and ubiquitous task in a variety of modern signal processing applications. An exact solution is often computationally infeasible, thus motivating the development of efficient sampling schemes. Here we describe auxiliary variable strategies for designing Monte Carlo algorithms to sample from the resultant distribution, which is often multi-modal.? Naive Markov chain Monte Carlo (MCMC) algorithms perform poorly in such cases, motivating our development of multi-scale sampling methods. In particular, we propose two new multi-scale MCMC algorithms based on simulated and parallel tempering. Empirical results indicate that for the same computational budget, this class of methods improves performance in cases with widely separated modes. We also describe an annealing algorithm based on a fusion of importance sampling and MCMC steps through the recently proposed framework of sequential Monte Carlo samplers. Simulation results indicate that in comparison to either form of sampling technique alone, the resulting algorithm performs more robustly in multi-modal cases than those previously reported in the literature.


Chair: Gopi Goswami

2:20-3:00

Session A

Adaptive Multiple Importance Sampling (AMIS)

Antonietta Mira
University
of Insubria and Institute for Advanced Study (IUSS), Italy

 

The AMIS algorithm consists of 3 steps.  In the initialization step a set of iid uniform points are sampled in the d-dimensional unit box. The logistic transformation with scale parameter S, is used to bring the points back to $R^d$. The scale parameter is chosen using the ESS of the importance weights. Importance sampling estimates of the target mean and variance are constructed. These estimates are used to generate, from a d-dimensional Student-T(3df), the initial set of particles of the second adaptive step.  At this stage a temporal dimension is introduced and global adaptation is performed by an importance sampling version of Haario et al. (2001) adaptive Metropolis-Hastings algorithm. To achieve variance reduction an adaptive version of Owen and Zhou (2000), deterministic mixture importance sampling is defined.  As a byproduct of the mixture and actualization process we performe on the weights, all particles are on the same ``weighting scale'' and can be easily and efficiently combined to get final AMIS estimator. In the final step a Rao-Blackwellised clustering algorithm is performed on all generated particles. The number of clusters, K, is chosen via BIC.  IS estimators of mean and covariance matrices on each cluster are derived.  A K-mixture of Student-T (3df) distributions is used to  generate a final set of particles.  The mixture proportions are taken to be the number of particles belonging to each cluster. The AMIS estimator is obtained by recycling the particles generated in all 3 steps, with the corresponding importance weights.  The strength of AMIS resides in its completely adaptive and multi-purpose nature: no tuning parameter is needed and the same algorithm is proved to perform well on very diverse high-dimensional target distributions (from banana shaped to mixture with very well separated modes).

Chair: Jinfeng Zhang

Session B

A New Data Augmentation Scheme

Yuguo Chen

University of Illinois at Urbana-Champaign

 

Motivated by the statistical inference problem in population genetics, we present a new data augmentation scheme which leads to a more efficient Markov chain Monte Carlo (MCMC) algorithm. We show that the new algorithm converges to the target distribution faster than the standard MCMC algorithm. We apply our method to examples from population genetics to demonstrate its efficiency.


Chair: Gopi Goswami

3:00-3:40

Session A

Fixed-Width Output Analysis for Markov Chain Monte Carlo

Galin Jones

University of Minnesota

 

Markov chain Monte Carlo is a method of producing a correlated sample from a target distribution. Features of the target distribution are then estimated using this sample. Thus a fundamental question in MCMC is when should the sampling stop? That is, when have we achieved good estimates? I will introduce a method that stops the MCMC sampling when the width of a confidence interval is less than a user-specified value. Hence calculating Monte Carlo standard errors is a critical step in assessing the output of the simulation. In this talk I will give an overview of fixed-width methodology and two methods for calculating Monte Carlo standard errors. The main results will be illustrated in several examples.


Chair: Jinfeng Zhang

Session B

Randomized Quasi Monte Carlo for Markov chain Monte Carlo

Radu Craiu

University of Toronto

 

The local search and optimization algorithms (LSOA) have made a significant impact on MCMC methodology. The increased efficiency of LSOA relies on local exploration of the sample space.

Randomized Quasi-Monte Carlo (RQMC) algorithms are traditionally built to produce highly stratified samples. The stratification can benefit LSOAs as it produces a more structured local exploration of the sample space. In turn, this increases the efficiency of the samplers through better mixing and higher acceptance rates (in the case of Metropolis-Hastings-Green samplers). We discuss various RQMC techniques along with possible implementations in LSOAs. Particular emphasis will be put on the Multiple-Try Metropolis algorithm and its variants.

We illustrate the ideas with simulated and real data.   Some of the work has been done jointly with Xiao-Li Meng (Harvard) and Christiane Lemieux (Waterloo).


Chair: Gopi Goswami

Coffee Break

3:55-4:35

Session A

Lyapunov-type Inequalities in Importance Sampling and Perfect Simulation

Jose H. Blanchet

Harvard University

 

So-called Lyapunov bounds (also called drift conditions) are studied in rates of convergence to stationarity of MCMC algorithms. Our goal in this talk is to explain how these drift conditions can also be used in the performance analysis of sequential importance sampling algorithm and also in the design of perfect simulation schemes.

In the context of sequential importance sampling, we illustrate our techniques in the context of an algorithm proposed by Chen, Diaconis, Holmes and Liu (2005) for counting the number of binary tables with given margins. We show that their algorithm achieves asymptotic optimality (in a precise sense) as the size of the table increases to infinity.

In the context of exact simulation in MCMC, we show how quantitative bounds for rate of convergence to stationarity in the form of drift conditions (or Lyapunov bounds) allow to obtain unbiased (i.e. exact) samples from the stationary distribution of the underlying chain.

 

Chair: Paul Edlefsen

Session B

Annealing Evolutionary Stochastic Approximation Monte Carlo for Global Optimization

Faming Liang

Texas A&M University

 

In this talk, we will describe a new algorithm, the so-called annealing evolutionary stochastic approximation Monte Carlo (AESAMC) algorithm as a general optimization technique. A remarkable feature of the new algorithm is that it possesses the self-adjusting mechanism and is thus not trapped by local minima. This is important for the objective functions with multiple local minima. Under mild conditions, we show that AESAMC can converge in distribution to the global minimum at a rate of O(1/t), which is much faster than O(1/\log(t)), the convergence rate of simulated annealing. AESAMC is tested on two function optimization problems. The numerical results indicate that AESAMC can potentially outperform other leading stochastic optimization algorithms, such as simulated annealing, the genetic algorithm, and annealing stochastic approximation Monte Carlo.


Chair: Jinfeng Zhang

4:35-5:15

Session A

Hierarchical Image-Motion Segmentation using Swendsen-Wang Cuts

Adrian Barbu

Siemens Corporate Research


Swendsen-Wang Cuts is a MCMC algorithm capable of efficiently sampling arbitrary probabilities defined on graph partitions. It uses data-drive information encoded in the graph edges to propose clusters of nodes for label flipping. The algorithm is a generalization of the original Swendsen-Wang (1987) (SW) method from physics, designed for Ising/Potts priors without external fields.

We present multi-grid and multi-level extensions of the Swendsen-Wang Cuts algorithm that provide speed-up in the general case (multi-grid) and in the case of hierarchical models (multi-level). We apply the concepts to joint image and motion segmentation using hierarchical models for motion and intensity regions and a multi-level computation approach.


Chair: Paul Edlefsen

Session B

Ensemble Approach for Modeling Protein Structures: A New Prospect for An Old Problem

Jinfeng Zhang

Harvard University

 

A protein’s biological function is determined by the ensemble conformations existing under the physiological conditions. Characterizing ensemble conformations has been hampered by the difficulties associated with estimating some most fundamental properties, such as entropy and free energy. We apply sequential Monte Carlo methods (SMC) to study ensemble conformations, especially those conformations that are very similar to the experimentally determined native structure, which we name near-native structures (NNS). Two types of NNS are studied: (i) ensemble of backbone structures, for which side-chains are modeled by simplified representations; (ii) ensemble of side-chain conformations with the same backbone structure. I will present the methodologies we developed for these tasks and some interesting findings discovered through the studies. (Joint work with Jun Liu, Jie Liang, Rong Chen, and Ming Lin)

 

Chair: Jinfeng Zhang

Sunday May 13, 2007 6:00-11:00 Poster session (Terrace Room, Sheraton Commander Hotel)

List of Posters:

Metropolis algorithm and equienergy sampling for two mean field spin systems
by Leisen Fabrizio and Bassetti Federico

Keywords:

In this paper we study in details the Metropolis algorithm in connection with the two mean field spin system: the Ising model and the Blume-Emery-Griffith model. It is well-known that, in the case of the ising model, the usual choice of proposal gives rise, for low temperature, to a slowly mixing Metropolis chain; that is the spectral gap decreases exponentially fast (in the dimension N of the problem) to zero. Here we show how a slight variant in the proposal chain can avoid this phenomenon, keeping the computational mean cost similar to the cost of the usual Metropolis. More precisely we prove that, with a suitable variant in the proposal, the Metropolis chain has a spectral gap which decreases polynomially in 1/N for every temperature of the target distribution. The idea rests on allowing appropriate jumps in the same energy level of the starting state, and it is strictly connected to both the "small word Markov chains" of Guan & Krone and to the "equi-energy sampling "of Kou et al.

Flexible generalized t-link models for binary response data
by Sungduk Kim*, Ming-Hui Chen, and Dipak K. Dey

Keywords: Latent variables; Logistic regression; Markov chain Monte Carlo; Mixed effects model; Probit link; Posterior distribution.

One of most critical issues involved in modeling binary response data is the choice of the links. In this paper, we introduce a new link based on the generalized t-distribution. There are two parameters in the generalized t-link: one parameter purely controls the heaviness of the tails of the link and the second parameter controls the scale of the link. There are two major advantages offered by the generalized t-links. First, a symmetric generalized t-link with an unknown shape parameter is much more identifiable than a Student t-link with unknown degrees of freedom and a known scale parameter. Second, skewed generalized t-links with both unknown shape and scale parameters provide much more flexible and improved skewed link regression models than the existing skewed links. Various theoretical properties and attractive features of the proposed links are examined and explored in details. An efficient Markov chain Monte Carlo algorithm is developed to sample from the posterior distribution. The Deviance Information Criteria (DIC) is used for guiding the choice of links. The proposed methodology is motivated and illustrated by prostate cancer data.

Compressing Parameters in Bayesian Models with High Order Interactions
by Longhai Li & Radford M. Neal

Keywords: High Order Interactions, Bayesian Modelling, Classification, Sequence Prediction

Bayesian regression and classification with high order interactions is largely infeasible because Markov chain Monte Carlo (MCMC) would need to be applied with a huge number of parameters, which typically increases exponentially with the order. In this paper we show how to make it feasible by effectively reducing the number of parameters, exploiting the fact that many interactions have the same values for all training cases. Our method represents the sum of the parameters associated with a set of patterns that have the same value for all training cases as a single ``compressed'' parameter. Using symmetric stable distributions as the priors of the original parameters, we can easily find the priors of these compressed parameters. We therefore need to deal only with a much smaller number of compressed parameters when training the model with MCMC. After training the model we can split these compressed parameters into the original ones as needed to make predictions for test cases. We show in detail how to compress parameters for logistic sequence prediction. Experiments on both simulated and real data demonstrate a huge number of parameters can indeed be reduced by our compression method.

 

Randomized Quasi Monte Carlo for Markov chain Monte Carlo
by Radu Craiu

Keywords: Multiple-try Metropolis, ,randomized quasi-Monte Carlo, MCMC.

The local search and optimization algorithms (LSOA) have made a significant impact on MCMC methodology. The increased efficiency of LSOA relies on local exploration of the sample space. Randomized Quasi-Monte Carlo (RQMC) algorithms are traditionally build to produce highly stratified samples. The stratification can benefit LSOAs as it produces a more structured exploration of the sample space. In turn, this increases the efficiency of the samplers through better mixing and higher acceptance rates (in the case of Metropolis-Hastings-Green samplers). We discuss various RQMC techniques along with possible implementations in LSOAs. Particular emphasis will be put on the Multiple-Try Metropolis algorithm and its variants. We illustrate the ideas with simulated and real data. Parts of this talk are joint with Xiao-Li Meng (Harvard) and Christiane Lemieux (Waterloo).

Improve Parameter Inference in Population Genetics via Importance Sampling
by Hongmei Chi and Peter Beerli

Keywords: Importance sampling, scrambled quasirandom sequences, population genetics

Estimating parameters in population genetics is intensive computation. Importance sampling can be helpful to reduce the computational burden if the important distribution can be computed in a quick way. Scrambled quasirandom sequences can replace pseudorandom sequences in this sampling methods. While quasirandom sequences improve the convergence of many MC applications, one cannot conclude that they will automatically enhance the convergence of all MCs. In other words, enhanced convergence is not assured in all situations with the naive use of quasirandom sequences. We will theoretically and practically, through simulation studies, investigate the relationship of convergence and number of parameters and also investigate whether specific types of sampling algorithms help to achieve faster convergence using MC methods.

Bayesian Variable Selection in High Dimensional Structured Covariate Spaces
by Fan Li, Nancy R. Zhang

Keywords:

In modern genomic profiling studies, it is common to have a high dimensional data vector for each subject that measures some set of biomarkers. Sources of this kind of data include array-based comparative genomic hybridization (array-CGH), mass spectrometry, and high-throughput genotyping. In the analysis of such data, a frequent goal is to find biomarkers that are associated with an observed trait. For example, with array-CGH data of cancer patients, one is interested in finding genomic regions that are associated with clinical traits. We approach this problem from a variable selection point of view, where we build a regression model for the observed trait based on a small subset of biomarkers. However, this is an unconventional variable selection problem, because the dimension of the covariate space is comparable, and often much larger, than the number of subjects in the study. Also a common feature of such problems is that the covariate space is highly structured, and we propose a method for including such structure information in the variable selection framework. Our method builds on the classic Bayesian approaches for variable selection (George and McCulloch, 1993). However, existing methods become computationally very intensive when the covariate space is large, to the point of becoming inapplicable. We propose to utilize the known structure of the co-variate space to improve the efficiency of the stochastic search. Faster convergence of the Monte Carlo Markov Chain (MCMC) is achieved by introducing a more informative prior that reflects the known structure of the covariate space. We further develop computational and approximation schemes to speed up the computation.

Energization of charged particles in Saturn’s inner magnetosphere due to the action of a stochastic electric field simulated through a Monte Carlo Method
by Elizabeth Martínez-Gómez

Keywords: Monte Carlo method, random process, energization, stochastic forces

Our knowledge of energetic particles in Saturn’s inner magnetosphere is based on observations made during the flybys of Pioneer 11, Voyager 1, Voyager 2, and recently by Cassini. The most important features of the energetic particle population in the inner Saturnian magnetosphere are: 1) the rings and the many large and small satellites inside this region reduce the population of particles whose energies are higher than 0.5 MeV to values of the order of 10^3 times less than would otherwise be present, 2) the sputtering and outgassing of the surfaces of the satellites injects particles into the system and by some physical process, particles of the resultant plasma are accelerated to energies of the order of tens of keV, 3) the radial distribution of very energetic protons Ep > tens of MeV exhibits three major peaks associated with rings and satellites, 4) a proton population Ep~1 MeV lies outside the orbit of Enceladus, 5) a proton population Ep < 0.25 MeV has an apparent origin associated with Dione, Tethys, Enceladus, E-ring, Mimas and G-ring, 6) a population of low-energy electrons is associated with the satellites. In this work I propose a mechanism to explain the energetic particle population observed in Saturn’s inner magnetosphere based on the stochastic behavior of the electric field. To simulate the stochastic electric field I employ a Monte Carlo Method taking into account the magnetic field fluctuations obtained from the observations made by Voyager 1 and Voyager 2 spacecraft. Assuming different initial conditions, like the source of charged particles and the distribution function of their velocities, I find that particles injected with very low energies ranging from 0.104 eV to 0.526 keV can be accelerated to reach much higher energies ranging from 0.944 eV to 0.547 keV after a few seconds.

MCMC Methods for Sampling Conditioned Diffusions
by Dr Alexandros Beskos

Keywords:

We present a Markov chain Monte Carlo (MCMC) methodology for sampling conditioned diffusion processes. The method builds on Langevin SPDEs defined recently in the literature which evolve on the pathspace and sample from the target diffusion in equilibrium. We show that, in this infinite-dimensional pathspace context, the choice of the numerical method for solving the SPDEs is absolutely critical for the properties of the algorithm. Analytically, we discretise the SPDEs in the algorithmic time direction to obtain proposals for a Metropolis-Hastings update. The resulting MCMC algorithm is inefficient unless one uses an IMPLICIT numerical scheme when discretising the SPDEs. Only then will the algorithm propose paths of the correct quadratic variation.

Monte Carlo Methods for Symmetric Stable and Related Models
by Raoul LePage

Keywords: Symmetric stable models, Monte Carlo estimation and prediction

Symmetric stable random processes are gaining ground in diverse applications which include signal processing, financial process models, spatial models, and image processing. One thinks of such processes as allowing for long tailed behaviors, but in fact they also allow for more granular and dependent structures as compared with their smoother normal domain of attraction counterparts. In particular, different realizations of such processes may exhibit markedly different behaviors, a fact which argues against merely using the available observations to obtain unconditionally evaluated predictions or estimates. The fact, due to the author, that symmetric stable processes are themselves mixtures of Gaussian processes having explicit series expansions involving model parameters, seems to have been largely ignored when attempting their statistical analysis. As will be shown, one may exploit the series representations and undertake estimation and prediction in models attracted to the symmetric stable environment using direct Monte Carlo (not MCMC), without the necessity of developing approximations of stable probability densities in multiple dimensions as some are attempting. A conditional invariance principle due to LePage, Podgorski and Ryznar may be used to establish the consistency of our MC approach.

Screening for Differentially Expressed Genes Using Bayes Factors
by Fang Yu*, Ming-Hui Chen, Lynn Kuo

Keywords: Gene selection, Bayes factor, Calibrating value, Multilevel model, Prior predictive distribution

A common interest in microarray data analysis is to identify genes having different expression levels between two conditions. The existing methods include using two-sampled t-statistics, a modified t-statistics (SAM), semiparametric hierarchical Bayesian models, and nonparametric permutation tests. All of these methods essentially compare two population means. In this paper, we consider using the Bayes factor to compare gene expression levels. The Bayes factor approach is quite attractive and flexible in evaluating the evidence for a gene to be differentially expressed as it allows us to compare not only two population means but also the population distributions. To facilitate the use of the Bayes factor, we propose a new calibration approach that weighs two types of error probabilities differently from the prior predictive distribution of the Bayes factor for each gene and at the same time controls overall error rates for all genes under consideration. Moreover, a novel gene selection algorithm based on the calibration of the Bayes factor is developed and the theoretical properties of the proposed method are carefully examined. Our method is shown to have smaller false discovery rate (FDR) and false non-discovery rate (FNDR) than several existing methods through simulations. Finally, a real dataset from an affymetric microarray experiment to identify genes associated with the onset of osteoblast differentiation is used to further illustrate the proposed methodology.

Switching Regressions for Bayesian Changepoint Analysis in Material Indentation Experiments
by Shelten Yuen, Daniel Rudoy, Robert D. Howe, Patrick J. Wolfe

Keywords: Switching Regressions, Changepoint Detection, Gibbs Sampling, Material Indentation

Material indentation is a popular method for determining the mechanical properties of biomaterials. The basic premise of an indentation experiment is to physically displace the sample using an indentor that concurrently measures resistive force, in order to formulate a force-displacement curve. However, doing so requires estimating the initial contact event between the indentor and the sample--a statistical changepoint detection problem that has not been rigorously addressed in the biomaterials literature to date. Here we adopt a hierarchical Bayesian approach to contact point determination based on switching regressions, which generalizes an algorithm popular with practitioners and enables both hyperparameter estimation as well as uncertainty quantification. Results using several experimentally obtained silicone indentation data sets indicate that our approach outperforms existing techniques.

Detecting Positively Selected Sites From Amino Acid Sequences: An
by Zheng Ouyang and Jie Liang

Keywords: Bayesian Monte Carlo

Fixation of advantageous mutations is an important evolutionary force driving the accelerated protein diversification. However, the standard phylogenetic approach to infer positive selection is based on relative rate of nonsynonymous to synonymous substitutions, and requires the knowledge of DNA sequences, hence precludes its application to family of remotely related sequences where saturated substitutions occur. In this study, we develop a new method to detect positive selection directly from amino acid sequences by treating codon usage as hidden parameters. For a given amino acid sequence set and a phylogenetic tree, we use a reversible continuous time Markov process as our evolutionary model. This model has fewer parameters than normal amino acid evolutionary model, with only transition/transversion rate ratio, nonsynonymous/synonymous rate ratio (omega), and codon usage. Similar to earlier work, we assume that omega is a random variable with different probabilities to take a set of discrete values. Those with omega > 1 model sites under positive selection. We use the Bayesian Monte Carlo method to estimate model parameters, as it allows implementation of complex model of sequence evolution. Here unobserved DNA sequences are sampled from protein sequences based on distributions parametrized by codon usages, based on the fact that both protein sequences and the native protein-encoding DNA sequences have the same phylogenetic tree. The object is that sampled DNA sequences should fit the same phylogenetic tree as well as the native DNA sequences. Data set of beta-globin sequences from vertebrates is used to verify our model. We are able to detect all eight positive selection sites, which were originally reported using native nucleotide sequences. Our work shows that although nonsynonymous/synonymous rate ratio is defined at codon level, it can be used to detect selective pressures of amino acid sequences by our implicit codon-based model.

Battery-aware Adaptive Modulation Based on Large-scale MDP
by kai yang and xiaodong wang

Keywords: Battery management, adaptive modulation. Markov decision process, reinforcement learning, sparse sampling method

We treat the problem of designing the optimal transmission scheme that is adapted to the battery state, the channel and buffer conditions, and the incoming traffic rate. We assume that the battery states can be tracked at every time slot, so that the problem is formulated as a large-scale Markov decision process (MDP). An efficient sparse sampling method is employed to obtain a solution. Simulation results are provided to demonstrate that the proposed schemes can considerably increase the lifetime of the battery-powered wireless systems while satisfying QoS constraints.

Structured Database Monte Carlo: A New Strategy for Efficient Simulation
by Gang Zhao, Tarik Borogovac, Pirooz Vakili

Keywords:

Quite often we use Monte Carlo to "solve" many instances of a single problem that differ only in parameter values. We introduce a new strategy called Structured Database Monte Carlo (SDMC) that utilizes information obtained at one parameter to design effective variance reduction techniques at neighboring parameters. We induce a linear order (structure) on the underlying probability space (database) using sampled values at a nominal parameter. This structure is approximately maintained when the parameter is perturbed, enabling design of generic variance reduction algorithms based on, for example, stratification, control variate, and importance sampling methods for estimation in the neighborhood of the nominal parameter. Experimental results for examples in finance and physics are provided.

Understanding HSC Kinetics via MCMC
by Youyi Fong, Peter Guttorp, Janis Abkowitz

Keywords: Reversible Jump, MCMC, Hematopoiesis, stem cell, goodness of fit

Hematopoiesis is the development of blood cells from hematopoietic stem cells (HSC). A HSC could divide into two HSC or differentiate into a blood progenitor cell. We are interested in testing the hypothesis that HSC also divides asymmetrically into one HSC cell and one progenitor cell. Data were collected from a bone marrow transplantation experiments done on hybrid cats. We build a hierarchical model and fit it using reversible jump Metropolis-Hasting embedded in a Gibbs sampler. Testing is done based on the posterior distribution of the parameters and a Goodness-of-Fit statistic.

Deterministic Sequential Monte Carlo Algorithms for Motif Discovery
by Kuo-ching Liang, Xiaodong Wang, Dimitris Anastassiou

Keywords:

We propose a position weight matrix-based deterministic sequential Monte Carlo (DSMC)method to locate conserved motifs in a given set of sequences. We also propose another DSMC algorithm to align the motifs for cases where insertions/deletions occur in instances of the motifs, which cannot be satisfactorily done using other multiple alignment and motif discovery algorithms.

Adaptive MCMC for Bayesian variable selection
by Chunlin JI, Kai Mao, Scott C. Schmidler

Keywords: Markov chain Monte Carlo, Bayesian variable selection, Adaptation Scheme, Multimodality

We describe adaptive Markov chain Monte Carlo (MCMC) methods for sampling posterior distributions arising from Bayesian variable selection problems. Point mass mixture priors are commonly used in Bayesian variable selection problems in regression. However, for generalized linear and nonlinear models where the conditional densities cannot be obtained directly, the resulting mixture posterior may be difficult to sample using standard MCMC methods. In particular, random-walk Metropolis-Hastings chains and Metropolized independence chains can be poorly mixing due to multimodality. We provide a novel approach to this problem via an adaptation scheme. First, a generic adaptive MCMC method is introduced which automatically tunes the parameters of the proposal density during simulation and attempts to 'learn' the best parameter values. By using proposals from a family of mixture distributions, the chain is able to sample efficiently from the multimodal target distribution. Then the adaptive independence sampler for variable selection is proposed, using a proposal density obtained from a mixture of a point mass and Gaussian mixture distribution. Under the adaptation strategy, the mixing weight of the point mass component in the proposal adapts to approximate the posterior inclusion probability of its associated variable, while the Gaussian mixture distribution approximates the non-zero component of the coefficient's posterior distribution, enabling the efficient mixing between models with and without the variable included. The resulting sampling scheme performs parameter estimation and variable selection simultaneously. The convergence and ergodicity of these algorithms can be guaranteed with a careful design of the adaptation strategy. The algorithm is applied to several problems including a logistic regression model, a random field model from statistical biophysics, and a sparse kernel regression. A simulation study indicates that this adaptive independent sampler outperforms traditional MH algorithms.

Bounding the convergence rate of parallel tempering and conditions for rapid mixing
by Dawn Woodard, Scott Schmidler, Mark Huber

Keywords: tempering; mixing; convergence

Multimodal posterior distributions are commonly encountered in Bayesian statistics. However, one standard tool for sampling from a posterior distribution, namely Markov Chain Monte Carlo, can become stuck in local modes. Parallel tempering is a modification of MCMC that is often used in hopes of circumventing bottlenecks between modes. We obtain a general bound on the convergence rate of parallel tempering that reflects its ability to move between the modes of the distribution. This bound leads to a set of sufficient conditions for rapid mixing of parallel tempering. We use these conditions to show the rapid mixing of parallel tempering on a number of multimodal distributions, including a symmetric mixture of normal distributions in R^M and a weighted mixture of normal distributions in R^M with equal covariance matrices. We also illustrate the failure of these conditions on several multimodal distributions and prove that parallel tempering is torpidly mixing on these distributions.

Adaptive Watermark Decoding and Attack Parameters Estimation using MCMC techniques
by Dimitrios Vagias and Xiaodong Wang

Keywords: data hiding, additive attacks, amplitude scaling, desynchronization, Markov chain Monte Carlo (MCMC)

Currently, watermarking algorithms have become a real trend in the field of copyright protection. A watermark is certain imperceptive information embedded into a host signal according to a procedure which must achieve satisfatctory rate-distortion-robustness performance. In our study, we are tackling with the problem of reliably recovering the watermark in the case of additive independent random noise, amplitude scaling and desynchronization attacking scenarios. For this task we are employing conventional and adaptive MCMC techniques at the decoder both for watermark detection and channel identification purposes.

Gibbs Sampling for Bayesian Linear Models

Alicia A. Johnson and Galin L. Jones

 

Under certain regularity conditions, it is known that a Markov chain will converge to stationarity.  However, it's the rate of this convergence that holds considerable weight in the application of MCMC methods.  A Markov chain that converges to its stationary distribution at a geometric rate is said to be geometrically ergodic.  A major consequence of geometric ergodicity is the existence of a Markov chain central limit theorem and consistent estimates of the corresponding asymptotic variance.  These estimates provide guidance on how long to run the chain so that our estimates are sufficiently credible, a matter usually approached in an ad hoc fashion.  In practice, geometric ergodicity can be established by constructing drift and minorization conditions for the Markov chain.  In this paper, we use these techniques to establish geometric ergodicity for a block Gibbs sampler for a Bayesian hierarchical version of the random intercepts model. 

 

 

 

 

back to home