Jun S. LiuDepartment of Statistics
Science Center 711
One Oxford Street, Cambridge, MA 02138
A wealth of biological sequence data and microarray expression data has emerged from the human genome project and functional genomics studies. In Silico methods for understanding these data and for incorporating different sources of biological information/knowledge are becoming increasingly important. Because the nature is inherently stochastic, the essence of much of the computational efforts is statistical data analysis and probabilistic modeling. The general goal of my research group is to design effective statistical models and computational strategies for understanding biological and genetic data. Currently we are interested in the following topics: (a) predicting gene regulatory binding motifs; (b) homology modeling and sequence-based protein analysis; (c) linkage disequilibrium studies; and (d) phylogenetic studies.
We started to explore the utility of the statistical missing data formulation and Gibbs sampling strategies for biological sequence analysis since 1993. Over the years, we have constructed a number of algorithms including the Gibbs site sampler, the Gibbs motif sampler, PROBE, Bayes Aligner, and BioProspector. They are suitable for detecting subtle sequence relationships and weak repetitive patterns. In particular, the Gibbs motif sampler has been adopted by many other research groups and become a standard tool for finding DNA regulatory binding motifs. Our future goals in computational gene regulation analysis are to develop strategies to efficiently combine sequence information, cross-species comparisons, and microarray analysis; to design new statistical models for eukaryotic gene regulation modules; to investigate the use of Bayesian network in understanding gene regulation.
Besides bioinformatics, we are also interested in the general Monte Carlo methods for integration and optimization in complex systems in bioinformatics, finance, engineering, and statistics.
Qin, Z. and Liu, J.S. (2001). Multi-point Metropolis method with application to hybrid Monte Carlo. J. Comp. Phys. in press.
McCue, L.A et al. (2001). Phylogenetic footprinting of transcription factor binding sites in proteobacterial genomes. Nucl. Acids Res. 29, 774-782.
Liu, J.S., Sabatti, C., Teng, J., Keats, B.J.B. and Risch, N. (2001). Bayesian analysis of haplotypes for linkage disequilibrium mapping. Genome Res., in press.
Page created and maintained by Xaq Pitkow