Publication Date



We believe that a wide range of physical processes conspire to shape the observed galaxy population but we remain unsure of their detailed interactions. The semi-analytic model (SAM) of galaxy formation uses multi-dimensional parameterizations of the physical processes of galaxy formation and provides a tool to constrain these underlying physical interactions. Because of the high dimensionality, the parametric problem of galaxy formation may be profitably tackled with a Bayesian-inference based approach, which allows one to constrain theory with data in a statistically rigorous way. In this paper, we develop a generalized SAM using the framework of Bayesian inference. We show that, with a parallel implementation of an advanced Markov-Chain Monte-Carlo algorithm, it is now possible to rigorously sample the posterior distribution of the high-dimensional parameter space of typical SAMs. As an example, we characterize galaxy formation in the current $\Lambda$CDM cosmology using stellar mass function of galaxies as observational constraints. We find that the posterior probability distribution is both topologically complex and degenerate in some important model parameters. It is common practice to reduce the SAM dimensionality by fixing various parameters. However, this can lead to biased inferences and to incorrect interpretations of data owing to this parameter covariance. This suggests that some conclusions obtained from early SAMs may not be reliable. Using synthetic data to mimic systematic errors in the stellar mass function, we demonstrate that an accurate observational error model is essential to meaningful inference.


This is a pre-published version which is collected from arXiv link.