{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Lab 4, Exercise 2: the stellar mass-size relation\n", "\n", "We have a sample of galaxies with measurements of their stellar mass $M_*$ and half-light radius $R_e$. Model the distribution of (log) stellar masses as a Gaussian and the distribution in $\\log{R_e}$ as a Gaussian with mean that scales with $M_*$:\n", "\n", "$P(\\log{M_*}|\\mu_*,\\sigma_*) \\sim \\mathcal{N}(\\mu_*, \\sigma_*^2)$\n", "\n", "$P(\\log{R_e}|\\log{M_*},\\mu_R, \\beta_R, \\sigma_R) \\sim \\mathcal{N}(\\mu_R + \\beta_R(\\log{M_*} - \\log{M_*^p}), \\sigma_R^2)$\n", "\n", "Assume Gaussian errors $\\epsilon_{*,i}$ on the log-stellar mass measurements and no errors on the sizes ($R_e$ are measured exactly). Infer the posterior probability distribution of the hyper-parameters $\\mu_*,\\sigma_*,\\mu_R,\\beta_R,\\sigma_R$ given the data. Draw samples from the posterior and find, for each hyper-parameter, the posterior median and 68% enclosed probability bounds.\n", "\n", "Make a scatter plot of the observed values of $\\log{R_e}$ as a function of $\\log{M_*}$. Overplot the function $\\log{R_e} = \\mu_R + \\beta_R (\\log{M_*} - \\log{M_*^p})$, with the values of $\\mu_R$ and $\\beta_R$ inferred from the MCMC (you can use the values that maximize the posterior probability, or the median $\\mu_R$ and $\\beta_R$ of the respective marginal posteriors).\n", "\n", "*(Optional):* make 2D contour plots for each pair of hyper-parameters." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Solution\n", "\n", "Let's simplify notations:\n", "\n", "$m_i \\equiv \\log{M_{*,i}}$ := true log-stellar mass of galaxy i\n", "\n", "$r_i \\equiv \\log{R_e}$ := true log-half-light radius of galaxy i (also \n", "\n", "$m_i^{(obs)} \\equiv \\log{M_{*,i}^{(obs)}}$ := observed log-stellar mass of galaxy i\n", "\n", "$m^p \\equiv \\log{M_*^p}$ := pivot stellar mass of the mass-size relation (to be assigned)\n", "\n", "As shown in class, the posterior probability distribution for this problem is\n", "\n", "$P(\\mu_*,\\sigma_*,\\mu_R,\\beta_R,\\sigma_R|d) \\propto \\dfrac{1}{|\\beta_R|^n} \\prod_i \\sqrt{\\dfrac{\\sigma_{eff*,i}^2}{\\sigma_*^2\\epsilon_{*,i}^2\\sigma_{R,i}^2/\\beta_R^2}} \\exp{\\left\\{-\\dfrac12\\left(\\dfrac{\\mu_*^2}{\\sigma_*^2} + \\dfrac{m_{(obs),i}^2}{\\epsilon_{*,i}^2} + \\dfrac{m_{R,i}^2}{\\sigma_{R,i}^2/\\beta_R^2} - \\dfrac{\\mu_{eff*,i}^2}{\\sigma_{eff*,i}^2}\\right)\\right\\}}$\n", "\n", "where\n", "\n", "$\\sigma_{eff*,i} \\equiv \\left(\\dfrac{1}{\\sigma_*^2} + \\dfrac{1}{\\epsilon_{*,i}^2} + \\dfrac{\\beta_R^2}{\\sigma_R^2}\\right)^{-1}$\n", "\n", "$\\mu_{eff*,i} \\equiv \\left(\\dfrac{\\mu_*}{\\sigma_*} + \\dfrac{m_{(obs),i}}{\\epsilon_{*,i}^2} + \\dfrac{m_{R,i}}{\\sigma_R^2/\\beta_R^2}\\right)\\sigma_{eff*,i}^2$\n", "\n", "$m_{R,i} \\equiv \\dfrac{r_i - \\mu_R}{\\beta_R} + m^p$\n", "\n", "Let us draw samples from this 5-dimensional distribution using the Metropolis algorithm.\n", "Let us use a 5-dimensional Gaussian jump distribution\n", "\n", "$$\n", "J_t(\\theta*,\\theta) = \\mathcal{N}\\left((\\theta), \\left(\\begin{array}{ccccc} \n", "s_1^2 & 0 & 0 & 0 & 0 \\\\ \n", "0 & s_2^2 & 0 & 0 & 0\\\\\n", "0 & 0 & s_3^2 & 0 & 0\\\\\n", "0 & 0 & 0 & s_4^2 & 0\\\\\n", "0 & 0 & 0 & 0 & s_5^2\n", "\\end{array}\\right)\\right)\n", "$$\n", "\n", "where $s_1,\\ldots,s_5$ are the step sizes in the direction of each parameter.\n", "\n", "Let's set $s_1 = s_2 = s_3 = s_4 = s_5 = 0.03$ for simplicity.\n" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "ename": "SyntaxError", "evalue": "Missing parentheses in call to 'print'. Did you mean print('sampling...')? (, line 37)", "output_type": "error", "traceback": [ "\u001b[0;36m File \u001b[0;32m\"\"\u001b[0;36m, line \u001b[0;32m37\u001b[0m\n\u001b[0;31m print 'sampling...'\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m Missing parentheses in call to 'print'. Did you mean print('sampling...')?\n" ] } ], "source": [ "import numpy as np\n", "import pylab\n", "from scipy.stats import multivariate_normal\n", "\n", "\n", "# load the data\n", "f = open('SLACS_table.cat', 'r')\n", "mstar, mstar_err, size = np.loadtxt(f, usecols=(3, 4, 5), unpack=True)\n", "f.close()\n", "\n", "n = len(mstar) # number of objects\n", "\n", "mobs = mstar\n", "robs = np.log10(size)\n", "eps_star = mstar_err\n", "\n", "mpiv = np.median(mstar) # set the value of the pivot stellar mass to the median of the sample\n", "\n", "# define log-posterior probability\n", "def logp_func(theta):\n", " \n", " mu_star, sigma_star, mu_R, beta_R, sigma_R = theta\n", " \n", " sumlogp = -n*np.log(abs(beta_R))\n", " for i in range(n): # loops over the galaxies in the sample\n", " mR = (robs[i] - mu_R)/beta_R + mpiv\n", " sigma_star_eff = (1./sigma_star**2 + 1./eps_star[i]**2 + beta_R**2/sigma_R**2)**(-0.5)\n", " mu_star_eff = (mu_star/sigma_star**2 + mobs[i]/eps_star[i]**2 + mR*beta_R**2/sigma_R**2)*sigma_star_eff**2\n", " \n", " sumlogp += np.log(sigma_star_eff/(sigma_star*eps_star[i]*sigma_R/abs(beta_R)))\n", " sumlogp += -0.5*(mu_star**2/sigma_star**2 + mobs[i]**2/eps_star[i]**2 + mR**2*beta_R**2/sigma_R**2 - mu_star_eff**2/sigma_star_eff**2)\n", "\n", " return sumlogp\n", "\n", "# defines sampler: takes starting point, jump covariance matrix and number of steps, returns array of samples.\n", "def run_sampler(theta_0, jump_cov, nstep):\n", " print 'sampling...'\n", " theta_samp = np.zeros((len(theta_0), nstep))\n", " logp_samp = np.zeros(nstep)\n", " \n", " theta_samp[:, 0] = theta_0\n", " logp_samp[0] = logp_func(theta_0)\n", " \n", " naccept = 0\n", " \n", " for i in range(1, nstep):\n", " # proposes step from the jump distribution\n", " theta_old = theta_samp[:, i-1]\n", " theta_star = multivariate_normal.rvs(mean=theta_old, cov=jump_cov)\n", " \n", " logp_new = logp_func(theta_star)\n", " \n", " accept = False\n", " if logp_new > logp_samp[i-1]:\n", " accept = True\n", " else:\n", " r = np.exp(logp_new - logp_samp[i-1])\n", " x = np.random.rand(1)\n", " if x < r:\n", " accept = True\n", " else:\n", " accept = False\n", " \n", " if accept:\n", " naccept += 1\n", " theta_new = theta_star\n", " logp_samp[i] = logp_new\n", " else:\n", " theta_new = theta_old\n", " logp_samp[i] = logp_samp[i-1]\n", " \n", " theta_samp[:, i] = theta_new\n", " \n", " print 'acceptance rate: %3.2f'%(naccept/float(nstep))\n", " return (theta_samp, logp_samp)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# set starting point\n", "theta_0 = np.zeros(5)\n", "theta_0[0] = np.median(mstar)\n", "theta_0[1] = mstar.std()\n", "theta_0[2] = np.median(robs)\n", "theta_0[3] = 0.5\n", "theta_0[4] = 0.2\n", "\n", "# set covariance matrix of jump distribution\n", "jump_cov = 0.03**2 * np.diag(np.ones(5)) \n", "\n", "# run the MCMC\n", "theta_samp, logp_samp = run_sampler(theta_0, jump_cov, nsamp)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# visually inspect the chain\n", "\n", "pylab.plot(theta_samp[0, :])\n", "pylab.ylabel('$\\mu_*$')\n", "pylab.show()\n", "\n", "pylab.plot(theta_samp[1, :])\n", "pylab.ylabel('$\\sigma_*$')\n", "pylab.show()\n", "\n", "pylab.plot(theta_samp[2, :])\n", "pylab.ylabel('$\\mu_R$')\n", "pylab.show()\n", "\n", "pylab.plot(theta_samp[3, :])\n", "pylab.ylabel('$\\\\beta_R$')\n", "pylab.show()\n", "\n", "pylab.plot(theta_samp[4, :])\n", "pylab.ylabel('$\\sigma_R$')\n", "pylab.show()\n", "\n", "pylab.plot(logp_samp)\n", "pylab.ylabel('$\\log{(P)}$')\n", "pylab.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let's run another chain, but update the covariance matrix of the jump distribution to set it equal to that of the sample. Let's also increase the number of steps." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "new_cov = np.cov(theta_samp)\n", "\n", "# let's initialize the new chain to the last value of the previous one\n", "theta_0 = theta_samp[:, -1]\n", "\n", "print 'starting point:\\n', theta_0\n", "\n", "nsamp = 50000\n", "\n", "# runs the MCMC\n", "theta_samp, logp_samp = run_sampler(theta_0, new_cov, nsamp)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# visually inspect the chain\n", "\n", "pylab.plot(theta_samp[0, :])\n", "pylab.ylabel('$\\mu_*$')\n", "pylab.show()\n", "\n", "pylab.plot(theta_samp[1, :])\n", "pylab.ylabel('$\\sigma_*$')\n", "pylab.show()\n", "\n", "pylab.plot(theta_samp[2, :])\n", "pylab.ylabel('$\\mu_R$')\n", "pylab.show()\n", "\n", "pylab.plot(theta_samp[3, :])\n", "pylab.ylabel('$\\\\beta_R$')\n", "pylab.show()\n", "\n", "pylab.plot(theta_samp[4, :])\n", "pylab.ylabel('$\\sigma_R$')\n", "pylab.show()\n", "\n", "pylab.plot(logp_samp)\n", "pylab.ylabel('$\\log{(P)}$')\n", "pylab.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This chain looks healthy. Now let's plot the observed mass-size relation with our best-fit measurement. For simplicity, we're going to fix $\\mu_R$ and $\\beta_R$ to the maximum of the posterior distribution." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# take the maximum of the posterior\n", "\n", "\n", "theta_mp = theta_samp[:, logp_samp.argmax()]\n", "\n", "mu_R_mp = theta_mp[2]\n", "beta_R_mp = theta_mp[3]\n", "\n", "pylab.scatter(mobs, robs)\n", "xlim = pylab.xlim()\n", "xs = np.linspace(xlim[0], xlim[1])\n", "pylab.plot(xs, mu_R_mp + beta_R_mp*(xs - mpiv), linestyle='--', color='k')\n", "pylab.xlabel('$\\log{M_*}$')\n", "pylab.ylabel('$\\log{R_e}$')\n", "pylab.show()\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let's plot the probability contours for each pair of hyper-parameters of the model.\n", "First let's write a function that creates a single 2D contour plot, then let's write a script that displays the plots in an organized way." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from scipy import ndimage\n", "\n", "\n", "def probcontour(xarr, yarr, color='b', smooth=2, bins=100):\n", " # create a 2d histogram\n", " H, xbins, ybins = pylab.histogram2d(xarr, yarr, bins=bins)\n", " \n", " # smooths the histogram\n", " H = ndimage.gaussian_filter(H, smooth)\n", " \n", " # create an array containing the number of points in each histogram bin, sorted.\n", " sortH = np.sort(H.flatten())\n", " \n", " # compute the cumulative sum\n", " cumH = sortH.cumsum()\n", " \n", " # find the level for which the integral (i.e. sum) over the bins with a higher number of samples is 68% of the total\n", " lvl68 = sortH[cumH>0.32*cumH[-1]][0]\n", " \n", " # same for 95%-ile level\n", " lvl95 = sortH[cumH>0.05*cumH[-1]][0]\n", " \n", " pylab.contourf(H.T, [lvl95, lvl68], colors=color, alpha=0.5, extent=(xbins[0], xbins[-1], ybins[0], ybins[-1]))\n", " pylab.contourf(H.T, [lvl68, 1e8], colors=color, alpha=0.9, extent=(xbins[0], xbins[-1], ybins[0], ybins[-1]))\n", "\n", "fig = pylab.figure()\n", "pylab.subplots_adjust(left=0.1, right=0.99, bottom=0.1, top=0.99)\n", "\n", "npars = 5\n", "labels = ['$\\mu_*$', '$\\sigma_*$', '$\\mu_R$', '$\\\\beta_R$', '$\\sigma_R$']\n", "\n", "for i in range(npars):\n", " ax = fig.add_subplot(npars, npars, (npars+1)*i + 1)\n", " pylab.hist(theta_samp[i, :], histtype='step', color='b')\n", " \n", "for j in range(1, npars):\n", " for i in range(j):\n", " ax = fig.add_subplot(npars, npars, npars*j+i+1)\n", " probcontour(theta_samp[i, :], theta_samp[j, :])\n", " \n", " if i == 0:\n", " ax.set_ylabel(labels[j])\n", " if j == npars-1:\n", " ax.set_xlabel(labels[i])\n", " \n", "pylab.show()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.6" } }, "nbformat": 4, "nbformat_minor": 2 }