{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Lab 3, Exercise 2: Stellar mass distribution of galaxy sample\n", "\n", "We have a sample of galaxies (strong lenses) with measurements of their (log) stellar mass $(m_1^{(obs)},...,m_n^{(obs)})$.\n", "Assuming a Gaussian model for the stellar mass distribution of the sample, what are the mean stellar mass $\\mu_*$ and the dispersion $\\sigma_*$ of the sample?\n", "1. Assume, for simplicity, that the stellar masses of individual galaxies are known exactly. Plot the 68% and 95% enclosed probability contours of the posterior probability distribution $P(\\mu_*,\\sigma_*|m_1^{(obs)}, ...,m_n^{(obs)})$\n", "2. Repeat point 1 relaxing the assumption of perfect stellar mass measurements and assuming Gaussian errors instead." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Solution\n", "\n", "### Part 1\n", "\n", "As shown in class, assuming a flat prior on both $\\mu_*$ and $\\sigma_*$, the posterior probability distribution in (\\mu_*,\\sigma_*), marginalized over the individual stellar masses, is proportional to the following:\n", "\n", "$P(\\mu_*,\\sigma_*|m_1^{(obs)}, ...,m_n^{(obs)}) \\propto \\dfrac{1}{\\sigma_*^n}\\exp{\\left\\{-\\dfrac{(S^2 + n(\\bar{y} - \\mu_*)^2)}{2\\sigma_*^2}\\right\\}}$\n", "\n", "A quick way to plot this 2D distribution is the following:\n", "1. Draw a value of $\\sigma_*$ from the marginal posterior distribution in $\\sigma_*$, $P(\\sigma_*|m_1^{(obs)}, ...,m_n^{(obs)}) = \\int d\\mu_* P(\\mu_*,\\sigma_*|m_1^{(obs)}, ...,m_n^{(obs)}) \\propto 1/\\sigma_*^{n-1}\\exp{\\{-S^2/(2\\sigma_*^2)\\}}$\n", "2. Draw a value of $\\mu_*$ from the posterior distribution in $\\mu_*$ conditional on the value of $\\sigma_*$, $P(\\mu_*|\\sigma_*,m_1^{(obs)}, ...,m_n^{(obs)}) \\propto 1/\\sigma_*^n \\exp{\\{-n(\\bar{y} - \\mu_*)^2/(2\\sigma_*^2)\\}}$\n", "3. Repeat for a large number of iterations (at least 1,000)\n", "4. Make a 2D histogram with the sample of $(\\mu_*,\\sigma_*)$ from above. Find the value $n_{68}$ for which bins with a higher number of points account for 68% of the points in the sample. Repeat for the 95%-ile level.\n", "5. Use the pylab function 'contour' to draw contour lines for the 2D histogram created in point 4" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import pylab\n", "\n", "\n", "f = open('SLACS_table.cat', 'r')\n", "lmstar, lmstar_err = np.loadtxt(f, usecols=(3, 4), unpack=True)\n", "f.close()\n", "\n", "n = len(lmstar)\n", "\n", "mbar = lmstar.mean()\n", "S2 = ((lmstar - mbar)**2).sum()\n", "print S2\n", "\n", "sigma_arr = np.linspace(0.01, 0.5, 500)\n", "logp_sigma_arr = (1.-n)*np.log(sigma_arr) - 0.5*S2/sigma_arr**2\n", "logp_sigma_arr -= logp_sigma_arr.max()\n", "\n", "p_sigma_arr = np.exp(logp_sigma_arr)\n", "pylab.plot(sigma_arr, p_sigma_arr)\n", "pylab.show()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from scipy.interpolate import splrep, splint, splev\n", "from scipy.optimize import brentq\n", "\n", "\n", "nsamp = 10000\n", "\n", "p_sigma_spline = splrep(sigma_arr, p_sigma_arr)\n", "p_sigma_norm = splint(sigma_arr[0], sigma_arr[-1], p_sigma_spline)\n", "\n", "def sigma_cdf(sigma):\n", " return splint(sigma_arr[0], sigma, p_sigma_spline)/p_sigma_norm\n", "\n", "x_samp = np.random.rand(nsamp)\n", "sigma_samp = np.zeros(nsamp)\n", "mu_samp = np.zeros(nsamp)\n", "\n", "for i in range(nsamp):\n", " sigma_here = brentq(lambda sigma: sigma_cdf(sigma) - x_samp[i], sigma_arr[0], sigma_arr[-1])\n", " sigma_samp[i] = sigma_here\n", " mu_samp[i] = np.random.normal(mbar, sigma_here/(float(n)**0.5), 1)\n", " \n", "# calculate the 68% enclosed probability bounds, in sigma_* and mu_*\n", "sigma16, sigma84 = np.percentile(sigma_samp, 16.), np.percentile(sigma_samp, 84.)\n", "mu16, mu84 = np.percentile(mu_samp, 16.), np.percentile(mu_samp, 84.)\n", "\n", "# plot histograms of the sample of sigma_* and mu_*\n", "nbins = 50\n", "bins = np.linspace(sigma_arr[0], sigma_arr[-1], nbins)\n", "pylab.hist(sigma_samp, bins=bins)\n", "pylab.plot(sigma_arr, p_sigma_arr/p_sigma_norm*(bins[1] - bins[0])*nsamp)\n", "pylab.axvline(sigma16, linestyle='--', color='k')\n", "pylab.axvline(sigma84, linestyle='--', color='k')\n", "pylab.show()\n", "print '%4.3f %4.3f'%(sigma16, sigma84)\n", "\n", "pylab.hist(mu_samp)\n", "pylab.axvline(mu16, linestyle='--', color='k')\n", "pylab.axvline(mu84, linestyle='--', color='k')\n", "pylab.show()\n", "print '%4.3f %4.3f'%(mu16, mu84)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from scipy import ndimage\n", "\n", "\n", "# make a 2d histogram with the sample of (mu_*, sigma_*) generated earlier\n", "H, xbins, ybins = pylab.histogram2d(mu_samp, sigma_samp, bins=100)\n", "\n", "# smooth the histogram to make it look nicer\n", "H = ndimage.gaussian_filter(H, 2)\n", "\n", "# find the levels lvl68, lvl95 in the histogram H enclosing 68% and 95% of the total probability\n", "# we do this by 1) sorting the histogram from the lowest to highest value\n", "sortH = np.sort(H.flatten())\n", "\n", "# 2) taking the cumulative sum\n", "cumH = sortH.cumsum()\n", "\n", "# 3) finding the level at which the cumulative sum is 32% (complementary to 68%) of the total sum\n", "lvl68 = sortH[cumH>cumH.max()*0.32].min()\n", "lvl95 = sortH[cumH>cumH.max()*0.05].min()\n", "\n", "# plot contour levels lvl95 and lvl68 of the smoothed (transposed) histogram\n", "pylab.contour(H.T, [lvl95, lvl68], colors='k', extent=(xbins[0], xbins[-1], ybins[0], ybins[-1]))\n", "pylab.show()\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Part 2\n", "\n", "In the general case in which the stellar masses are measured with error $\\epsilon_i$, the posterior probability distribution in $(\\mu_*,\\sigma_*)$, marginalized over the individual stellar masses, is\n", "\n", "$P(\\mu_*,\\sigma_*|m_1^{(obs)}, ...,m_n^{(obs)}) \\propto \\prod_i \\dfrac{1}{\\sqrt{\\sigma_*^2 + \\epsilon_i^2}} \\exp{\\left\\{-\\dfrac{(\\mu_* - m_i^{(obs)})^2}{2(\\sigma_*^2 + \\epsilon_i^2)}\\right\\}}$" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 2", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.13" } }, "nbformat": 4, "nbformat_minor": 2 }