{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Lab 3, Exercise 1: drawing samples from an arbitrary distribution\n", "\n", "Let $P(\\theta) \\propto \\dfrac{1}{\\theta^5}\\exp{\\left\\{-\\dfrac{1}{\\theta^2}\\right\\}}$ be an un-normalized probability distribution truncated between the interval $[0, 3]$. Write a code that draws random samples from it." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Solution\n", "\n", "One way of solving this problem is the following:\n", "- Define a function that computes the cumulative distribution function (CDF) of $P(\\theta)$. This can be done by interpolating $P(\\theta)$ with a spline on a grid in $\\theta$, then using `scipy.special.splint` to calculate the integral of $P(\\theta)$ between $a$ and a point $\\theta$:\n", "\n", "```python\n", "from scipy.interpolate import splrep, splint, splev\n", "\n", "a = 0.\n", "b = 3.\n", "\n", "def pfunc(theta):\n", " return theta**(-5)*np.exp(-1./theta**2)\n", " \n", "ngrid = 1001\n", "theta_grid = np.linspace(a, b, ngrid) # defines a grid in values of theta\n", "p_grid = pfunc(theta_grid) # pfunc is a function that returns the probability density P(\\theta)\n", "p_spline = splrep(theta_grid, p_grid) # creates a spline that interpolates P(\\theta) over the grid\n", "p_norm = splint(a, b, p_spline) # computes the integral of P(\\theta) over the full range, to normalize the CDF to 1\n", "def cdf(theta):\n", " return splint(a, theta, p_spline)/p_norm\n", "```\n", "\n", "- Draw samples $x$ from a uniform distribution between 0 and 1. Set $\\theta = \\mathrm{CDF}^{-1}(x)$:\n", "\n", "```python\n", "from scipy.optimize import brentq\n", "\n", "nsamp = 1000\n", "theta_samp = np.zeros(nsamp)\n", "x_samp = np.random.rand(nsamp)\n", "\n", "for i in range(nsamp):\n", " theta_samp[i] = brentq(lambda theta: cdf(theta) - x[i], a, b)\n", "```" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "from scipy.interpolate import splrep, splint, splev\n", "import pylab\n", "\n", "# define the boundaries of the distribution\n", "a = 0.01 # avoid zero, the probability distribution is not defined\n", "b = 3.\n", "\n", "# define the probability density (un-normalized)\n", "def pfunc(theta):\n", " return theta**(-5)*np.exp(-1./theta**2)\n", "\n", "# prepare a grid of values of theta for interpolation\n", "ngrid = 1000\n", "theta_grid = np.linspace(a, b, ngrid)\n", "\n", "# evaluate P(theta) on the grid\n", "p_grid = pfunc(theta_grid)\n", "\n", "# create a spline that interpolates P(\\theta) over the grid\n", "p_spline = splrep(theta_grid, p_grid)\n", "\n", "# compute the integral of P(\\theta) over the full range, to normalize the CDF to 1\n", "p_norm = splint(a, b, p_spline)\n", "\n", "# define the CDF\n", "def cdf(theta):\n", " return splint(a, theta, p_spline)/p_norm\n", "\n", "# evaluates the CDF on the same grid of theta used for the interpolation, for illustration purposes only\n", "cdf_grid = np.zeros(ngrid)\n", "for i in range(ngrid):\n", " cdf_grid[i] = cdf(theta_grid[i])\n", " \n", "# plot the \n", "pylab.plot(theta_grid, pfunc(theta_grid))\n", "pylab.show()\n", "\n", "# plot the CDF\n", "pylab.plot(theta_grid, cdf_grid)\n", "pylab.show()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from scipy.optimize import brentq\n", "\n", "\n", "# draw random values of x from U(0, 1)\n", "nsamp = 10000\n", "x_samp = np.random.rand(nsamp)\n", "\n", "theta_samp = np.zeros(nsamp)\n", "for i in range(nsamp):\n", " theta_samp[i] = brentq(lambda theta: cdf(theta) - x_samp[i], a, b)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# plot the sample, together with the probability distribution\n", "\n", "\n", "nbins = 30\n", "bins = np.linspace(a, b, nbins)\n", "pylab.hist(theta_samp, bins=bins)\n", "pylab.plot(theta_grid, p_grid/p_norm*(bins[1] - bins[0])*nsamp)\n", "pylab.show()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 2", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.13" } }, "nbformat": 4, "nbformat_minor": 2 }