项目作者: wittawatj

项目描述 :
AISTATS 2016. K2-ABC: Approximate Bayesian Computation with Kernel Embeddings.
高级语言: Matlab
项目地址: git://github.com/wittawatj/k2abc.git
创建时间: 2016-03-12T16:48:11Z
项目社区:https://github.com/wittawatj/k2abc

开源协议:MIT License

下载


K2-ABC

This repository contains Matlab implementation of K2-ABC as described in

  1. K2-ABC: Approximate Bayesian Computation with Kernel Embeddings
  2. Mijung Park, Wittawat Jitkrittum, Dino Sejdinovic
  3. AISTATS 2016

See the paper here.

Demo script

  1. In Matlab, switch to code/ folder with cd code.
  2. Run startup to include necessary dependency.
  3. Run demo_k2abc_rf to see a demo. The full code is at
    demo/demo_k2abc_rf.m.
    This code demonstrates how to use K2-ABC random with Fourier features as well as
    K2-ABC with full quadratic MMD. Here, the problem we consider is a one-dimensional Gaussian
    likelihood. The goal is to infer the mean of the normal distribution. In this
    demo, we assume that the true mean is 3, and observe 200 points.
  1. % Set up a likelihood function (theta, n) -> data. Here n is the number of points
  2. % to draw for each parameter theta. This function should return d x n matrix
  3. % in general.
  4. likelihood_func = @(theta, n)randn(1, n) + theta;
  5. % True mean is 3.
  6. true_theta = 3;
  7. % Set the number of observations to 200
  8. num_obs = 200;
  9. % number of random features
  10. nfeatures = 50;
  11. % Generate the set of observations
  12. obs = likelihood_func(true_theta, num_obs );
  13. % options. All options are described in ssf_kernel_abc.
  14. op = struct();
  15. % A proposal distribution for drawing the latent variables of interest.
  16. % func_handle : n -> (d' x n) where n is the number of samples to draw.
  17. % Return a d' x n matrix.
  18. % Here we use 0-mean Gaussian proposal with variance 8.
  19. op.proposal_dist = @(n)randn(1, n)*sqrt(8);
  20. op.likelihood_func = likelihood_func;
  21. % List of ABC tolerances. Will try all of them one by one.
  22. op.epsilon_list = logspace(-3, 0, 9);
  23. % Sample size from the posterior.
  24. op.num_latent_draws = 500;
  25. % number of pseudo data to draw e.g., the data drawn from the likelihood function
  26. % for each theta
  27. op.num_pseudo_data = 200;
  28. % Set the Gaussian width using the median heuristic.
  29. width2 = meddistance(obs)^2;
  30. % Gaussian kernel takes width squared
  31. ker = KGaussian(width2);
  32. % This option setting is not necessary for K2-ABC width quadratic MMD.
  33. % Online for random Fourier features.
  34. op.feature_map = ker.getRandFeatureMap(nfeatures, 1);
  35. % Run K2-ABC with random features
  36. % Rrf contains latent samples and their weights for each epsilon.
  37. [Rrf, op] = k2abc_rf(obs, op);
  38. % Run K2-ABC with full quadratic MMD
  39. [R, op] = k2abc(obs, op);
  40. % Plot the results
  41. figure
  42. cols = 3;
  43. num_eps = length(op.epsilon_list);
  44. for ei = 1:num_eps
  45. subplot(ceil(num_eps/cols), cols, ei);
  46. ep = op.epsilon_list(ei);
  47. % plot empirical distribution of the drawn latent
  48. hold on
  49. %plot(Rrf.latent_samples(Ilin), Rrf.norm_weights(Ilin, ei), '-b');
  50. %plot(R.latent_samples(I), R.norm_weights(I, ei), '-r');
  51. stem(R.latent_samples, R.norm_weights(:, ei), 'r', 'Marker', 'none');
  52. stem(Rrf.latent_samples, Rrf.norm_weights(:, ei), 'b', 'Marker', 'none');
  53. set(gca, 'fontsize', 16);
  54. title(sprintf('ep = %.2g ', ep ));
  55. xlim([true_theta-3, true_theta+3]);
  56. ylim([0, 0.04]);
  57. grid on
  58. hold off
  59. end
  60. legend('K2ABC-quad', 'K2ABC-rf')
  61. superTitle=sprintf('Approx. Posterior. true theta = %.1f, ker = %s, likelihood = %s', true_theta, ...
  62. ker.shortSummary(), func2str(op.likelihood_func));
  63. annotation('textbox', [0 0.9 1 0.1], ...
  64. 'String', superTitle, ...
  65. 'EdgeColor', 'none', ...
  66. 'HorizontalAlignment', 'center', ...
  67. 'FontSize', 16)

The script will show the following plot.

Inferred posteriors with different epsilons

In both K2-ABC with full quadratic MMD (K2ABC-quad), and K2-ABC with
random features (K2ABC-rf), the posterior samples concentrate around the true mean 3.
We observe that small epsilons tend to yield posterior distributions with smaller variance.