How to quantify the confidence on a Bayesian posterior probability?
$begingroup$
Consider a physical system that depends on a parameter $0leq phi <infty$. I want to (i) find the probability that this parameter is smaller than a critical value: $phileq phi_c$, and (ii) quantify my confidence on the probability that I produce in (i)---see below. To this aim, I use Bayesian approach. A priori, the probability distribution of this parameter is $p(phi)$. I measure some observable, say $X$. By repeating this measurement $m$ times, I collect the dataset of outcomes of $X$, which I show with the vector ${bf x} = {x_1,x_2,dots,x_m}$.
From the Bayes rule, I can update the probability distribution $p(phi) to p(phi|{bf x})$ as follows:
$$ p(phi|{bf x}) = frac{p({bf x}|phi)p(phi)}{int dphi ~ p({bf x}|phi)p(phi)}.$$
Thus, the answer to (i) is simply $p(phileqphi_c) = int_0^{phi_c} dphi~ p(phi|{bf x})$.
It remains to answer (ii), i.e., to quantify how reliable is our answer to (i). More precisely, I want to quantify the error made in estimating $p(phileqphi_c)$. Such error should depend on the number of measurements $m$, and vanish as $mto infty$ (most likely it scales with $1/sqrt{m}$).
I would appreciate any hint or references on quantifying the described error. I hope that I was clear enough about the problem...if not, please let me know.
bayesian bayes-theorem parameter-estimation mean-square-error
$endgroup$
add a comment |
$begingroup$
Consider a physical system that depends on a parameter $0leq phi <infty$. I want to (i) find the probability that this parameter is smaller than a critical value: $phileq phi_c$, and (ii) quantify my confidence on the probability that I produce in (i)---see below. To this aim, I use Bayesian approach. A priori, the probability distribution of this parameter is $p(phi)$. I measure some observable, say $X$. By repeating this measurement $m$ times, I collect the dataset of outcomes of $X$, which I show with the vector ${bf x} = {x_1,x_2,dots,x_m}$.
From the Bayes rule, I can update the probability distribution $p(phi) to p(phi|{bf x})$ as follows:
$$ p(phi|{bf x}) = frac{p({bf x}|phi)p(phi)}{int dphi ~ p({bf x}|phi)p(phi)}.$$
Thus, the answer to (i) is simply $p(phileqphi_c) = int_0^{phi_c} dphi~ p(phi|{bf x})$.
It remains to answer (ii), i.e., to quantify how reliable is our answer to (i). More precisely, I want to quantify the error made in estimating $p(phileqphi_c)$. Such error should depend on the number of measurements $m$, and vanish as $mto infty$ (most likely it scales with $1/sqrt{m}$).
I would appreciate any hint or references on quantifying the described error. I hope that I was clear enough about the problem...if not, please let me know.
bayesian bayes-theorem parameter-estimation mean-square-error
$endgroup$
add a comment |
$begingroup$
Consider a physical system that depends on a parameter $0leq phi <infty$. I want to (i) find the probability that this parameter is smaller than a critical value: $phileq phi_c$, and (ii) quantify my confidence on the probability that I produce in (i)---see below. To this aim, I use Bayesian approach. A priori, the probability distribution of this parameter is $p(phi)$. I measure some observable, say $X$. By repeating this measurement $m$ times, I collect the dataset of outcomes of $X$, which I show with the vector ${bf x} = {x_1,x_2,dots,x_m}$.
From the Bayes rule, I can update the probability distribution $p(phi) to p(phi|{bf x})$ as follows:
$$ p(phi|{bf x}) = frac{p({bf x}|phi)p(phi)}{int dphi ~ p({bf x}|phi)p(phi)}.$$
Thus, the answer to (i) is simply $p(phileqphi_c) = int_0^{phi_c} dphi~ p(phi|{bf x})$.
It remains to answer (ii), i.e., to quantify how reliable is our answer to (i). More precisely, I want to quantify the error made in estimating $p(phileqphi_c)$. Such error should depend on the number of measurements $m$, and vanish as $mto infty$ (most likely it scales with $1/sqrt{m}$).
I would appreciate any hint or references on quantifying the described error. I hope that I was clear enough about the problem...if not, please let me know.
bayesian bayes-theorem parameter-estimation mean-square-error
$endgroup$
Consider a physical system that depends on a parameter $0leq phi <infty$. I want to (i) find the probability that this parameter is smaller than a critical value: $phileq phi_c$, and (ii) quantify my confidence on the probability that I produce in (i)---see below. To this aim, I use Bayesian approach. A priori, the probability distribution of this parameter is $p(phi)$. I measure some observable, say $X$. By repeating this measurement $m$ times, I collect the dataset of outcomes of $X$, which I show with the vector ${bf x} = {x_1,x_2,dots,x_m}$.
From the Bayes rule, I can update the probability distribution $p(phi) to p(phi|{bf x})$ as follows:
$$ p(phi|{bf x}) = frac{p({bf x}|phi)p(phi)}{int dphi ~ p({bf x}|phi)p(phi)}.$$
Thus, the answer to (i) is simply $p(phileqphi_c) = int_0^{phi_c} dphi~ p(phi|{bf x})$.
It remains to answer (ii), i.e., to quantify how reliable is our answer to (i). More precisely, I want to quantify the error made in estimating $p(phileqphi_c)$. Such error should depend on the number of measurements $m$, and vanish as $mto infty$ (most likely it scales with $1/sqrt{m}$).
I would appreciate any hint or references on quantifying the described error. I hope that I was clear enough about the problem...if not, please let me know.
bayesian bayes-theorem parameter-estimation mean-square-error
bayesian bayes-theorem parameter-estimation mean-square-error
edited Dec 18 '18 at 13:24
Moha
asked Dec 18 '18 at 11:51
MohaMoha
85
85
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
As you stated, using the data that you have, you can obtain a posterior distribution and calculate the integral
$$I = int_0^{phi_c} p(phi mid boldsymbol{x}) dphi$$
explicitly. This doesn't come with an uncertainty estimate, but what we would like to know is: thinking about the data sets that could have been drawn, what would our estimates for $I$ look like? We want to use bootstrapping here. Bootstrapping involves repeatedly resampling the data, calculating an estimate for $I$ under each resampling, and then aggregating our estimates for $I$ at the end. The distribution of these estimates of $I$ allows us to estimate our uncertainty in the estimate.
You can use the Bayesian bootstrap specifically (there is a good description of it here) to approximate $I$ in a Bayesian fashion.
I'll make a specific example here. Suppose that the prior for $phi$ is a gamma distribution with rate shape $alpha$ and rate $beta$ (I'll use $alpha = beta = 1$ in my analysis), so
$$p(phi) = frac{beta^{alpha - 1}}{Gamma(alpha)}phi^{alpha - 1}e^{-betaphi}, phi > 0$$
The likelihood will be Poission with parameter $phi$, so the likelihood of $boldsymbol{x}$ given $phi$ is
$$frac{e^{-phi m} phi^{sum_i x_i}}{prod_i x_i !},$$
where each $x_i in {0, 1, 2, dots}.$
Hence the posterior distribution is a gamma distribution:
$$p(phi mid boldsymbol{x}) = frac{(beta + m)^{alpha + sum_i x_i - 1}}{Gamma(alpha + sum_i x_i)} phi^{alpha + sum_i x_i - 1}e^{-(beta + m)phi}, phi > 0,$$
and we can calculate $I$ using lookup tables.
I then simulated this setup with $phi = 0.95$ and $phi_c = 1$ (so $I$ should get close to $1$ as we get more information about $phi$). Running this in R with $m = 10$ I got an estimate of $I = 0.54$. Running $1{,}000$ resamples using the bayesboot library, I got the following output:
We can see that the distribution is heavier towards $1$, but it's reasonably uniform.
Now, using a sample of size $m = 100$ I got an estimate of $I = 0.32$, and the following BB output:
The distribution has not changed much.
Now I try running this with $m = 10{,}000$ data points. I got an estimate of $I = 0.96$ and the following BB output:
Now we are really certain that $I$ is close to $1$!
The estimates have the properties that you suggested they would: they depend on the sample size $m$ and the uncertainty decreases as $m$ gets larger. I don't think you will be able to find a general formula for how quickly the uncertainty will decrease. For example, suppose that the data were generated from a uniform $U(0, phi)$ distribution. Then if we ever observe some $phi_c < x_i$ then we know that $P(phi < phi_c) = 0$ immediately, since it's impossible to observe $phi < phi_c < x_i$.
Here is the R code that I used to generate the plots:
install.packages("bayesboot")
library(bayesboot)
phi = 0.95
priorShape = 1
priorRate = 1
postProbLessThanOne = function(x) {
postShape = priorShape + sum(x)
postRate = priorRate + length(x)
postProb = pgamma(1, shape = postShape, rate = postRate)
return(postProb)
}
set.seed(1)
x10 = rpois(10, phi)
postProbLessThanOne(x10) # 0.54
bootstraps10 = bayesboot(data = x10, statistic = postProbLessThanOne, R = 1000,
R2 = 10, use.weights = FALSE, .progress = "text")
hist(bootstraps10$V1, main = "BB Distribution of I: m = 10", xlab = "Bootstrap Values of I")
x100 = rpois(100, phi)
postProbLessThanOne(x100) # 0.32
bootstraps100 = bayesboot(data = x100, statistic = postProbLessThanOne, R = 1000,
R2 = 100, use.weights = FALSE, .progress = "text")
hist(bootstraps100$V1, main = "BB Distribution of I: m = 100", xlab = "Bootstrap Values of I")
x10000 = rpois(10000, phi)
postProbLessThanOne(x10000) # 0.96
bootstraps10000 = bayesboot(data = x10000, statistic = postProbLessThanOne, R = 1000,
R2 = 10000, use.weights = FALSE, .progress = "text")
hist(bootstraps10000$V1, main = "BB Distribution of I: m = 10,000", xlab = "Bootstrap Values of I")
$endgroup$
$begingroup$
That makes a lot of sense Alex. Two questions: 1) The Poisson likelihood sounds a bit strange to me. I expect that, for a given $phi$ the integral over all $x$ be normalized to one, however, it seems the other way around, that is, for a given $x$ the integral over $phi$ is normalized to one. 2) Regarding the scaling with $m$; If rather than the initial estimator $I$ we use another estimator $J = 1/m sum_{i=1}^m int_0^{phi_c} p(phi|x_i) dphi$, one can verify $Var(J)propto 1/m$. Whether $J$ is the best estimator or another estimator, say $I$ can over-perform remains unclear to me.
$endgroup$
– Moha
Dec 23 '18 at 10:43
$begingroup$
1) The Poisson likelihood does sum to $1$ for a given $phi$. Consider the case where $m = 1$ to start with. Then the likelihood is $e^{-phi}phi^{x_1}/x_1!$, and if we sum over $x_1$ from $0$ to $infty$ we get $1$ (because $e^{phi} = sum_{i = 0}^infty phi^i / i!$). For multiple independent observations $x_1, dots, x_m$ we are just multiplying the probabilities together and so if we do the sum over all possible combinations of $(x_1, dots, x_m)$, where each $0 < x_i$, we will get $1$.
$endgroup$
– Alex
Dec 23 '18 at 11:29
$begingroup$
2) The estimator $J$ will be over-influenced by the prior distribution. Each integral $int_{0}^{phi_c} p(phi mid x_i) dphi$ is going to be greatly affected by the prior no matter how large $m$ is. As you say, the estimate will converge as $m$ gets larger - it will converge to the expected value of $I$ when $m = 1$.
$endgroup$
– Alex
Dec 23 '18 at 11:35
$begingroup$
Maybe the Poisson likelihood was confusing because I didn't define that the $x_i$ values had to be non-negative integers - I'll correct that in my answer.
$endgroup$
– Alex
Dec 23 '18 at 11:53
$begingroup$
I see your point. In fact I was just simulating the estimator $J$, and it turns out to be very bad. I'll keep on with the initial $I$. Thanks a lot for your useful answer and comments.
$endgroup$
– Moha
Dec 23 '18 at 12:21
|
show 1 more comment
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3045073%2fhow-to-quantify-the-confidence-on-a-bayesian-posterior-probability%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
As you stated, using the data that you have, you can obtain a posterior distribution and calculate the integral
$$I = int_0^{phi_c} p(phi mid boldsymbol{x}) dphi$$
explicitly. This doesn't come with an uncertainty estimate, but what we would like to know is: thinking about the data sets that could have been drawn, what would our estimates for $I$ look like? We want to use bootstrapping here. Bootstrapping involves repeatedly resampling the data, calculating an estimate for $I$ under each resampling, and then aggregating our estimates for $I$ at the end. The distribution of these estimates of $I$ allows us to estimate our uncertainty in the estimate.
You can use the Bayesian bootstrap specifically (there is a good description of it here) to approximate $I$ in a Bayesian fashion.
I'll make a specific example here. Suppose that the prior for $phi$ is a gamma distribution with rate shape $alpha$ and rate $beta$ (I'll use $alpha = beta = 1$ in my analysis), so
$$p(phi) = frac{beta^{alpha - 1}}{Gamma(alpha)}phi^{alpha - 1}e^{-betaphi}, phi > 0$$
The likelihood will be Poission with parameter $phi$, so the likelihood of $boldsymbol{x}$ given $phi$ is
$$frac{e^{-phi m} phi^{sum_i x_i}}{prod_i x_i !},$$
where each $x_i in {0, 1, 2, dots}.$
Hence the posterior distribution is a gamma distribution:
$$p(phi mid boldsymbol{x}) = frac{(beta + m)^{alpha + sum_i x_i - 1}}{Gamma(alpha + sum_i x_i)} phi^{alpha + sum_i x_i - 1}e^{-(beta + m)phi}, phi > 0,$$
and we can calculate $I$ using lookup tables.
I then simulated this setup with $phi = 0.95$ and $phi_c = 1$ (so $I$ should get close to $1$ as we get more information about $phi$). Running this in R with $m = 10$ I got an estimate of $I = 0.54$. Running $1{,}000$ resamples using the bayesboot library, I got the following output:
We can see that the distribution is heavier towards $1$, but it's reasonably uniform.
Now, using a sample of size $m = 100$ I got an estimate of $I = 0.32$, and the following BB output:
The distribution has not changed much.
Now I try running this with $m = 10{,}000$ data points. I got an estimate of $I = 0.96$ and the following BB output:
Now we are really certain that $I$ is close to $1$!
The estimates have the properties that you suggested they would: they depend on the sample size $m$ and the uncertainty decreases as $m$ gets larger. I don't think you will be able to find a general formula for how quickly the uncertainty will decrease. For example, suppose that the data were generated from a uniform $U(0, phi)$ distribution. Then if we ever observe some $phi_c < x_i$ then we know that $P(phi < phi_c) = 0$ immediately, since it's impossible to observe $phi < phi_c < x_i$.
Here is the R code that I used to generate the plots:
install.packages("bayesboot")
library(bayesboot)
phi = 0.95
priorShape = 1
priorRate = 1
postProbLessThanOne = function(x) {
postShape = priorShape + sum(x)
postRate = priorRate + length(x)
postProb = pgamma(1, shape = postShape, rate = postRate)
return(postProb)
}
set.seed(1)
x10 = rpois(10, phi)
postProbLessThanOne(x10) # 0.54
bootstraps10 = bayesboot(data = x10, statistic = postProbLessThanOne, R = 1000,
R2 = 10, use.weights = FALSE, .progress = "text")
hist(bootstraps10$V1, main = "BB Distribution of I: m = 10", xlab = "Bootstrap Values of I")
x100 = rpois(100, phi)
postProbLessThanOne(x100) # 0.32
bootstraps100 = bayesboot(data = x100, statistic = postProbLessThanOne, R = 1000,
R2 = 100, use.weights = FALSE, .progress = "text")
hist(bootstraps100$V1, main = "BB Distribution of I: m = 100", xlab = "Bootstrap Values of I")
x10000 = rpois(10000, phi)
postProbLessThanOne(x10000) # 0.96
bootstraps10000 = bayesboot(data = x10000, statistic = postProbLessThanOne, R = 1000,
R2 = 10000, use.weights = FALSE, .progress = "text")
hist(bootstraps10000$V1, main = "BB Distribution of I: m = 10,000", xlab = "Bootstrap Values of I")
$endgroup$
$begingroup$
That makes a lot of sense Alex. Two questions: 1) The Poisson likelihood sounds a bit strange to me. I expect that, for a given $phi$ the integral over all $x$ be normalized to one, however, it seems the other way around, that is, for a given $x$ the integral over $phi$ is normalized to one. 2) Regarding the scaling with $m$; If rather than the initial estimator $I$ we use another estimator $J = 1/m sum_{i=1}^m int_0^{phi_c} p(phi|x_i) dphi$, one can verify $Var(J)propto 1/m$. Whether $J$ is the best estimator or another estimator, say $I$ can over-perform remains unclear to me.
$endgroup$
– Moha
Dec 23 '18 at 10:43
$begingroup$
1) The Poisson likelihood does sum to $1$ for a given $phi$. Consider the case where $m = 1$ to start with. Then the likelihood is $e^{-phi}phi^{x_1}/x_1!$, and if we sum over $x_1$ from $0$ to $infty$ we get $1$ (because $e^{phi} = sum_{i = 0}^infty phi^i / i!$). For multiple independent observations $x_1, dots, x_m$ we are just multiplying the probabilities together and so if we do the sum over all possible combinations of $(x_1, dots, x_m)$, where each $0 < x_i$, we will get $1$.
$endgroup$
– Alex
Dec 23 '18 at 11:29
$begingroup$
2) The estimator $J$ will be over-influenced by the prior distribution. Each integral $int_{0}^{phi_c} p(phi mid x_i) dphi$ is going to be greatly affected by the prior no matter how large $m$ is. As you say, the estimate will converge as $m$ gets larger - it will converge to the expected value of $I$ when $m = 1$.
$endgroup$
– Alex
Dec 23 '18 at 11:35
$begingroup$
Maybe the Poisson likelihood was confusing because I didn't define that the $x_i$ values had to be non-negative integers - I'll correct that in my answer.
$endgroup$
– Alex
Dec 23 '18 at 11:53
$begingroup$
I see your point. In fact I was just simulating the estimator $J$, and it turns out to be very bad. I'll keep on with the initial $I$. Thanks a lot for your useful answer and comments.
$endgroup$
– Moha
Dec 23 '18 at 12:21
|
show 1 more comment
$begingroup$
As you stated, using the data that you have, you can obtain a posterior distribution and calculate the integral
$$I = int_0^{phi_c} p(phi mid boldsymbol{x}) dphi$$
explicitly. This doesn't come with an uncertainty estimate, but what we would like to know is: thinking about the data sets that could have been drawn, what would our estimates for $I$ look like? We want to use bootstrapping here. Bootstrapping involves repeatedly resampling the data, calculating an estimate for $I$ under each resampling, and then aggregating our estimates for $I$ at the end. The distribution of these estimates of $I$ allows us to estimate our uncertainty in the estimate.
You can use the Bayesian bootstrap specifically (there is a good description of it here) to approximate $I$ in a Bayesian fashion.
I'll make a specific example here. Suppose that the prior for $phi$ is a gamma distribution with rate shape $alpha$ and rate $beta$ (I'll use $alpha = beta = 1$ in my analysis), so
$$p(phi) = frac{beta^{alpha - 1}}{Gamma(alpha)}phi^{alpha - 1}e^{-betaphi}, phi > 0$$
The likelihood will be Poission with parameter $phi$, so the likelihood of $boldsymbol{x}$ given $phi$ is
$$frac{e^{-phi m} phi^{sum_i x_i}}{prod_i x_i !},$$
where each $x_i in {0, 1, 2, dots}.$
Hence the posterior distribution is a gamma distribution:
$$p(phi mid boldsymbol{x}) = frac{(beta + m)^{alpha + sum_i x_i - 1}}{Gamma(alpha + sum_i x_i)} phi^{alpha + sum_i x_i - 1}e^{-(beta + m)phi}, phi > 0,$$
and we can calculate $I$ using lookup tables.
I then simulated this setup with $phi = 0.95$ and $phi_c = 1$ (so $I$ should get close to $1$ as we get more information about $phi$). Running this in R with $m = 10$ I got an estimate of $I = 0.54$. Running $1{,}000$ resamples using the bayesboot library, I got the following output:
We can see that the distribution is heavier towards $1$, but it's reasonably uniform.
Now, using a sample of size $m = 100$ I got an estimate of $I = 0.32$, and the following BB output:
The distribution has not changed much.
Now I try running this with $m = 10{,}000$ data points. I got an estimate of $I = 0.96$ and the following BB output:
Now we are really certain that $I$ is close to $1$!
The estimates have the properties that you suggested they would: they depend on the sample size $m$ and the uncertainty decreases as $m$ gets larger. I don't think you will be able to find a general formula for how quickly the uncertainty will decrease. For example, suppose that the data were generated from a uniform $U(0, phi)$ distribution. Then if we ever observe some $phi_c < x_i$ then we know that $P(phi < phi_c) = 0$ immediately, since it's impossible to observe $phi < phi_c < x_i$.
Here is the R code that I used to generate the plots:
install.packages("bayesboot")
library(bayesboot)
phi = 0.95
priorShape = 1
priorRate = 1
postProbLessThanOne = function(x) {
postShape = priorShape + sum(x)
postRate = priorRate + length(x)
postProb = pgamma(1, shape = postShape, rate = postRate)
return(postProb)
}
set.seed(1)
x10 = rpois(10, phi)
postProbLessThanOne(x10) # 0.54
bootstraps10 = bayesboot(data = x10, statistic = postProbLessThanOne, R = 1000,
R2 = 10, use.weights = FALSE, .progress = "text")
hist(bootstraps10$V1, main = "BB Distribution of I: m = 10", xlab = "Bootstrap Values of I")
x100 = rpois(100, phi)
postProbLessThanOne(x100) # 0.32
bootstraps100 = bayesboot(data = x100, statistic = postProbLessThanOne, R = 1000,
R2 = 100, use.weights = FALSE, .progress = "text")
hist(bootstraps100$V1, main = "BB Distribution of I: m = 100", xlab = "Bootstrap Values of I")
x10000 = rpois(10000, phi)
postProbLessThanOne(x10000) # 0.96
bootstraps10000 = bayesboot(data = x10000, statistic = postProbLessThanOne, R = 1000,
R2 = 10000, use.weights = FALSE, .progress = "text")
hist(bootstraps10000$V1, main = "BB Distribution of I: m = 10,000", xlab = "Bootstrap Values of I")
$endgroup$
$begingroup$
That makes a lot of sense Alex. Two questions: 1) The Poisson likelihood sounds a bit strange to me. I expect that, for a given $phi$ the integral over all $x$ be normalized to one, however, it seems the other way around, that is, for a given $x$ the integral over $phi$ is normalized to one. 2) Regarding the scaling with $m$; If rather than the initial estimator $I$ we use another estimator $J = 1/m sum_{i=1}^m int_0^{phi_c} p(phi|x_i) dphi$, one can verify $Var(J)propto 1/m$. Whether $J$ is the best estimator or another estimator, say $I$ can over-perform remains unclear to me.
$endgroup$
– Moha
Dec 23 '18 at 10:43
$begingroup$
1) The Poisson likelihood does sum to $1$ for a given $phi$. Consider the case where $m = 1$ to start with. Then the likelihood is $e^{-phi}phi^{x_1}/x_1!$, and if we sum over $x_1$ from $0$ to $infty$ we get $1$ (because $e^{phi} = sum_{i = 0}^infty phi^i / i!$). For multiple independent observations $x_1, dots, x_m$ we are just multiplying the probabilities together and so if we do the sum over all possible combinations of $(x_1, dots, x_m)$, where each $0 < x_i$, we will get $1$.
$endgroup$
– Alex
Dec 23 '18 at 11:29
$begingroup$
2) The estimator $J$ will be over-influenced by the prior distribution. Each integral $int_{0}^{phi_c} p(phi mid x_i) dphi$ is going to be greatly affected by the prior no matter how large $m$ is. As you say, the estimate will converge as $m$ gets larger - it will converge to the expected value of $I$ when $m = 1$.
$endgroup$
– Alex
Dec 23 '18 at 11:35
$begingroup$
Maybe the Poisson likelihood was confusing because I didn't define that the $x_i$ values had to be non-negative integers - I'll correct that in my answer.
$endgroup$
– Alex
Dec 23 '18 at 11:53
$begingroup$
I see your point. In fact I was just simulating the estimator $J$, and it turns out to be very bad. I'll keep on with the initial $I$. Thanks a lot for your useful answer and comments.
$endgroup$
– Moha
Dec 23 '18 at 12:21
|
show 1 more comment
$begingroup$
As you stated, using the data that you have, you can obtain a posterior distribution and calculate the integral
$$I = int_0^{phi_c} p(phi mid boldsymbol{x}) dphi$$
explicitly. This doesn't come with an uncertainty estimate, but what we would like to know is: thinking about the data sets that could have been drawn, what would our estimates for $I$ look like? We want to use bootstrapping here. Bootstrapping involves repeatedly resampling the data, calculating an estimate for $I$ under each resampling, and then aggregating our estimates for $I$ at the end. The distribution of these estimates of $I$ allows us to estimate our uncertainty in the estimate.
You can use the Bayesian bootstrap specifically (there is a good description of it here) to approximate $I$ in a Bayesian fashion.
I'll make a specific example here. Suppose that the prior for $phi$ is a gamma distribution with rate shape $alpha$ and rate $beta$ (I'll use $alpha = beta = 1$ in my analysis), so
$$p(phi) = frac{beta^{alpha - 1}}{Gamma(alpha)}phi^{alpha - 1}e^{-betaphi}, phi > 0$$
The likelihood will be Poission with parameter $phi$, so the likelihood of $boldsymbol{x}$ given $phi$ is
$$frac{e^{-phi m} phi^{sum_i x_i}}{prod_i x_i !},$$
where each $x_i in {0, 1, 2, dots}.$
Hence the posterior distribution is a gamma distribution:
$$p(phi mid boldsymbol{x}) = frac{(beta + m)^{alpha + sum_i x_i - 1}}{Gamma(alpha + sum_i x_i)} phi^{alpha + sum_i x_i - 1}e^{-(beta + m)phi}, phi > 0,$$
and we can calculate $I$ using lookup tables.
I then simulated this setup with $phi = 0.95$ and $phi_c = 1$ (so $I$ should get close to $1$ as we get more information about $phi$). Running this in R with $m = 10$ I got an estimate of $I = 0.54$. Running $1{,}000$ resamples using the bayesboot library, I got the following output:
We can see that the distribution is heavier towards $1$, but it's reasonably uniform.
Now, using a sample of size $m = 100$ I got an estimate of $I = 0.32$, and the following BB output:
The distribution has not changed much.
Now I try running this with $m = 10{,}000$ data points. I got an estimate of $I = 0.96$ and the following BB output:
Now we are really certain that $I$ is close to $1$!
The estimates have the properties that you suggested they would: they depend on the sample size $m$ and the uncertainty decreases as $m$ gets larger. I don't think you will be able to find a general formula for how quickly the uncertainty will decrease. For example, suppose that the data were generated from a uniform $U(0, phi)$ distribution. Then if we ever observe some $phi_c < x_i$ then we know that $P(phi < phi_c) = 0$ immediately, since it's impossible to observe $phi < phi_c < x_i$.
Here is the R code that I used to generate the plots:
install.packages("bayesboot")
library(bayesboot)
phi = 0.95
priorShape = 1
priorRate = 1
postProbLessThanOne = function(x) {
postShape = priorShape + sum(x)
postRate = priorRate + length(x)
postProb = pgamma(1, shape = postShape, rate = postRate)
return(postProb)
}
set.seed(1)
x10 = rpois(10, phi)
postProbLessThanOne(x10) # 0.54
bootstraps10 = bayesboot(data = x10, statistic = postProbLessThanOne, R = 1000,
R2 = 10, use.weights = FALSE, .progress = "text")
hist(bootstraps10$V1, main = "BB Distribution of I: m = 10", xlab = "Bootstrap Values of I")
x100 = rpois(100, phi)
postProbLessThanOne(x100) # 0.32
bootstraps100 = bayesboot(data = x100, statistic = postProbLessThanOne, R = 1000,
R2 = 100, use.weights = FALSE, .progress = "text")
hist(bootstraps100$V1, main = "BB Distribution of I: m = 100", xlab = "Bootstrap Values of I")
x10000 = rpois(10000, phi)
postProbLessThanOne(x10000) # 0.96
bootstraps10000 = bayesboot(data = x10000, statistic = postProbLessThanOne, R = 1000,
R2 = 10000, use.weights = FALSE, .progress = "text")
hist(bootstraps10000$V1, main = "BB Distribution of I: m = 10,000", xlab = "Bootstrap Values of I")
$endgroup$
As you stated, using the data that you have, you can obtain a posterior distribution and calculate the integral
$$I = int_0^{phi_c} p(phi mid boldsymbol{x}) dphi$$
explicitly. This doesn't come with an uncertainty estimate, but what we would like to know is: thinking about the data sets that could have been drawn, what would our estimates for $I$ look like? We want to use bootstrapping here. Bootstrapping involves repeatedly resampling the data, calculating an estimate for $I$ under each resampling, and then aggregating our estimates for $I$ at the end. The distribution of these estimates of $I$ allows us to estimate our uncertainty in the estimate.
You can use the Bayesian bootstrap specifically (there is a good description of it here) to approximate $I$ in a Bayesian fashion.
I'll make a specific example here. Suppose that the prior for $phi$ is a gamma distribution with rate shape $alpha$ and rate $beta$ (I'll use $alpha = beta = 1$ in my analysis), so
$$p(phi) = frac{beta^{alpha - 1}}{Gamma(alpha)}phi^{alpha - 1}e^{-betaphi}, phi > 0$$
The likelihood will be Poission with parameter $phi$, so the likelihood of $boldsymbol{x}$ given $phi$ is
$$frac{e^{-phi m} phi^{sum_i x_i}}{prod_i x_i !},$$
where each $x_i in {0, 1, 2, dots}.$
Hence the posterior distribution is a gamma distribution:
$$p(phi mid boldsymbol{x}) = frac{(beta + m)^{alpha + sum_i x_i - 1}}{Gamma(alpha + sum_i x_i)} phi^{alpha + sum_i x_i - 1}e^{-(beta + m)phi}, phi > 0,$$
and we can calculate $I$ using lookup tables.
I then simulated this setup with $phi = 0.95$ and $phi_c = 1$ (so $I$ should get close to $1$ as we get more information about $phi$). Running this in R with $m = 10$ I got an estimate of $I = 0.54$. Running $1{,}000$ resamples using the bayesboot library, I got the following output:
We can see that the distribution is heavier towards $1$, but it's reasonably uniform.
Now, using a sample of size $m = 100$ I got an estimate of $I = 0.32$, and the following BB output:
The distribution has not changed much.
Now I try running this with $m = 10{,}000$ data points. I got an estimate of $I = 0.96$ and the following BB output:
Now we are really certain that $I$ is close to $1$!
The estimates have the properties that you suggested they would: they depend on the sample size $m$ and the uncertainty decreases as $m$ gets larger. I don't think you will be able to find a general formula for how quickly the uncertainty will decrease. For example, suppose that the data were generated from a uniform $U(0, phi)$ distribution. Then if we ever observe some $phi_c < x_i$ then we know that $P(phi < phi_c) = 0$ immediately, since it's impossible to observe $phi < phi_c < x_i$.
Here is the R code that I used to generate the plots:
install.packages("bayesboot")
library(bayesboot)
phi = 0.95
priorShape = 1
priorRate = 1
postProbLessThanOne = function(x) {
postShape = priorShape + sum(x)
postRate = priorRate + length(x)
postProb = pgamma(1, shape = postShape, rate = postRate)
return(postProb)
}
set.seed(1)
x10 = rpois(10, phi)
postProbLessThanOne(x10) # 0.54
bootstraps10 = bayesboot(data = x10, statistic = postProbLessThanOne, R = 1000,
R2 = 10, use.weights = FALSE, .progress = "text")
hist(bootstraps10$V1, main = "BB Distribution of I: m = 10", xlab = "Bootstrap Values of I")
x100 = rpois(100, phi)
postProbLessThanOne(x100) # 0.32
bootstraps100 = bayesboot(data = x100, statistic = postProbLessThanOne, R = 1000,
R2 = 100, use.weights = FALSE, .progress = "text")
hist(bootstraps100$V1, main = "BB Distribution of I: m = 100", xlab = "Bootstrap Values of I")
x10000 = rpois(10000, phi)
postProbLessThanOne(x10000) # 0.96
bootstraps10000 = bayesboot(data = x10000, statistic = postProbLessThanOne, R = 1000,
R2 = 10000, use.weights = FALSE, .progress = "text")
hist(bootstraps10000$V1, main = "BB Distribution of I: m = 10,000", xlab = "Bootstrap Values of I")
edited Dec 23 '18 at 11:54
answered Dec 19 '18 at 23:40
AlexAlex
709412
709412
$begingroup$
That makes a lot of sense Alex. Two questions: 1) The Poisson likelihood sounds a bit strange to me. I expect that, for a given $phi$ the integral over all $x$ be normalized to one, however, it seems the other way around, that is, for a given $x$ the integral over $phi$ is normalized to one. 2) Regarding the scaling with $m$; If rather than the initial estimator $I$ we use another estimator $J = 1/m sum_{i=1}^m int_0^{phi_c} p(phi|x_i) dphi$, one can verify $Var(J)propto 1/m$. Whether $J$ is the best estimator or another estimator, say $I$ can over-perform remains unclear to me.
$endgroup$
– Moha
Dec 23 '18 at 10:43
$begingroup$
1) The Poisson likelihood does sum to $1$ for a given $phi$. Consider the case where $m = 1$ to start with. Then the likelihood is $e^{-phi}phi^{x_1}/x_1!$, and if we sum over $x_1$ from $0$ to $infty$ we get $1$ (because $e^{phi} = sum_{i = 0}^infty phi^i / i!$). For multiple independent observations $x_1, dots, x_m$ we are just multiplying the probabilities together and so if we do the sum over all possible combinations of $(x_1, dots, x_m)$, where each $0 < x_i$, we will get $1$.
$endgroup$
– Alex
Dec 23 '18 at 11:29
$begingroup$
2) The estimator $J$ will be over-influenced by the prior distribution. Each integral $int_{0}^{phi_c} p(phi mid x_i) dphi$ is going to be greatly affected by the prior no matter how large $m$ is. As you say, the estimate will converge as $m$ gets larger - it will converge to the expected value of $I$ when $m = 1$.
$endgroup$
– Alex
Dec 23 '18 at 11:35
$begingroup$
Maybe the Poisson likelihood was confusing because I didn't define that the $x_i$ values had to be non-negative integers - I'll correct that in my answer.
$endgroup$
– Alex
Dec 23 '18 at 11:53
$begingroup$
I see your point. In fact I was just simulating the estimator $J$, and it turns out to be very bad. I'll keep on with the initial $I$. Thanks a lot for your useful answer and comments.
$endgroup$
– Moha
Dec 23 '18 at 12:21
|
show 1 more comment
$begingroup$
That makes a lot of sense Alex. Two questions: 1) The Poisson likelihood sounds a bit strange to me. I expect that, for a given $phi$ the integral over all $x$ be normalized to one, however, it seems the other way around, that is, for a given $x$ the integral over $phi$ is normalized to one. 2) Regarding the scaling with $m$; If rather than the initial estimator $I$ we use another estimator $J = 1/m sum_{i=1}^m int_0^{phi_c} p(phi|x_i) dphi$, one can verify $Var(J)propto 1/m$. Whether $J$ is the best estimator or another estimator, say $I$ can over-perform remains unclear to me.
$endgroup$
– Moha
Dec 23 '18 at 10:43
$begingroup$
1) The Poisson likelihood does sum to $1$ for a given $phi$. Consider the case where $m = 1$ to start with. Then the likelihood is $e^{-phi}phi^{x_1}/x_1!$, and if we sum over $x_1$ from $0$ to $infty$ we get $1$ (because $e^{phi} = sum_{i = 0}^infty phi^i / i!$). For multiple independent observations $x_1, dots, x_m$ we are just multiplying the probabilities together and so if we do the sum over all possible combinations of $(x_1, dots, x_m)$, where each $0 < x_i$, we will get $1$.
$endgroup$
– Alex
Dec 23 '18 at 11:29
$begingroup$
2) The estimator $J$ will be over-influenced by the prior distribution. Each integral $int_{0}^{phi_c} p(phi mid x_i) dphi$ is going to be greatly affected by the prior no matter how large $m$ is. As you say, the estimate will converge as $m$ gets larger - it will converge to the expected value of $I$ when $m = 1$.
$endgroup$
– Alex
Dec 23 '18 at 11:35
$begingroup$
Maybe the Poisson likelihood was confusing because I didn't define that the $x_i$ values had to be non-negative integers - I'll correct that in my answer.
$endgroup$
– Alex
Dec 23 '18 at 11:53
$begingroup$
I see your point. In fact I was just simulating the estimator $J$, and it turns out to be very bad. I'll keep on with the initial $I$. Thanks a lot for your useful answer and comments.
$endgroup$
– Moha
Dec 23 '18 at 12:21
$begingroup$
That makes a lot of sense Alex. Two questions: 1) The Poisson likelihood sounds a bit strange to me. I expect that, for a given $phi$ the integral over all $x$ be normalized to one, however, it seems the other way around, that is, for a given $x$ the integral over $phi$ is normalized to one. 2) Regarding the scaling with $m$; If rather than the initial estimator $I$ we use another estimator $J = 1/m sum_{i=1}^m int_0^{phi_c} p(phi|x_i) dphi$, one can verify $Var(J)propto 1/m$. Whether $J$ is the best estimator or another estimator, say $I$ can over-perform remains unclear to me.
$endgroup$
– Moha
Dec 23 '18 at 10:43
$begingroup$
That makes a lot of sense Alex. Two questions: 1) The Poisson likelihood sounds a bit strange to me. I expect that, for a given $phi$ the integral over all $x$ be normalized to one, however, it seems the other way around, that is, for a given $x$ the integral over $phi$ is normalized to one. 2) Regarding the scaling with $m$; If rather than the initial estimator $I$ we use another estimator $J = 1/m sum_{i=1}^m int_0^{phi_c} p(phi|x_i) dphi$, one can verify $Var(J)propto 1/m$. Whether $J$ is the best estimator or another estimator, say $I$ can over-perform remains unclear to me.
$endgroup$
– Moha
Dec 23 '18 at 10:43
$begingroup$
1) The Poisson likelihood does sum to $1$ for a given $phi$. Consider the case where $m = 1$ to start with. Then the likelihood is $e^{-phi}phi^{x_1}/x_1!$, and if we sum over $x_1$ from $0$ to $infty$ we get $1$ (because $e^{phi} = sum_{i = 0}^infty phi^i / i!$). For multiple independent observations $x_1, dots, x_m$ we are just multiplying the probabilities together and so if we do the sum over all possible combinations of $(x_1, dots, x_m)$, where each $0 < x_i$, we will get $1$.
$endgroup$
– Alex
Dec 23 '18 at 11:29
$begingroup$
1) The Poisson likelihood does sum to $1$ for a given $phi$. Consider the case where $m = 1$ to start with. Then the likelihood is $e^{-phi}phi^{x_1}/x_1!$, and if we sum over $x_1$ from $0$ to $infty$ we get $1$ (because $e^{phi} = sum_{i = 0}^infty phi^i / i!$). For multiple independent observations $x_1, dots, x_m$ we are just multiplying the probabilities together and so if we do the sum over all possible combinations of $(x_1, dots, x_m)$, where each $0 < x_i$, we will get $1$.
$endgroup$
– Alex
Dec 23 '18 at 11:29
$begingroup$
2) The estimator $J$ will be over-influenced by the prior distribution. Each integral $int_{0}^{phi_c} p(phi mid x_i) dphi$ is going to be greatly affected by the prior no matter how large $m$ is. As you say, the estimate will converge as $m$ gets larger - it will converge to the expected value of $I$ when $m = 1$.
$endgroup$
– Alex
Dec 23 '18 at 11:35
$begingroup$
2) The estimator $J$ will be over-influenced by the prior distribution. Each integral $int_{0}^{phi_c} p(phi mid x_i) dphi$ is going to be greatly affected by the prior no matter how large $m$ is. As you say, the estimate will converge as $m$ gets larger - it will converge to the expected value of $I$ when $m = 1$.
$endgroup$
– Alex
Dec 23 '18 at 11:35
$begingroup$
Maybe the Poisson likelihood was confusing because I didn't define that the $x_i$ values had to be non-negative integers - I'll correct that in my answer.
$endgroup$
– Alex
Dec 23 '18 at 11:53
$begingroup$
Maybe the Poisson likelihood was confusing because I didn't define that the $x_i$ values had to be non-negative integers - I'll correct that in my answer.
$endgroup$
– Alex
Dec 23 '18 at 11:53
$begingroup$
I see your point. In fact I was just simulating the estimator $J$, and it turns out to be very bad. I'll keep on with the initial $I$. Thanks a lot for your useful answer and comments.
$endgroup$
– Moha
Dec 23 '18 at 12:21
$begingroup$
I see your point. In fact I was just simulating the estimator $J$, and it turns out to be very bad. I'll keep on with the initial $I$. Thanks a lot for your useful answer and comments.
$endgroup$
– Moha
Dec 23 '18 at 12:21
|
show 1 more comment
Thanks for contributing an answer to Mathematics Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3045073%2fhow-to-quantify-the-confidence-on-a-bayesian-posterior-probability%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown