Finding Significant Probability from Data
up vote
0
down vote
favorite
I've been recording the results of nearly 2000 card packs being opened. When you open a pack you are guaranteed a rare card but sometimes you get a mythic card instead. I would like to find out what the probability is of getting a mythic card from a pack. From my data I have recorded 220 mythics out of 1874 packs or approximately 11.74%. How do I know if I have enough data points to conclude this is reasonably correct or how do I know if I need more data?
probability
add a comment |
up vote
0
down vote
favorite
I've been recording the results of nearly 2000 card packs being opened. When you open a pack you are guaranteed a rare card but sometimes you get a mythic card instead. I would like to find out what the probability is of getting a mythic card from a pack. From my data I have recorded 220 mythics out of 1874 packs or approximately 11.74%. How do I know if I have enough data points to conclude this is reasonably correct or how do I know if I need more data?
probability
It depends on what you consider to be reasonably correct, do you want to be correct within 10 percent, or perhaps 5%. Depending the precision you are looking for, you will need different amounts of data points.
– gd1035
Nov 23 at 19:30
I suppose 5% since I believe that is a fairly common value that is used.
– Mr.Rasputin
Nov 23 at 19:39
Use the Chebyshev inequality.
– John Douma
Nov 23 at 19:49
add a comment |
up vote
0
down vote
favorite
up vote
0
down vote
favorite
I've been recording the results of nearly 2000 card packs being opened. When you open a pack you are guaranteed a rare card but sometimes you get a mythic card instead. I would like to find out what the probability is of getting a mythic card from a pack. From my data I have recorded 220 mythics out of 1874 packs or approximately 11.74%. How do I know if I have enough data points to conclude this is reasonably correct or how do I know if I need more data?
probability
I've been recording the results of nearly 2000 card packs being opened. When you open a pack you are guaranteed a rare card but sometimes you get a mythic card instead. I would like to find out what the probability is of getting a mythic card from a pack. From my data I have recorded 220 mythics out of 1874 packs or approximately 11.74%. How do I know if I have enough data points to conclude this is reasonably correct or how do I know if I need more data?
probability
probability
asked Nov 23 at 19:22
Mr.Rasputin
51
51
It depends on what you consider to be reasonably correct, do you want to be correct within 10 percent, or perhaps 5%. Depending the precision you are looking for, you will need different amounts of data points.
– gd1035
Nov 23 at 19:30
I suppose 5% since I believe that is a fairly common value that is used.
– Mr.Rasputin
Nov 23 at 19:39
Use the Chebyshev inequality.
– John Douma
Nov 23 at 19:49
add a comment |
It depends on what you consider to be reasonably correct, do you want to be correct within 10 percent, or perhaps 5%. Depending the precision you are looking for, you will need different amounts of data points.
– gd1035
Nov 23 at 19:30
I suppose 5% since I believe that is a fairly common value that is used.
– Mr.Rasputin
Nov 23 at 19:39
Use the Chebyshev inequality.
– John Douma
Nov 23 at 19:49
It depends on what you consider to be reasonably correct, do you want to be correct within 10 percent, or perhaps 5%. Depending the precision you are looking for, you will need different amounts of data points.
– gd1035
Nov 23 at 19:30
It depends on what you consider to be reasonably correct, do you want to be correct within 10 percent, or perhaps 5%. Depending the precision you are looking for, you will need different amounts of data points.
– gd1035
Nov 23 at 19:30
I suppose 5% since I believe that is a fairly common value that is used.
– Mr.Rasputin
Nov 23 at 19:39
I suppose 5% since I believe that is a fairly common value that is used.
– Mr.Rasputin
Nov 23 at 19:39
Use the Chebyshev inequality.
– John Douma
Nov 23 at 19:49
Use the Chebyshev inequality.
– John Douma
Nov 23 at 19:49
add a comment |
1 Answer
1
active
oldest
votes
up vote
0
down vote
accepted
As a simple approach to get a reasonable estimate of the sample size needed you could use Cochran's formula:
$$N=frac{Z^2,p,(1-p)}{epsilon^2}$$
where $epsilon$ is the desired level of precision (in your case, 0.05), $Z$ is the Z-score that corresponds to the desired confidence interval (in your case, with a 95% confidence interval, $Z=1.96$), $p$ is an estimate of the proportion of the property being observed (mythic card, with estimated proportion 0.1174). You will see that, by doing this calculation, you get approximately $N=159$, well below the size you already have.
Thank you, thats perfect and very helpful. I was also wondering how do you calculate the Z-score? I was thinking of using the formula to check at smaller precision levels
– Mr.Rasputin
Nov 27 at 20:25
You can have a look at the z-score tables (particularly the left z-score table). First, note that the z-distribution is symmetric around 0 and that we will want to distribute the precision level (in your case 0.05) equally between the two queues. Thus, in the table we need to look for the accumulated probability 0.975. You will find it in the row corresponding to the value 1.9 and column corresponding to the value 0.06 (which adds up the second decimal). Thus, the Z-score we are looking for is 1.96.
– DavidPM
Nov 27 at 22:05
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3010737%2ffinding-significant-probability-from-data%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
accepted
As a simple approach to get a reasonable estimate of the sample size needed you could use Cochran's formula:
$$N=frac{Z^2,p,(1-p)}{epsilon^2}$$
where $epsilon$ is the desired level of precision (in your case, 0.05), $Z$ is the Z-score that corresponds to the desired confidence interval (in your case, with a 95% confidence interval, $Z=1.96$), $p$ is an estimate of the proportion of the property being observed (mythic card, with estimated proportion 0.1174). You will see that, by doing this calculation, you get approximately $N=159$, well below the size you already have.
Thank you, thats perfect and very helpful. I was also wondering how do you calculate the Z-score? I was thinking of using the formula to check at smaller precision levels
– Mr.Rasputin
Nov 27 at 20:25
You can have a look at the z-score tables (particularly the left z-score table). First, note that the z-distribution is symmetric around 0 and that we will want to distribute the precision level (in your case 0.05) equally between the two queues. Thus, in the table we need to look for the accumulated probability 0.975. You will find it in the row corresponding to the value 1.9 and column corresponding to the value 0.06 (which adds up the second decimal). Thus, the Z-score we are looking for is 1.96.
– DavidPM
Nov 27 at 22:05
add a comment |
up vote
0
down vote
accepted
As a simple approach to get a reasonable estimate of the sample size needed you could use Cochran's formula:
$$N=frac{Z^2,p,(1-p)}{epsilon^2}$$
where $epsilon$ is the desired level of precision (in your case, 0.05), $Z$ is the Z-score that corresponds to the desired confidence interval (in your case, with a 95% confidence interval, $Z=1.96$), $p$ is an estimate of the proportion of the property being observed (mythic card, with estimated proportion 0.1174). You will see that, by doing this calculation, you get approximately $N=159$, well below the size you already have.
Thank you, thats perfect and very helpful. I was also wondering how do you calculate the Z-score? I was thinking of using the formula to check at smaller precision levels
– Mr.Rasputin
Nov 27 at 20:25
You can have a look at the z-score tables (particularly the left z-score table). First, note that the z-distribution is symmetric around 0 and that we will want to distribute the precision level (in your case 0.05) equally between the two queues. Thus, in the table we need to look for the accumulated probability 0.975. You will find it in the row corresponding to the value 1.9 and column corresponding to the value 0.06 (which adds up the second decimal). Thus, the Z-score we are looking for is 1.96.
– DavidPM
Nov 27 at 22:05
add a comment |
up vote
0
down vote
accepted
up vote
0
down vote
accepted
As a simple approach to get a reasonable estimate of the sample size needed you could use Cochran's formula:
$$N=frac{Z^2,p,(1-p)}{epsilon^2}$$
where $epsilon$ is the desired level of precision (in your case, 0.05), $Z$ is the Z-score that corresponds to the desired confidence interval (in your case, with a 95% confidence interval, $Z=1.96$), $p$ is an estimate of the proportion of the property being observed (mythic card, with estimated proportion 0.1174). You will see that, by doing this calculation, you get approximately $N=159$, well below the size you already have.
As a simple approach to get a reasonable estimate of the sample size needed you could use Cochran's formula:
$$N=frac{Z^2,p,(1-p)}{epsilon^2}$$
where $epsilon$ is the desired level of precision (in your case, 0.05), $Z$ is the Z-score that corresponds to the desired confidence interval (in your case, with a 95% confidence interval, $Z=1.96$), $p$ is an estimate of the proportion of the property being observed (mythic card, with estimated proportion 0.1174). You will see that, by doing this calculation, you get approximately $N=159$, well below the size you already have.
answered Nov 23 at 23:07
DavidPM
16618
16618
Thank you, thats perfect and very helpful. I was also wondering how do you calculate the Z-score? I was thinking of using the formula to check at smaller precision levels
– Mr.Rasputin
Nov 27 at 20:25
You can have a look at the z-score tables (particularly the left z-score table). First, note that the z-distribution is symmetric around 0 and that we will want to distribute the precision level (in your case 0.05) equally between the two queues. Thus, in the table we need to look for the accumulated probability 0.975. You will find it in the row corresponding to the value 1.9 and column corresponding to the value 0.06 (which adds up the second decimal). Thus, the Z-score we are looking for is 1.96.
– DavidPM
Nov 27 at 22:05
add a comment |
Thank you, thats perfect and very helpful. I was also wondering how do you calculate the Z-score? I was thinking of using the formula to check at smaller precision levels
– Mr.Rasputin
Nov 27 at 20:25
You can have a look at the z-score tables (particularly the left z-score table). First, note that the z-distribution is symmetric around 0 and that we will want to distribute the precision level (in your case 0.05) equally between the two queues. Thus, in the table we need to look for the accumulated probability 0.975. You will find it in the row corresponding to the value 1.9 and column corresponding to the value 0.06 (which adds up the second decimal). Thus, the Z-score we are looking for is 1.96.
– DavidPM
Nov 27 at 22:05
Thank you, thats perfect and very helpful. I was also wondering how do you calculate the Z-score? I was thinking of using the formula to check at smaller precision levels
– Mr.Rasputin
Nov 27 at 20:25
Thank you, thats perfect and very helpful. I was also wondering how do you calculate the Z-score? I was thinking of using the formula to check at smaller precision levels
– Mr.Rasputin
Nov 27 at 20:25
You can have a look at the z-score tables (particularly the left z-score table). First, note that the z-distribution is symmetric around 0 and that we will want to distribute the precision level (in your case 0.05) equally between the two queues. Thus, in the table we need to look for the accumulated probability 0.975. You will find it in the row corresponding to the value 1.9 and column corresponding to the value 0.06 (which adds up the second decimal). Thus, the Z-score we are looking for is 1.96.
– DavidPM
Nov 27 at 22:05
You can have a look at the z-score tables (particularly the left z-score table). First, note that the z-distribution is symmetric around 0 and that we will want to distribute the precision level (in your case 0.05) equally between the two queues. Thus, in the table we need to look for the accumulated probability 0.975. You will find it in the row corresponding to the value 1.9 and column corresponding to the value 0.06 (which adds up the second decimal). Thus, the Z-score we are looking for is 1.96.
– DavidPM
Nov 27 at 22:05
add a comment |
Thanks for contributing an answer to Mathematics Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3010737%2ffinding-significant-probability-from-data%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
It depends on what you consider to be reasonably correct, do you want to be correct within 10 percent, or perhaps 5%. Depending the precision you are looking for, you will need different amounts of data points.
– gd1035
Nov 23 at 19:30
I suppose 5% since I believe that is a fairly common value that is used.
– Mr.Rasputin
Nov 23 at 19:39
Use the Chebyshev inequality.
– John Douma
Nov 23 at 19:49