Finding Significant Probability from Data











up vote
0
down vote

favorite












I've been recording the results of nearly 2000 card packs being opened. When you open a pack you are guaranteed a rare card but sometimes you get a mythic card instead. I would like to find out what the probability is of getting a mythic card from a pack. From my data I have recorded 220 mythics out of 1874 packs or approximately 11.74%. How do I know if I have enough data points to conclude this is reasonably correct or how do I know if I need more data?










share|cite|improve this question






















  • It depends on what you consider to be reasonably correct, do you want to be correct within 10 percent, or perhaps 5%. Depending the precision you are looking for, you will need different amounts of data points.
    – gd1035
    Nov 23 at 19:30












  • I suppose 5% since I believe that is a fairly common value that is used.
    – Mr.Rasputin
    Nov 23 at 19:39










  • Use the Chebyshev inequality.
    – John Douma
    Nov 23 at 19:49















up vote
0
down vote

favorite












I've been recording the results of nearly 2000 card packs being opened. When you open a pack you are guaranteed a rare card but sometimes you get a mythic card instead. I would like to find out what the probability is of getting a mythic card from a pack. From my data I have recorded 220 mythics out of 1874 packs or approximately 11.74%. How do I know if I have enough data points to conclude this is reasonably correct or how do I know if I need more data?










share|cite|improve this question






















  • It depends on what you consider to be reasonably correct, do you want to be correct within 10 percent, or perhaps 5%. Depending the precision you are looking for, you will need different amounts of data points.
    – gd1035
    Nov 23 at 19:30












  • I suppose 5% since I believe that is a fairly common value that is used.
    – Mr.Rasputin
    Nov 23 at 19:39










  • Use the Chebyshev inequality.
    – John Douma
    Nov 23 at 19:49













up vote
0
down vote

favorite









up vote
0
down vote

favorite











I've been recording the results of nearly 2000 card packs being opened. When you open a pack you are guaranteed a rare card but sometimes you get a mythic card instead. I would like to find out what the probability is of getting a mythic card from a pack. From my data I have recorded 220 mythics out of 1874 packs or approximately 11.74%. How do I know if I have enough data points to conclude this is reasonably correct or how do I know if I need more data?










share|cite|improve this question













I've been recording the results of nearly 2000 card packs being opened. When you open a pack you are guaranteed a rare card but sometimes you get a mythic card instead. I would like to find out what the probability is of getting a mythic card from a pack. From my data I have recorded 220 mythics out of 1874 packs or approximately 11.74%. How do I know if I have enough data points to conclude this is reasonably correct or how do I know if I need more data?







probability






share|cite|improve this question













share|cite|improve this question











share|cite|improve this question




share|cite|improve this question










asked Nov 23 at 19:22









Mr.Rasputin

51




51












  • It depends on what you consider to be reasonably correct, do you want to be correct within 10 percent, or perhaps 5%. Depending the precision you are looking for, you will need different amounts of data points.
    – gd1035
    Nov 23 at 19:30












  • I suppose 5% since I believe that is a fairly common value that is used.
    – Mr.Rasputin
    Nov 23 at 19:39










  • Use the Chebyshev inequality.
    – John Douma
    Nov 23 at 19:49


















  • It depends on what you consider to be reasonably correct, do you want to be correct within 10 percent, or perhaps 5%. Depending the precision you are looking for, you will need different amounts of data points.
    – gd1035
    Nov 23 at 19:30












  • I suppose 5% since I believe that is a fairly common value that is used.
    – Mr.Rasputin
    Nov 23 at 19:39










  • Use the Chebyshev inequality.
    – John Douma
    Nov 23 at 19:49
















It depends on what you consider to be reasonably correct, do you want to be correct within 10 percent, or perhaps 5%. Depending the precision you are looking for, you will need different amounts of data points.
– gd1035
Nov 23 at 19:30






It depends on what you consider to be reasonably correct, do you want to be correct within 10 percent, or perhaps 5%. Depending the precision you are looking for, you will need different amounts of data points.
– gd1035
Nov 23 at 19:30














I suppose 5% since I believe that is a fairly common value that is used.
– Mr.Rasputin
Nov 23 at 19:39




I suppose 5% since I believe that is a fairly common value that is used.
– Mr.Rasputin
Nov 23 at 19:39












Use the Chebyshev inequality.
– John Douma
Nov 23 at 19:49




Use the Chebyshev inequality.
– John Douma
Nov 23 at 19:49










1 Answer
1






active

oldest

votes

















up vote
0
down vote



accepted










As a simple approach to get a reasonable estimate of the sample size needed you could use Cochran's formula:
$$N=frac{Z^2,p,(1-p)}{epsilon^2}$$
where $epsilon$ is the desired level of precision (in your case, 0.05), $Z$ is the Z-score that corresponds to the desired confidence interval (in your case, with a 95% confidence interval, $Z=1.96$), $p$ is an estimate of the proportion of the property being observed (mythic card, with estimated proportion 0.1174). You will see that, by doing this calculation, you get approximately $N=159$, well below the size you already have.






share|cite|improve this answer





















  • Thank you, thats perfect and very helpful. I was also wondering how do you calculate the Z-score? I was thinking of using the formula to check at smaller precision levels
    – Mr.Rasputin
    Nov 27 at 20:25










  • You can have a look at the z-score tables (particularly the left z-score table). First, note that the z-distribution is symmetric around 0 and that we will want to distribute the precision level (in your case 0.05) equally between the two queues. Thus, in the table we need to look for the accumulated probability 0.975. You will find it in the row corresponding to the value 1.9 and column corresponding to the value 0.06 (which adds up the second decimal). Thus, the Z-score we are looking for is 1.96.
    – DavidPM
    Nov 27 at 22:05













Your Answer





StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3010737%2ffinding-significant-probability-from-data%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
0
down vote



accepted










As a simple approach to get a reasonable estimate of the sample size needed you could use Cochran's formula:
$$N=frac{Z^2,p,(1-p)}{epsilon^2}$$
where $epsilon$ is the desired level of precision (in your case, 0.05), $Z$ is the Z-score that corresponds to the desired confidence interval (in your case, with a 95% confidence interval, $Z=1.96$), $p$ is an estimate of the proportion of the property being observed (mythic card, with estimated proportion 0.1174). You will see that, by doing this calculation, you get approximately $N=159$, well below the size you already have.






share|cite|improve this answer





















  • Thank you, thats perfect and very helpful. I was also wondering how do you calculate the Z-score? I was thinking of using the formula to check at smaller precision levels
    – Mr.Rasputin
    Nov 27 at 20:25










  • You can have a look at the z-score tables (particularly the left z-score table). First, note that the z-distribution is symmetric around 0 and that we will want to distribute the precision level (in your case 0.05) equally between the two queues. Thus, in the table we need to look for the accumulated probability 0.975. You will find it in the row corresponding to the value 1.9 and column corresponding to the value 0.06 (which adds up the second decimal). Thus, the Z-score we are looking for is 1.96.
    – DavidPM
    Nov 27 at 22:05

















up vote
0
down vote



accepted










As a simple approach to get a reasonable estimate of the sample size needed you could use Cochran's formula:
$$N=frac{Z^2,p,(1-p)}{epsilon^2}$$
where $epsilon$ is the desired level of precision (in your case, 0.05), $Z$ is the Z-score that corresponds to the desired confidence interval (in your case, with a 95% confidence interval, $Z=1.96$), $p$ is an estimate of the proportion of the property being observed (mythic card, with estimated proportion 0.1174). You will see that, by doing this calculation, you get approximately $N=159$, well below the size you already have.






share|cite|improve this answer





















  • Thank you, thats perfect and very helpful. I was also wondering how do you calculate the Z-score? I was thinking of using the formula to check at smaller precision levels
    – Mr.Rasputin
    Nov 27 at 20:25










  • You can have a look at the z-score tables (particularly the left z-score table). First, note that the z-distribution is symmetric around 0 and that we will want to distribute the precision level (in your case 0.05) equally between the two queues. Thus, in the table we need to look for the accumulated probability 0.975. You will find it in the row corresponding to the value 1.9 and column corresponding to the value 0.06 (which adds up the second decimal). Thus, the Z-score we are looking for is 1.96.
    – DavidPM
    Nov 27 at 22:05















up vote
0
down vote



accepted







up vote
0
down vote



accepted






As a simple approach to get a reasonable estimate of the sample size needed you could use Cochran's formula:
$$N=frac{Z^2,p,(1-p)}{epsilon^2}$$
where $epsilon$ is the desired level of precision (in your case, 0.05), $Z$ is the Z-score that corresponds to the desired confidence interval (in your case, with a 95% confidence interval, $Z=1.96$), $p$ is an estimate of the proportion of the property being observed (mythic card, with estimated proportion 0.1174). You will see that, by doing this calculation, you get approximately $N=159$, well below the size you already have.






share|cite|improve this answer












As a simple approach to get a reasonable estimate of the sample size needed you could use Cochran's formula:
$$N=frac{Z^2,p,(1-p)}{epsilon^2}$$
where $epsilon$ is the desired level of precision (in your case, 0.05), $Z$ is the Z-score that corresponds to the desired confidence interval (in your case, with a 95% confidence interval, $Z=1.96$), $p$ is an estimate of the proportion of the property being observed (mythic card, with estimated proportion 0.1174). You will see that, by doing this calculation, you get approximately $N=159$, well below the size you already have.







share|cite|improve this answer












share|cite|improve this answer



share|cite|improve this answer










answered Nov 23 at 23:07









DavidPM

16618




16618












  • Thank you, thats perfect and very helpful. I was also wondering how do you calculate the Z-score? I was thinking of using the formula to check at smaller precision levels
    – Mr.Rasputin
    Nov 27 at 20:25










  • You can have a look at the z-score tables (particularly the left z-score table). First, note that the z-distribution is symmetric around 0 and that we will want to distribute the precision level (in your case 0.05) equally between the two queues. Thus, in the table we need to look for the accumulated probability 0.975. You will find it in the row corresponding to the value 1.9 and column corresponding to the value 0.06 (which adds up the second decimal). Thus, the Z-score we are looking for is 1.96.
    – DavidPM
    Nov 27 at 22:05




















  • Thank you, thats perfect and very helpful. I was also wondering how do you calculate the Z-score? I was thinking of using the formula to check at smaller precision levels
    – Mr.Rasputin
    Nov 27 at 20:25










  • You can have a look at the z-score tables (particularly the left z-score table). First, note that the z-distribution is symmetric around 0 and that we will want to distribute the precision level (in your case 0.05) equally between the two queues. Thus, in the table we need to look for the accumulated probability 0.975. You will find it in the row corresponding to the value 1.9 and column corresponding to the value 0.06 (which adds up the second decimal). Thus, the Z-score we are looking for is 1.96.
    – DavidPM
    Nov 27 at 22:05


















Thank you, thats perfect and very helpful. I was also wondering how do you calculate the Z-score? I was thinking of using the formula to check at smaller precision levels
– Mr.Rasputin
Nov 27 at 20:25




Thank you, thats perfect and very helpful. I was also wondering how do you calculate the Z-score? I was thinking of using the formula to check at smaller precision levels
– Mr.Rasputin
Nov 27 at 20:25












You can have a look at the z-score tables (particularly the left z-score table). First, note that the z-distribution is symmetric around 0 and that we will want to distribute the precision level (in your case 0.05) equally between the two queues. Thus, in the table we need to look for the accumulated probability 0.975. You will find it in the row corresponding to the value 1.9 and column corresponding to the value 0.06 (which adds up the second decimal). Thus, the Z-score we are looking for is 1.96.
– DavidPM
Nov 27 at 22:05






You can have a look at the z-score tables (particularly the left z-score table). First, note that the z-distribution is symmetric around 0 and that we will want to distribute the precision level (in your case 0.05) equally between the two queues. Thus, in the table we need to look for the accumulated probability 0.975. You will find it in the row corresponding to the value 1.9 and column corresponding to the value 0.06 (which adds up the second decimal). Thus, the Z-score we are looking for is 1.96.
– DavidPM
Nov 27 at 22:05




















draft saved

draft discarded




















































Thanks for contributing an answer to Mathematics Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


Use MathJax to format equations. MathJax reference.


To learn more, see our tips on writing great answers.





Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


Please pay close attention to the following guidance:


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3010737%2ffinding-significant-probability-from-data%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Le Mesnil-Réaume

Ida-Boy-Ed-Garten

web3.py web3.isConnected() returns false always