^ That Slate piece was just referenced in their write-up on the state of science today: http://www.slate.com/articles/health_and...roken.html
Statistical Significance
65 Replies, 15200 Views
(2017-10-06, 02:07 PM)jkmac Wrote: We've discussed that already. And like it or not,,, this seems to be a fact of life with these phenomenon. They are not consistent, and people need to deal with it.You are talking about true results and false results. There are two kinds of true results - true positives and true negatives. And there are two kinds of false results - false positives and false negatives. Mostly researchers are interested in finding true positives, although they should also be interested in finding true negatives (whole 'nother discussion). What you mention above is how we go about distinguishing between true positives and false positives and false negatives. I excluded true negatives, because we are talking about whether a claim can be supported with significance testing, not whether it can be disproven. The use of a p-value in order to determine whether a study is positive or negative, is the same as using a diagnostic test to decide if a patient has a particular condition (conceptually and mathematically). So I'm going to explain it in those terms. The p-value represents the specificity of the test. That is, it tells us how many false positives (and true negatives) we can expect when the patient doesn't have the condition (there is no psi). The p-value doesn't tell us about true positives. True positives are determined by the sensitivity of the test, which in the case of significance testing with p-values is something called "power". Sensitivity (power) tells us how many positive tests (significant studies) we can expect when the patent has the condition (there is psi). "Power" depends upon the size of the effect and the size of the study (and the sensitivity which we can ignore for now). When we quote the p-value, we give no information about true positives and false negatives - the two things we are most interested in if we are trying to prove a claim or determine if a patient has a condition. That is, the p-value doesn't tell us what we want to know - what is the probability that I have found the effect I am looking for, in this study, or what is the probability that this patient has the condition? Medicine uses the Likelihood Ratio to include information about true positives and false negatives, which is a form of the Bayes' Factor discussed in the link provided earlier. If we do the same thing to the idea of significance testing, we discover that we falsely think a significant study supports the claim more often than the p-value suggests. That is, a p-value of 0.000001 doesn't mean that there is only a one in a million chance that the effect was due to chance - it could be one in three for all we know. That makes a huge difference when it comes to treating patients. When it comes to psi, misclassifications in whether or not a study is a true-positive or a false-positive (and true-negative/false-negative), would give the appearance of inconsistency. The situation you describe arises all the time in medical testing - the result of N+1 testing overturns the results of N testing. And this happens when the Likelihood Ratio of the N+1 diagnostic test is better than the LR of the N diagnostic test. With respect to the Bayes' Factor, this would reflect the results under a high Bayes' factor supplanting the results under a low Bayes' Factor. Linda (2017-10-06, 04:13 PM)fls Wrote: You are talking about true results and false results. There are two kinds of true results - true positives and true negatives. And there are two kinds of false results - false positives and false negatives. Mostly researchers are interested in finding true positives, although they should also be interested in finding true negatives (whole 'nother discussion). What you mention above is how we go about distinguishing between true positives and false positives and false negatives. I excluded true negatives, because we are talking about whether a claim can be supported with significance testing, not whether it can be disproven.So here's the deal: I've already said that I'm not expert in research. Given that, I can't help but think that almost everything you are saying, assuming it is true , is based on tests on a more deterministic system. IE: the thing you are testing, doesn't change behavior significantly, and thus it behaves in the same way the second time you test it. Given these limitations, it is certainly reasonable to think that a second test could shed light on a previous tests results. I can't help but think however, that your experience in medical testing is corrupting your expectations and assumptions in regard to testing this sort of system. I'm suggesting that how one must interpret psi test data might be quite different due to the nature of the thing. I can't make any pronouncements to that fact, it's just an instinct I have. I wish we had someone here with deep experience collecting and analyzing this sort of test data (specifically psi) and could comment and set me (or you) straight. Until then, I'll probably need to keep my inexpert opinion/instinct on the matter to myself. (which will probably make lots of people happy) (2017-10-06, 05:28 PM)jkmac Wrote: So here's the deal: I've already said that I'm not expert in research. Given that, I can't help but think that almost everything you are saying, assuming it is true , is based on tests on a more deterministic system. IE: the thing you are testing, doesn't change behavior significantly, and thus it behaves in the same way the second time you test it. Given these limitations, it is certainly reasonable to think that a second test could shed light on a previous tests results. Not really. The stuff we are testing in medicine acts much the same way psi acts. Quote:I can't help but think however, that your experience in medical testing is corrupting your expectations and assumptions in regard to testing this sort of system. I'm suggesting that how one must interpret psi test data might be quite different due to the nature of the thing. I can't make any pronouncements to that fact, it's just an instinct I have. I have experience both in psi and in medicine. I suspect you don't have a realistic understanding of medicine (if you think that the thing we are testing doesn't change behavior significantly and behaves in the same way the second time you test it). Quote:I wish we had someone here with deep experience collecting and analyzing this sort of test data (specifically psi) and could comment and set me (or you) straight. We have had other people with expertise in collecting and analyzing this sort of test data come by on the previous Skeptic forums. Unfortunately, you would be unlikely to find them helpful, given that they weren't proponents and they agreed with me. Linda (2017-10-06, 05:28 PM)jkmac Wrote: So here's the deal: I've already said that I'm not expert in research. Given that, I can't help but think that almost everything you are saying, assuming it is true , is based on tests on a more deterministic system. IE: the thing you are testing, doesn't change behavior significantly, and thus it behaves in the same way the second time you test it. Given these limitations, it is certainly reasonable to think that a second test could shed light on a previous tests results. What you describe is known as a loophole. In other words psi is too tricky to pin down so let's just assume it's all true.
It seems often that psi researchers are quick to shout eureka when they should be a great deal more reserved in expressing positive conclusions. Here's a non psi example of counting chickens before they've hatched. Start a 7 minutes specifically or just watch the whole vid.
Are the Fundamental Constants Changing? http://psiencequest.net/forums/thread-421.html Quote:Three things the p value can't tell you about your hypothesis. (2017-10-06, 11:37 PM)Steve001 Wrote: [quoted] That's not true, but to be fair to Andrew Gelman, he didn't say that - it came from Eston Martz, who writes a blog for a software company that sells a statistics program. (2017-10-06, 04:13 PM)fls Wrote: The p-value represents the specificity of the test. That is, it tells us how many false positives (and true negatives) we can expect when the patient doesn't have the condition (there is no psi). The p-value doesn't tell us about true positives. True positives are determined by the sensitivity of the test, which in the case of significance testing with p-values is something called "power". Sensitivity (power) tells us how many positive tests (significant studies) we can expect when the patent has the condition (there is psi). "Power" depends upon the size of the effect and the size of the study (and the sensitivity which we can ignore for now). When we quote the p-value, we give no information about true positives and false negatives - the two things we are most interested in if we are trying to prove a claim or determine if a patient has a condition. That is, the p-value doesn't tell us what we want to know - what is the probability that I have found the effect I am looking for, in this study, or what is the probability that this patient has the condition? Chris, I'd be interested in your comments on the above. To me, it seems confused. Supposedly, "power" "tells us how many positive tests (significant studies) we can expect", but at the same time, p-values "give no information about true positives" - so how do we judge a study as "significant" in the absence of a p-value? (2017-10-06, 06:45 PM)fls Wrote: We have had other people with expertise in collecting and analyzing this sort of test data come by on the previous Skeptic forums. Unfortunately, you would be unlikely to find them helpful, given that they weren't proponents and they agreed with me. I'm trying hard not to be rude right now,, because if you look at my post, I pretty clearly state that either one of us might need to be set straight. Your wink doesn't change the fact that you are being passive aggressive the way I see it. |
« Next Oldest | Next Newest »
|
Users browsing this thread: 3 Guest(s)