Interview with Dr. Henry Bauer - Part 1

#122 · Chris 2017-10-27, 11:53 PM Unregistered

fls

Well, although you said previously that you didn't want to talk to me, we seem to be back in a situation where you can't stop talking to me. Frankly, it seems very strange.

malf · malf 2017-10-28, 05:30 AM

I’d be interested in any direct quotes in which Linda was wrong, misleading or dishonest.

#126 · Chris 2017-10-28, 08:20 AM Unregistered

(2017-10-28, 05:30 AM)malf Wrote: I’d be interested in any direct quotes in which Linda was wrong, misleading or dishonest.

OK. I had assumed that everyone who was interested had read more than enough about this already - and enough not to be misled by further nonsense from fls, but evidently not.

You'll remember that we were discussing, more than three weeks ago, whether small effect sizes presented a problem. I made the point that they shouldn't prevent a problem in themselves, so long as the statistical evidence was adequate. Specifically I'd said this:
"I'm afraid I just don't see any rational basis for rejecting a phenomenon purely on the basis that the effect size is small, if the statistical evidence is adequate."
And then just to make sure there was no confusion about what I meant, I said this:
"if the statistical evidence is adequate, then the smallness of the effect is no reason to reject a phenomenon per se."

Then there was some other discussion of an article by Steven Novella, and later that day fls jumped into it to say that the p value was interpreted, or "usually taken", to mean something it didn't (which was misleading in itself, but only par for the course, I suppose), and then she said this:
"And this is where effect size becomes relevant, as smaller effects increase the likelihood that that positive results are false-positives, even in the setting of very low p-values (because the probability of producing a positive result when the alternative hypothesis is true is also very low)."

There's no doubt that statement is factually false as it stands, because it's the statistical power that influences the false-positive rate, not the effect size. If a study of a smaller effect size has adequate statistical power, then the false-positive rate will be no higher. And everyone - and perhaps one can say especially sceptics - agrees that experimental studies should always be adequately powered. Moreover, not only was it factually false, but it was especially misleading in the context of our discussion, where I had twice emphasised I was talking about a situation in which the statistical evidence for the effect was adequate - which would obviously imply the study was adequately powered.

I responded saying I didn't think what she'd said was correct, and the following day I added:
"After a bit more thought, I think the answer is that there's no consistent relationship between the effect size and the likelihood of a false positive, other things (notably the p value) being equal. I think it's possible to construct examples where the effect size is smaller and the likelihood of a false positive is bigger, but also examples where the effect size is smaller and the likelihood of a false positive is smaller. It will depend on the values of the parameters."

And the day after that, I posted some equations to demonstrate that it was the power, not the effect size, which affected the false positive rate.

I see that fls now says "I came right out and said that the power will influence the false positive rate, well before you mathematically demonstrated and confirmed that the power influences the false positive rate."

But that is simply a lie. I have just checked her posts in the relevant period, and she mentions power only once, and that is in relation to the true positive rate (which is simply the definition of power).

I suppose the only consequence of my posting this is going to be more attempts at obfuscation from fls. If that happens, it's entirely malf's fault! LOL

#127 · Chris 2017-10-28, 08:26 AM Unregistered

But seriously, I find it deeply weird that fls has insisted on bringing this up again weeks later, when there really shouldn't be any need for it to be discussed again.

If she is really so keen to acknowledge and correct her errors, as she said she was, the one she needs to correct is the error over the confirming/disconfirming p/not q paper that effectively wrecked this thread. The discussion following that was so convoluted that it really could mislead someone if they come across it in the future. The honest thing for her to do would be to edit her original post to include a small note pointing out that she made an error. But one gets the impression that doing that would almost kill her.

***Laird*** · ***Laird*** 2017-10-28, 09:24 AM Administrator

OK, Chris and malf, if we're going to do this, then let's be comprehensive and include references.

I'll start by adding references for Chris's previous post for each of the comments he cites. Note that the thread in which these comments were made is Principles of Curiosity:

Chris's comment "I'm afraid I just don't see any rational basis for rejecting a phenomenon purely on the basis that the effect size is small, if the statistical evidence is adequate." comes from post #36.

Chris's follow-up comment to clear up confusion, "if the statistical evidence is adequate, then the smallness of the effect is no reason to reject a phenomenon per se." comes from post #42.

Linda's misleading comment, "And this is where effect size becomes relevant, as smaller effects increase the likelihood that that positive results are false-positives, even in the setting of very low p-values (because the probability of producing a positive result when the alternative hypothesis is true is also very low)." comes from post #54.

Chris's comment, "After a bit more thought, I think the answer is that there's no consistent relationship between the effect size and the likelihood of a false positive, other things (notably the p value) being equal. I think it's possible to construct examples where the effect size is smaller and the likelihood of a false positive is bigger, but also examples where the effect size is smaller and the likelihood of a false positive is smaller. It will depend on the values of the parameters." comes from post #68.

Linda's comment, "I came right out and said that the power will influence the false positive rate, well before you mathematically demonstrated and confirmed that the power influences the false positive rate.", comes from post #124 in this thread.

I have to say that now that Chris has made clear the context in which Linda made her "small effects" comment, I'm more inclined to side with Chris on this issue and less inclined to side with Linda. I still think though that it's a little more ambiguous than what's to come.

OK, so, referencing done, let's try now to be comprehensive about Linda's other (recent) false claims, in time order.

Firstly, in another thread, somewhat recently, Linda made a false claim that p-values have no impact on whether a result is a falsetrue-positive, and that only power does.

In the thread, Statistical Significance, in post #22, Linda wrote (emboldening in the original): "The p-value doesn't tell us about true positives. True positives are determined by the sensitivity of the test, which in the case of significance testing with p-values is something called "power"."

I responded saying that this post seemed confused, and asking Chris what he thought.

Chris disproved Linda's claim in post #40, where, backed by a formula, he wrote (emboldening mine): "both the significance level (in other words the p value) and the power - as well as the prior estimates P0 and P1 - influence the estimates of the likelihood of psi and no-psi."

He reiterated this in post #47: "Nor, in my view, do they [the formulae Chris posted --Laird] back up what fls said on this thread about the importance of the power and the unimportance of the p value. Clearly, they both play a role."

Linda's response to being (mathematically) proved wrong on this point? Did she admit her error, as she claims she is so willing to do? Nope. Total avoidance of the issue. Literally no comment.

Secondly, in this thread, Linda made a false or at least misleading claim that hypotheses can be classified as either "confirmatory" or "disconfirmatory":

In post #39, Linda wrote: "Falsification is about whether your hypothesis reflects what you would see if your idea is true or whether it reflects what you would see if your hypothesis is false. This is different than the distinction between an alternative and a null hypothesis."

I demonstrated to her in my post #44 that a so-called "confirming" hypothesis could equally be reframed as a "disconfirming" hypothesis, with no functional change to the experiment. Linda accepted this in her post #45: "I don't think they are functionally different."

I reiterated this point in my post #68: "But as I pointed out, this is purely a semantic distinction - any positive hypothesis can (I think) be reframed as a negative hypothesis, and vice versa, without any change to the way the experiment is conducted. For example, the positively-framed hypothesis "That when psi is present, a statistically significant effect will be observed" can be reworked into the negatively-framed hypothesis "That when psi is present, a statistically significant effect will not be observed". Nothing about the experiment would change.

You seem to agree with me about this."

Linda affirmed in her post #72 that (by now), yes, she did agree with me: "Of course any hypothesis can be framed as a positive and as a negative. What you described is the basic process of forming your alternative hypothesis and a null hypothesis. No disagreement here".

Any acknowledgement that this is a total contradiction of her original statement in post #39? Any admission of error, and of having been persuaded that she had made an error? Nope. None whatsoever.

Thirdly, as Chris reaffirmed in his immediately preceding post (#127), Linda committed a serious error by making a false distinction between tests of "p" and "not q" as described in a paper which she referenced.

In (again) post #39, Linda wrote: "You are attempting to confirm the idea when you choose to look at "p". Your result may be "q" (alternative hypothesis confirmed) or "not q" (null hypothesis confirmed).

You are attempting to falsify the idea when you choose to look at "not q". Your result may be "not p" (alternative hypothesis confirmed) or "p" (null hypothesis confirmed)."

I pointed out that this was a bogus distinction between a test that "confirms" the "idea" and one that "falsifies" it in post #68:

Quote:However, as both the paper and logic dictate, whilst, yes, starting by looking at the consequent "not q" can potentially falsify the hypothesis (in the case that the antecedent turns out to be "p"), starting by looking at the antecedent "p" can also potentially falsify the hypothesis (in the case that the consequent turns out to be "not q").

Both tests can potentially falsify the hypothesis.

Chris reiterated this critique in his post #80: "Do you agree, now that it's been explained to you, that looking at "p" and looking at "not q" are exactly similar in their potential effects and consequences? That it's nonsensical to say that one is an attempt to confirm the "idea" and the other is an attempt to falsify it? That the whole point of the paper you referred to was that it was necessary to look at both "p" and "not q" in order to test the hypothesis? In short, that you missed the whole point of that paper when you tried to present looking at "p" and looking at "not q" as alternative tests? And that much of the ensuing confusion on this thread is of your own making?"

Linda's response to this direct invitation to admit to error, which she claims she is willing to do? We're still waiting for it...

So, malf, maybe that satisfies your request?

fls · fls 2017-10-28, 10:00 AM

So the error I am guilty of is that Chris did not realize, at first, that I was referring to small effects' effect on power.

I'm not sure what the p and q thing is about. I only mentioned p and q in one post, in the context of the paper on the Wason card test. I'm not sure how I sowed confusion with that mention, especially given that one need only read the paper, if confused.

Linda

***Laird*** · ***Laird*** 2017-10-28, 10:08 AM Administrator

Wow, Linda. That's your response to being comprehensively called out on at least three and probably four instances of blatant error? As Chris would say: astonishing.