Can't argue that.

fossilesque@mander.xyz · 3 months ago

Can't argue that.

Don Piano@feddit.org · 3 months ago

That’s stupid, though. If you can explain 11% of the variance of some noisy phenomenon like cognitive and behavioral flexibility, that’s noteworthy. They tested both linear and quadratic terms, and the quadratic one worked better in terms of prediction, and is also an expression of a meaningful theoretical model, rather than just throwing higher polynomials at it for the fun of it. Quadratic here also would coincide with some homogenizing mechanism at the two ends of the age distribution.

toynbee@lemmy.world · 3 months ago

Whether you’re right or wrong, starting your argument with “that’s stupid, though” is unlikely to convince many.

TimewornTraveler@lemmy.dbzer0.com · 3 months ago

well it convinced me, but I’m stupid and already made up my mind that I wanted to see a reply like that

dream_weasel@sh.itjust.works · 2 months ago

That’s stupid though. People should change their minds when better information is presented regardless of tone!

Don Piano@feddit.org · 2 months ago

Maybe, yeah, but I kinda get annoyed at this kinda dismissiveness - it’s a type of vague anti-science or something like that. Like… Sure, overfitting is a potential issue, but the answer to that isn’t to never fit any curve when data is noisy, it is (among other things) to build solid theories and good tests thereof. A lot of interesting stuff, especially behavioral things, is noisy and you can’t expect to always have relationships that are simple enough to see.

You’re probably right. But also, I was annoyed, not trying to convince. Maybe not the best place to post from. :)

toynbee@lemmy.world · 2 months ago

Your frustration is understandable, but yeah, I agree with the last sentence of your post.

I will acknowledge, in this case, that your post certainly drew engagement.

onslaught545@lemmy.zip · 3 months ago

But I have eyes and the curve they picked as best fit is really poorly fitting. It’s such a poor fit that is almost in a dead zone of the random points.

Don Piano@feddit.org · 2 months ago

I dunno, the point cloud looks to me like some kinda symmetric upward curve. I’d’ve guessed maybe more like R^2=.2 or something in that range, though.

But also: This is noisy, it’s cool to see anything.

SaveTheTuaHawk@lemmy.ca · 2 months ago

It’s a line fitted to a shotgun blast. R2 = 0.11, LOL.

sus@programming.dev · 2 months ago

wtf is up with that confidence interval(?) though

Don Piano@feddit.org · 2 months ago

It’s a 95% CI, presumably for the expected value of the conditional (on age) population mean. It looks correct, given the sample size and variance, what issue do you see with it?

Don Piano@feddit.org · 2 months ago

To expand a little: you get a 95% ci by taking the expected value ±SE*1.96 . The SE you get for a normal distribution by taking the sample SD and dividing that by the sqrt of the sample size. So if you take a standard normal distribution, the SE for a sample size of 9 would be 1/3 and for a sample size of 100 it would be 1/10, etc. This is much tighter than the population distribution, but that’s because youre estimating just the population mean, not anything else.

Capturing structured variance in the data then should increase the precision of your estimate of the expected value, because you’re removing variance from the error term and add it into the other parts of your model (cf. the term analysis of variance).

TowardsTheFuture@lemmy.zip · 2 months ago

Yet it’s one single sample, and possibly not a great one. Few things could cause the shape seen like sample selection of healthy people ignores a lot more of the 65+ community than the younger, and also stuff like those born around the 50’s have higher lead levels could cause more of a dip, or like… plenty of stuff. After some repetitions sure but even then… that’s 11% hell I could probably put in an exponential with a negative exponent and be as accurate or better.

Don Piano@feddit.org · 2 months ago

Sure, you could do some wild overfitting. But why? What substantive theoretical model would such a data model correspond to?

A more straightforward conclusion to draw would be that age is far from the only predictor of flexibility etc., but on the list nevertheless, and if you wanna rule out alternative explanations (or support them), you might have to go and do more observations that allow such arguments to be constructed.

TowardsTheFuture@lemmy.zip · 2 months ago

I mean, that shape is mostly a cone (oop realize I said negative exponent not negative with an exponent but, yeah that plus some other stuff to actually shape it a bit better), just showing… as you get older it could either get worse (if you essentially stop using it) or better (if you continue to use it). But I mean that idea is certainly less provocative than what they’ve got.

grrgyle@slrpnk.net · 2 months ago

Now this should be an xkcd

Don Piano@feddit.org · 2 months ago

To be honest, I doubt Munroe wants to say “if the effect is smaller than you, personally, can spot in the scatterplot, disbelieve any and all conclusions drawn from the dataset”. He seems to be a bit more evenhanded than that, even though I wouldn’t be surprised if a sizable portion of his fans weren’t.

It’s kinda weird, scatterplot inspection is an extremely useful tool in principled data analysis, but spotting stuff is neither sufficient nor necessary for something to be meaningful.

But also… an R^2 of .1 corresponds to a Cohen’s d of 0.67. if this were a comparison of groups, roughly three quarters of the control group would be below the average person in the experimental group. I suspect people (including me) are just bad at intuitions about this kinda thing and like to try to feel superior or something and let loose some half-baked ideas about statistics. Which is a shame, because some of those ideas can become pretty, once fully baked.