Two years too late, but I prepared some comments on this so-called "advanced analysis" of paralyze proc data, mainly concerning the statistical sophistry involved. (I really hope insights have been further developed since then.) Such are the perils of idleness. (I don't recommend that you continue reading further; you've been warned.) I address specific sections of the write-up (sections in boldface).
Introduction
The author claims that it is not desirable to maximize the duration of a paralyze effect. Instead, he (is it ever a she when bloviating about some B.S.?) seems to think that maximizing the number of processes (procs) per cast is the relevant goal. He cites two hypothetical situations where the durations are different yet the rate of procs per unit time is the same. He argues that the scenario with the shorter duration gives an opportunity to reapply a possibly stronger paralyze (higher rate of procs per unit time).
However, he proposed a model that assumes that MND, enfeebling magic skill, and a HQ staff have an effect (statistically significant or not) on both spell duration and the number of paralyze procs. So why not just model the rate of procs per unit time to begin with? The author argues we must "account for" (control for) the effect of duration (something we cannot directly control) so we can see how the controlled factors affect the number of procs directly within some varying time interval that is supposed to be under statistical control. But this is also modeling the rate of procs per unit time (when duration is controlled).
Finally, his "analysis" shows that the duration of the paralyze effect has the greatest effect on the number of procs (MND also does), which he considers unfortunate. However, it goes without saying (but I'll say it anyway) that you cannot change duration purposefully without changing some combination of MND and enfeebling skill (not to mention any omitted variables that may affect duration). (In most practical situations MP-users don't cast without elemental staves.) So what, exactly, did you expect?
Preliminary Analysis
Note that the presence of the 10 missing observations affects the calculation of the correlation matrix. The missing observations are excluded from the subsequent path analysis.
Path Analysis
First off, I must acknowledge that I have never used path analysis for anything, so as I become more familiar with it I may revise my comments later.
The pair-wise "sample" correlations between the so-called exogenous variables here, MND, enfeebling skill, and HQ staff, are meaningless as the variables are not random. (What multicollinearity?) I don't even know why they are indicated on the diagram other than to follow some rote procedure rigidly.
"Clustered ordinary least-squares (OLS) regression" is an oxymoron. Generally speaking, using a robust least-squares method of estimation is a departure from what is ordinarily done. Furthermore, the justification for "clustered robust" LS estimation--that observations within each group (naked, enfeebling, MND, etc.) are not independent--is not valid. The author attributes lack of independence of observations within groups to the "experimental setup of this test," but there is absolutely nothing in the description of the "experimental setup" that suggests this should be so. Autocorrelation is not an issue. (Why would catoblepas build up resistance to paralyze anyway?) But even if it were, a "clustered robust" method cannot account for that. What he basically did was control for group effects twice, which is absolute nonsense and has no effect on his parameter point estimates anyway. (The coefficient of determination, R2, is the same whether improperly accounting for nonexistent "clustering" or not.)
There is also the issue of not controlling for test subject (monster), but regardless of the magnitude of the effect of test subject, this concern is not discussed while comparatively more frivolous concerns are. To wit, the author's irrelevant aside about Bayesian inference has nothing to do with the use of BIC here, even though he is not really doing model selection but providing cover for arguing that MND may be a more "important" predictor of duration than the use of a HQ staff.
That cover is rather weak though since individual (non-simultaneous) interval estimates for the "standardized coefficients" are rather wide in the model that the author actually "chose":
MND: (.086, .404)
skill: (.017, .337)
staff: (.057, .377)
Now consider the second regression (modeling number of procs). Again, the author uses completely inappropriate clustered robust linear regression, which leads him to trump up enfeebling skill as highly significant. In reality, the enfeebling effect is barely significant at the 5% level, hardly convincing evidence of a real effect (if it exists, which I doubt). Moreover, something fishy could be going on with the last set of observations. If you omit those from the analysis, the enfeebling effect does not even approach significance. But the data are what they are.
Discussion
Again, the author fails to recognize the imprecision of his parameter estimates (standardized beta coefficients) despite curiously devoting time earlier to a frivolous comparison of two population correlations in Appendix A.
Today, it may be "commonly known" that MND does affect the accuracy of a MND-based magic spell in some way, but arguing that MND has a relatively stronger effect on paralyze duration (a measure of accuracy) than enfeebling skill on the basis of standardized effects is spurious because of the poor parameter estimates and because of the interpretation. Obviously, the main effects are not random variables, so their associated standard deviations don't have any particular meaning as they are just an artifact of experimental control.
Consider the interpretations in real-unit terms. From the first linear regression, the duration is estimated to increase by 6.38 seconds for every 22.8-point increase in MND (controlling for the other main effects). Similarly, the duration is estimated to increase by 4.93 seconds for every 14.4-point increase in enfeebling magic skill (controlling for the other main effects). Point for point, enfeebling magic skill is more effective than MND, and I don't know anyone who would argue for a comparison other than by a per-point basis.
Certainly, there are distinct levels of resists, but there is no reason to believe that HQ staves have a privileged role in determining the distribution of partial resists any more than other factors that affect magic "hit rate," especially since magic accuracy bonuses for both NQ and HQ staves have been estimated.
As for unexplained variability in the number of procs, the author provides a laundry list of possible explanatory factors, none of which are as important as the ones under one's direct control. (Do you do anything only during specific moon phases?)
Reaction and criticism (not in the write-up)
These people had the temerity to broadcast this "analysis" on both Allakhazam and Killing Ifrit.
On Allakhazam, you typically had the usual sucking off. Not unexpectedly, a reasonable objection was raised about the relationship between duration and number of procs. It seems practical enough to consider an increase in duration (holding other factors constant) as increasing the number of procs that are observed. The exogenous factors (MND), on the other hand, actually affect the potency of paralyze (proc rate), also measured as the number of procs, but holding duration constant. But instead of recognizing this line of reasoning, these numbnuts hid beind numbers (and statistics) without even thinking about how to interpret effects and the implications of their "analysis." (This is actually all too common for all the so-called "mathematicians" on Allakhazam.)
On Killing Ifrit, there were a few somewhat naïve criticisms of the experimental design (all from the same poster). Yes, it would be nice to use more than two levels of each independent variable, but there is no compelling case for a nonlinear trend. Again, generating standardized effects for each predictor is a pointless exercise for this data (as discussed previously). A multi-factor ANOVA is superfluous as you can construct simultaneous confidence intervals for the parameter estimates from regression (in general). Sample size and power are brought up, but concern for "too much power" (with excessive sample sizes) is simply a trivial objection.
Alternative (not in the write-up)
I don't have any particular objection to path analysis per se. The low-hanging fruit are that the statistical procedures are questionable, the write-up mired in irrelevant details and the interpretations awkward.
Let us return to the original motivation for the path "analysis." Modeling proc rate was criticized (false distinction between that and number of procs when controlling for duration) but the interpretations involved in path analysis concern proc rate anyway. (Potency must be a proc rate. This is beyond dispute.) So why not model the proc rate directly? (And if you care so much about modeling duration too, you can regress that on your favorite predictors. No one's stopping you.)
It seems natural enough to use Poisson regression to model proc rate, and I carried out this procedure in R (output below):
Call:
glm(formula = proc ~ MND + enfeebling + staff + iceday + offset(log(duration)),
family = poisson, data = paralyze)
Deviance Residuals:
Min 1Q Median 3Q Max
-2.2293 -0.8721 -0.0776 0.6353 3.0779
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -6.260517 1.120601 -5.587 2.31e-08 ***
MND 0.007941 0.002259 3.516 0.000439 ***
enfeebling 0.008038 0.003644 2.206 0.027404 *
staff 0.047913 0.107654 0.445 0.656271
iceday -0.027394 0.114415 -0.239 0.810772
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for poisson family taken to be 1)
Null deviance: 155.82 on 139 degrees of freedom
Residual deviance: 138.19 on 135 degrees of freedom
AIC: 510.53
Number of Fisher Scoring iterations: 5
The model deviance indicates that this model is an acceptable fit to the data. (Note: I facetiously specified an Iceday effect in the model.) Controlling for other factors, proc rate is estimated to increase by .797% for every one-point increase in MND. Note that the z-values are similar to the t-values using OLS estimation.