The Unbearable Triteness of Preening: double attack

Showing posts with label double attack. Show all posts

Tuesday, May 25, 2010

What's the proc rate for virtue weapons? How do you know?

One "line" of evidence: checking with Justice Sword

Taken at face value, the estimate 555/1000 indicates the "occasionally attacks twice" (OAT) rate of Justice Sword is significantly higher than 50% and could be considered 55% (source). But is the OAT property the same for all so-called "virtue weapons"?

Another line of evidence: checking with Fortitude Axe... and WAR

The rest of this post discusses how to estimate the OAT rate of Fortitude Axe in the presence of the double attack trait from WAR. But first, I needed a good idea about how Fortitude Axe OAT actually interacts with the DA trait. In the past, I blabbed a lot about how Fortitude Axe might interact with double attack, but my "conclusion" was based on very weak evidence. After collecting some more count data with kparser under 12% DA (source), which ruled out my previous weak hypotheses about the DA/OAT interaction, I got a better idea about how to explain these results (assuming kparser was working correctly...).

It appears (not exactly "proof") that OAT can process on both the normal hit (which is guaranteed to occur for a given attack round, if not actually land) as well as the hit from the possible DA proc (with a major caveat to be discussed soon). More specifically, the hit from the DA proc occurs independently of whether an OAT proc occurs. (Not probabilistically, of course, but "mechanistically.")

It is worth noting that conceptually the order of DA and OAT could easily be reversed, such that the hit from the OAT proc occurs independently of whether a DA proc occurs, but I will just say the resulting probability calculations are not supported by the data when OAT is mechanistically independent of DA.

Anyway, one way to show pictorially all "possible" outcomes where DA and OAT can interact is with the following "tree":

There are six hypothetically "distinct" outcomes, but it is very inconvenient to monitor the equipment menu for virtue stone expenditure. More important, though, is the fact that Fortitude Axe cannot quadruple attack, so the case of expending two virtue stones is impossible. (This makes sense, noting that triple attacks are impossible with zero DA rate.)

So what "happens" to this 2-virtue stone attack round that is impossible? It appears that even if a DA proc occurs, only one virtue stone can be expended anyway, so the "tree" simplifies further:

The resulting probability model of the number of hits in an attack round (ignoring the distinction between hits and misses) is specified as follows. Let X denote the number of hits in a given attack round, d the probability of a double attack proc, and π the probability of an OAT proc. Then,

Now that we have a reasonable probability model describing the interaction between double attack and the OAT property of Fortitude Axe ("reasonable" based on chi-square goodness of fit to the data given 12% DA rate and posited virtue weapon proc rates of 50% and 55%), we can now estimate the OAT rate. Proceeding with maximum likelihood estimation is not really necessary when an obvious unbiased estimator can be based off the observed number of single hits (denoted as X₁) in n attack rounds:

It follows that the unbiased estimator is

with variance

Note that when d = 0, the variance reduces to that for the estimator for a simple binomial proportion (marginal in the context of the multinomial distribution). (Note to self: from simulation, this estimator is only very slightly less efficient, from an MSE standpoint, than the MLE, which I would bet is UMVUE even if an analytical expression for the MLE and the CRLB is annoying to obtain.)

The estimated proportion of single hits (per attack round) is 1 - 595/1425/.88 = .5255183, with corresponding 95% confidence interval (.4964218, .5546148). Given the specified probability model (which cannot be "proven" to be true at this time) and the data, it is not possible to conclude that the OAT rate of Fortitude Axe is either 50% or 55% (both are plausible given the confidence interval), unfortunately. But it should be possible to rule out one or the other with further data collection (with the hope that the probability model is correct), using the estimator specified above.

A third way: Faith Baghnakhs

Among all virtue weapons, it would be fastest to determine the OAT rate of Faith Baghnakhs by counting the number of triple attacks and quadruple attacks. It would be easier to do this on ninja because you wouldn't have to pay attention to kick attacks (because you want to use a parser instead of counting manually). If the OAT rate for Faith Baghnakhs can be shown to be 55%, that, along with the observed proc rate for Justice Sword, could be used as evidence for a common OAT rate of 55% across all virtue weapons.

Friday, October 2, 2009

The two-fold effect of double attack

When you speak of damage over time, what do you actually mean? The answer may reveal whether you hold a minor, yet "fundamental" misunderstanding of damage "mechanics" in this world of OCD fuck-headed douchebaggery.

First, I hope you understand that the only valid view of damage over time accounts for weapon skills, regardless of when and how you use them (spamming them or whatever). Like, weapon skills contribute to damage, and you're doing damage over some time interval. Duh! Since when do the ideal and the actual have to coincide, anyway?

Anyway, I also hope we can all agree there are factors such as accuracy, double attack, and haste that affect the frequency of attacks (that land) and, therefore, damage per unit time (an average), while factors such as attack (rating) and strength affect the potency of damage per hit.

But as I've mentioned in passing here and there, while accuracy and double attack can affect the frequency of auto-attacks and weapon skills, it should be obvious that they also affect the average damage of weapon skills, which haste does not affect. But this fact is often elided for the sake of convenience without any apparent recognition.

For example, given 85% hit rate, 4 accuracy will increase auto-attack and weapon skill frequency by 2.35% ideally. This doesn't mean a 2.35% increase in overall damage per unit time, which again should account for the contribution from weapon skills, because that figure doesn't account for the effect of accuracy on average WS damage (per use). Therefore, it is a slight underestimate of the "true" percent change.

Does it matter? Practically, not really, mainly because it's pretty inconvenient to account for the two-fold effect of accuracy and double attack. But there are some "interesting," counterintuitive (indulging the conceit that FFXI players have any intuitions about how anything actually works) consequences that I demonstrate in the following example.

Does double attack ever "beat" haste?

Sure, a specific amount X of double attack, given some initial level of double attack, can be more efficient than a specific amount Y of haste, given some initial level of haste. Why does this comparison ever come up, anyway? I can't think of any situation where haste and double attack are in direct competition. Whatever increases your "efficiency" without incurring ridiculous opportunity costs should be good enough.

A possible explanation is that players often are deluded into thinking they have "capped" accuracy and rapid TP gain is "sexy," which both haste and double attack affect. However, on a per-point basis, haste is plainly more efficient than double attack at increasing the rate of TP gain because haste directly lowers the time between attack rounds, which is fundamentally more efficient than tacking on an occasional extra attack per attack round.

Of course, I didn't mention the effect of double attack on average weapon skill damage, which, when actually considered, is actually enough for double attack to be more efficient than haste for increasing overall damage over time (not just for TP gain) for specific situations.

How is this even possible? Fundamentally speaking, haste increases damage over time at an instantaneous rate of 100/(100 - H)², where H is the amount of haste (as an integer percentage), so the effect of total haste is relatively slow at the beginning but eventually ramps up rapidly as the amount of haste increases. This is why haste is the "gold standard" for increasing rate of damage.

But, since the rate of increase is relatively slow at the low end, there is the only "opportunity" for double attack and accuracy to be ever so slightly more efficient than haste if all you really cared about is damage efficiency.

A counter-intuitive example

The most convenient way to compare the efficiency of two competing options in the game, whether it be pieces of equipment, two types of "buffs," etc., is to determine the percent difference in "output" (damage over time, gil over time, whatever) since we are generally interested only in the relative difference and any factors that are fixed between any two options can be factored out and don't need to be "given" as a matter of convenience.

Where damage is concerned, this requires knowledge of the functional relationship between rate of damage and the factors of interest (holding all else fixed). Unfortunately, I don't see any way to derive an exact relationship between damage over time (including the weapon skill contribution) and double attack, so it is just easier for me to give a specific example illustrating where double attack is ideally more efficient than haste.

The example I give is based on the following conditions:

Suppose 106 "base" damage per auto-attack hit and 159 "base" damage per weapon skill hit with average pDIF of 1 in both phases
Six required hits to achieve 100 TP (6-hit setup), so five hits to 100 TP given sufficient TP return from the previous weapon skill
A 3-hit weapon skill used instantaneously after achieving 100 TP
95% hit rate. Therefore, the average number of landed hits (giving TP) per weapon skill is 3(.95) = 2.85
Starting from 0% double attack and 0% haste, it will be shown how damage per second varies with DA and haste, respectively

These conditions are sufficient to give an estimate of damage per second, ignoring the slight effect of having insufficient TP return from the previous WS to get to 100 TP in 5 hits thereafter, and possible variability in weapon skill damage based on TP, among other factors. But it's not the estimate of damage per second that is of interest, but how damage per second changes with either DA or haste.

After some boring spreadsheet calculations, which are boring and unnecessary to show, a plot illustrating the efficiency of double attack and haste is presented.

As expected, it takes a "while" for the effect of total haste to ramp up, but fundamentally damage per second must tend to infinity as total haste approaches 100%. In comparison, double attack is actually more efficient than haste initially because of the two-fold effect of double attack (on the rate of TP gain and the average WS damage) but the instantaneous rate of change increases very slowly. This makes sense because the increase in the average number of hits for any multi-hit weapon weapon skill is 0.020 for every point of double attack. At the same time, the time to WS is decreased.

When determining percent changes with double attack, it should now be obvious that the relative efficacy of double attack depends on the damage per hit of the weapon skill as well as the number of required hits to 100 TP. The relative efficacy of haste is independent of damage per hit or expected number of hits in a weapon skill, though.

It follows that the efficiency of double attack is blunted if TP isn't spammed. As a check on my calculations, I dropped the weapon skill component of damage per unit time (which is wrong to do) and generated a different plot illustrating auto-attack rate of damage.

This should be a familiar result. All is right in the world. Going from 0% to 25% haste results in a 33% increase in overall damage per second, as illustrated here and in the previous figure. Going from 0% to 25% double attack results in a 25% increase in auto-attack damage per second, as illustrated in the last figure alone.

If for some reason you never use a weapon skill, double attack can never be more efficient than haste at increasing the overall rate of damage, which must account for average weapon skill damage lest you look like a dipshit. But when does that actually happen if you are OCD'ing about damage to begin with?

What do I take away from this?

Double attack can proc on multi-hit weapon skills. Therefore, double attack not only affects weapon skill frequency and the rate of auto-attack damage, but also average WS damage. Don't forget average weapon skill damage matters somewhat when evaluating the effect of additional double attack. (Doing the actual calculations is another issue, though.) The same goes for accuracy, too, although I didn't provide a specific example.

Thursday, September 10, 2009

Warrior's Charge for TP generation

A typical way to use Warrior's Charge

When I had the one perfunctory merit for Warrior's Charge—not like there was anything compelling in Group 2—I reserved it exclusively for weapon skills instead of TP gain. For an ability that can be used on demand when available, this is a typical application especially for "zerging," when you can't really ensure that the potential TP gain from the guaranteed double attack will let you squeeze out another WS in the 45 seconds of Mighty Strikes. But to be honest, I liked the big numbers for Steel Cyclone.

A more efficient way to use Warrior's Charge?

Of course, if you're doing some long-term activity, like meriting, it is theoretically more efficient (that word again...) in the "long run" to use Warrior's Charge in the auto-attack phase than for weapon skills. By "long run," I'm basically referring to TP-burning, where the benefit of the average increase in TP gain from Warrior's Charge actually and most ideally manifests in higher average WS frequency per unit time.

Would you actually want to waste 22 merits to reduce the recast time to 5 minutes, though? Perhaps it would help to quantify the effective increase in double attack rate from using Warrior's Charge every five minutes.

Expressing the effect of Warrior's Charge as a rate of double attack

First, a preliminary observation. While Warrior's Charge confers an absolute increase of one double attack per use, its relative contribution to long-run rate of damage decreases with increasing delay reduction.

For example, if there are few attack rounds in the time period between uses of Warrior's Charge, then the contribution of Warrior's Charge is relatively large. But if there are many attack rounds in that window, then the contribution of Warrior's Charge is relatively small.

With that in mind, it is necessary to make an explicit statement about the amount of delay reduction present before determining the effective double attack rate with ideal use of Warrior's Charge. I acknowledge that the effect of Warrior's Charge is "discrete," and not continually present, but the rate of double attack is just an expected value (average) anyway, a mathematical conceit, so it's natural to account for the effect of Warrior's Charge in a weighted average, which is the "effective" double attack rate in the long run (to be shown later).

As an example, suppose that you have 5/5 Warrior's Charge, for a five-minute recast. The average DA rate for Warrior's Charge, one double attack every five minutes, is equivalent to 2.8% DA given 504 delay, and 2.296% DA given 504 delay and 18% haste. These values are calculated as follows:

\[\frac{1\ \mbox{DA}}{5\ \cancel{\mbox{min}}}\cdot\frac{1\ \mbox{min}}{60\ \mbox{s}}\cdot\frac{1\ \mbox{s}}{60\ \mbox{delay}}\cdot\frac{504\ \mbox{delay}}{1\ \mbox{round}}\cdot100\% =2.8\%\ \frac{\mbox{DA}}{\mbox{round}}\]
\[\frac{1\ \mbox{DA}}{5\ \mbox{min}}\cdot\frac{1\ \mbox{min}}{60\ \mbox{s}}\cdot\frac{1\ \mbox{s}}{60\ \mbox{delay}}\cdot\frac{504(.82)\ \mbox{delay}}{1\ \mbox{round}}\cdot100\% =2.296\%\ \frac{\mbox{DA}}{\mbox{round}}\]
Again, the length of the interval between attacks, which is affected by delay reduction, determines the relative contribution of Warrior's Charge on a per-round basis.

That makes sense, but how does Warrior's Charge actually affect the effective double attack rate?

The effective double attack rate, which takes into account the contribution of Warrior's Charge, is not the sum of the base DA rate and the contrived DA rate from Warrior's Charge. It's a weighted average based on how often WC takes effect, ideally as often as possible, or once every five minutes. That may be a confusing statement, so I provide further corny explanation.

Approaching this question from a probabilistic point of view, the above rates can be treated as the unconditional "probabilities" that Warrior's Charge takes effect in one attack round. There is nothing probabilistic about how often Warrior's Charge is used, but I am using probability language for the sake of explaining how the effective double attack rate is calculated.

If you are familiar with the phrase "percent of the time" appended to a number, perhaps this explanation will actually make some sense. Probability statements are often colloquially expressed in terms of "percent of the time." Given 0% delay reduction, Warrior's Charge "takes effect 2.8 percent of the time." Given 18% delay reduction, Warrior's Charge "takes effect 2.296 percent of the time," and so on.

Given that Warrior's Charge has just been used, the "probability" of a double attack in the subsequent attack round is 1; otherwise, it is whatever your DA rate normally is. Therefore, the "effective" double attack rate is just a weighted average, an application of the law of total probability treating the DA rates as probabilities. As a probability statement, the effective double attack rate is

\[P(\mbox{DA}) = P(\mbox{DA} \mid \mbox{WC})P(\mbox{WC}) + P(\mbox{DA} \mid \overline{\mbox{WC}})P(\overline{\mbox{WC}})\]
Note that this expression is valid in extreme hypothetical cases. If your delay reduction approaches 100%, the relative effect of Warrior's Charge tends to 0. Therefore, your effective DA rate (which is a semi-probability, so to speak) cannot exceed 1. If you never use Warrior's Charge, then your effective DA rate is just your base DA rate.

To give an explicit example, suppose my DA rate from Warrior's Charge is 2.296% (shown above) and my base DA rate (before Warrior's Charge) is 19%. Then, my effective DA rate is

\[P(\mbox{DA}) = 1(.02296) + (.19)(1-.02296) = .2085976\]
This effective double attack rate, which accounts for the discrete contributions of Warrior's Charge, can then be used to estimate the average number of rounds to 100 TP or the average number of hits in a weapon skill.

Showing that Warrior's Charge is better (in the "long run") for TP gain than for weapon skills

Certainly, if you don't find occasion to use Warrior's Charge for meaningful TP gain, you might as well use it for weapon skills. No one ever said anything about not using one's discretion and judgment.

Still, it is easy to argue that you get more out of Warrior's Charge for TP spamming without doing arithmetic. In the auto-attack phase, on average the extra DA increases TP gain, leading to higher WS frequency, and also contributes to auto-attack damage. For weapon skills, the extra DA merely gives a slightly higher average TP return and slightly higher WS damage.

Numbers provide a nice summary, however, so I present the results of some number-crunching for TP spamming with

sufficient Store TP for a "6-hit" setup (5 hits to 100 TP given sufficient TP return from the previous weapon skill)
a 3-hit weapon skill (like Raging Rush or King's Justice)
18% delay reduction
5/5 Warrior's Charge (so 2.296% DA rate from WC)
19% base double attack rate (so effective DA rate of 20.86%)
95% hit rate
106 "base" damage for TP (average pDIF of 1)
159 "base" damage for WS (average pDIF of 1)

(All numbers in the following table are averages.)

Comparison of average damage per second for a "cycle" of auto-attack and WS damage (5 hits to 100 TP)

Use of Warrior's Charge	Rounds	TP Hits	WS Hits	Time (s)	DPS
5/5 Warrior's Charge only for TP gain	4.497	5.164	3.211	30.979	34.149
5/5 Warrior's Charge only for WS ("e-penis")	4.557	5.151	3.228	31.388	33.752
No Warrior's Charge	4.557	5.151	3.211	31.388	33.662

As expected, the e-penis approach is slightly better than nothing, but worse than the "optimal" approach, provided you actually have opportunities to use Warrior's Charge for TP generation. 22 merit points into Warrior's Charge only gets you up to a 1.446% improvement in theoretical, long-run rate of damage, which itself is an inefficient use of merit points compared to other warrior-specific options such as Group 1 double attack rate.

Moreover, as shown earlier, Warrior's Charge becomes relatively less effective for increasing WS frequency the more delay reduction you have, as it provides only a static increase in TP gain while being unaffected by delay reduction.

By this point, it should be easy to accept using Warrior's Charge for increasing WS frequency where applicable, but let's return to the 45-second Mighty Strikes "zerg." Warrior's Charge is relatively less effective with increasing haste (which is desirable for zerging), but the key word is "relatively." In that small time frame, you should still be better off using Warrior's Charge for the extra TP to get another WS off than to tack on another hit to a weapon skill.

Saturday, July 25, 2009

Properties of virtue weapons like Fortitude Axe

Edit (Aug. 5): mixed up scenario labeling but the conclusion is the same.

A few days ago I forwarded these cheesy analyses of the relative "efficiency" of so-called virtue weapons (Fortitude Axe and Love Halberd), which I hope are not taken all that seriously even though I made some attempt to reconcile it with my own experiences or others'. My primary goal with these posts I made over the past week was to demonstrate yet another application of probability theory to simplify such cheesy "analysis" while being somewhat rigorous about it. At least it's better than presenting some ugly arithmetic and retarded hand-waving, as I explicitly stated the major assumptions involved. (But you would have to trust I am doing calculations correctly.)

I bring up the comparisons involving the virtue weapons to point out that I made the assumption that double attack can proc both on the main hit and virtue weapon proc, yielding the possibility of a round of 4 attacks. Actually, I have no basis for assuming such a thing other than flimsy hearsay. With this in mind, I set out to collect data to support the notion that a 4-attack round is possible with Fortitude Axe, and I obtained the following count data after 236 rounds. (I ran out of virtue stones.)

No. of hits	0	1	2	3	4
Counts	5	96	100	35	0
Est. proportion	.0212	.4068	.4237	.1483	.0000

It so happened that I didn't observe a 4-hit round with Fortitude Axe, but is this because it is rare or because it's impossible?

If we assume DA can proc independently of one another for both the initial attack and the virtue weapon proc, the probability of a 4-hit round is .01796 (given 95% hit rate) and the probability that zero 4-hit rounds occur in 236 attempts is .01388. Put another way, the probability that at least one 4-hit round occurs in 236 attempts is .98612.

Thus, if this "mechanism" of interaction between DA and virtue weapons is true, I was unlucky not to see a 4-hit round. But, if it is wrong, not seeing a 4-hit round is exactly what I should expect.

How else would DA and virtue weapons interact such that a 4-attack round is not a possibility?

One scenario, which I call "A," is that there is exactly one DA proc possible and that it's independent of the virtue weapon proc. In this case, DA has only one chance to proc.

Another hypothesis is that whether DA procs on the virtue weapon depends on whether the DA has processed on the initial attack. I call this scenario "B." If DA has processed on the first attack, it will not process after the potential virtue weapon proc; otherwise, DA may process after the virtue weapon swing. In this scenario, there are up to two chances for DA to proc.

Both of these scenarios are not far-fetched, so the question of which one agrees more with the data depends on knowing the hypothetical distribution of number of attacks per round under each scenario given the rate of DA trait and overall hit rate. These distributions are determined for 95% hit rate, 21% DA rate, and 50% virtue weapon proc rate, as shown below.

No. of hits	0	1	2	3	4	Expected value
(A)	.0210	.4235	.4655	.0900	0	1.6245
(B)	.0208	.4162	.4018	.1611	0	1.7033025
(C)	.0208	.4161	.3991	.1460	.0180	1.72425

To summarize, scenario "A" allows DA exactly one opportunity to proc. This DA proc is independent of whether the virtue weapon procs.

Scenario "B" allows DA up to two opportunities to proc. If it procs on the first attack, it won't on the (potential) virtue weapon proc. If it doesn't proc on the first attack, it can on the virtue weapon proc

Scenario "C" allows DA exactly two opportunities to proc (DA and virtue weapons "stack") as explained toward the beginning. The language to describe these may be confusing, but the associated probability distributions are a pretty convenient distillation. Checking the actual data against these hypothetical distributions can give us insight as to which scenario is most reasonable of the three. The "expected value" column shows the average number of attacks per round under each scenario. My initial impression is that (B) seems to be the most realistic.

I already discussed scenario "C." The probability of observing zero 4-hit rounds in 236 attempts is .01388, so scenario "C" is unlikely.

Start with scenario "A." Instead of focusing, say, on comparing the observed proportion of 3-hit rounds to the hypothetical proportions under (A) and (B), it makes more sense to consider all of the data at hand. I can use Pearson's chi-square statistic to examine the "goodness of fit" of the associated probability distribution to the observed data. Under scenario "A," the approximate p-value is .02017, indicating that scenario "A" is not particularly likely given the data.

How about scenario "B" then? The associated p-value is .898 (approximately). Under the scenario that DA is permitted to proc up to two times, the probability of observing count data as "extreme" or more extreme than the data actually observed is about .898, an indication that this mechanism is very plausible.

It bears reminding that in all of these hypothetical cases, I assumed that the virtue weapon proc rate was 50%. This is not necessarily a good assumption. For example the Joyeuse proc rate is more like 45%, contradicting the long-held assumption that it is 50%.

After acknowledging that scenario "B" is the best way to explain my data, it may be useful to see how the probability distributions "shift" by changing the virtue weapon proc rate in increments of 5% in either "direction" of 50%.

Hypothetical probability distributions for the number of hits in a single round, assuming that DA has up to two opportunities to proc on a virtue weapon

Given a virtue weapon proc rate of...	0 hits	1 hits	2 hits	3 hits	p-value
40%	.0246	.4870	.3593	.1289	.08418
45%	.0227	.4516	.3806	.1450	.52411
50%	.0208	.4162	.4018	.1611	.898
55%	.0189	.3808	.4231	.1773	.66298
60%	.0170	.3453	.4443	.1934	.13281

It is easy to see that the probabilities under each column decrease or increase at a constant rate. It is also easy to see what while a virtue weapon proc rate of 50% is highly probable given the data at hand, there is insufficient statistical power to rule out a proc rate as low as 40% or as high as 60%.

Conclusion

Based on an observed sample of 236 attack rounds with Fortitude Axe, it appears that the double attack trait can proc either on the first attack or on the virtue weapon attack, but not both. If it procs on the first attack, it won't on the potential virtue weapon attack. If it doesn't proc on the first attack, it may proc on the potential virtue weapon attack.

The obvious implication is that claims of a four-attack (or four-hit) round with Fortitude Axe and other virtue weapons are highly suspect. If you think you observe a four-attack round with Fortitude Axe or another virtue weapon that cannot be explained by high attack speed, that observation must be considered in the context of the relative frequency of 0-, 1-, 2-, and 3-hit rounds that you probably didn't even bother to record. Idiot.

Also, the proc rate of a virtue weapon might be 50% but there was an insufficient sample size to "prove" it.

Sunday, July 19, 2009

Comparison of Love Halberd and Tomoe for samurai

I allocated way too much time this weekend to putzing around with spreadsheets, but let's just finish this off, shall we? Here's an example of doing a fairly simple comparison of a 7-hit Love Halberd with a 5-hit Tomoe, which is based on ideas presented in a prior comparison of weapons for warrior. In particular, I utilize the concepts of "expected number of rounds to clear 100 TP" and "expected number of hits to clear 100 TP" to make the arithmetic more tractable.

I didn't see any (good) hypothetical comparison of Tomoe 5-hit versus Love Halberd 7-hit for samurai (using Penta Thrust), so I thought I could do this really fast because I already set up the "black box" (this mess of a spreadsheet) to spit out an answer.

Calculating average time to 100 TP

Weapon	Average no. of rounds	Average no. of hits	Average time (s)
Love Halberd	3.959	6.488	26.131
Tomoe	3.774	4.123	30.197

This is the easiest step as the assumptions are reasonable if idealized, such as 95% hit rate, 15% double attack rate, and starting with some initial TP from the previous weapon skill.

Calculating average damage to 100 TP (including weapon skill damage)

Weapon	AA "base" damage	WS "base" damage	Average AA damage	Average WS damage	Total damage
Love Halberd	70	110	454.179	650.335	1104.515
Tomoe	96	136	395.890	822.614	1218.505

Again, there are more simple assumptions, like using Penta Thrust immediately after attaining 100 TP, using the same fSTR throughout, and assuming an average pDIF of 1. Using the expected values from the previous table, the average auto-attack and WS damage can be calculated.

Also, average WS damage is based on an average return of 5.035 hits.

Did I account for the effect of Meditate? Assuming Meditate recast is 150 seconds, we can assume all the TP goes to one WS and incorporate that damage into one cycle of AA and WS damage. For example, a "Meditate WS" is about 0.174 of a full WS in one cycle for Love Halberd, 0.201 for Tomoe, which makes sense as Meditate will benefit "slower-to-WS" weapons relatively more (Tomoe being slower).

Damage per second

Weapon	AA proportion of total damage	DPS	Relative efficiency
Love Halberd	.411	42.267	+4.75%
Tomoe	.324	40.351	---

Time for a reality check. Is it really possible for Tomoe auto-attack damage to account for only about 33% of total damage? I would have to see some parser output to validate these calculations. If you ignore Meditate, the proportions increase to .451 and .366. I will update this post when I can track down some parser output.

Even accounting for Meditate, Love Halberd comes out ahead on paper by almost 5%. Whether that 5% is worth expending virtue stones in a merit party is another issue altogether. You can't really argue differences in hit rate (if you want hit rate to drop below 95%) since the only real difference would be whatever is used in the ammo slot. As for attack differences, who knows how DEX +7 would compare to attack +5 and whatever's in the ammo slot.

Of course, the major issue, at least to me, is whether DA really stacks with virtue weapons. I've been assuming it does. Even if it doesn't though, Love Halberd is still slightly more efficient.

Saturday, November 8, 2008

Aggressor and double attack merits

After meriting on greater colibri for a bit, I was wondering whether I would be "better off" had I merited double attack to level 5 instead of Aggressor recast. (Unsynchronized Berserk and Aggressor timers would be really annoying though.) This May 2007 discussion comparing Aggressor and double attack merits shows, despite the muddled presentation, a situation where fully merited double attack is more effective than fully merited Aggressor recast, since Aggressor supposedly provides an accuracy bonus of 25, which corresponds to only a 12.5% hit rate increase (on average). However, we might be interested in the magnitude of difference between the two Group 1 schemes, which is more difficult to quantify.

One approach is to calculate the average number of attack rounds to reach 100 TP for both 5 DA/0 Aggressor and 0 DA/5 Aggressor. (The number of attack rounds is independent of specific damage values.) Of course, the relative effectiveness of Aggressor is higher when your hit rate is lower, as is usually the case when targeting anything more difficult than greater colibri. Then it might be useful to compare max DA and max Aggressor for lower levels of a baseline hit rate.

Ultimately we want to know what the differences in long-run "damage over time" are, but first we can look at the average number of attack rounds, as that is an indirect measure of time. (Assume number of seconds per attack round is constant.) Unfortunately, an analytic expression of the average number of attack rounds to reach 100 TP is too annoying to derive primarily because the number of attack rounds needed to reach 100 TP depends on the TP return of the previous weapon skill, which is almost never zero for a multi-hit weapon skill with a decent hit rate. The number of hits to 100, given initial TP, seems basically to follow a Poisson process, but I'd rather not worry about cumbersome calculations. Therefore, I resorted to simulation to generate the following approximate values based on my warrior setup (varying the Group 1 merit configurations, obviously), given baseline hit rate and the use of a 3-hit weapon skill (Raging Rush or King's Justice):


Average number of attack rounds given baseline hit rate

       5/0   2/4   0/5
0.2   20.19 19.81 19.87
0.3   14.60 14.52 14.61
0.4   11.40 11.39 11.47
0.5    9.31  9.33  9.41
0.6    7.83  7.86  7.95
0.7    6.73  6.78  6.84
0.75   6.29  6.33  6.39
0.8    5.88  5.93  5.99
0.825  5.71  5.74  5.81

Here, the first column corresponds to baseline hit rate (before the Aggressor bonus), and the next three columns correspond to different Group 1 merit configurations:

"5/0": 5 double attack, 0 Aggressor
"2/4": 2 double attack, 4 Aggressor (mine)
"0/5": 0 double attack, 5 Aggressor

Then, we can obtain values representing "damage over time" in terms of hits per round, given the baseline (or nominal) hit rate:


Average number of hits per round given baseline hit rate

       5/0   2/4   0/5
0.2   0.336 0.341 0.340
0.3   0.458 0.460 0.456
0.4   0.580 0.579 0.573
0.5   0.702 0.698 0.691
0.6   0.819 0.818 0.811
0.7   0.946 0.935 0.925
0.75  1.006 0.996 0.983
0.8   1.069 1.054 1.041
0.825 1.097 1.085 1.071

The max DA configuration is already about even with max Aggressor at 30% baseline hit rate, and it really starts to pull away as the baseline hit rate increases (especially after the point where Aggressor does not provide the full accuracy bonus, past 82.5% hit rate), so to me there is scant justification for 5/5 Aggressor. This makes sense as fully merited Aggressor provides an average 1.5% hit rate increase over non-merited Aggressor, which pales in comparison to the increase in "damage over time" that can be conferred by 5 double attack in the presence of high levels of accuracy. This analysis doesn't account for multi-hit weapons such as Ridill and Joyeuse, but the relative differences between 5/0 and 0/5 should still favor 5 DA merits even though the gap may close. And of course, this post doesn't account for actual damage per hit, but DA and hit rate are "independent" of damage per hit anyway (hits/time × damage/hit = damage/time!) and it's not that much of a reach to estimate real "damage over time" by factoring in an average damage per hit.

I found it helpful to plot attack rounds vs. hit rate to illustrate that the average number of attack rounds to 100 TP levels off as hit rate increases:

Obviously the rate of change in the number of attack rounds to 100 TP is decreasing in magnitude (but is still negative) with hit rate. But the number of attack rounds is not a direct measure of damage over time. Damage over time is a ratio of, yes, damage over time. The number of attack rounds is a proxy for time, and is not a ratio.

The number of hits, given the number of attack rounds, on the other hand, is a measure of damage, so dividing the number of hits by the number of attack rounds gives a quantity that can stand in for "damage over time," as plotted below vs. nominal hit rate:

Of course, there is no reason to plot such a thing because intuitively the rate of change of hits/round must be constant (we're plotting hit rate vs. hit rate!), especially if you believe that 2 points of accuracy always corresponds to 1% hit rate between 20% hit rate and 95% hit rate. If you do, it's complete nonsense to speak of damage over time showing "diminishing returns" to hit rate. Hit rate leveling off with accuracy in some logistic fashion is another story though.

Wednesday, October 29, 2008

Double attack and weapon skills, part 2

So many "known" things about random mechanics in FFXI seem poorly substantiated due to a lack of data, bad methodology when data are collected, and poor or non-existent analysis and interpretation after the data collection. Then again, it's not as though you really need to know, say, how many hits per attack round you can expect from a Kraken Club. Even if you have one, such considerations are beside the point.

That said, it's almost delightful to see some real data (not some useless parse), and even better when there are some easily tested hypotheses that follow from the purpose of the data collection. This thread on double attack during WS generated some interesting speculation about how many times double attack can process based on the data gathered but ventured no further, and no one really provided and tested a model of how DA interacts with weapon skills, the closest being a proposal that Penta Thrust may receive up to 3 DA "checks" per WS.

This proposal followed from data collection on TP return (a measure of number of hits in a WS) for Penta Thrust, which is summarized as follows:

10% DA rate (warrior subjob)
95% hit rate (lv 73 dragoon vs. lv 47-54 diatryma)

196 total WS

3 hits: 3 (.015)
4 hits: 42 (.214)
5 hits: 120 (.612)
6 hits: 30 (.153)
7 hits: 1 (.005)

However, I am not interested in seeing whether a "3 DA check" model is a good fit to the data since it is "known" that double attack cannot proc more than twice on a WS. (I hope this is a correct assumption. Besides, it doesn't seem likely that people who love to jack off to WS damage, and make their obnoxious asses known on popular FFXI forums, wouldn't run their mouths about a 8-hit Penta Thrust. Sometimes the persistent absence of evidence is strong evidence--NOT PROOF--of absence.) Rather, I'm looking to clarify how exactly double attack can proc twice at most based on my previous post.

As a reminder, I proposed the following models for how DA might work with WS: (1) double attack can proc twice on specific hits of the WS (thought to be the first two hits per FFXIclopedia), and (2) double attack may proc a maximum of two times on a WS (not restricted to specific hits). Is it even possible to tell the difference between these two models for Penta Thrust, given only 10% DA rate?

Fortunately, the probability distributions under each model are fairly easy to calculate (the calculations for Penta Thrust are similar to those for 3-hit WS last time) and are summarized in the following graph:

The difference between the two is fairly stark, so it wouldn't take that much data to support one over the other, assuming either one is true. In particular, the difference between the two models is most pronounced for the 6-hit and 7-hit cases. A sample proportion of .153 for 6-hits is very unlikely for the "2 DA maximum" model, where the theoretical proportion is .262. The "DA 2 hits only" seems a decent fit, so run with that.

The FFXIclopedia article on DA was changed February 10 of this year to state that DA can activate on the first two hits only (instead of being able to proc twice at most and on any of the hits). Aside from the fact that one cannot distinguish between the 4 ways DA can proc consecutively on two hits in Penta Thrust (saying it procs on the first two hits is nothing more than a guess if you don't know how it's programmed), I wonder if that change was motivated by the evidence of sample data or if it was just a shot in the dark. At least I found some evidence for that.

If you are interested in playing around with the probabilities of the number of hits for your favorite multi-hit weapon skill, the following is some R code I wrote to generate them. You can change p1 (hit rate), p2 (double attack rate), and y (number of normal hits in the WS) to suit your particular situation. Some slight modification would have to be made to isolate the probability of the first hit occurring for the purposes of calculating average WS damage (where fTP isn't 1.0).


# p1 - hit rate
# p2 - double attack rate
#  y - number of normal hits in the weapon skill

p1 = .95
p2 = .15
y = 2

# double attack can process on only two hits

p_2x = rep(0,(y+2))
for (i in 0:(y+2)) {
  p_2x[i+1] = sum(dbinom(max(i-2,0):min(i,y),y,p1)*dbinom(i-max(i-2,0):min(i,y),2,p1*p2))
}

# double attack may process a maximum of two times

p_max = rep(0,(y+2))
for (i in 0:(y+2)) {
  if (i < 2) {
    p_max[i+1] = sum(dbinom(max(i-2,0):min(i,y),y,p1)*dbinom(i-max(i-2,0):min(i,y),y,p1*p2))
    next
  }

  p_max[i+1] = dbinom(i-2,y,p1)*sum(dnbinom(0:(y-2),2,p1*p2))

  if (i != (y+2)) {
    p_max[i+1] = p_max[i+1] + sum(dbinom((i-1):i,y,p1)*dbinom(i-(i-1):i,y,p1*p2))
  }
}

# probability mass functions

round(p_2x,10)
round(p_max,10)

# expected number of hits

hit = seq(0,(y+2))
exp_hit_2x = sum(hit*p_2x)
exp_hit_max = sum(hit*p_max)

exp_hit_2x
exp_hit_max

Some checks: for a 2-hit WS, the two models are indistinguishable. As DA tends to 0%, the two models are indistinguishable in the limit. (The negative binomial distribution is degenerate when p2 = 0.) When hit rate is 100%, there are no number of hits less than the number of normal hits.

Friday, October 24, 2008

Double attack and weapon skills

Previously, I estimated the average damage of both Raging Rush and King's Justice for my character on lv 82 greater colibri (link), but there was one major unmentioned assumption I made concerning how the double attack trait processes on weapon skills.

Suppose that "conventional wisdom" assumes that double attack can proc twice, at most, on a WS (I haven't seen any evidence to prove that DA can proc more than twice), but under this assumption there are two possibilities: (1) double attack must proc on only two hits of the WS (2 or more normal hits in the WS; this is usually thought of as occurring on the first two hits of the WS), and; (2) double attack may proc a maximum of two times on a WS. Which one is it?

There is a subtle difference between the two "hypotheses." If DA can proc on any hit in a multi-hit weapon skill, there are more opportunities for DA to proc twice (when the number of normal hits in the WS is greater than 2) than there would be if DA is limited to proc on specific hits in the WS. Intuitively, if the number of normal hits in the WS is greater than 2, there will be, on average, more WS hits under the second hypothesis even in the presence of a cap to exclude 3+ DA procs.

If you aren't convinced, the following probability exercise will help. Suppose I'm looking at a 3-hit WS (examples: Raging Rush, King's Justice, Blade: Jin, Tachi: Rana) and I want to know the probability of seeing n hits (n = 1, 2, ..., 5) in one WS, given my DA level. Assume 95% hit rate.

Since DA procs are independent of normal hits (in the sense that normal hits must occur in a WS even if they miss), it's simple to calculate these probabilities when DA must proc on only two hits in the WS. Here, the second DA proc is assumed to be independent of the first DA proc, and vice versa. For the other case, the DA procs are dependent, so the calculations are less simple, but they can be done.

When the DA rate is 10%, the probability distributions for both cases are illustrated as follows:

People are more likely to notice 5-hit results than other results, but in either case the probability of observing a 5-hit is pretty low. However, under "2 DA maximum" there are more opportunities for DA to proc (even if there is a 2-DA cap). The expected number of hits is 3.04 for "DA two hits only," and 3.13 for "2 DA maximum."

If you increase your DA rate, the expected number of hits for a WS should always increase (you will see relatively more 4- and 5-hit WSes), and this is the case going from 10% DA to 19% DA:

The expected number of hits is 3.211 for "DA two hits only," and 3.39 for "2 DA maximum." Given 19% DA, it is now fairly easy to distinguish between the two hypotheses, and collecting enough sample data on n-hits of a 3-hit WS should provide evidence in favor of one or the other.

If you can manage to push your DA rate even higher (through merits or elsewhere; I myself have 2 DA merits), the difference between the two hypotheses becomes more stark. Consider when DA is 22%:

The expected number of hits is 3.268 for "DA two hits only," and 3.47 for "2 DA maximum."

Which one do I believe to be the case? I don't have any stake in believing one over the other, but it was easier for me to assume that DA procs on two hits only (there are three ways this can happen for a 3-hit WS, but it doesn't matter in calculating the probabilities).

Thursday, October 2, 2008

Occasionally posts once

It's hard to tell what the consensus is about how frequently Ridill and Kraken Club process x number of hits. Two Japanese sources indicate either directly or indirectly that for Ridill the proportions for one, two, and three hits per attack round are .3, .5, and .2, respectively.

For Kraken Club, however, Studio Gobli gives the distribution of swings per attack round as "5:15:25:25:15:10:3:2". This corresponds to 3.82 expected hits per attack round. Another source (I don't know how "authoritative" it is) specifies 3.2 expected swings per attack around without giving any proportions.

Out of curiosity, I'd like to know how exactly these claims are justified. Did an SE representative give out this information, so it must be true? If these claims were justified empirically, where are the data?

But why should anyone really care about the distribution of number of hits for multihit weapons? Believe it or not, some freaks have been concerned that double attack traits (from the warrior job trait, equipment, Fighter's Roll, whatever) attenuate the number of triple attacks for Ridill (and number of attacks greater than 2 for K. Club), so it may be helpful to know whether this attenuation results in worse performance of Ridill (and other multi-hit weapons) in the presence of double attack than without DA. I myself am more interested in how to analyze any data collected in support or contradiction of a belief. This is for the sake of making conclusions that are marginally better than hand-waving about "margin of error" without even quantifying it.

Collecting data for Kraken Club from English-language sources appears to be a non-starter, but some data for Ridill is easily found. The talk page for Ridill on FFXIclopedia has some good data sets for the number of x hits (x = 1, 2, 3). This is assuming that FFXI's random number generator is sufficiently random (no reason to believe otherwise).

Apparently, the purpose of this data collection was to find evidence that DA affects Ridill's output. But how would DA affect Ridill's output? There were two claims implied by the inane discussion:

(1) Double attack trait processes on all attack rounds equally. This means that single attacks are "converted" to double attacks and triple attacks are "converted" to double attacks. (DA trait "overrides" the Ridill proc.) DA trait may also process when a double attack occurs, but there is no difference in result. As a result, the proportions of single and triple attacks are reduced by the same percentage.

(If the average number of hits/round is less than 2, the net result is a slight increase in Ridill output. If exactly 2, no change regardless of DA level. If greater than 2, a slight decrease in Ridill output.)

(2) DA trait "disproportionately" reduces the number of triple attacks compared to single attacks. Ridill nerfed!

The second claim is really a poorly formed and vague hypothesis; there is no suggestion as to how to express this hypothesis in numerical terms. In contrast, the first claim at least provides some basis for statistical inference because there is a specific claim of how DA interacts with Ridill.

Supposing that the multihit distribution of Ridill as stated previously is really true (a working assumption), then we can calculate Ridill's hit distribution in the presence of warrior's double attack job trait (10% DA) under the first claim:

single: .3(1-.1) = .27
double: .5 + .1(.3 + .2) = .55
triple: .2(1-.1) = .18

The very first data set on the talk page was collected using a WAR/NIN with no other DA from equipment or other sources. The sample proportions are

single: 276/1020 = 0.2705882
double: 541/1020 = 0.5303922
triple: 203/1020 = 0.1990196

At first blush, there seems to be no need to go through the motions of performing a statistical analysis. (Never mind that I saw the data before proposing a hypothesis...) Even though the usual logic of using some statistical hypothesis test doesn't really hold (not trying to assemble evidence against a "null" hypothesis, but rather trying to find corroborating evidence to support one), I use this example to illustrate a few approaches one might use to analyze the data.

One approach is to generate simultaneous confidence intervals (with some pre-specified confidence level) for the proportions of single, double, and triple attacks.

Formally speaking, these multihit distributions can be modeled using a multinomial distribution with educated guessing about the parameters (the proportions of x-hits). Given the data above, a set of approximate simultaneous CIs, using the approach of Goodman (1965), will give a range of probable values of the true proportions of Ridill's x-hits.

If I wanted to be (at least) 95% confident that all the confidence intervals contained the true proportions, then I obtain this set of CIs for the given data:

single: (0.23864, 0.30510)
double: (0.49292, 0.56753)
triple: (0.17081, 0.23059)

I think a family of (simultaneous) CIs is more useful than a CI for an individual proportion if only to get some sense of the "big picture" and limit your attention to "plausible" sets of multiple proportions. With the right techniques, your CIs won't be much wider than the individual CIs you would calculate the usual way. The downside is that there aren't any statistical packages that have built-in options to generate simultaneous intervals.

Conclusion: The above CIs happen to cover the null parameters, so the proposed model seems like a good fit to the data, using the logic of a goodness-of-fit test ("accepting" a null hypothesis in the absence of contradictory data). ("Double attack trait processes on all attack rounds equally.")

Instead of dealing with confidence intervals for multiple proportions, you could focus your attention instead on confidence intervals for the sample mean (expected value) of the number of hits per attack round, which is a random variable just as the numbers of single/double/triple attacks are random variables (all of which depend on the sample size, hence the use of the sample mean).

Indeed, the mean number of hits per attack round is a linear function of the numbers of single/double/triple attacks, and we can use this observation to compute the variance of the sample mean, using the fact that the sum of the individual proportions must equal 1 (for any multinomial distribution).

Thus, for the "null" hypothesis we are currently considering, the expected value of the sample mean of hits/round is 1.91, and the variance of the sample mean is 0.0004332353. By the central limit theorem, the sampling distribution of the sample mean is approximately normal for sufficiently large n. We can use this fact to obtain confidence intervals for the (sample) expected value of number of hits/round.

Personally, I don't think I would bother employing this method. It might be easier to understand if only for the sake of debunking bullshit assertions that arise from point estimates of the expected value for a given sample size. (I'll point out a few of these assertions after I use this method for the previously considered data.) But you lose a sense of the "big picture" when you sacrifice detail for concision.

From the data above, it can be shown that the sample mean of hits/round is 1.928431. Since we already have an assumption about the expected value of the sample mean, we might as well use the population variance of the sample mean (0.0004332353) instead of fussing with a sample variance. (You could also argue that with a sample size of 1,020, who cares?) Then, a 95% confidence interval for the sample mean of hits/round is

1.928431 ± (1.959964)(0.02081431) or (1.888, 1.969)

Recall that the expected value of the sample mean is 1.91. There is no reason to believe that 1.928 is an "extreme" result, assuming that the true distribution of Ridill multi-hits with DA job trait is .27/.55/.18. This can be illustrated with a histogram of a simulated sampling distribution of hits/round (dotted vertical line denoting 1.928431 from the sample and red vertical lines denoting the bounds of the CI), overlaid with a graph of a normal distribution with mean 1.91 and variance 0.0004332353:

Note that the normal distribution and the simulated sampling distribution agree, as expected.

Conclusion: Using the criterion of "average swings per attack round", the proposed model seems like a good fit to the data. ("Double attack trait processes on all attack rounds equally.")

So how does this apply to the discussion of Ridill on FFXIclopedia? To reiterate, from the first data set (Ridill multihits with WAR DA trait only), the estimated sample value was 1.928. Later on, there is a data set for Ridill multihits in the presence of WAR DA trait, Brutal Earring (assumed DA 5%), Warrior's Cuisses (1%), and Fighter's Calligae (1%), for a total of 17% DA. Does DA "nerf" Ridill or not going from 10% DA to 17% DA? (Whether or not it's really 17% DA, it's higher than 10%.)

Similar to what was shown earlier, it is easy to calculate an alternative distribution under 17% DA (null being 10% DA), assuming DA affects all x-hits equally:

single: .3(1-.17) = .249
double: .5 + .17(.3 + .2) = .585
triple: .2(1-.17) = .166

The sample proportions from the data are

single: 257/1022 = 0.2514677
double: 611/1022 = 0.5978474
triple: 154/1022 = 0.1506849

The sample mean of hits/round for Ridill is 1.899 given DA 17%, which is less than 1.928 given DA %10.

I recall on BG someone drew the erroneous conclusion that additional DA (from equipment) has the effect of "nerfing" Ridill without accounting for random variability! But before evaluating this assertion, I want to finish up discussing whether the alternative hypothesis is a good fit to the data.

Is 1.899 an "extreme" result given the "alternative" hypothesis just specified? Under the alternative, the expected value of the sample mean of hits/round is 1.917, and the variance is 0.0003993258. We can then repeat the exercise of generating a graph, this time of a normal distribution with mean 1.917 and variance 0.0003993258, along with a simulated sampling distribution of the mean:

As you can see, 1.899 is not an extreme result under the above distribution. Furthermore, because the expected value of hits/round is 1.917 and the underlying (sampling) distribution is normal, if you repeat this experiment many, many times, about half of the observed hits/round must be below 1.917, and about half must be above 1.917.

But this wasn't the null distribution, or the point of the comparison. Even under the null distribution (first graph), 1.899 is not an extreme result. This shows that for sample sizes around 1,000 (1,000 is really large for any typical hypothesis testing that "really matters"), the effect of DA, if it really exists, is obscured by random error, at least under the assumptions I'm subscribing to.

If the Japanese sources are really correct, then there is no point in doing statistics. But if they are not correct, statistics probably won't help to reveal what seems to be a very slight effect from a change in DA (without using excessive sample sizes). Assuming that calculating average number of hits/round is valid, going from 10% DA to 17% DA is, in the long run, a .37% increase in hits/round.

Conclusion: using the "number of hits/round" criterion, the evidence doesn't show that a DA increase has a "statistically significant" effect, neither worse nor better. (Here, I wanted to find evidence against the null of "no change from 10% DA to ~17% DA.)

(If you used the method of obtaining simultaneous 95% confidence intervals instead, you would get (0.22043, 0.28528) for singles, (0.56068, 0.63392) for doubles, and (0.12585, 0.17942) for triples, each of which covers the parameters they correspond to for the 17% DA case. Incidentally, they don't cover the parameters under the 10% DA case. In fact, a chi-square goodness-of-fit test would "reject" at the 5% level the null hypothesis that the data are a random sample from the case where DA is 10%. Such are the perils of choosing appropriate statistics for inference.

Since the null hypothesis model is a not-so-good fit to the data, maybe you would favor the idea that DA improves the output of Ridill, however negligible.)

At this point, you might be wondering what's the point of this post then, and I'm wondering that myself, too. The point is that when taking a random sample of data, remember the "random" part. An effect that you happen to observe in a one-shot sample could easily be ascribed to sampling error, and a goal of statistical inference is to rule out random variability as a possible explanation.

Finally, would random error explain what appears to be an increase in triple attacks in the presence of DA from equipment? Sure. (I already said it's possible earlier, but here is yet another illustrative example.) Consider the following data set (source: QCDN):

War/Drg + Askar Korazin & Brutal Earring:

Triples: 18.37%
Doubles: 59.77%
Sinlges: 21.86%
Total: 430 Rounds. 845 Swings. (1.97 Swings/Round)

I would have to assume out of 430 rounds, 79 triples, 257 doubles, and 94 singles occurred. If DA procs on all hits equally (17% here), then the hypothesized proportions of single/double/triple attacks are .249/.585/.166 respectively. Note that the sample size is 430. A 95% confidence interval for the number of swings/round is

1.965116 ± (1.959964)(0.03054195) or (1.905, 2.025)

This CI happens to cover what we assume is the true expected value (1.917). 1.965 swings/round is not so "extreme" a result if our assumptions are indeed true.