Wednesday, June 23, 2010

TP bonus of Fencer

(Edit: This is for WAR at level 75. I didn't consider the possibility of "increasing levels of mastery" bullshit.)

Saw this dumb shit so I thought I would act a dumb shit too by wasting my time figuring this out. (Figuring out the critical hit rate bonus did not waste that much time as I was sleeping while the data was being collected...)

One way to characterize the TP bonus of Fencer is to see how (and whether) the damage of the weapon skill Spirits Within varies with TP in the presence of Fencer and then compare the results to the damage-TP relationship of Spirits Within without Fencer. (Then you assume the findings can be generalized to all weapon skills and hope your observed damage with other weapon skills is consistent with the findings from Spirits Within testing.)

Some preliminary considerations

The problem is that the latter has not been fully characterized to account for flooring, so after retrieving Spirits Within damage observations between 100 and 300 TP without Fencer (using a Trainee Sword with store TP +5 for 6.7 TP and given 1000 current HP), I came up with a formula that matched the observations exactly.

Let D denote Spirits Within damage, H denote current HP and T denote current TP. Then Spirits Within damage appears to follow the piecewise function


This function describes the TP modifier (the fraction) increasing with TP in increments of 1/256 (other increments, such as 1/128 and 1/1024, result in calculated damage values that disagree with the set of actually observed damage values), so perhaps the TP modifiers at 100, 200, and 300 TP are better described as 32/256, 48/256, and 120/256, respectively. (Note: the inner bracket is there to ensure TP values are floored for the purposes of damage calculation, as TP values, while discrete, need not be integers.)

Now, with the same 1000 current HP and 6.7 TP, we can then observe how Fencer affects Spirits Within damage in terms of modifying base TP. We assume the TP bonus is additive and hope it is constant.

Actual TP bonus determination

The actual TP bonus (assuming it's additive and constant) was determined by a step-wise process of elimination by identifying "candidates" for the TP bonus as follows:

Step 1: For 100.5 TP, the predicted Spirits Within damage is 125. The observed damage is 148. The TP bonus could be 38, 39, 40, 41, 42, or 43. (At this point, 40 is the most plausible candidate as one would expect SE to make the TP bonus a mutiple of 5 or 10.)

Step 2: For 120.6 TP, the predicted Spirits Within damage is 136. The observed damage is 160. The TP bonus could be 37, 38, 39, 40, 41, or 42, but only 38, 39, 40, 41, or 42 are consistent with both observations.

Step 3: For 147.4 TP, the predicted Spirits Within damage is 152. The observed damage is 175. The TP bonus could be 35, 36, 37, 38, 39, or 40, but only 38, 39, or 40 are consistent with all three observations.

Step 4: For 140.7 TP, the predicted Spirits Within damage is 148. The observed damage is 171. The TP bonus could be 35, ..., 41, but, again, only 38, 39, or 40 are consistent with all four observations.

Step 5: For 154.1 TP, the predicted Spirits Within damage is 156. The observed damage is 183. The TP bonus could be 40, 41, 42, 43, 44, or 45, but only 40 is consistent with all five observations. Assuming the TP bonus is additive and constant, Fencer adds +40 TP to the current TP for WS damage calculation.

At this point, we should make sure adding 40 TP to the current TP allows us to predict correctly Spirits Within damage when the "net" TP exceeds 200 TP (so that damage is calculated based on the other part of the function).

For 167.5 TP, the predicted Spirits Within damage, based on 207.5 TP, is 207, which is also the observed value.

For 174.2 TP, the predicted Spirits Within damage, based on 214.2 TP, is 226, which is also the observed value.

For 180.9 TP, the predicted Spirits Within damage, based on 220.9 TP, is 246, which is also the observed value. (Note that you cannot floor the current TP to 180 and then add 40, which would give a predicted value of 242 based on 220 TP, which is wrong.) At this point, it seems reasonable to conclude that there is a 40 TP bonus from Fencer between 100 and 200 TP.

Now what about between 200 and 300 TP?

For 201.0 TP, the predicted Spirits Within damage, based on 241.0 TP, is 300, which is also the observed value.

For 227.8 TP, the predicted Spirits Within damage, based on 267.8 TP, is 375, which is also the observed value.

Finally, to make sure the actual TP for damage calculation is actually min(TP + 40, 300), for 300 TP, the predicted Spirits Within would be 578 given 340 TP, but the observed damage is 468, which is consistent with the 300 TP maximum.

Conclusion

Fencer gives a constant TP bonus of 40 TP for weapon skills independent of what the current TP is.

Tuesday, June 22, 2010

Critical hit rate bonus of Fencer

(Edit: now with information on Fencer with dual wield.)

Fencer is a new job trait from the July 21, 2010 version update that is available to the Warrior job at level 45 and the Beastmaster job at level 80. It has the following help description: "Increases rate of critical hits when wielding with the main hand only. Grants a TP bonus to weapon skills." The critical hit rate bonus was estimated using the following procedure.

Methods (brief)

An estimate of the critical hit rate bonus was obtained by auto-attacking overnight a level 69 Ul'hpemde, which has AGI 65 (source). WAR75/MNK01 was used. The following equipment was used to obtain STR 57, DEX 64, and an accuracy score of 276:

  • Trainee Knife (240 dagger skill)
  • Walahra Turban
  • Dusk Gloves
  • Snow Ring (STR -2)
  • Swift Belt (Accuracy +3)
  • Aurum Sabatons (DEX +3, accuracy +5)

STR 57 ensures 0 damage to any Ul'hpemde, and DEX 64 ensures (with 4/4 critical hit rate merits) a 9% critical hit rate before the effect of Fencer (source). kparser was used for automated data collection.

The level of the targeted Ul'hpemde was inferred by comparing the predicted hit rate for a level 69 Ul'hpemde (.92) against a point estimate of the hit rate of 5628/6133 = .9176 , with 95% confidence interval (.9105, .9244). The observed hit rate is consistent with the prediction.

Estimation of Fencer's effect with dual wield was also done with a Trainee Knife/Trainee's Needle combination, but the Ul'hpemde was level 68. (Critical hit rate is "directly" independent of level, but not AGI, which depends on level to some extent. But for both level 67 and level 68 Ul'hpemdes, the AGI is 65.) The following image summarizes the final base attribute values for this particular trial:


Results

Single wield: A point estimate of 802/5628 - .09 = .0525 was obtained for the critical hit rate bonus, with a 95% confidence interval (.0434, .0619).

Dual wield: A point of estimate of 464/4983 - .09 = .0031 was obtained for the critical hit rate bonus, with a 95% confidence interval (-.0048, .0115).

Interpretation and conclusion

Since critical hit rate has statistically been shown to take only integer percent values, assuming that the bonus is additive, the critical hit rate bonus of Fencer is either 5% or 6% with 95% confidence.

For the dual-wield case, suppose there were a 5% bonus for the main hand and none for the off hand. The effective bonus would then be 2.5%. Yet the observed estimate is much less than 2.5%, which should be taken as evidence that Fencer has no effect when dual wielding.

Saturday, June 19, 2010

Weapon skill critical hit rate bonus: summary of evidence

(Edit #2: added information for Backhand Blow and Blade: Jin, and another source for Rampage.)

(Edit #1: added another source for Drakesbane.)

This is an attempt to summarize any evidence following attempts to determine the critical hit rate bonus at or around 100 TP (if any) for weapon skills whose "chance of critical varies with TP."

I am not aware of any (non-anecdotal) evidence for the following weapon skills: Ascetic's Fury, Vorpal Blade, Power Slash, Sturmwind, Keen Edge, Vorpal Scythe, Vorpal Thrust, Skewer, Blade: Rin, True Strike, Hexa Strike, Sniper Shot, Heavy Shot, Dulling Arrow, and Arching Arrow (17 weapon skills). That leaves only six: Backhand Blow, Evisceration, Rampage, Raging Rush, Drakesbane, and Blade: Jin.

For now, "convenient" determination of critical hit rate is possible only for the first hit. Most of the testing done concerns the first hit, and conclusions are based on the assumption that the bonus (where it exists) is additive.

Backhand Blow (hand-to-hand, 2 hits)

Source: dex/crit relation, WS crits, WS gorgets discussion (Blue Gartr forums)

Comparing the sample proportions 22/50 (.44) at 9% baseline critical rate and 37/50 (.74) at 30% baseline (with 6% from Destroyers), it is obvious that there is some kind of innate critical rate bonus for at least the first hit of Backhand Blow.

But with Backhand Blow TP varying between 100 and 120 TP, it seems likely that the critical rate was not fixed for each sample. The consequences of this on the allocation of Type I error and coverage probability of the corresponding interval estimate are explored for Blade: Jin bonus estimation (later in the post), as data for that was obtained by the same person, but for now I will just describe briefly how to go about estimating the bonus for Backhand Blow.

Assume that the innate bonus is additive and constant (meaning it's independent of whatever the baseline critical rate is). Also assume that the critical rate bonus from Destroyers (6%) increases the critical hit rate of Backhand Blow by an additional 6% (starting from 24%).

Let X1 be the number of critical hits observed at 9% baseline, n1 the total number of hits observed at 9%, X2 the number of critical hits observed at 30% baseline, and n2 the total number of hits observed at 30%. A natural "pooled" estimator for Backhand Blow's critical hit rate bonus is


and its standard error is


The sample proportion is .395 and a corresponding 95% confidence interval for the WS bonus is (30.32%, 48.68%).

Conclusion: there is a critical hit rate bonus for Backhand Blow at 100 TP. A bonus of 40% would be consistent with the given data.

Evisceration (dagger, 5 hits)

Source: Evis crit rate testing (Allakhazam forums)

At ~100 TP and given 24% base critical hit rate, the pooled sample gives a sample proportion 248/696 = .3563. A 95% confidence interval for the critical hit rate bonus is (8.61%, 15.32%).

Conclusion: there is a critical hit rate bonus for Evisceration at 100 TP, with +10% being a possibility.

Rampage (axe, 5 hits)

Source (1): ランページとDEXの関係

There are two sets of estimates: one for DEX 68, and one for DEX 124, with Gigantobugard as the target mob in both cases. I'm not much interested in calculating base AGI and confirming that the Megalobugard's level range is 40-43, so I ignored the estimates for DEX 68. DEX 124 ensures a 24% base critical hit rate.

At 100 TP, the sample proportion of critical hits is 35/130 = .2692. A 95% confidence interval for the critical hit rate bonus is ( -4.48%, 11.40%). But suppose there actually is a 10% critical hit bonus. For a sample size of 130, the probability that the sample is sufficient to show a statistically significant bonus is about .7388 (power calculation).

At 200 TP, the sample proportion of critical hits is 68/150 = .4533. A 95% confidence interval for the critical hit rate bonus is (3.20%, 29.66%).

Source (2): dex/crit relation, WS crits, WS gorgets discussion (Blue Gartr forums)

I did say I wasn't interested in calculating a mob's AGI, but a Clipper's AGI is either 18 or 21 regardless of the levels reported on FFXIclopedia, and either AGI value doesn't affect the actual crit rate for the DEX 57 case, which is indeed 13%. (See this for details about critical hit rate as a function of your DEX - mob AGI.)

Using the same "pooled" estimator rationale I used for Backhand Blow (earlier in the post), the sample proportion for Rampage's crit bonus at 300 TP is .465 and a corresponding 95% confidence interval for the rate bonus is (31.80%, 61.20%). For the sake of completeness, estimates for the bonus at 100 TP and 200 TP are (-9.26%, 25.40%) and (3.20%, 48.80%), respectively.

Conclusion: if there is a critical hit rate bonus for Rampage at 100 TP, the known evidence is insufficient to show that, but if the bonus were 10%, for n = 130 the power to reject the null hypothesis of no bonus is fairly high (.7388). Given all the data, it is relatively unlikely that the bonus is 10%, but a smaller bonus cannot be ruled out with such small samples.

Unsurprisingly, there is a bonus at 200 TP and 300 TP.

Raging Rush (great axe, 3 hits)

Source (1): レイグラのクリティカル率につい て その1

The sample proportion is 20/40 given the usual 24% base. The "control" data for base critical rate (which is a good idea to have by the way), however, gives the sample proportion 44/130 = .3384, which is somewhat unusual, but I write that off merely as that, not a sign of dubious experimental error. This data alone gives the tentative impression that there is a bonus.

Source (2): RagingRush Critical rate test (Killing Ifrit forums)

The raw data (showing damage values) are in a spreadsheet, but you don't need to download it.

At 100 TP and given 24% base critical hit rate, the proportion of critical hits is 155/373 = .4155. A 95% confidence interval for the critical hit rate bonus is (12.50%, 22.74%). This is strong evidence that the critical hit rate bonus is not 10%. Possible candidates are 15% and 20%.

More interesting to me is that the damage for 1 TP return (2o occurrences) was also noted, providing an opportunity to determine whether a critical hit rate bonus also applies to off-hand hits (despite there being no way to tell the difference between a double attack hit and a regular off-hand hit). Assuming a 24% base critical hit rate, with 9 observed critical hits out of 20, the corresponding p-value is .03614, which suggests a critical hit rate bonus.

Conclusion: there is a critical hit rate bonus for Raging Rush at 100 TP, with +15% and +20% being possible candidates. The small sample for critical hits from off-hand hits suggests a critical hit rate bonus for off-hand hits of Raging Rush as well.

Drakesbane (polearm, 4 hits)

Source (1): drakesbane native crit% (FFXIclopedia forums)

The first sample is 38/100 and the second, 24/100 (given 106 TP).

38/100 is a fairly extreme observation given 24% base critical hit rate (if there were no bonus). On the other hand, 24/100 is not that extreme an observation given a 34% rate. Since there is no good reason to think the conditions changed between the two samples, pool the data and crank out an interval estimate for the rate bonus, which is (0.66%, 13.91%).

Source (2): 雲蒸竜変の検証

There are four samples: three for 100 TP and one for 300 TP.

For 100 TP, the sample proportions are 12/49, 15/45, and 15/41 (given 24% base critical hit rate). The pooled estimate is 42/135 = .3111 and a 95% confidence interval for the bonus is (-0.57%, 15.64%). While this interval covers 0, 0 is again close to the left endpoint (in the other case the 0 being on the "right" side based on expectations).

As for 300 TP, the sample proportion is 16/30 and a 95% confidence interval for the rate bonus is (10.32%, 47.66%), which rules out 50% (tentatively).

Conclusion: there is suggestive evidence for a critical hit rate bonus at 100 TP, with +5% and +10% being possible candidates. At 300 TP, a +50% bonus appears to be an "unlikely" possibility.

Blade: Jin (katana, 3 hits)

Source: dex/crit relation, WS crits, WS gorgets discussion (Blue Gartr forums)

The sampling was done in the same fashion as for Backhand Blow, with observed critical hit proportions 3/30 at 9% baseline crit rate and 8/30 at 30% baseline (with Senjuinrikio's 6% bonus) at 100 TP. Using the same estimator that I used for Backhand Blow, the "pooled" sample proportion for Blade: Jin's critical bonus is -0.01167, and a corresponding 95% confidence interval is (-10.73%, 8.39%).

Taking the confidence interval at face value, if there is a critical bonus for Blade: Jin at 100 TP, it is unlikely that it's 10% or higher, especially considering the "sloppy" manner in which the data was likely collected (with TP not being held fixed, the critical hit rate could have varied), which further supports that contention. If the bonus were 10%, obviously, the probability that a 95% confidence interval wouldn't cover 10% at the right endpoint of the interval would be near .025 (half the Type I error). The consequences of experimental "error" are explored in a simulation study described at the end of this post.

Conclusion: if there is a critical hit rate bonus for Blade: Jin at 100 TP, it is unlikely that the bonus is as high as 10%.

Simulation study: is a 10% critical hit rate bonus that unlikely for Blade: Jin?

Consider the following simulation study based on hypotheticals: if there actually were a 10% bonus at 100 TP, with a 1% increase for every 5 TP, then with TP varying between 100 and 119 TP, the critical rate varies between 10% and 13%.

Given that "TP overflow" is inevitable with dual wield, and that extra hits occurring beyond TP were quite possible because data collection was reported to be boring, suppose that each of the critical rates between 10% and 13% (inclusive) are equally likely to be "chosen" for Blade: Jin.

The purpose of the study is to show how likely it is that the "pooled" large-sample confidence interval covers 10% given the above conditions.

A histogram of the simulated sampling distribution of the critical hit rate bonus shows that it's obviously not normal, with the mean (about 11.5%) higher than 10%, which is supposed to be the "actual" bonus at 100 TP for this simulation. (The shape of the large-sample approximation of the sampling distribution is traced with the solid curve.)


On the other hand, the margin of error for all simulated sample proportions is higher than 9.56%, the margin of error for the actual sample, about 97.7% of the time. (The mean margin of error is 11.19%.) Also, the "actual" (in the context of the simulation) Type I error is about .059, with about .040 allocated to the right tail (meaning there is a probability of .0402 that the null hypothesis of .10 is rejected because the estimate is higher than .10 based on the criterion of statistical significance) and about .019 allocated to the left tail (meaning the null is rejected with probability .019 because the observed estimate is significantly lower than .10). By comparison, the nominal left-tail error is .025.

Repeating this exercise under the condition that there is no bonus, the margin of error for all simulated sample proportions is higher than 9.56% only 58.0% of the time, and the probability that a confidence interval's right endpoint is higher than 8.39% is less than 0.1%.

If Blade: Jin's critical hit rate bonus at 100 TP were actually 10%, considering TP overflow and additional hits occurring beyond TP overflow, it would be very unlikely that a given 95% confidence interval would not cover 10%. The margin of error would also be very likely to be higher than 9.56%. Therefore, it is more plausible that its critical rate bonus is significantly less than 10%, if it even exists.

The following is some code for the simulation, but the inner loop should probably be expanded so that it finishes faster.

n = 100000
ci.lower = numeric(n)
ci.upper = numeric(n)
p.pool = numeric(n)
for (i in 1:n) {
X1 = 0
X2 = 0

for (j in 1:30) {
X1 = X1 + rbinom(1,1,sample(seq(.19,.22,by=.01),1))
X2 = X2 + rbinom(1,1,sample(seq(.40,.43,by=.01),1))
}

p.pool[i] = (X1 + X2 - .39*30)/60

ci.upper[i] = p.pool[i] + qnorm(.975)*sqrt((X1/30*(1-X1/30) + X2/30*(1-X2/30))/120)
ci.lower[i] = p.pool[i] - qnorm(.975)*sqrt((X1/30*(1-X1/30) + X2/30*(1-X2/30))/120)
}

mean(p.pool)
me = (ci.upper - ci.lower)*.5
mean(me>sqrt((3/30*(1-3/30)+8/30*(1-8/30))/120)*qnorm(.975))
mean(ci.upper<.10) mean(ci.lower>.10)
mean(ci.upper<.10) + mean(ci.lower>.10)

Friday, June 18, 2010

Why Love Halberd is underrated... for dragoon

While I personally have yet to determine the virtue stone consumption rate for virtue weapons other than Fortitude Axe (so far, I'm assuming it's 55% across all virtue weapons given the limited evidence thus far), how exactly the normal double attack trait interacts with the virtue weapon's "occasionally attacks twice" (OAT) property seems to be described correctly. With a reasonable level of confidence, one can draw conclusions about how effective the other virtue weapons are compared to their "peers."

I can't say the likes of Hope Staff and Prudence Rod are worth discussing, but Love Halberd has some properties relevant for dragoon and samurai that seem to be misunderstood and even dismissed out of hand, the inconvenience of acquiring virtue stones notwithstanding. I go through them in order of importance and then compare Love Halberd to its competing options for DRG.

Is Love Halberd's delay undesirable?

Love Halberd has 396 delay, so with current quantities of Store TP available, it's possible and reasonable to achieve an "8-hit setup" with 23 Store TP (12.5/10.2 = 1.22549, which rounds up to 1.23).

People act like this this is a bad thing. But so what if it takes Love Halberd 8 hits to get to 100 TP? Noting how many hits it takes to get to 100 TP is trivial and irrelevant especially because of Love Halberd's OAT property. Instead, one should ask, how many attack rounds does it take for Love Halberd to get to 100 TP, given that 8 hits are required to get there?

It may help to show a graph illustrating, for both a virtue weapon (singly wielded) and a weapon without any multi-hit property (also singly wielded) but under 9% double attack rate and 95% hit rate, the relationship between the nominal number of hits to get to 100 TP and the "actual" (in a long-run, "missing the first hit of a WS 5% of the time," weapon skill-spamming context), average number of attack rounds it takes to get to 100 TP:


First, look for the average number of attack rounds it takes for a weapon without any multi-hit property to get to 100 TP in 6 hits. On the graph, the average number of attack rounds appears to be 5, and the actual value is 4.9526 rounds. This figure is reasonable because even though 5% of the time, the first hit of a WS misses (most of the time it takes 5 hits to get to 100 TP) , the 9% double attack rate results in the average value falling slightly below 5.

Now, look for the average number of attack rounds it takes for a virtue weapon to get to 100 TP in 8 hits. "Wait a second," you observe, "isn't the corresponding average number of rounds below 4.9526?" In fact, on average it takes a virtue weapon only 4.7305 rounds to get to 100 TP in 8 hits, so an 8-hit virtue weapon setup ideally has a higher weapon skill frequency than a 6-hit setup with a non-multi-hit weapon.

Is the average attack round argument unconvincing? Let's instead examine the probability distributions of the number of attack rounds it takes for a virtue weapon, a weapon without a multi-hit property, and, for comparison's sake, a "Trial of the Magians" OAT weapon (for dragoon, Bradamante) to get to 100 TP:


These probability distributions were obtained via Markov chain methods.

For a weapon without a multi-hit property, the probability of getting to 100 TP in 5 attack rounds is .580, and the probability for fewer than 5 attack rounds is higher than the probability for greater than 5 attack rounds, which is consistent with the average attack round figure of 4.9526.

In comparison, while the probability of getting to 100 TP in 5 attack rounds is lower for a virtue weapon (.403), the higher probability of getting to 100 TP in 4 attack rounds (.373) contributes to the average number of attack rounds to get to 100 TP being lower (4.7305).

And for the sake of comparison, it takes about 3.783 rounds for a Magian OAT weapon to get to 100 TP in 6 hits. This breaks down such that, most of the time, there is a high probability that a Magian OAT weapon takes either 3 or 4 attack rounds to get to 100 TP.

Note that for all three types of weapons, the probability that it takes 7 or more attack rounds to get to 100 TP is, at most, about .028 (for both the virtue weapon and the non-multi-hit weapon), which underscores the fact that, at least given 95% hit rate, it's not like the virtue weapon "needs" 7 or more attack rounds to get to 100 TP with any significant probability just because 8 landed hits are required to generate 100 TP.

In short, delay for virtue weapons, and the corresponding nominal number of hits it takes to get to 100 TP, is relatively unimportant because of the OAT property. In the case of the 8-hit Love Halberd setup, this property results in a lower average number of attack rounds to get to 100 TP than that for a 6-hit setup for a weapon without a multi-hit property (assuming a 55% virtue stone consumption rate).

Is the Love Halberd's base damage rating too low?

Love Halberd's 60 base damage is only 4 lower than Fortitude Axe's 64, which has 504 delay, so I'd say dragoons and samurai are relatively "spoiled" with access to a weapon with such high attack frequency and low delay.

Also, with a low base damage, the relative damage gap between Love Halberd and a higher-damage weapon decreases with additional fSTR.

Does Love Halberd's DEX +7 matter?

This is relatively unimportant, but with DEX +8 generally guaranteeing a 1% increase in critical hit rate when the target's AGI is not obscenely higher than your DEX, one can expect, effectively, a +1% critical hit bonus most of the time with DEX +7, which is not bad. DEX +7 is also a nice amount of DEX in the weapon slot that could help to ramp up one's critical hit rate if the opportunity presents itself (yeah, yeah, Greater Colibri...).

At least you can say it counters the loss of any attack (or accuracy) bonus associated with equipment for the ammo slot, Smart Grenade, Tiphia Sting, or whatever it is that DRG uses.

An additional +5 or +6 accuracy, if actually realized from the DEX bonus, is nothing to ignore, either.

Finally, a comparison of polearm options

All the features of Love Halberd described culminate such that Love Halberd is better than "conventional wisdom" allegedly holds.

Earlier, I did a write-up of how to model (approximately) the effect of Jump on damage rate as a preliminary step to doing a comparison of polearms that accounts for the increased WS frequency that Jumps provide. As usual, this comparison is done in terms of a long-run, WS-spamming, Jump-spamming situation so that one gets a decent idea of the relationship among the weapons in terms of maximum potential.

The weapons to be compared are
  • Valkyrie's Fork (6 hits to 100 TP)
  • Bradamante (with 75 base damage and 6 hits to 100 TP)
  • Love Halberd (8 hits to 100 TP).

Some of the conditions I specified are
  • fSTR 6 (+5 for Drakesbane)
  • 42 additional WS "base" damage from the STR 50% modifier
  • 95% hit rate
  • 0% Zanshin rate
  • base double attack rate of 9%
  • ATK/DEF ratio of 1.5 and base critical hit rate of 9%, corresponding to an (approximate) average pDIF of 1.599 across all weapons (the critical hit rate bonus of Love Halberd treated as though it offsets the use of virtue stones at the expense of any attack bonus from the ammo slot)

Also, for Drakesbane, I am assuming a critical hit rate bonus of +10% and basing WS damage on 100 TP (ignoring excess TP effects, if they even exists). For Jumps (when accounted for), I treat the damage of Jumps as equivalent to normal hits (yet another simplification).

Let's start with a high quantity of haste, say, 64%, which accounts for Hasso (10%), double March (20%), Haste spell (15%), and haste from equipment (19%), which would relatively favor Valkyrie's Fork, a weapon with fundamentally lower WS frequency than the others, because of weapon skill delay (2 seconds).

Without accounting for the effect of Jumps, the summary of relevant numbers comes out as follows:

Weapon
Avg. TP dmg
Avg. WS dmg
Time per WS
Dmg/sec
TP:WS dmg
Valkyrie's Fork
832.011041.5416.29 s
114.98
444:556
Bradamante
701.52
894.93
13.78 s
115.83439:561
Love Halberd
793.79
789.77
13.23 s
119.61
501:499

These figures are merely a point of comparison to the more "realistic" figures that account for the effect of Jumps. But first, as an aside, I have to point out that the OAT effect of virtue weapons doesn't proc on Jumps and discuss the major implication for using Jumps with Love Halberd.

In general, Jumps can be considered an attack round that occurs "on demand." Moreover, Jumps generally delay the start of the following attack round by 2 seconds (a consequence of job ability or weapon skill delay in general), so Jumps, in effect, help to decrease the time between weapon skills except when the time between auto-attack rounds falls below 2 seconds. This is the primary effect of Jumps as slight increases in Jump damage per hit compared to auto-attack damage per hit are minor in comparison.

But since Jumps with Love Halberd are effectively normal attack rounds, they do not generate TP (on average) as much as auto-attack rounds. Therefore, there is a critical value of haste after which jumping with Love Halberd is unproductive.

Given the above conditions, Love Halberd averages about 1.579 landed hits per attack round, and "normal" jumps average exactly .95*1.09 = 1.0355 landed hits per "attack round" or 0.51775 landed hits per second (if spammed, so this is the upper limit for Jumps). It follows that it's counterproductive to jump with Love Halberd (in a long-run sense, not in a "need damage on demand" sense) when haste is above 53% (an approximate critical value). Therefore, for the following table, the effect of Jumps is considered only for Valkyrie's Fork and Bradamante:

Weapon
Avg. TP dmg
Avg. WS dmg
Time per WS
Dmg/sec
TP:WS dmg
Valkyrie's Fork
832.011041.5416.00 s
117.08
444:556
Bradamante
701.52
894.93
13.51 s
118.13439:561
Love Halberd
793.79
789.77
13.23 s
119.61
501:499

As stated previously, the primary effect of Jumps is to decrease the time per weapon skill. Given 64% haste, the effective increase in damage per second is at most around 2%. (At lower levels of haste, the contribution of Jumps to increasing the rate of damage is higher.) Even when Jumps are accounted for, Love Halberd is still slightly better than either Valkyrie's Fork or Bradamante. (The TP:WS damage ratios are my usual check on how well the calculations represent what is observed in the game, but I have no idea if these are typical ratios.)

Certainly, virtue stone consumption is a strike against Love Halberd for everyday, humdrum situations, and it's possible Bradamante can be further augmented after future updates, but can Bradamante be enhanced to the point where formerly top-end polearms (like Valkyrie's Fork) are completely outclassed after accounting for human "inefficiency"? It remains to be seen, but now let's consider the viability of these weapons in a zerg-like situation with 80% haste:

Weapon
Avg. TP dmg
Avg. WS dmg
Time per WS
Dmg/sec
TP:WS dmg
Valkyrie's Fork
832.011041.549.94 s
188.47
444:556
Bradamante
701.52
894.93
8.55 s
186.80439:561
Love Halberd
793.79
789.77
8.24 s
192.08
501:499

As discussed in a previous post, the benefit of increasing haste is higher for weapons with lower WS frequency than weapons with higher frequency, a consequence of weapon skill delay. Unsurprisingly, Bradamante falls behind Valkyrie's Fork, yet Love Halberd still has a slight advantage over Valkyrie's Fork even at maximum haste, lending actual credence to the use of Love Halberd for high-haste zergs (and discrediting the idea of using Bradamante for such, at least when compared to Valkyrie's Fork).

Conclusions

Love Halberd's delay in conjunction with its OAT property can give it a weapon skill frequency lower than weapons without any multi-hit property. For example, an 8-hit Love Halberd setup has a higher WS frequency than a 6-hit setup for a polearm without any multi-hit property. This, along with its relatively high base damage (for a multi-hit weapon) and DEX +7 make it a "peer" to the likes of Bradamante, the latest fashionable polearm. At 80% haste, Bradamante is a relatively poor weapon compared to Love Halberd.

INT affecting Drain accuracy, continued

This is a followup to an earlier post showing that INT affects the accuracy of Drain.

Having found a way to suppress my base INT even further, I increased the difference between the two INT "treatments" to 60 (54 INT and 114 INT). As the resist rate for 121 INT could have been floored (presumably at 5%), maybe a 7-INT decrease would result in some increase in resist rate. The following dot plots summarize the distribution of the observed data visually:


The criterion I used last time to determine a difference between two cases, the number of Drain values set "far enough" apart from the bulk of the data (over the total number of samples in each case), doesn't quite work this time, in part because it is kind of difficult to define the "bulk" of the data for the 54-INT case. (It appears that there are more resists in general by an alternative criterion of number of Drain values below 144, though, and more for the 54-INT case than the 114-INT case.)

Instead, I probably am better off relying on the two-sample t-test to demonstrate statistical significance. The sample means for 54 INT and 114 INT are 154.28 and 191.33, respectively, and the 95% confidence interval for the difference of means is (12.213, 61.898), so the evidence, taken together (including that in the last post), provides strong support for the contention that INT affects Drain accuracy (provided that all the assumptions I stated last time hold, and why wouldn't they?), specifically that increasing INT increases its accuracy (in the form of fewer resists).

The most obvious practical implications of this finding is that the "conventional wisdom" that holds prioritizing dark skill above magic accuracy above recast reduction for Drain (and Aspir by close analogy) should incorporate INT, arguably before recast reduction. Where the benefit of reducing recast timers for Drain and Aspir is not fully realized (more often than you think) and the resist rate is suspect, the opportunity cost of recast reduction is usually additional INT, and it might not be a cost worth incurring depending on the actual trade-off.

Thursday, June 17, 2010

You think these new abilities and traits for WAR actually matter?

When was the last time there was a good version update teaser? I can't even remember since I'm usually apathetic toward version updates in general, but the latest teaser concerning "Adjustments of the Job Persuasion!" has provided a lot of fodder for blabbing and speculation, and here are my two Zimbabwean cents on the warrior job abilities and traits.

Restraint (level 77)

Ability delay: 10 minutes
Effect duration: 5 minutes
Description: "Enhances your weapon skill power with each normal attack you land, but prevents you from dealing critical hits."

Would use this for zerging when you have Mighty Strikes? No. I can't imagine a situation where the WS bonus would exceed the substantial benefit of 100% critical hit rate.

Consider the implications of 0% critical hit rate for the purposes of great axe WS spam without MS. Then, consider the implications of 0% critical hit rate for Raging Rush (doubtful that the loss of critical hits wouldn't apply to the WS), leaving King's Justice as the only choice for WS spam. Will the trade-off be worth it? I doubt it. If anything, I wouldn't be surprised if the bonus applied only to the first hit of any WS, and I doubt you can cancel Restraint and keep the WS bonus.

The loss of critical hits and actual WS bonus notwithstanding, this naturally favors one-handed weapons relative to two-handed weapons in theory, except that dual-wielding generally sucks relative to using great axe (or polearm when applicable) and one-handed weapon skills not named Rampage generally suck, too.

Critical Attack Bonus (level 78)

(Also Thief level 78, Dancer level 80)
Description: "Improves power of critical hits."

If this is like a permanent Brave Grip (estimated 3.5% increase in critical hit damage), it's better than nothing. Technically it's "good," but it's not to going to change anything about how to "play" warrior.

Of course this bonus has to be introduced alongside Restraint, which is likely to be a spurious JA.

Fencer (level 45)

(Also Beastmaster level 80, Ranger level 80)
Description: "Increases rate of critical hits when wielding with the main hand only. Grants a TP bonus to weapon skills."

The description and jobs for which this is available seems to imply that the rate increase applies to one-handed weapons only without dual-wield. Now why would anyone want to do that? Tanking!?

It's way more relevant for jobs subbing WAR at level 90 and beyond. But watch the rate increase be like 1%.

Shield Defense Bonus (level 80)

(Also Paladin level 77)
Description: "Reduces damage taken when blocking an attack with a shield."

Joke.

Does INT affect Drain accuracy?

(Correction: 06/18/2010. I meant /SCH instead of /DRK toward the end. I've gone mental...)

A while ago, I asserted that INT "seems likely" to affect the accuracy of Aspir and, by implied analogy, Drain, but I had absolutely nothing on which to base this assertion. Not quite as baseless is assuming that, since then, there has been absolutely no evidence presented anywhere to support or refute that assertion.

Some problems with getting data to show whether INT affects Drain accuracy

Why do I assume that? I'm not trying to be hater and talk shit, as ignorance about this is not on the level of, say, ignorance about the party-based, hidden latent effects of curry food items.

At least where examining the effect of INT on Drain accuracy is concerned, one problem is that if you're in a situation where you have good reason to believe Drain accuracy isn't "capped," you wouldn't want your HP to be low enough to allow you obtain the actual quantity of HP taken with Drain. (Low-level beetles and worms are not acceptable targets for examining Drain accuracy with level 75 jobs, and EM+ worms are not easily accessible... yet.)

Another problem, related to the first, is that the distribution of Drain still isn't known today, and "censored" values of HP drained don't help to provide insight into that. ("Censoring" is one way to describe the fact that Drain values reported in chat logs are based on maximum HP; any HP restored beyond your maximum HP does not count in the final chat log figure, so at best you only know at least how much you drained, not its actual value or whether your Drain was resisted.)

These are some of the problems that hamper data collection.

A way to avoid these problems?

If only there were a stationary target that didn't fight back, that could allow you to suppress your HP safely, for which Drain accuracy has the possibility not to be capped at level 75, and for which you could gather Drain data without interference from other players...

Zvahl Fortalices definitely satisfy the first condition, as they do not move. They also satisfy the second condition, as two of the fortalices deeper into Castle Zvahl Baileys (S) do not have any mobs wandering nearby, including Dark and Ice Elementals. Zvahl Fortalices definitely seemed like a promising candidate for Drain testing, so I actually set out to get some data to determine if INT has some accuracy effect.

That left the third and fourth conditions. Of course, I had no idea if it would even be possible for Drain accuracy not to be capped, but that would be part of the data collection anyway, with the hope that my Drain accuracy could be decreased enough to raise the corresponding resist rate above the (assumed) 5% resist rate floor. As for people doing skill-ups, I don't really begrudge them trying to maximize their skill-up opportunities, as this method of skill-up is liable to be "nerfed" come the June 21 version update.

Goals of data collection, some assumptions, and results

My way of determining whether INT has an effect on Drain accuracy is based on a simple two-sample comparison of the occurrence of resists, one sample based on "low" INT (71 in my case), and the other based on "high" INT (121). Again, this was based on the hope that my resist rate would not be floored (at 5%) for the low-INT case. This, in turn, is based on the assumption that the resist rate is floored at 5% and rises with decreasing magic accuracy. If the data shows the resist rate being above 5% for low INT, I conclude my Drain accuracy isn't capped for low INT. (That alone would not show that INT has an effect on accuracy; I would need the second sample under high INT as well.)

But what is considered a resist? Similar to the Aspir data collection I cited previously, it would be necessary to get some sense of the distribution of unresisted Drain values, with any low Drain values set "far enough" apart from the bulk of the data considered occurrences of a resist. This is the main assumption concerning the interpretation of the data (but a reasonable one).

Now, what about the other assumptions? I merely state some of them here because I simply was not interested in testing them, and I didn't collect enough data to test these assumptions anyway.
  • No differences among Zvahl Fortalices that could affect the results. This is a catch-all assumption concerning possible differences in magic evasion, INT, etc., but I don't think they exist (otherwise, fuck you, SE). If it could be shown that two Fortalices have two different base INT values (even if only a +1 INT difference), you would have to wonder about other possible confounders like level difference as well (can't assume these are level 75, etc.).
  • Even if there is a bonus to Drain on Fortalices, similar to a MAB bonus for elemental magic, it should still be possible to tell the difference between a resist and a non-resist. Bio II initial damage shows there is a MAB bonus, but even if there is a similar bonus for Drain (not MAB-related, of course), it shouldn't affect one's ability to distinguish between resists and non-resists.
  • The equipment bonuses (or penalties) aside from +50 INT for the "high INT" case have no effect on the accuracy of Drain. Now, obviously, I didn't put on equipment with dark magic skill or magic accuracy (or use a Dark Staff or Pluto's Staff), leaving only base attribute bonuses and penalties. Now, if you think MND and CHR actually have an effect on Drain accuracy, I'd like to hear the justification. If there are hidden accuracy effects on my equipment, that could be a problem, though.
  • Dark weather and Darksday have no effect on the accuracy of Drain. This is not really an assumption, as I didn't collect any data during Darksday or under Dark weather, but I just mention it anyway as they are potential confounders.
Now, the results. First, dot plots of the results as an initial visual impression, under low INT (top dot plot) and under high INT (below), suggest a minimum and maximum non-resisted Drain given 269 dark magic skill:


Before jumping into a discussion of the maximum and minimum Drain values, based solely on the criterion of a resist I described earlier (low values of Drain set "far enough" apart from the bulk of the data), there are 10/65 resists under low INT and 2/63 under high INT, so this data appears to provide good evidence that increasing INT increases the accuracy of Drain, especially if you think that for the 121-INT case, the resist rate was floored at 5%. I see no reason to be pedantic and report a confidence interval or p-value.

Under both low INT and high INT, the Drain maximum (unresisted) was 288 under 269 dark magic skill. It has been said that the maximum Drain and Aspir are 300 and 100, respectively, without any potency-enhancing gear (anecdotal discussion on BG), so if it can be shown that the Drain maximum is 288 under 269 skill for other mobs, you have to wonder how Drain potency actually scales with dark magic skill.

The location of the unresisted Drain minimum is less straightforward. One possibility is that it could be at 144 HP, which would be exactly half of the unresisted Drain maximum. It would be interesting if this relationship between maximum and minimum actually holds for all levels of dark magic skill (with other potency-enhancing factors presumably serving only to affect scale). One way to check this would be with with /SCH as a subjob.

And what of the relationship between the HP value of a resist and a non-resist? I actually got 29 HP under the low-INT case, and it's difficult to describe this relationship with with small samples. But small samples are enough to reach the major conclusions.

Conclusions

Based on the criterion that low values of Drain set "far enough" apart from the bulk of the observed data should be considered resists, additional INT appears to increase the accuracy of Drain. Ideally, the data collection should be repeated in an attempt to replicate this result.

Data collection could be performed using /SCH as a subjob. +50 INT (if it could be achieved) should still be able to manifest in the form of increased accuracy (provided INT does have effect), and further exploration of the relationship between dark magic skill and unresisted Drain maximum could be done, along with that between (unresisted) Drain maximum and minimum.

Monday, June 14, 2010

Probability distributions associated with WS spam

(Correction: 06/15/2010. I thought Sekkanoki lasted one minute but after reading up on it, it lasts either for one minute or until the next weapon skill, whichever comes first. So, my discussion of the consequences of TP overflow elimination now refers to a hypothetical "Sekkanoki 2.0," which would reduce the TP cost of all weapon skills to 100 TP.)

Who cares about "TP overflow"?

"TP overflow" seems to be the de rigueur term referring to any landed hits that don't contribute to spamming weapons every time 100+ TP is accumulated. TP overflow is inevitable when more than one landed hit per attack round is possible, so it's not like anyone can do much about it except attempt to minimize it by spamming WS. This absolutely does not mean it is harder to "cope" with TP overflow using a multi-hit weapon (when weapon delay is the same as a non-multi-hit alternative). Rather, slack effort means squandering the benefit of the more rapid TP gain of the multi-hit weapon.

So why care about TP overflow? One argument is that it should be "accounted for" when doing item comparisons pertaining to damage efficiency, possibly to be more accurate.

Consider, for example, Soboro Sukehiro, which is considered to average 1.9 attacks per attack round, with the probability of two attacks being .5 and that for three, .2. Given 100% hit rate and 0% DA rate, it takes 3.46553 attack rounds, on average, to be able to execute a weapon skill in six hits, with the actual average number of hits being 6.584507 (note that 6.584507/3.46533 = 1.9 attacks per round), so almost 9% of the hits occur in excess of the target number of hits.

What if somehow there was a way to allocate the TP from those excess hits toward additional weapon skills? Well, Samurai has a level 60 job ability called Sekkanoki, which limits the cost of the next weapon skill to 100 TP. This seems analogous to job abilities like Elemental Seal or Divine Seal, which lasts for 1 minute or until a spell is used, whichever comes first. But what if Sekkanoki limited the cost of all weapon skills to 100 TP while active, say, one minute? This would effectively cause a re-allocation of TP toward future weapon skills. Let's call this "Sekkanoki 2.0."

If one were under the effect of "Sekkanoki 2.0" over a very long time interval, effectively all of the TP would go toward weapon skills, and so the average number of hits approaches 6. Since the average number of attacks per round is 1.9, then the average number of attack rounds approaches 3.157894737, which seems like a fairly significant reduction in average attack rounds until you realize that the concomitant "loss" of TP damage that results from TP overflow (which is eliminated under Sekkanoki 2.0 over an infinite period of time), along with the slight loss of WS damage, offsets the benefit of increased WS frequency. (Also, the proposed Sekkanoki 2.0 lasts for 1 minute out of 5, which means that some TP overflow is inevitable for finite time periods, so it's not like Sekkanoki 2.0 has this tremendous effect.) So, the argument about accounting for TP overflow is a bit overblown (not that you shouldn't, however).

So why care about TP overflow? Since there is no Sekkanoki 2.0, which itself would be a limited tool, you can't do anything about it, so why worry about it? Maybe it's more about players wanting to appear to be "clever" about a not-very-subtle consequence of multi-hit weapons, like asserting that the probability of TP overflow for a given WS is high. (One could easily retort that for Soboro, the fraction of excess hits over total hits would be around 9%.)

But, you know, I'm all about meaningless stuff, so let's finally get into how to define the probability distribution of excess hits (that contribute to TP overflow) associated with WS spam (this would be the same as the probability distribution of the number of hits you end up with under the condition that you spam weapon skills).

Excess hits contributing to TP overflow and the corresponding probability distribution

Let E denote the number of hits in excess of those that contribute to the 100+ TP (in six hits) required to spam a WS. Let's continue with the example of Soboro. For any given attack round, the probability of n landed hits is πn, where n = 0, 1, 2, 3. These probabilities are straightforward to calculate. Not as straightforward to calculate is the probability mass function for E. An extremely tedious approach is to list all the possible combinations of attack rounds that result in 6 or more hits—the possibilities being 6, 7, or 8, which correspond to E = 0, 1, and 2, respectively. This approach requires knowing what to count (all the possible ways to get E = 0, 1, and 2), how to count (combinatorics), and knowing the closed-form expression for the sum of an infinite series, as the possibility of missing hits with non-100% hit rate means there are an infinite number of possible outcomes. (For a given combination of attack rounds leading to 100 TP, there could possibly be zero attack rounds that yield zero landed hits, one attack round that yields zero landed hits, two attack rounds that yield zero landed hits, and so on. These attack rounds are independent of those that yield hits.)

After spending more time than I care to admit, I obtained the p.m.f. of E, which is


This expression is quite unsightly, and rather useless. Not only is it useless merely because knowing the probability of TP overflow is useless, it also is useless because it refers only to the case where 6 hits are required to attain 100 TP. It requires no imagination to see that an expression for a dual-wield situation would be ghastly. It also is useless because you don't even need to knowledge of this p.m.f. to obtain the average number of hits in the process of getting to 100 TP (as I have shown repeatedly in the past). But there it is...

Again, using the Soboro example, P(E = 0) = 0.522579, P(E = 1) = 0.370335, and P(E = 2) = 0.107086, and thank goodness the probabilities sum to 1. The probability of "TP overflow" for a given WS with Soboro is almost 50%... not that you can really do anything about it. The correct response is, "who gives a shit?"

Even worse: the probability distribution of the number of attack rounds

Let R denote the (total) number of attack rounds that results in 100 TP. Again, with the Soboro example, R = 2, 3, 4, ..., and there is not much hope for an elegant formula for the probability distribution, because to obtain such a formula "by hand," one needs again to enumerate all the possible outcomes associated with each event. I only got as far as R =3 before I quit.


Again, using the Soboro example, P(R = 2) = .04, and P(R = 3) = .519. This is consistent with the average number of attack rounds being ~3, but if you already had the average number of attack rounds, why do you need the corresponding probability distribution. Useless!

A better approach for calculating these probability distributions: Markov chains

Perhaps I'll discuss this in a future entry. Aside from the fact that knowing the above probabiltiy distributions is quite useless—average weapon skill TP, average number of rounds, and average number of hits, among other things, are all easily obtained without any knowledge of these probability distributions—the Markov chain approach to obtaining these is much faster and far superior when no symbolic formulas are required. The interpretation of Markov chain output and manipulation is also much easier than it is with formulas for a specific case. It is also the only realistic way where dual-wielding is concerned, as you would have to be crazy even to consider deriving closed-form expressions for the probability distributions for that situation. It is so easy to make a mistake with a binomial or multinomial coefficient here or there, that I have to admit I didn't obtain the above expressions entirely "by hand," but with the help of Mathematica, which is quite handy for dealing with symbolic math.

Friday, June 11, 2010

How do you account for the effect of Jump?

Modeling the effect of Jump on damage rate isn't too bad provided that you invoke the following major simplifications: let both Jump and High Jump have the same amount of merit upgrades, and treat TP from Jumps as accumulating toward a weapon skill independently of TP from auto-attack. In this way, we can estimate the proportion of attack rounds that Jumps contribute to the average number of attack rounds required to accumulate 100 TP (for a weapon skill). We need this proportion to estimate the time savings from using Jumps that contribute to increasing WS frequency.

Suppose that there are 5 merits both in Jump and High Jump. This means that in a 150-second time frame, two Jumps and one High Jump can occur, for a total of three jumps. Also, do not (yet) assume that attack rounds from Jumps are equivalent to those from auto-attack in terms of multi-hit "capability" (from double attack, multi-hit weapons, etc.). It is then possible to obtain a general expression for the denominator required to obtain the respective proportions of attack rounds that auto-attack, Jump, and High Jump contribute to the average number of attack rounds to 100 TP:


The implied units for this denominator are rounds per WS (with spamming of TP after 100 TP is achieved).

The first term (factors specific to it denoted with the subscript 1) in the expression accounts for how many weapon skills from auto-attack can occur in 150 seconds when accounting for a weapon skill delay of two seconds. T1 denotes the time per attack round at 0% haste, and H denotes the haste level as an integer. E[R] in general denotes the average number of attack rounds to 100 TP, and usually, E[R1] = E[R2] = E[R3] except in the case of virtue weapons, apparently (the only reason the equality wouldn't hold because virtue weapons apparently do not work with Jumps).

The second term accounts for how many weapon skills from Jump (two Jumps in 150 seconds, remember) can occur in the previously specified 150-second time frame (necessarily a fraction), and the third term accounts for how many weapon skills from High Jump can occur in 150 seconds.

With this denominator expression, it should then be obvious how to obtain the actual proportions of attack rounds that each of auto-attack, Jump, and High Jump contribute to the average number of attack rounds to 100 TP. For example, the proportion of attack rounds that High Jump contributes to the average number of attack rounds to 100 TP is


These proportions can then be used to obtain an estimate of the adjusted average of the number of attack rounds to 100 TP accounting for Jump effects (this is a weighted average). Of course, if E[R1] = E[R2] = E[R3] = E[R], then the weighted average simplifies to E[R].

However, the adjusted average of attack rounds cannot be multiplied by a simple "time per attack round" conversion factor to get the average time to 100 TP. Recall that TP from Jumps is treated as independent of TP from auto-attack as a simplifying assumption. Instead, the aforementioned proportions must be used to obtain a weighted average of the time "per cycle" of 100 TP generated, with 2E[R2] seconds for Jump and 2E[R3] seconds for High Jump (ignoring stacking of Jump and High Jump; the units of E[R] are attack rounds "per cycle" of 100 TP generated) and E[R1]T1(100-H)/100 seconds for auto-attack.

Mechanistically, we should already recognize before doing modeling that the dominant effect of Jumps is to increase WS frequency by reducing the time required to generate 100 TP, except when T1(100-H)/100 < 2 seconds. With modeling, it is possible to estimate the reduction (both absolute and relative) in average time to generate 100 TP from Jumps. From modeling, it is also possible to account for differences in damage between auto-attack hits, Jump, and High Jump (you don't use haste equipment for Jumps, right?), but this effect is slight compared to the effect on WS frequency and will not be accounted for in future posts.

A real great katana comparison

(Correction: 06/13/2010. Additional comments are in italicized red. Incorrect statements are crossed out.)

Earlier, I blabbed about the consequences of delay associated with the use of weapon skills in terms of modeling damage output mathematically, but did not justify how much delay should be specified because I didn't know how much the following attack round (after a WS) is delayed. Fortunately, I came across this presentation of results and discussion quantifying the amount of delay that is effectively added to the attack round following the use of a job ability (or weapon skill). The results of "stacking" job abilities aside (read for yourself), it is obvious that a two-second delay for the use of a weapon skill must be accounted for, at the minimum, when attempting to model theoretical damage output. (Using other job abilities while engaging an enemy would also have an effect on damage output, but the use of weapon skills, if spammed, is the dominant factor contributing to job ability delay. Consequently, many of my previous posts, which ignored this delay, likely have led to incorrect conclusions.)

For now, though, I think it would be instructive to show how much a weapon skill delay of two seconds obviously hampers the modeling of damage output. But I don't want to waste my time doing the "before" analysis, so I base my "after" analysis based on the conditions set forth in this comparison of great katanas (covering Hagun, Soboro Sukehiro, Kurodachi, and Radennotachi). There are some problems with it, especially with the implied use of /DRG (low DA rates but not accounting for the effect of Jumps, wut). Therefore, I do not merely reuse the computed figures given but provide my own in some cases. In any case, it may help to review that comparison and mine side by side as I wish not to waste my time rehashing said conditions.

Calculating WS frequency: Zanshin is relevant for main job SAM?

The effect of Zanshin on weapon skill frequency is something I had not considered in my previous posts, and I am kind of surprised the activation rate is apparently rather high for samurai as the main job. Recall that in the October, 19, 2006 version update, "the hit rate of the extra attack [was] increased." Moreover, there is very good evidence the Zanshin activation rate can be considered 45% for main job and 25% for subjob, with the hit rate bonus the result of +35 accuracy (source). Unfortunately, it is more difficult to furnish evidence as to how Zanshin interacts with double attack for auto-attack purposes, but it seems likely that Zanshin has a lower "priority" than double attack (if double attack processes, Zanshin doesn't, and if it doesn't, Zanshin can), so I'll just run with that. This means that accounting for Zanshin doesn't really matter all that much for multi-hit weapons, but since I do it for Hagun and Radennotachi, I might as well do it for the other two.

To start off with my "after" analysis (remember I want to show the effect of weapon skill delay not previously considered on an analysis that incorrectly ignores it), going back to the "before" analysis I cited previously, I should first point out that pDIF is apparently ignored in favor of a bogus assumption of a "baseline" 35:65 ratio of melee damage to WS damage for Hagun.

Since we are talking about theoretical damage output, it is nonsense to assume such a ratio. The baseline assumption is bogus, not that 35:65 may be observed in practice. If 35:65 is observed, surely average auto-attack damage and average WS damage are also observed (from parser output)! Use those values instead to back-calculate an "average" pDIF for both auto-attack and WS damage that should be fixed across all great katanas. The differences in WS frequency and weapon base damage will then account for the differences in the ratio of melee damage to WS damage, holding pDIF constant.

Anyway, I will return to the pDIF issue later. After accounting for the effect of Zanshin, I obtain the following averages for attack rounds from WS use to 100 TP, auto-attack hits in the process of getting to 100 TP after WS use, and the "effective" hit rate (landed hits per attack round), which encompasses the effects of accuracy, double attack, and Zanshin.

Weapon
Average no.
of rounds
Average no.
of hits
Effective hit rate
Hagun
5.059455.107421.00948
Soboro Sukehiro
3.11051
5.58467
1.79542
Kurodachi
3.99375
5.326421.33369
Radennotachi
5.05945
5.10742
1.00948

I am aware of the apparent absence of Brutal Earring (5% DA) for Soboro (but why use a Pole Grip then, implied with the stated 2% DA?), replaced by a mysterious source of accuracy +5, and accounted for those differences. I gave the benefit of the doubt, so to speak, with Soboro (94% hit rate after accuracy +5), even though it could easily be argued that, across all merit mobs encountered, the average hit rate could actually be closer to 93.5%.

My effective hit rate figures agree with the previous analysis more or less, but I do not compute effective hit rate directly. Instead, I compute it, as a kind of check on my calculations, after computing the average number of attack rounds and average number of hits (example: 5.10742/5.05945 = 1.00948) to make sure I didn't make any errors calculating the average number of attack rounds.

As always, the average number of rounds can be converted to the average time to accumulate 100 TP, but now the time between weapon skills must also account for the two-second weapon skill delay discussed previously. (This will be done at the end of the post.)

Accounting for average TP for the use of Tachi: Gekko

The previous analysis assumes maximum fSTR for each of the weapons (16, 12, 15, and 17 for Hagun, Soboro, Kurodachi, and Radennotachi, respectively), which would appear to be reasonable given the implied high STR modifier bonus used for Tachi: Gekko (152*.75*.83 = 94.62, which is close to the given 94). As mentioned previously, pDIF is completely ignored, but based on the attack bonus of Tachi: Gekko, it is reasonable to assume an average pDIF of 2.3 (based on a symmetric pDIF distribution between 1.9 and 2.7).

The only thing left is calculating the fTP bonus of the first hit for Tachi: Gekko, which requires calculation of average TP for each weapon when a one-hit weapon skill is used, accounting for double attack. This, in turn, requires knowledge of the probability distribution of TP return from a one-hit WS and the corresponding TP values, which is the same regardless of weapon.

This would seem straightforward except for the observation of 2-TP return with one-hit weapon skills (source), which would suggest that for weapon skills, Zanshin can occur on the first hit independent of the double attack (Zanshin still can't occur for the double attack hit, presumably). The presence of Zanshin effectively "reallocates" the probability of missing the first hit (and losing the full TP return of 16.7), which is 5% most likely, so ignoring the Zanshin effect for a one-hit weapon skill results in negligible error for TP return (but not necessarily WS damage).

Weapon
Average TP per WS
(my calculation)
fTP bonus of 1st hit
(with Gorget effect)
Hagun
101.278061.9829879
Soboro Sukehiro
109.24816
1.6914005
Kurodachi
104.935331.6779229
Radennotachi
101.27806
1.6664939

Note that average TP shouldn't be truncated because these averages are themselves based on the actual truncated TP figures to begin with (assumed 16.7 TP per main WS hit and auto-attack hits and 1.4 TP for off-hand WS hit).

Accounting for average Tachi: Gekko damage: ignore Zanshin?

Given 91% hit rate for any double attack hits (7% DA rate) for Tachi: Gekko (95% otherwise), the average number of hits per weapon skill is .95 + (.91)(.07) = 1.0137. Accounting for the 45% Zanshin rate, this average rises to 1.035075, of which .95 still corresponds to the first hit (which receives the fTP bonus), so 0.085075 of the hits in the average WS have an fTP = 1. The effect of Zanshin is, therefore, like adding 2.345% DA, which, for the purposes of Tachi: Gekko, constitutes approximately a 1.1-1.3% increase in average WS damage. (This is given the conditions stated in the "before" analysis). Whether or not this is accounted for (I will account for it), the effect of Zanshin very slightly "favors" weapons with worse WS "secondary" hit damage (compared to other factors), so it can be ignored for convenience.

A "fatal" flaw: consequences of the effect of haste with weapon skill delay

Because weapon skill delay, which is a fixed value (consider it two seconds), exists, the relative benefit of haste (or other forms of delay reduction) is higher for weapons with lower weapon-skill frequency compared to weapons with higher weapon-skill frequency. It follows that a weapon with higher weapon skill frequency CAN actually be "worse," on average, than a weapon with lower weapon skill frequency depending on the level of haste!

One way to think of this is to consider an arbitrary time frame during which weapon skills occur. The time associated with the WS frequency might be reduced with haste, but there is always an absolute weapon skill delay tacked on. Even if haste goes to 100% (meaning the time associated with WS frequency goes to 0) and you still decide to use WS for some reason, the sum of the absolute weapon skill delay for the weapon with higher WS frequency will be higher than equal to that for the weapon with lower WS frequency (WS frequency is rendered irrelevant if it takes zero time to build TP toward a WS), so the weapon with higher WS damage wins out in terms of WS damage output.

A "practical" consequence is that for "zerging" situations where maximum haste is involved, low-damage, multi-hit weapons (on average) can be worse than standard weapons. Similarly, multi-hit weapons may not be that good for meriting situations.

The "fatal flaw" with the "before" analysis is the unstated assumption that the haste level doesn't matter across weapons, so that the "pecking order" of great katanas always holds. Because weapon skill delay is not accounted for, the analysis does not hew to what is experienced in practice.

Repeat the analysis instead with ~65% haste (Hasso, Haste spell, double March, 20% equipment haste) along with the weapon skill delay of two seconds. The following figures are the result of a "per weapon skill" perspective, using average auto-attack pDIF 1.15 and average WS pDIF of 2.3. (Overwhelm 5/5 also used.)

Weapon
Avg. TP dmg
Avg. WS dmg
Time per WS
Dmg/sec
TP:WS dmg
Hagun
510.99763910.7053192
15.281 s
93.04
36:64
Soboro Sukehiro
333.96349
619.4514917
10.165 s
93.7935:65
Kurodachi
490.03068
744.1553438
12.810 s
96.35
397:603
Radennotachi
593.22713
835.6201688
15.281 s
93.50
415:585

Given 65% haste, relative to Hagun, Kurodachi is about (96.3474/93.0370 - 1)100% = 3.56% more efficient, and Soboro, about (93.7931/93.0370 - 1)100% = 0.81% more efficient. Radennotachi is about 0.5% more efficient. This jibes with the observation that Soboro is not really any better than Hagun in a typical merit situation.

Now, what happens given 80% haste?

Weapon
Avg. TP dmg
Avg. WS dmg
Time per WS
Dmg/sec
TP:WS dmg
Hagun
510.99763910.7053192
9.589 s
148.2636:64
Soboro Sukehiro
333.96349
619.4514917
6.666 s
143.03
35:65
Kurodachi
490.03068
744.1553438
8.177 s
150.93
397:603
Radennotachi
593.22713
835.6201688
9.589 s
149.01
415:585

Obviously, the damage figures (other than rate of damage) shouldn't change with haste. As they are fixed, changes in relative efficiency calculations (relative to 65% haste) involve only changes in time per WS (where applicable). The effect of 15% more haste benefits Hagun relatively more than it does Soboro because of the presence of the fixed two-second weapon skill delay. The result here shows that Hagun is more efficient than Soboro in a max-haste situation when spamming WS, and you should be. 910 damage, on average, in exchange for 2 seconds is better than 511 damage, on average, over 7.589 seconds.

(Correction: 06/13/2010) Incidentally, given 9% DA (the stated condition), Kurodachi is still better than Hagun even with maximum haste, so it is just better barring situations where WS damage is the predominant form of damage and WS frequency is an irrelevant consideration but as DA increases, Hagun eventually becomes better than Kurodachi. This should make sense (but even I overlooked this...) because the "full" benefit of a DA increase is not realized with multi-hit weapons such as Kurodachi, and definitely not with Soboro Sukehiro.

Conclusion

Weapon skill delay, which exists and can be considered to be two seconds, should be considered when doing a theoretical comparison of things related to doing damage.

A major consequence of weapon skill delay is that, as haste increases, weapons with lower WS frequency benefit relatively more than weapons with higher WS frequency. This affects the "correct" choice of weapon for situations where high levels of haste are achieved. For example, even though Soboro Sukehiro may be better than Hagun at low levels of haste, it is inferior at high levels of haste (on average, since there is some inherent variability of WS frequency associated with multi-hit weapons).

(Correction: 06/13/2010) However, it can be shown that Kurodachi is superior to Hagun when WS frequency is a relevant factor (e.g., not relying only on Meditate to generate TP). "Actually better" in theory, however, is contingent on how much base DA is present.

The effects of Zanshin on WS frequency, WS damage (fTP bonus and Zanshin hits), and TP return can be quantified. While the effects of Zanshin given low hit rate were not discussed, the effect of Zanshin can "safely" be ignored for relative comparisons given high hit rates.