The Unbearable Triteness of Preening

Wednesday, June 23, 2010

TP bonus of Fencer

(Edit: This is for WAR at level 75. I didn't consider the possibility of "increasing levels of mastery" bullshit.)

Saw this dumb shit so I thought I would act a dumb shit too by wasting my time figuring this out. (Figuring out the critical hit rate bonus did not waste that much time as I was sleeping while the data was being collected...)

One way to characterize the TP bonus of Fencer is to see how (and whether) the damage of the weapon skill Spirits Within varies with TP in the presence of Fencer and then compare the results to the damage-TP relationship of Spirits Within without Fencer. (Then you assume the findings can be generalized to all weapon skills and hope your observed damage with other weapon skills is consistent with the findings from Spirits Within testing.)

Some preliminary considerations

The problem is that the latter has not been fully characterized to account for flooring, so after retrieving Spirits Within damage observations between 100 and 300 TP without Fencer (using a Trainee Sword with store TP +5 for 6.7 TP and given 1000 current HP), I came up with a formula that matched the observations exactly.

Let D denote Spirits Within damage, H denote current HP and T denote current TP. Then Spirits Within damage appears to follow the piecewise function

This function describes the TP modifier (the fraction) increasing with TP in increments of 1/256 (other increments, such as 1/128 and 1/1024, result in calculated damage values that disagree with the set of actually observed damage values), so perhaps the TP modifiers at 100, 200, and 300 TP are better described as 32/256, 48/256, and 120/256, respectively. (Note: the inner bracket is there to ensure TP values are floored for the purposes of damage calculation, as TP values, while discrete, need not be integers.)

Now, with the same 1000 current HP and 6.7 TP, we can then observe how Fencer affects Spirits Within damage in terms of modifying base TP. We assume the TP bonus is additive and hope it is constant.

Actual TP bonus determination

The actual TP bonus (assuming it's additive and constant) was determined by a step-wise process of elimination by identifying "candidates" for the TP bonus as follows:

Step 1: For 100.5 TP, the predicted Spirits Within damage is 125. The observed damage is 148. The TP bonus could be 38, 39, 40, 41, 42, or 43. (At this point, 40 is the most plausible candidate as one would expect SE to make the TP bonus a mutiple of 5 or 10.)

Step 2: For 120.6 TP, the predicted Spirits Within damage is 136. The observed damage is 160. The TP bonus could be 37, 38, 39, 40, 41, or 42, but only 38, 39, 40, 41, or 42 are consistent with both observations.

Step 3: For 147.4 TP, the predicted Spirits Within damage is 152. The observed damage is 175. The TP bonus could be 35, 36, 37, 38, 39, or 40, but only 38, 39, or 40 are consistent with all three observations.

Step 4: For 140.7 TP, the predicted Spirits Within damage is 148. The observed damage is 171. The TP bonus could be 35, ..., 41, but, again, only 38, 39, or 40 are consistent with all four observations.

Step 5: For 154.1 TP, the predicted Spirits Within damage is 156. The observed damage is 183. The TP bonus could be 40, 41, 42, 43, 44, or 45, but only 40 is consistent with all five observations. Assuming the TP bonus is additive and constant, Fencer adds +40 TP to the current TP for WS damage calculation.

At this point, we should make sure adding 40 TP to the current TP allows us to predict correctly Spirits Within damage when the "net" TP exceeds 200 TP (so that damage is calculated based on the other part of the function).

For 167.5 TP, the predicted Spirits Within damage, based on 207.5 TP, is 207, which is also the observed value.

For 174.2 TP, the predicted Spirits Within damage, based on 214.2 TP, is 226, which is also the observed value.

For 180.9 TP, the predicted Spirits Within damage, based on 220.9 TP, is 246, which is also the observed value. (Note that you cannot floor the current TP to 180 and then add 40, which would give a predicted value of 242 based on 220 TP, which is wrong.) At this point, it seems reasonable to conclude that there is a 40 TP bonus from Fencer between 100 and 200 TP.

Now what about between 200 and 300 TP?

For 201.0 TP, the predicted Spirits Within damage, based on 241.0 TP, is 300, which is also the observed value.

For 227.8 TP, the predicted Spirits Within damage, based on 267.8 TP, is 375, which is also the observed value.

Finally, to make sure the actual TP for damage calculation is actually min(TP + 40, 300), for 300 TP, the predicted Spirits Within would be 578 given 340 TP, but the observed damage is 468, which is consistent with the 300 TP maximum.

Conclusion

Fencer gives a constant TP bonus of 40 TP for weapon skills independent of what the current TP is.

Tuesday, June 22, 2010

Critical hit rate bonus of Fencer

(Edit: now with information on Fencer with dual wield.)

Fencer is a new job trait from the July 21, 2010 version update that is available to the Warrior job at level 45 and the Beastmaster job at level 80. It has the following help description: "Increases rate of critical hits when wielding with the main hand only. Grants a TP bonus to weapon skills." The critical hit rate bonus was estimated using the following procedure.

Methods (brief)

An estimate of the critical hit rate bonus was obtained by auto-attacking overnight a level 69 Ul'hpemde, which has AGI 65 (source). WAR75/MNK01 was used. The following equipment was used to obtain STR 57, DEX 64, and an accuracy score of 276:

Trainee Knife (240 dagger skill)
Walahra Turban
Dusk Gloves
Snow Ring (STR -2)
Swift Belt (Accuracy +3)
Aurum Sabatons (DEX +3, accuracy +5)

STR 57 ensures 0 damage to any Ul'hpemde, and DEX 64 ensures (with 4/4 critical hit rate merits) a 9% critical hit rate before the effect of Fencer (source). kparser was used for automated data collection.

The level of the targeted Ul'hpemde was inferred by comparing the predicted hit rate for a level 69 Ul'hpemde (.92) against a point estimate of the hit rate of 5628/6133 = .9176 , with 95% confidence interval (.9105, .9244). The observed hit rate is consistent with the prediction.

Estimation of Fencer's effect with dual wield was also done with a Trainee Knife/Trainee's Needle combination, but the Ul'hpemde was level 68. (Critical hit rate is "directly" independent of level, but not AGI, which depends on level to some extent. But for both level 67 and level 68 Ul'hpemdes, the AGI is 65.) The following image summarizes the final base attribute values for this particular trial:

Results

Single wield: A point estimate of 802/5628 - .09 = .0525 was obtained for the critical hit rate bonus, with a 95% confidence interval (.0434, .0619).

Dual wield: A point of estimate of 464/4983 - .09 = .0031 was obtained for the critical hit rate bonus, with a 95% confidence interval (-.0048, .0115).

Interpretation and conclusion

Since critical hit rate has statistically been shown to take only integer percent values, assuming that the bonus is additive, the critical hit rate bonus of Fencer is either 5% or 6% with 95% confidence.

For the dual-wield case, suppose there were a 5% bonus for the main hand and none for the off hand. The effective bonus would then be 2.5%. Yet the observed estimate is much less than 2.5%, which should be taken as evidence that Fencer has no effect when dual wielding.

Saturday, June 19, 2010

Weapon skill critical hit rate bonus: summary of evidence

(Edit #2: added information for Backhand Blow and Blade: Jin, and another source for Rampage.)

(Edit #1: added another source for Drakesbane.)

This is an attempt to summarize any evidence following attempts to determine the critical hit rate bonus at or around 100 TP (if any) for weapon skills whose "chance of critical varies with TP."

I am not aware of any (non-anecdotal) evidence for the following weapon skills: Ascetic's Fury, Vorpal Blade, Power Slash, Sturmwind, Keen Edge, Vorpal Scythe, Vorpal Thrust, Skewer, Blade: Rin, True Strike, Hexa Strike, Sniper Shot, Heavy Shot, Dulling Arrow, and Arching Arrow (17 weapon skills). That leaves only six: Backhand Blow, Evisceration, Rampage, Raging Rush, Drakesbane, and Blade: Jin.

For now, "convenient" determination of critical hit rate is possible only for the first hit. Most of the testing done concerns the first hit, and conclusions are based on the assumption that the bonus (where it exists) is additive.

Backhand Blow (hand-to-hand, 2 hits)

Source: dex/crit relation, WS crits, WS gorgets discussion (Blue Gartr forums)

Comparing the sample proportions 22/50 (.44) at 9% baseline critical rate and 37/50 (.74) at 30% baseline (with 6% from Destroyers), it is obvious that there is some kind of innate critical rate bonus for at least the first hit of Backhand Blow.

But with Backhand Blow TP varying between 100 and 120 TP, it seems likely that the critical rate was not fixed for each sample. The consequences of this on the allocation of Type I error and coverage probability of the corresponding interval estimate are explored for Blade: Jin bonus estimation (later in the post), as data for that was obtained by the same person, but for now I will just describe briefly how to go about estimating the bonus for Backhand Blow.

Assume that the innate bonus is additive and constant (meaning it's independent of whatever the baseline critical rate is). Also assume that the critical rate bonus from Destroyers (6%) increases the critical hit rate of Backhand Blow by an additional 6% (starting from 24%).

Let X₁ be the number of critical hits observed at 9% baseline, n₁ the total number of hits observed at 9%, X₂ the number of critical hits observed at 30% baseline, and n₂ the total number of hits observed at 30%. A natural "pooled" estimator for Backhand Blow's critical hit rate bonus is

and its standard error is

The sample proportion is .395 and a corresponding 95% confidence interval for the WS bonus is (30.32%, 48.68%).

Conclusion: there is a critical hit rate bonus for Backhand Blow at 100 TP. A bonus of 40% would be consistent with the given data.

Evisceration (dagger, 5 hits)

Source: Evis crit rate testing (Allakhazam forums)

At ~100 TP and given 24% base critical hit rate, the pooled sample gives a sample proportion 248/696 = .3563. A 95% confidence interval for the critical hit rate bonus is (8.61%, 15.32%).

Conclusion: there is a critical hit rate bonus for Evisceration at 100 TP, with +10% being a possibility.

Rampage (axe, 5 hits)

Source (1): ランページとDEXの関係

There are two sets of estimates: one for DEX 68, and one for DEX 124, with Gigantobugard as the target mob in both cases. I'm not much interested in calculating base AGI and confirming that the Megalobugard's level range is 40-43, so I ignored the estimates for DEX 68. DEX 124 ensures a 24% base critical hit rate.

At 100 TP, the sample proportion of critical hits is 35/130 = .2692. A 95% confidence interval for the critical hit rate bonus is ( -4.48%, 11.40%). But suppose there actually is a 10% critical hit bonus. For a sample size of 130, the probability that the sample is sufficient to show a statistically significant bonus is about .7388 (power calculation).

At 200 TP, the sample proportion of critical hits is 68/150 = .4533. A 95% confidence interval for the critical hit rate bonus is (3.20%, 29.66%).

Source (2): dex/crit relation, WS crits, WS gorgets discussion (Blue Gartr forums)

I did say I wasn't interested in calculating a mob's AGI, but a Clipper's AGI is either 18 or 21 regardless of the levels reported on FFXIclopedia, and either AGI value doesn't affect the actual crit rate for the DEX 57 case, which is indeed 13%. (See this for details about critical hit rate as a function of your DEX - mob AGI.)

Using the same "pooled" estimator rationale I used for Backhand Blow (earlier in the post), the sample proportion for Rampage's crit bonus at 300 TP is .465 and a corresponding 95% confidence interval for the rate bonus is (31.80%, 61.20%). For the sake of completeness, estimates for the bonus at 100 TP and 200 TP are (-9.26%, 25.40%) and (3.20%, 48.80%), respectively.

Conclusion: if there is a critical hit rate bonus for Rampage at 100 TP, the known evidence is insufficient to show that, but if the bonus were 10%, for n = 130 the power to reject the null hypothesis of no bonus is fairly high (.7388). Given all the data, it is relatively unlikely that the bonus is 10%, but a smaller bonus cannot be ruled out with such small samples.

Unsurprisingly, there is a bonus at 200 TP and 300 TP.

Raging Rush (great axe, 3 hits)

Source (1): レイグラのクリティカル率について　その１

The sample proportion is 20/40 given the usual 24% base. The "control" data for base critical rate (which is a good idea to have by the way), however, gives the sample proportion 44/130 = .3384, which is somewhat unusual, but I write that off merely as that, not a sign of dubious experimental error. This data alone gives the tentative impression that there is a bonus.

Source (2): RagingRush Critical rate test (Killing Ifrit forums)

The raw data (showing damage values) are in a spreadsheet, but you don't need to download it.

At 100 TP and given 24% base critical hit rate, the proportion of critical hits is 155/373 = .4155. A 95% confidence interval for the critical hit rate bonus is (12.50%, 22.74%). This is strong evidence that the critical hit rate bonus is not 10%. Possible candidates are 15% and 20%.

More interesting to me is that the damage for 1 TP return (2o occurrences) was also noted, providing an opportunity to determine whether a critical hit rate bonus also applies to off-hand hits (despite there being no way to tell the difference between a double attack hit and a regular off-hand hit). Assuming a 24% base critical hit rate, with 9 observed critical hits out of 20, the corresponding p-value is .03614, which suggests a critical hit rate bonus.

Conclusion: there is a critical hit rate bonus for Raging Rush at 100 TP, with +15% and +20% being possible candidates. The small sample for critical hits from off-hand hits suggests a critical hit rate bonus for off-hand hits of Raging Rush as well.

Drakesbane (polearm, 4 hits)

Source (1): drakesbane native crit% (FFXIclopedia forums)

The first sample is 38/100 and the second, 24/100 (given 106 TP).

38/100 is a fairly extreme observation given 24% base critical hit rate (if there were no bonus). On the other hand, 24/100 is not that extreme an observation given a 34% rate. Since there is no good reason to think the conditions changed between the two samples, pool the data and crank out an interval estimate for the rate bonus, which is (0.66%, 13.91%).

Source (2): 雲蒸竜変の検証

There are four samples: three for 100 TP and one for 300 TP.

For 100 TP, the sample proportions are 12/49, 15/45, and 15/41 (given 24% base critical hit rate). The pooled estimate is 42/135 = .3111 and a 95% confidence interval for the bonus is (-0.57%, 15.64%). While this interval covers 0, 0 is again close to the left endpoint (in the other case the 0 being on the "right" side based on expectations).

As for 300 TP, the sample proportion is 16/30 and a 95% confidence interval for the rate bonus is (10.32%, 47.66%), which rules out 50% (tentatively).

Conclusion: there is suggestive evidence for a critical hit rate bonus at 100 TP, with +5% and +10% being possible candidates. At 300 TP, a +50% bonus appears to be an "unlikely" possibility.

Blade: Jin (katana, 3 hits)

Source: dex/crit relation, WS crits, WS gorgets discussion (Blue Gartr forums)

The sampling was done in the same fashion as for Backhand Blow, with observed critical hit proportions 3/30 at 9% baseline crit rate and 8/30 at 30% baseline (with Senjuinrikio's 6% bonus) at 100 TP. Using the same estimator that I used for Backhand Blow, the "pooled" sample proportion for Blade: Jin's critical bonus is -0.01167, and a corresponding 95% confidence interval is (-10.73%, 8.39%).

Taking the confidence interval at face value, if there is a critical bonus for Blade: Jin at 100 TP, it is unlikely that it's 10% or higher, especially considering the "sloppy" manner in which the data was likely collected (with TP not being held fixed, the critical hit rate could have varied), which further supports that contention. If the bonus were 10%, obviously, the probability that a 95% confidence interval wouldn't cover 10% at the right endpoint of the interval would be near .025 (half the Type I error). The consequences of experimental "error" are explored in a simulation study described at the end of this post.

Conclusion: if there is a critical hit rate bonus for Blade: Jin at 100 TP, it is unlikely that the bonus is as high as 10%.

Simulation study: is a 10% critical hit rate bonus that unlikely for Blade: Jin?

Consider the following simulation study based on hypotheticals: if there actually were a 10% bonus at 100 TP, with a 1% increase for every 5 TP, then with TP varying between 100 and 119 TP, the critical rate varies between 10% and 13%.

Given that "TP overflow" is inevitable with dual wield, and that extra hits occurring beyond TP were quite possible because data collection was reported to be boring, suppose that each of the critical rates between 10% and 13% (inclusive) are equally likely to be "chosen" for Blade: Jin.

The purpose of the study is to show how likely it is that the "pooled" large-sample confidence interval covers 10% given the above conditions.

A histogram of the simulated sampling distribution of the critical hit rate bonus shows that it's obviously not normal, with the mean (about 11.5%) higher than 10%, which is supposed to be the "actual" bonus at 100 TP for this simulation. (The shape of the large-sample approximation of the sampling distribution is traced with the solid curve.)

On the other hand, the margin of error for all simulated sample proportions is higher than 9.56%, the margin of error for the actual sample, about 97.7% of the time. (The mean margin of error is 11.19%.) Also, the "actual" (in the context of the simulation) Type I error is about .059, with about .040 allocated to the right tail (meaning there is a probability of .0402 that the null hypothesis of .10 is rejected because the estimate is higher than .10 based on the criterion of statistical significance) and about .019 allocated to the left tail (meaning the null is rejected with probability .019 because the observed estimate is significantly lower than .10). By comparison, the nominal left-tail error is .025.

Repeating this exercise under the condition that there is no bonus, the margin of error for all simulated sample proportions is higher than 9.56% only 58.0% of the time, and the probability that a confidence interval's right endpoint is higher than 8.39% is less than 0.1%.

If Blade: Jin's critical hit rate bonus at 100 TP were actually 10%, considering TP overflow and additional hits occurring beyond TP overflow, it would be very unlikely that a given 95% confidence interval would not cover 10%. The margin of error would also be very likely to be higher than 9.56%. Therefore, it is more plausible that its critical rate bonus is significantly less than 10%, if it even exists.

The following is some code for the simulation, but the inner loop should probably be expanded so that it finishes faster.


n = 100000
ci.lower = numeric(n)
ci.upper = numeric(n)
p.pool = numeric(n)
for (i in 1:n) {
X1 = 0
X2 = 0

for (j in 1:30) {
X1 = X1 + rbinom(1,1,sample(seq(.19,.22,by=.01),1))
X2 = X2 + rbinom(1,1,sample(seq(.40,.43,by=.01),1))
}

p.pool[i] = (X1 + X2 - .39*30)/60

ci.upper[i] = p.pool[i] + qnorm(.975)*sqrt((X1/30*(1-X1/30) + X2/30*(1-X2/30))/120)
ci.lower[i] = p.pool[i] - qnorm(.975)*sqrt((X1/30*(1-X1/30) + X2/30*(1-X2/30))/120)
}

mean(p.pool)
me = (ci.upper - ci.lower)*.5
mean(me>sqrt((3/30*(1-3/30)+8/30*(1-8/30))/120)*qnorm(.975))
mean(ci.upper<.10) mean(ci.lower>.10)
mean(ci.upper<.10) + mean(ci.lower>.10)

Friday, June 18, 2010

Why Love Halberd is underrated... for dragoon

While I personally have yet to determine the virtue stone consumption rate for virtue weapons other than Fortitude Axe (so far, I'm assuming it's 55% across all virtue weapons given the limited evidence thus far), how exactly the normal double attack trait interacts with the virtue weapon's "occasionally attacks twice" (OAT) property seems to be described correctly. With a reasonable level of confidence, one can draw conclusions about how effective the other virtue weapons are compared to their "peers."

I can't say the likes of Hope Staff and Prudence Rod are worth discussing, but Love Halberd has some properties relevant for dragoon and samurai that seem to be misunderstood and even dismissed out of hand, the inconvenience of acquiring virtue stones notwithstanding. I go through them in order of importance and then compare Love Halberd to its competing options for DRG.

Is Love Halberd's delay undesirable?

Love Halberd has 396 delay, so with current quantities of Store TP available, it's possible and reasonable to achieve an "8-hit setup" with 23 Store TP (12.5/10.2 = 1.22549, which rounds up to 1.23).

People act like this this is a bad thing. But so what if it takes Love Halberd 8 hits to get to 100 TP? Noting how many hits it takes to get to 100 TP is trivial and irrelevant especially because of Love Halberd's OAT property. Instead, one should ask, how many attack rounds does it take for Love Halberd to get to 100 TP, given that 8 hits are required to get there?

It may help to show a graph illustrating, for both a virtue weapon (singly wielded) and a weapon without any multi-hit property (also singly wielded) but under 9% double attack rate and 95% hit rate, the relationship between the nominal number of hits to get to 100 TP and the "actual" (in a long-run, "missing the first hit of a WS 5% of the time," weapon skill-spamming context), average number of attack rounds it takes to get to 100 TP:

First, look for the average number of attack rounds it takes for a weapon without any multi-hit property to get to 100 TP in 6 hits. On the graph, the average number of attack rounds appears to be 5, and the actual value is 4.9526 rounds. This figure is reasonable because even though 5% of the time, the first hit of a WS misses (most of the time it takes 5 hits to get to 100 TP) , the 9% double attack rate results in the average value falling slightly below 5.

Now, look for the average number of attack rounds it takes for a virtue weapon to get to 100 TP in 8 hits. "Wait a second," you observe, "isn't the corresponding average number of rounds below 4.9526?" In fact, on average it takes a virtue weapon only 4.7305 rounds to get to 100 TP in 8 hits, so an 8-hit virtue weapon setup ideally has a higher weapon skill frequency than a 6-hit setup with a non-multi-hit weapon.

Is the average attack round argument unconvincing? Let's instead examine the probability distributions of the number of attack rounds it takes for a virtue weapon, a weapon without a multi-hit property, and, for comparison's sake, a "Trial of the Magians" OAT weapon (for dragoon, Bradamante) to get to 100 TP:

These probability distributions were obtained via Markov chain methods.

For a weapon without a multi-hit property, the probability of getting to 100 TP in 5 attack rounds is .580, and the probability for fewer than 5 attack rounds is higher than the probability for greater than 5 attack rounds, which is consistent with the average attack round figure of 4.9526.

In comparison, while the probability of getting to 100 TP in 5 attack rounds is lower for a virtue weapon (.403), the higher probability of getting to 100 TP in 4 attack rounds (.373) contributes to the average number of attack rounds to get to 100 TP being lower (4.7305).

And for the sake of comparison, it takes about 3.783 rounds for a Magian OAT weapon to get to 100 TP in 6 hits. This breaks down such that, most of the time, there is a high probability that a Magian OAT weapon takes either 3 or 4 attack rounds to get to 100 TP.

Note that for all three types of weapons, the probability that it takes 7 or more attack rounds to get to 100 TP is, at most, about .028 (for both the virtue weapon and the non-multi-hit weapon), which underscores the fact that, at least given 95% hit rate, it's not like the virtue weapon "needs" 7 or more attack rounds to get to 100 TP with any significant probability just because 8 landed hits are required to generate 100 TP.

In short, delay for virtue weapons, and the corresponding nominal number of hits it takes to get to 100 TP, is relatively unimportant because of the OAT property. In the case of the 8-hit Love Halberd setup, this property results in a lower average number of attack rounds to get to 100 TP than that for a 6-hit setup for a weapon without a multi-hit property (assuming a 55% virtue stone consumption rate).

Is the Love Halberd's base damage rating too low?

Love Halberd's 60 base damage is only 4 lower than Fortitude Axe's 64, which has 504 delay, so I'd say dragoons and samurai are relatively "spoiled" with access to a weapon with such high attack frequency and low delay.

Also, with a low base damage, the relative damage gap between Love Halberd and a higher-damage weapon decreases with additional fSTR.

Does Love Halberd's DEX +7 matter?

This is relatively unimportant, but with DEX +8 generally guaranteeing a 1% increase in critical hit rate when the target's AGI is not obscenely higher than your DEX, one can expect, effectively, a +1% critical hit bonus most of the time with DEX +7, which is not bad. DEX +7 is also a nice amount of DEX in the weapon slot that could help to ramp up one's critical hit rate if the opportunity presents itself (yeah, yeah, Greater Colibri...).

At least you can say it counters the loss of any attack (or accuracy) bonus associated with equipment for the ammo slot, Smart Grenade, Tiphia Sting, or whatever it is that DRG uses.

An additional +5 or +6 accuracy, if actually realized from the DEX bonus, is nothing to ignore, either.

Finally, a comparison of polearm options

All the features of Love Halberd described culminate such that Love Halberd is better than "conventional wisdom" allegedly holds.

Earlier, I did a write-up of how to model (approximately) the effect of Jump on damage rate as a preliminary step to doing a comparison of polearms that accounts for the increased WS frequency that Jumps provide. As usual, this comparison is done in terms of a long-run, WS-spamming, Jump-spamming situation so that one gets a decent idea of the relationship among the weapons in terms of maximum potential.

The weapons to be compared are

Valkyrie's Fork (6 hits to 100 TP)
Bradamante (with 75 base damage and 6 hits to 100 TP)
Love Halberd (8 hits to 100 TP).

Some of the conditions I specified are

fSTR 6 (+5 for Drakesbane)
42 additional WS "base" damage from the STR 50% modifier
95% hit rate
0% Zanshin rate
base double attack rate of 9%
ATK/DEF ratio of 1.5 and base critical hit rate of 9%, corresponding to an (approximate) average pDIF of 1.599 across all weapons (the critical hit rate bonus of Love Halberd treated as though it offsets the use of virtue stones at the expense of any attack bonus from the ammo slot)

Also, for Drakesbane, I am assuming a critical hit rate bonus of +10% and basing WS damage on 100 TP (ignoring excess TP effects, if they even exists). For Jumps (when accounted for), I treat the damage of Jumps as equivalent to normal hits (yet another simplification).

Let's start with a high quantity of haste, say, 64%, which accounts for Hasso (10%), double March (20%), Haste spell (15%), and haste from equipment (19%), which would relatively favor Valkyrie's Fork, a weapon with fundamentally lower WS frequency than the others, because of weapon skill delay (2 seconds).

Without accounting for the effect of Jumps, the summary of relevant numbers comes out as follows:

Weapon	Avg. TP dmg	Avg. WS dmg	Time per WS	Dmg/sec	TP:WS dmg
Valkyrie's Fork	832.01	1041.54	16.29 s	114.98	444:556
Bradamante	701.52	894.93	13.78 s	115.83	439:561
Love Halberd	793.79	789.77	13.23 s	119.61	501:499

These figures are merely a point of comparison to the more "realistic" figures that account for the effect of Jumps. But first, as an aside, I have to point out that the OAT effect of virtue weapons doesn't proc on Jumps and discuss the major implication for using Jumps with Love Halberd.

In general, Jumps can be considered an attack round that occurs "on demand." Moreover, Jumps generally delay the start of the following attack round by 2 seconds (a consequence of job ability or weapon skill delay in general), so Jumps, in effect, help to decrease the time between weapon skills except when the time between auto-attack rounds falls below 2 seconds. This is the primary effect of Jumps as slight increases in Jump damage per hit compared to auto-attack damage per hit are minor in comparison.

But since Jumps with Love Halberd are effectively normal attack rounds, they do not generate TP (on average) as much as auto-attack rounds. Therefore, there is a critical value of haste after which jumping with Love Halberd is unproductive.

Given the above conditions, Love Halberd averages about 1.579 landed hits per attack round, and "normal" jumps average exactly .95*1.09 = 1.0355 landed hits per "attack round" or 0.51775 landed hits per second (if spammed, so this is the upper limit for Jumps). It follows that it's counterproductive to jump with Love Halberd (in a long-run sense, not in a "need damage on demand" sense) when haste is above 53% (an approximate critical value). Therefore, for the following table, the effect of Jumps is considered only for Valkyrie's Fork and Bradamante:

Weapon	Avg. TP dmg	Avg. WS dmg	Time per WS	Dmg/sec	TP:WS dmg
Valkyrie's Fork	832.01	1041.54	16.00 s	117.08	444:556
Bradamante	701.52	894.93	13.51 s	118.13	439:561
Love Halberd	793.79	789.77	13.23 s	119.61	501:499

As stated previously, the primary effect of Jumps is to decrease the time per weapon skill. Given 64% haste, the effective increase in damage per second is at most around 2%. (At lower levels of haste, the contribution of Jumps to increasing the rate of damage is higher.) Even when Jumps are accounted for, Love Halberd is still slightly better than either Valkyrie's Fork or Bradamante. (The TP:WS damage ratios are my usual check on how well the calculations represent what is observed in the game, but I have no idea if these are typical ratios.)

Certainly, virtue stone consumption is a strike against Love Halberd for everyday, humdrum situations, and it's possible Bradamante can be further augmented after future updates, but can Bradamante be enhanced to the point where formerly top-end polearms (like Valkyrie's Fork) are completely outclassed after accounting for human "inefficiency"? It remains to be seen, but now let's consider the viability of these weapons in a zerg-like situation with 80% haste:

Weapon	Avg. TP dmg	Avg. WS dmg	Time per WS	Dmg/sec	TP:WS dmg
Valkyrie's Fork	832.01	1041.54	9.94 s	188.47	444:556
Bradamante	701.52	894.93	8.55 s	186.80	439:561
Love Halberd	793.79	789.77	8.24 s	192.08	501:499

As discussed in a previous post, the benefit of increasing haste is higher for weapons with lower WS frequency than weapons with higher frequency, a consequence of weapon skill delay. Unsurprisingly, Bradamante falls behind Valkyrie's Fork, yet Love Halberd still has a slight advantage over Valkyrie's Fork even at maximum haste, lending actual credence to the use of Love Halberd for high-haste zergs (and discrediting the idea of using Bradamante for such, at least when compared to Valkyrie's Fork).

Conclusions

Love Halberd's delay in conjunction with its OAT property can give it a weapon skill frequency lower than weapons without any multi-hit property. For example, an 8-hit Love Halberd setup has a higher WS frequency than a 6-hit setup for a polearm without any multi-hit property. This, along with its relatively high base damage (for a multi-hit weapon) and DEX +7 make it a "peer" to the likes of Bradamante, the latest fashionable polearm. At 80% haste, Bradamante is a relatively poor weapon compared to Love Halberd.

INT affecting Drain accuracy, continued

This is a followup to an earlier post showing that INT affects the accuracy of Drain.

Having found a way to suppress my base INT even further, I increased the difference between the two INT "treatments" to 60 (54 INT and 114 INT). As the resist rate for 121 INT could have been floored (presumably at 5%), maybe a 7-INT decrease would result in some increase in resist rate. The following dot plots summarize the distribution of the observed data visually:

The criterion I used last time to determine a difference between two cases, the number of Drain values set "far enough" apart from the bulk of the data (over the total number of samples in each case), doesn't quite work this time, in part because it is kind of difficult to define the "bulk" of the data for the 54-INT case. (It appears that there are more resists in general by an alternative criterion of number of Drain values below 144, though, and more for the 54-INT case than the 114-INT case.)

Instead, I probably am better off relying on the two-sample t-test to demonstrate statistical significance. The sample means for 54 INT and 114 INT are 154.28 and 191.33, respectively, and the 95% confidence interval for the difference of means is (12.213, 61.898), so the evidence, taken together (including that in the last post), provides strong support for the contention that INT affects Drain accuracy (provided that all the assumptions I stated last time hold, and why wouldn't they?), specifically that increasing INT increases its accuracy (in the form of fewer resists).

The most obvious practical implications of this finding is that the "conventional wisdom" that holds prioritizing dark skill above magic accuracy above recast reduction for Drain (and Aspir by close analogy) should incorporate INT, arguably before recast reduction. Where the benefit of reducing recast timers for Drain and Aspir is not fully realized (more often than you think) and the resist rate is suspect, the opportunity cost of recast reduction is usually additional INT, and it might not be a cost worth incurring depending on the actual trade-off.

Thursday, June 17, 2010

You think these new abilities and traits for WAR actually matter?

When was the last time there was a good version update teaser? I can't even remember since I'm usually apathetic toward version updates in general, but the latest teaser concerning "Adjustments of the Job Persuasion!" has provided a lot of fodder for blabbing and speculation, and here are my two Zimbabwean cents on the warrior job abilities and traits.

Restraint (level 77)

Ability delay: 10 minutes
Effect duration: 5 minutes
Description: "Enhances your weapon skill power with each normal attack you land, but prevents you from dealing critical hits."

Would use this for zerging when you have Mighty Strikes? No. I can't imagine a situation where the WS bonus would exceed the substantial benefit of 100% critical hit rate.

Consider the implications of 0% critical hit rate for the purposes of great axe WS spam without MS. Then, consider the implications of 0% critical hit rate for Raging Rush (doubtful that the loss of critical hits wouldn't apply to the WS), leaving King's Justice as the only choice for WS spam. Will the trade-off be worth it? I doubt it. If anything, I wouldn't be surprised if the bonus applied only to the first hit of any WS, and I doubt you can cancel Restraint and keep the WS bonus.

The loss of critical hits and actual WS bonus notwithstanding, this naturally favors one-handed weapons relative to two-handed weapons in theory, except that dual-wielding generally sucks relative to using great axe (or polearm when applicable) and one-handed weapon skills not named Rampage generally suck, too.

Critical Attack Bonus (level 78)

(Also Thief level 78, Dancer level 80)
Description: "Improves power of critical hits."

If this is like a permanent Brave Grip (estimated 3.5% increase in critical hit damage), it's better than nothing. Technically it's "good," but it's not to going to change anything about how to "play" warrior.

Of course this bonus has to be introduced alongside Restraint, which is likely to be a spurious JA.

Fencer (level 45)

(Also Beastmaster level 80, Ranger level 80)
Description: "Increases rate of critical hits when wielding with the main hand only. Grants a TP bonus to weapon skills."

The description and jobs for which this is available seems to imply that the rate increase applies to one-handed weapons only without dual-wield. Now why would anyone want to do that? Tanking!?

It's way more relevant for jobs subbing WAR at level 90 and beyond. But watch the rate increase be like 1%.

Shield Defense Bonus (level 80)

(Also Paladin level 77)
Description: "Reduces damage taken when blocking an attack with a shield."

Joke.

Does INT affect Drain accuracy?

(Correction: 06/18/2010. I meant /SCH instead of /DRK toward the end. I've gone mental...)

A while ago, I asserted that INT "seems likely" to affect the accuracy of Aspir and, by implied analogy, Drain, but I had absolutely nothing on which to base this assertion. Not quite as baseless is assuming that, since then, there has been absolutely no evidence presented anywhere to support or refute that assertion.

Some problems with getting data to show whether INT affects Drain accuracy

Why do I assume that? I'm not trying to be hater and talk shit, as ignorance about this is not on the level of, say, ignorance about the party-based, hidden latent effects of curry food items.

At least where examining the effect of INT on Drain accuracy is concerned, one problem is that if you're in a situation where you have good reason to believe Drain accuracy isn't "capped," you wouldn't want your HP to be low enough to allow you obtain the actual quantity of HP taken with Drain. (Low-level beetles and worms are not acceptable targets for examining Drain accuracy with level 75 jobs, and EM+ worms are not easily accessible... yet.)

Another problem, related to the first, is that the distribution of Drain still isn't known today, and "censored" values of HP drained don't help to provide insight into that. ("Censoring" is one way to describe the fact that Drain values reported in chat logs are based on maximum HP; any HP restored beyond your maximum HP does not count in the final chat log figure, so at best you only know at least how much you drained, not its actual value or whether your Drain was resisted.)

These are some of the problems that hamper data collection.

A way to avoid these problems?

If only there were a stationary target that didn't fight back, that could allow you to suppress your HP safely, for which Drain accuracy has the possibility not to be capped at level 75, and for which you could gather Drain data without interference from other players...

Zvahl Fortalices definitely satisfy the first condition, as they do not move. They also satisfy the second condition, as two of the fortalices deeper into Castle Zvahl Baileys (S) do not have any mobs wandering nearby, including Dark and Ice Elementals. Zvahl Fortalices definitely seemed like a promising candidate for Drain testing, so I actually set out to get some data to determine if INT has some accuracy effect.

That left the third and fourth conditions. Of course, I had no idea if it would even be possible for Drain accuracy not to be capped, but that would be part of the data collection anyway, with the hope that my Drain accuracy could be decreased enough to raise the corresponding resist rate above the (assumed) 5% resist rate floor. As for people doing skill-ups, I don't really begrudge them trying to maximize their skill-up opportunities, as this method of skill-up is liable to be "nerfed" come the June 21 version update.

Goals of data collection, some assumptions, and results

My way of determining whether INT has an effect on Drain accuracy is based on a simple two-sample comparison of the occurrence of resists, one sample based on "low" INT (71 in my case), and the other based on "high" INT (121). Again, this was based on the hope that my resist rate would not be floored (at 5%) for the low-INT case. This, in turn, is based on the assumption that the resist rate is floored at 5% and rises with decreasing magic accuracy. If the data shows the resist rate being above 5% for low INT, I conclude my Drain accuracy isn't capped for low INT. (That alone would not show that INT has an effect on accuracy; I would need the second sample under high INT as well.)

But what is considered a resist? Similar to the Aspir data collection I cited previously, it would be necessary to get some sense of the distribution of unresisted Drain values, with any low Drain values set "far enough" apart from the bulk of the data considered occurrences of a resist. This is the main assumption concerning the interpretation of the data (but a reasonable one).

Now, what about the other assumptions? I merely state some of them here because I simply was not interested in testing them, and I didn't collect enough data to test these assumptions anyway.

No differences among Zvahl Fortalices that could affect the results. This is a catch-all assumption concerning possible differences in magic evasion, INT, etc., but I don't think they exist (otherwise, fuck you, SE). If it could be shown that two Fortalices have two different base INT values (even if only a +1 INT difference), you would have to wonder about other possible confounders like level difference as well (can't assume these are level 75, etc.).
Even if there is a bonus to Drain on Fortalices, similar to a MAB bonus for elemental magic, it should still be possible to tell the difference between a resist and a non-resist. Bio II initial damage shows there is a MAB bonus, but even if there is a similar bonus for Drain (not MAB-related, of course), it shouldn't affect one's ability to distinguish between resists and non-resists.
The equipment bonuses (or penalties) aside from +50 INT for the "high INT" case have no effect on the accuracy of Drain. Now, obviously, I didn't put on equipment with dark magic skill or magic accuracy (or use a Dark Staff or Pluto's Staff), leaving only base attribute bonuses and penalties. Now, if you think MND and CHR actually have an effect on Drain accuracy, I'd like to hear the justification. If there are hidden accuracy effects on my equipment, that could be a problem, though.
Dark weather and Darksday have no effect on the accuracy of Drain. This is not really an assumption, as I didn't collect any data during Darksday or under Dark weather, but I just mention it anyway as they are potential confounders.

Now, the results. First, dot plots of the results as an initial visual impression, under low INT (top dot plot) and under high INT (below), suggest a minimum and maximum non-resisted Drain given 269 dark magic skill:

Before jumping into a discussion of the maximum and minimum Drain values, based solely on the criterion of a resist I described earlier (low values of Drain set "far enough" apart from the bulk of the data), there are 10/65 resists under low INT and 2/63 under high INT, so this data appears to provide good evidence that increasing INT increases the accuracy of Drain, especially if you think that for the 121-INT case, the resist rate was floored at 5%. I see no reason to be pedantic and report a confidence interval or p-value.

Under both low INT and high INT, the Drain maximum (unresisted) was 288 under 269 dark magic skill. It has been said that the maximum Drain and Aspir are 300 and 100, respectively, without any potency-enhancing gear (anecdotal discussion on BG), so if it can be shown that the Drain maximum is 288 under 269 skill for other mobs, you have to wonder how Drain potency actually scales with dark magic skill.

The location of the unresisted Drain minimum is less straightforward. One possibility is that it could be at 144 HP, which would be exactly half of the unresisted Drain maximum. It would be interesting if this relationship between maximum and minimum actually holds for all levels of dark magic skill (with other potency-enhancing factors presumably serving only to affect scale). One way to check this would be with with /SCH as a subjob.

And what of the relationship between the HP value of a resist and a non-resist? I actually got 29 HP under the low-INT case, and it's difficult to describe this relationship with with small samples. But small samples are enough to reach the major conclusions.

Conclusions

Based on the criterion that low values of Drain set "far enough" apart from the bulk of the observed data should be considered resists, additional INT appears to increase the accuracy of Drain. Ideally, the data collection should be repeated in an attempt to replicate this result.

Data collection could be performed using /SCH as a subjob. +50 INT (if it could be achieved) should still be able to manifest in the form of increased accuracy (provided INT does have effect), and further exploration of the relationship between dark magic skill and unresisted Drain maximum could be done, along with that between (unresisted) Drain maximum and minimum.