Tuesday, May 25, 2010

What's the proc rate for virtue weapons? How do you know?

One "line" of evidence: checking with Justice Sword

Taken at face value, the estimate 555/1000 indicates the "occasionally attacks twice" (OAT) rate of Justice Sword is significantly higher than 50% and could be considered 55% (source). But is the OAT property the same for all so-called "virtue weapons"?

Another line of evidence: checking with Fortitude Axe... and WAR

The rest of this post discusses how to estimate the OAT rate of Fortitude Axe in the presence of the double attack trait from WAR. But first, I needed a good idea about how Fortitude Axe OAT actually interacts with the DA trait. In the past, I blabbed a lot about how Fortitude Axe might interact with double attack, but my "conclusion" was based on very weak evidence. After collecting some more count data with kparser under 12% DA (source), which ruled out my previous weak hypotheses about the DA/OAT interaction, I got a better idea about how to explain these results (assuming kparser was working correctly...).

It appears (not exactly "proof") that OAT can process on both the normal hit (which is guaranteed to occur for a given attack round, if not actually land) as well as the hit from the possible DA proc (with a major caveat to be discussed soon). More specifically, the hit from the DA proc occurs independently of whether an OAT proc occurs. (Not probabilistically, of course, but "mechanistically.")

It is worth noting that conceptually the order of DA and OAT could easily be reversed, such that the hit from the OAT proc occurs independently of whether a DA proc occurs, but I will just say the resulting probability calculations are not supported by the data when OAT is mechanistically independent of DA.

Anyway, one way to show pictorially all "possible" outcomes where DA and OAT can interact is with the following "tree":


There are six hypothetically "distinct" outcomes, but it is very inconvenient to monitor the equipment menu for virtue stone expenditure. More important, though, is the fact that Fortitude Axe cannot quadruple attack, so the case of expending two virtue stones is impossible. (This makes sense, noting that triple attacks are impossible with zero DA rate.)


So what "happens" to this 2-virtue stone attack round that is impossible? It appears that even if a DA proc occurs, only one virtue stone can be expended anyway, so the "tree" simplifies further:


The resulting probability model of the number of hits in an attack round (ignoring the distinction between hits and misses) is specified as follows. Let X denote the number of hits in a given attack round, d the probability of a double attack proc, and π the probability of an OAT proc. Then,

Now that we have a reasonable probability model describing the interaction between double attack and the OAT property of Fortitude Axe ("reasonable" based on chi-square goodness of fit to the data given 12% DA rate and posited virtue weapon proc rates of 50% and 55%), we can now estimate the OAT rate. Proceeding with maximum likelihood estimation is not really necessary when an obvious unbiased estimator can be based off the observed number of single hits (denoted as X1) in n attack rounds:


It follows that the unbiased estimator is


with variance


Note that when d = 0, the variance reduces to that for the estimator for a simple binomial proportion (marginal in the context of the multinomial distribution). (Note to self: from simulation, this estimator is only very slightly less efficient, from an MSE standpoint, than the MLE, which I would bet is UMVUE even if an analytical expression for the MLE and the CRLB is annoying to obtain.)

The estimated proportion of single hits (per attack round) is 1 - 595/1425/.88 = .5255183, with corresponding 95% confidence interval (.4964218, .5546148). Given the specified probability model (which cannot be "proven" to be true at this time) and the data, it is not possible to conclude that the OAT rate of Fortitude Axe is either 50% or 55% (both are plausible given the confidence interval), unfortunately. But it should be possible to rule out one or the other with further data collection (with the hope that the probability model is correct), using the estimator specified above.

A third way: Faith Baghnakhs

Among all virtue weapons, it would be fastest to determine the OAT rate of Faith Baghnakhs by counting the number of triple attacks and quadruple attacks. It would be easier to do this on ninja because you wouldn't have to pay attention to kick attacks (because you want to use a parser instead of counting manually). If the OAT rate for Faith Baghnakhs can be shown to be 55%, that, along with the observed proc rate for Justice Sword, could be used as evidence for a common OAT rate of 55% across all virtue weapons.

Monday, May 17, 2010

A hierarchy of great axes?

This is a rehash of a previous post comparing Bonesplitter and the good Luchtaine, two "Magian" great axes, to that old standby Perdu Voulge and Fortitude Axe, the presumptive weapon of choice for Campaign (even though Waltz recast ends up being the rate-limiting factor for curing yourself), but new evidence, both for Fortitude Axe (see first relevant BG post and second relevant BG post for details that I won't go over here) and Luchtaine (to be discussed later, perhaps), show that I underrated Fortitude slightly and overrated Luchtaine significantly.

In particular, evidence indicates Luchtaine behaves similarly to Joyeuse such that regular DA and Magian OAT are "directionally" exclusive, which is different than mutually exclusive. Suppose that the DA rate were 20%. Then, mutually exclusive would mean P(OAT) = .40, P(DA) = .20, and P(OAT and DA) = 0. On the other hand, directionally exclusive would mean that either P(OAT|not DA) = .40 and P(DA) = .20 OR P(DA| not OAT) = .20 and P(OAT) = .40. Consequently, given 20% DA and 40% OAT rate, the effective DA rate would be .20 + .80*.40 = .40+.60*.20 = .52.

Also, I decided to repeat the previous analysis using Raging Rush. Even if Raging Rush's three base hits (in other words, those not arising from double attack) are the only ones that have a chance to be critical hits, it's still generally better than King's Justice. One consequence: because RR's STR modifier is lower than KJ's, the relative difference in damage between a Perdu RR and Fortitude RR is more than that between a Perdu KJ and Fortitude KJ, so the relative difference between Perdu and Fortitude "overall" would be less with RR than KJ "all other things being equal."

I find it is worth including Rune Chopper in the discussion, too, along with Hephaestus with STR +4 and attack +15 as a basis of my pontificating about what kind of effort is warranted to get "good enough" (not the most). Since I have 19% haste normally, I will use that as a haste baseline before Rune Chopper, so the full haste bonus of Rune Chopper is not fully realized. On the other hand, I will also consider having Rune Chopper with only 1 MP refresh such that the latent is active one out of every two rounds (as it appears to be anyway). I will also consider the situation of having a "typical" double March (~20% haste with March +2 instrument and 8/8 merits in both wind and singing skill), Haste spell (~15%), and Hasso (~10%).

I will also account for the concept of time delay between the initiation of a weapon skill and the start of the next attack round, as it apparently is fundamental to the game and not associated with human reaction time or laziness (not that I really noticed or cared), kind of like how the delay associated with Curing Waltz screws up Drain Samba actually working properly (something that is easy to notice and that I find very annoying). This could also be considered the time delay associated with execution of a weapon skill that must elapse before the start of the following auto-attack round, or "WS delay" for short. It's something worth considering because this delay is unavoidable, but since I don't know what is actually the so-called WS delay, I will do this comparison for 0, 1, 2, 3, and 4 second delays.

Finally, I find it really unnecessary to go into excruciating detail about what goes in the calculations, so I will just report something I call "relative efficiency" ratios relative to the baseline of Perdu Voulge, which are merely ratios of damage rates. In the end, one should focus only on the gross differences, rather than whether something is really 2.15% more as opposed to 2.2% more, for example.

Relative efficiency of great axes relative to Perdu Voulge (in terms of damage rate)

Weapon
No WS delay
1s delay
2s delay
3s delay
4s delay
Rune Chopper
(latent active always)
1.0971.082
1.0701.0591.049
Fortitude Axe
1.0371.008
0.984
0.9640.947
Hephaestus
(6 hits to 100 TP)
1.0321.029
1.0261.0231.021
Bonesplitter
1.0171.017
1.017
1.017
1.017
Perdu Voulge
11
111
Rune Chopper
(latent 1/2 active)
0.997
0.991
0.986
0.9810.977
Luchtaine
0.9850.972
0.9610.9520.944
Hephaestus
(7 hits to 100 TP)
0.9530.961
0.968
0.974
0.980

Again, these ratios are based on 19% equipment haste before Rune Chopper, along with double March (~20% march), Hasso, and Haste spell.

The ratios under the hypothetical situation with zero WS delay can represent the "intrinsic" relative efficiency of weapons that have a higher WS frequency that Perdu Voulge (notice that Bonesplitter has the same relative efficiency regardless of WS delay because it has the same WS frequency as Perdu), but intrinsic doesn't mean actual or true. The higher the WS delay, the more disproportionately affected are weapons with higher WS frequency compared to Perdu.

(Note that the concept of WS delay can be generalized to job abilities that interrupt or postpone attack rounds, but I did not account for that here.)

Implications for Fortitude Axe: wonder why Fortitude Axe doesn't actually appear to be better than Perdu Voulge in practice? WS delay could explain it. In particular, if you plan to use Fortitude Axe in a maximum haste situation (~80% haste), spamming weapon skills might be relatively counter-productive (for Fortitude compared to Perdu) because of WS delay, but WS frequency is pretty much the only benefit of using Fortitude Axe (aside from TP gain without using WS), so why not just use a high-damage great axe (Perdu or even Berserker's Axe)? Using Fortitude Axe for a zerg basically means having hope that you get more hits per round in a small time frame compared to the long-run average, e.g., stringing together several 3-attack rounds. Having 80% haste is usually the decisive factor in a max-haste zerg because you probably have max attack and accuracy as well. Maybe if you had a BLM land Choke and got some STR etudes... things you could do to compensate for the low base damage.

Implications for Rune Chopper: On the other hand, Rune Chopper with latent always active (if you somehow manage to achieve this; not a trivial thing) is still substantially better than Perdu even with significant WS delay (4 seconds), and this is under the situation where the 9% haste bonus isn't fully realized (it could be if you switched out other haste equipment to increase other damage-related factors), albeit under the double March/Hasso/Haste spell situation. On the other hand, Rune Chopper with only 1 MP refresh is rather pointless. If you had an Ares Cuirass lying around and RDM accommodating you, it would be good.

Implications for Luchtaine and other Magian great axes: SE really needs to allow Luchtaine to attack 3 times or even 4 times in the future or increase the base damage dramatically. At least there is hope that SE might do this later, whereas with Fortitude Axe, SE will never allow 4 attacks per round. You don't get much out of the others (as the final forms currently are) considering the time investment required, compared to spending IS on a Perdu Voulge. Hephaestus 6-hit is not terribly reasonable because of the 29 store TP requirement alone.

Didn't I say something about a hierarchy? Stick with Perdu Voulge in general...

Tuesday, May 11, 2010

How to check if a Teiwaz has superior accuracy to Terra's Staff

Obviously, I am talking about the Teiwaz with elemental affinity: magic accuracy +3, not so much that I am talking about the earth-aligned Teiwaz.

It is thought that earth affinity: magic accuracy +1 is equivalent to the accuracy bonus of Earth Staff, and earth affinity: magic accuracy +2 equivalent to the accuracy bonus of Terra's Staff. If you ever read this blog, you would know that elemental NQ staves are considered to have +20 magic accuracy for the specified element, and HQ staves, +30 magic accuracy. It is postulated that elemental affinity: magic accuracy +3 corresponds to +40 magic accuracy.

Without discussing the evidence underlying the following experiment to check whether the earth Teiwaz is superior to Terra's Staff, I will describe a superiority "trial" involving relatively few casts.

Location: Alzadaal Undersea Ruins (Nyzul Isle Staging Point)
Target monster: Level 78 Qiqirn Poulterer (ranger)
Spell to cast: Stone (I)
How many casts: 100.
What to count: number of non-resisted Stone I, number of half-resisted Stone I, number of quarter-resisted Stone I, number of eighth-resisted Stone I (should be easy to identify from the magnitude of damage)

How to identify the level 78 Qiqirn Poulterer: one way to check you have found the correct level Qiqirn is to set your accuracy score to 263 and use the "check" function to find the right Qiqirn. One way to achieve this is to equip a weapon type for which you have 230 combat skill (example: BLM with max club skill). 230 combat skill corresponds to 227 accuracy. Suppose you also have 62 DEX. For a one-handed weapon (club), this means you have +31 accuracy. Then equip +5 accuracy worth of equipment (example: Chivalrous Chain) to achieve a total accuracy score of 263. Level 77 and 76 Poulterers will check "low evasion," while the level 78 Poulterer will give no evasion message. Incidentally, this implies the level 78 Poulterer has at least 293 total evasion. You can confirm the level after killing the Poulterer by noting EXP yield (200 base EXP for level 78, 230 given 15% Sanction bonus).

Total magic accuracy for this experiment: it is known reasonably well (I will not cite evidence at this time) that having 65 INT, 290 elemental magic skill, +5 magic accuracy (from equipment), and no elemental staff corresponds to having about 55% magic accuracy rate for the Stone I spell. (The level 78 Qiqirn Poulterer has 65 INT.) If your elemental magic skill is higher or lower (say 292), make the appropriate adjustments to INT and/or magic accuracy. Here, +/-1 INT is considered +/-1% magic accuracy rate (up to a point), and +/-1 magic accuracy (from equipment) is considered +/-1% magic accuracy rate, too.

Given the above, equipping a HQ staff like Terra's brings your magic accuracy up to ~85%. A "quickie" trial I ran gave 57/71 non-resisted Stone I, strong evidence of uncapped magic accuracy rate. As postulated previously, equipping a Teiwaz with earth affinity: magic accuracy +3 could bring your magic accuracy up to ~95% (the maximum rate).

Why 100 casts of Stone I on a level 78 Qiqirn Poulterer? 100 is an arbitrary figure as I am too lazy to do a power calculation, but since we know a priori that Terra's Staff doesn't even give a capped magic accuracy rate (and it shouldn't since I said it would be ~85%), it will be very easy to show that, if the Teiwaz earth affinity: magic accuracy +3 really has +40 magic accuracy, the observed data will indicate a capped magic accuracy rate. Of course, if the accuracy bonus were higher than +40, this test wouldn't be able to show that, but the in-game constraints described by the current magic accuracy "model" and a desire for a "minimal" sample size (the further away from 50%, the smaller the standard error) led to the above experimental conditions.

Considerations: avoid Qiqirn Goldsmith links. To minimize damage from ranged attacks, it is preferable to be RDM. If not, have someone spam heal you while you hammer out 100 casts in short order. It really doesn't take that long.

Questions and desired clarifications about experimental conditions may be fielded in the comments, if anyone actually gives a shit.

Credit to pchan on BG for previous work on Qiqirn Poulterers that allows for a fairly straightforward and not-onerous experiment.