The Unbearable Triteness of Preening: virtue weapons

Showing posts with label virtue weapons. Show all posts

Sunday, July 4, 2010

Magian weapons: mutli-attack rate estimation

First, I will recap what is "known" about the "occasionally attacks twice" (OAT) rate of Magian weapons, and then present an estimation of the the probability distribution of attacks for "occasionally attacks 2-3 times" (OA2-3T) Magian weapons.

Is there a universal "occasionally attacks twice" rate for Magian weapons?

Appealing to Occam's razor, one could assert that the lack of duplicate entries for "occasionally attacks twice" (OAT) in the .DATs means that all the Magian weapons with that trait have the same OAT rate. While I do not know whether that assertion is actually true, at least I can look at various pieces of published evidence to see if this notion of a universal rate has any traction.

The track record of English-language FFXI sources is dismal: there is exactly one forum post (that I am aware of) that presents any data that can be used to estimate the OAT rate. The "general consensus" is that it's 40%. (By comparison, the Joyeuse rate is 45%.) As far as I know, this is the only evidence that English-language users cite or allude to when making claims about the Magian OAT rate, which is pathetic, but in line with the natural incuriosity of the FFXI sheep.

Another data set that someone shared with me, concerning the OAT rate of a Magian great axe (Luchtaine) with a 19% base double attack rate, showed 344 double attacks out of 689 total attack rounds. Based on these counts, an interval estimate of the Magian OAT rate (given 95% confidence) is (.3357284, .4279119), which is also consistent with the idea of a 40% OAT rate. (Reasoning leading to the OAT rate estimation is similar to that for virtue weapons I discussed previously.)

Among Japanese sources, there is more data but an annoying lack of statistical consistency, if this one blog post is to be taken as a summary of all pieces of evidence regarding the Magian OAT rate. They can be grouped into two categories: evidence consistent with a 40% rate and evidence consistent with a rate higher than 40% but lower than 50% ("statistically significantly," what a gauche phrase). The foolish conclusion that the OAT rate is 43.75%, based on an idiotic pooling of the data, has no traction.

The attack distribution of "occasionally attacks 2-3 times" Magian weapons

Given the above discussion of the lack of reliable information on the Magian OAT rate, the prospect of getting reliable data concerning the attack distribution of Magian weapons that "occasionally attack 2-3 times" (OA2-3T) appears poor. In fact, there is one set of count data (source) for Magian OA2-3T hand-to-hand with MNK (whatever its actual name is for the weapon, I don't give a fuck) that can shed light on the matter, but it can only do so provided that the attack distribution associated with OA2-3T is the same for all Magian weapons and the data are actually credible. The counts are as follows (302 total):

2 attacks: 57
3 attacks: 98
4 attacks: 81
5 attacks: 48
6 attacks: 16
7 attacks: 2

In order to obtain estimates of the attack distribution probabilities for Magian OA2-3T, a probability model needs to be specified and estimation based on this model.

Let H denote the number of attacks in a given attack round. Let π_n denote the probability of n = 1, 2, 3 attacks of a single hand in an attack round, and that the sum of the probabilities equals 1, and also let k denote the probability of a kick attack in an attack round. Provided that the number of attacks of one hand, the number of attacks of the other hand, and the number of kick attacks (all in a given attack round) are mutually independent, the probability mass function of H is

and 0 otherwise.

Aside: "Why do you care about kicks?" is a valid question. The answer is that the data were collected with a parser. Just as WAR cannot have a 0% double attack rate, MNK cannot have a 0% kick attack rate, and kparser cannot make the distinction between a kick and a punch (nor should we expect that kind of distinction to be made). Surely, a person can tell the difference, but why would you expect anyone to count manually when a parser is available? The occurrence of kicks does not provide any useful information about the attack distribution of an OA2-3T weapon (but can help validate the probability model), so all kicks do is introduce undesirable variability to the proceedings, but you can't do anything about it (other than get the data using PUP).

With the above data and probability model, maximum likelihood estimation can proceed. Of immediate concern is whether to assume that the kick attack rate, given 5/5 Kick Attack merits, is actually 17.5%. (Of course, I could let the kick attack rate be yet another parameter to estimate, but estimating four parameters with a sample size of 302 is not really that helpful.) People who play monk are generally fucking retarded, but I'll just use that rate. Using numerical methods, a set of point estimates and 95% simultaneous confidence intervals (Bonferroni, too lazy to care about other methods) is generated:

         p.hat  ci.lower  ci.upper
[1,] 0.4795746 0.4160971 0.5430521
[2,] 0.3377920 0.2492500 0.4263339
[3,] 0.1826334 0.1283014 0.2369655

Assuming the 17.5% kick attack rate is valid (the weakest assumption by far in my view, to go along with all the other assumptions upon which the analysis is based), the probability distribution of attacks for Magian OA2-3T is obviously not the same as that for the likes of Ridill, Mercurial Kris, and Soboro Sukehiro. The alleged 30:50:20 ratio for 1-3 attacks obviously does not agree with the data (and the corresponding estimates). Given the data, the multi-attack probability (including 2 and 3 attacks) could be 1/2, partitioning to 3/10 for two attacks and 1/5 for three attacks.

To put it another way, the ratio of 1-3 attacks could be 50:30:20 for Magian "occasionally attacks 2-3 times" weapons (generalizing from hand-to-hand to all weapons), and that's what I'll stand by until other data persuasively rejects that working hypothesis.

Addendum: numerical maximum likelihood estimation

Analytical MLE for the above case is a complete waste of time if it is even possible, so I tapped out an R script for the purposes of numerical estimation.

ll <- function(p,X,k) {
X2 = X[1]; X3 = X[2]; X4 = X[3]; X5 = X[4]; X6 = X[5]; X7 = X[6]
p1 = p[1]; p2 = p[2]

ll = -(X2*log(p1*p1*(1-k)) +
X3*log(2*p1*p2*(1-k) + p1*p1*k) +
X4*log((2*p1*(1-p1-p2)+p2*p2)*(1-k)+2*p1*p2*k) +
X5*log(2*p2*(1-p1-p2)*(1-k) + (2*p1*(1-p1-p2)+p2*p2)*k) + 
X6*log((1-p1-p2)*(1-p1-p2)*(1-k) + 2*p2*(1-p1-p2)*k) +
X7*log((1-p1-p2)*(1-p1-p2)*k))
return(ll)
}

counts = c(57,98,81,48,16,2)
est = optim(c(.05,.05),ll,X=counts,k=.175,hessian=T,control=list(reltol=1E-40))
fim = solve(est$hessian);
p.hat = c(est$par,1-sum(est$par))

se = c(sqrt(diag(fim)),sqrt(sum(diag(fim))+2*fim[1,2]))
ci.lower = p.hat - qnorm(1-.05/(2*3))*se
ci.upper = p.hat + qnorm(1-.05/(2*3))*se
cbind(p.hat,ci.lower,ci.upper)

k = .175
fitted = c(p.hat[1]*p.hat[1],2*p.hat[1]*p.hat[2],2*p.hat[1]*p.hat[3] + p.hat[2]*p.hat[2],2*p.hat[2]*p.hat[3],p.hat[3]*p.hat[3],0)*(1-k) +
c(0,p.hat[1]*p.hat[1],2*p.hat[1]*p.hat[2],2*p.hat[1]*p.hat[3] + p.hat[2]*p.hat[2],2*p.hat[2]*p.hat[3],p.hat[3]*p.hat[3])*k
chisq.test(counts,p=fitted)

Friday, June 18, 2010

Why Love Halberd is underrated... for dragoon

While I personally have yet to determine the virtue stone consumption rate for virtue weapons other than Fortitude Axe (so far, I'm assuming it's 55% across all virtue weapons given the limited evidence thus far), how exactly the normal double attack trait interacts with the virtue weapon's "occasionally attacks twice" (OAT) property seems to be described correctly. With a reasonable level of confidence, one can draw conclusions about how effective the other virtue weapons are compared to their "peers."

I can't say the likes of Hope Staff and Prudence Rod are worth discussing, but Love Halberd has some properties relevant for dragoon and samurai that seem to be misunderstood and even dismissed out of hand, the inconvenience of acquiring virtue stones notwithstanding. I go through them in order of importance and then compare Love Halberd to its competing options for DRG.

Is Love Halberd's delay undesirable?

Love Halberd has 396 delay, so with current quantities of Store TP available, it's possible and reasonable to achieve an "8-hit setup" with 23 Store TP (12.5/10.2 = 1.22549, which rounds up to 1.23).

People act like this this is a bad thing. But so what if it takes Love Halberd 8 hits to get to 100 TP? Noting how many hits it takes to get to 100 TP is trivial and irrelevant especially because of Love Halberd's OAT property. Instead, one should ask, how many attack rounds does it take for Love Halberd to get to 100 TP, given that 8 hits are required to get there?

It may help to show a graph illustrating, for both a virtue weapon (singly wielded) and a weapon without any multi-hit property (also singly wielded) but under 9% double attack rate and 95% hit rate, the relationship between the nominal number of hits to get to 100 TP and the "actual" (in a long-run, "missing the first hit of a WS 5% of the time," weapon skill-spamming context), average number of attack rounds it takes to get to 100 TP:

First, look for the average number of attack rounds it takes for a weapon without any multi-hit property to get to 100 TP in 6 hits. On the graph, the average number of attack rounds appears to be 5, and the actual value is 4.9526 rounds. This figure is reasonable because even though 5% of the time, the first hit of a WS misses (most of the time it takes 5 hits to get to 100 TP) , the 9% double attack rate results in the average value falling slightly below 5.

Now, look for the average number of attack rounds it takes for a virtue weapon to get to 100 TP in 8 hits. "Wait a second," you observe, "isn't the corresponding average number of rounds below 4.9526?" In fact, on average it takes a virtue weapon only 4.7305 rounds to get to 100 TP in 8 hits, so an 8-hit virtue weapon setup ideally has a higher weapon skill frequency than a 6-hit setup with a non-multi-hit weapon.

Is the average attack round argument unconvincing? Let's instead examine the probability distributions of the number of attack rounds it takes for a virtue weapon, a weapon without a multi-hit property, and, for comparison's sake, a "Trial of the Magians" OAT weapon (for dragoon, Bradamante) to get to 100 TP:

These probability distributions were obtained via Markov chain methods.

For a weapon without a multi-hit property, the probability of getting to 100 TP in 5 attack rounds is .580, and the probability for fewer than 5 attack rounds is higher than the probability for greater than 5 attack rounds, which is consistent with the average attack round figure of 4.9526.

In comparison, while the probability of getting to 100 TP in 5 attack rounds is lower for a virtue weapon (.403), the higher probability of getting to 100 TP in 4 attack rounds (.373) contributes to the average number of attack rounds to get to 100 TP being lower (4.7305).

And for the sake of comparison, it takes about 3.783 rounds for a Magian OAT weapon to get to 100 TP in 6 hits. This breaks down such that, most of the time, there is a high probability that a Magian OAT weapon takes either 3 or 4 attack rounds to get to 100 TP.

Note that for all three types of weapons, the probability that it takes 7 or more attack rounds to get to 100 TP is, at most, about .028 (for both the virtue weapon and the non-multi-hit weapon), which underscores the fact that, at least given 95% hit rate, it's not like the virtue weapon "needs" 7 or more attack rounds to get to 100 TP with any significant probability just because 8 landed hits are required to generate 100 TP.

In short, delay for virtue weapons, and the corresponding nominal number of hits it takes to get to 100 TP, is relatively unimportant because of the OAT property. In the case of the 8-hit Love Halberd setup, this property results in a lower average number of attack rounds to get to 100 TP than that for a 6-hit setup for a weapon without a multi-hit property (assuming a 55% virtue stone consumption rate).

Is the Love Halberd's base damage rating too low?

Love Halberd's 60 base damage is only 4 lower than Fortitude Axe's 64, which has 504 delay, so I'd say dragoons and samurai are relatively "spoiled" with access to a weapon with such high attack frequency and low delay.

Also, with a low base damage, the relative damage gap between Love Halberd and a higher-damage weapon decreases with additional fSTR.

Does Love Halberd's DEX +7 matter?

This is relatively unimportant, but with DEX +8 generally guaranteeing a 1% increase in critical hit rate when the target's AGI is not obscenely higher than your DEX, one can expect, effectively, a +1% critical hit bonus most of the time with DEX +7, which is not bad. DEX +7 is also a nice amount of DEX in the weapon slot that could help to ramp up one's critical hit rate if the opportunity presents itself (yeah, yeah, Greater Colibri...).

At least you can say it counters the loss of any attack (or accuracy) bonus associated with equipment for the ammo slot, Smart Grenade, Tiphia Sting, or whatever it is that DRG uses.

An additional +5 or +6 accuracy, if actually realized from the DEX bonus, is nothing to ignore, either.

Finally, a comparison of polearm options

All the features of Love Halberd described culminate such that Love Halberd is better than "conventional wisdom" allegedly holds.

Earlier, I did a write-up of how to model (approximately) the effect of Jump on damage rate as a preliminary step to doing a comparison of polearms that accounts for the increased WS frequency that Jumps provide. As usual, this comparison is done in terms of a long-run, WS-spamming, Jump-spamming situation so that one gets a decent idea of the relationship among the weapons in terms of maximum potential.

The weapons to be compared are

Valkyrie's Fork (6 hits to 100 TP)
Bradamante (with 75 base damage and 6 hits to 100 TP)
Love Halberd (8 hits to 100 TP).

Some of the conditions I specified are

fSTR 6 (+5 for Drakesbane)
42 additional WS "base" damage from the STR 50% modifier
95% hit rate
0% Zanshin rate
base double attack rate of 9%
ATK/DEF ratio of 1.5 and base critical hit rate of 9%, corresponding to an (approximate) average pDIF of 1.599 across all weapons (the critical hit rate bonus of Love Halberd treated as though it offsets the use of virtue stones at the expense of any attack bonus from the ammo slot)

Also, for Drakesbane, I am assuming a critical hit rate bonus of +10% and basing WS damage on 100 TP (ignoring excess TP effects, if they even exists). For Jumps (when accounted for), I treat the damage of Jumps as equivalent to normal hits (yet another simplification).

Let's start with a high quantity of haste, say, 64%, which accounts for Hasso (10%), double March (20%), Haste spell (15%), and haste from equipment (19%), which would relatively favor Valkyrie's Fork, a weapon with fundamentally lower WS frequency than the others, because of weapon skill delay (2 seconds).

Without accounting for the effect of Jumps, the summary of relevant numbers comes out as follows:

Weapon	Avg. TP dmg	Avg. WS dmg	Time per WS	Dmg/sec	TP:WS dmg
Valkyrie's Fork	832.01	1041.54	16.29 s	114.98	444:556
Bradamante	701.52	894.93	13.78 s	115.83	439:561
Love Halberd	793.79	789.77	13.23 s	119.61	501:499

These figures are merely a point of comparison to the more "realistic" figures that account for the effect of Jumps. But first, as an aside, I have to point out that the OAT effect of virtue weapons doesn't proc on Jumps and discuss the major implication for using Jumps with Love Halberd.

In general, Jumps can be considered an attack round that occurs "on demand." Moreover, Jumps generally delay the start of the following attack round by 2 seconds (a consequence of job ability or weapon skill delay in general), so Jumps, in effect, help to decrease the time between weapon skills except when the time between auto-attack rounds falls below 2 seconds. This is the primary effect of Jumps as slight increases in Jump damage per hit compared to auto-attack damage per hit are minor in comparison.

But since Jumps with Love Halberd are effectively normal attack rounds, they do not generate TP (on average) as much as auto-attack rounds. Therefore, there is a critical value of haste after which jumping with Love Halberd is unproductive.

Given the above conditions, Love Halberd averages about 1.579 landed hits per attack round, and "normal" jumps average exactly .95*1.09 = 1.0355 landed hits per "attack round" or 0.51775 landed hits per second (if spammed, so this is the upper limit for Jumps). It follows that it's counterproductive to jump with Love Halberd (in a long-run sense, not in a "need damage on demand" sense) when haste is above 53% (an approximate critical value). Therefore, for the following table, the effect of Jumps is considered only for Valkyrie's Fork and Bradamante:

Weapon	Avg. TP dmg	Avg. WS dmg	Time per WS	Dmg/sec	TP:WS dmg
Valkyrie's Fork	832.01	1041.54	16.00 s	117.08	444:556
Bradamante	701.52	894.93	13.51 s	118.13	439:561
Love Halberd	793.79	789.77	13.23 s	119.61	501:499

As stated previously, the primary effect of Jumps is to decrease the time per weapon skill. Given 64% haste, the effective increase in damage per second is at most around 2%. (At lower levels of haste, the contribution of Jumps to increasing the rate of damage is higher.) Even when Jumps are accounted for, Love Halberd is still slightly better than either Valkyrie's Fork or Bradamante. (The TP:WS damage ratios are my usual check on how well the calculations represent what is observed in the game, but I have no idea if these are typical ratios.)

Certainly, virtue stone consumption is a strike against Love Halberd for everyday, humdrum situations, and it's possible Bradamante can be further augmented after future updates, but can Bradamante be enhanced to the point where formerly top-end polearms (like Valkyrie's Fork) are completely outclassed after accounting for human "inefficiency"? It remains to be seen, but now let's consider the viability of these weapons in a zerg-like situation with 80% haste:

Weapon	Avg. TP dmg	Avg. WS dmg	Time per WS	Dmg/sec	TP:WS dmg
Valkyrie's Fork	832.01	1041.54	9.94 s	188.47	444:556
Bradamante	701.52	894.93	8.55 s	186.80	439:561
Love Halberd	793.79	789.77	8.24 s	192.08	501:499

As discussed in a previous post, the benefit of increasing haste is higher for weapons with lower WS frequency than weapons with higher frequency, a consequence of weapon skill delay. Unsurprisingly, Bradamante falls behind Valkyrie's Fork, yet Love Halberd still has a slight advantage over Valkyrie's Fork even at maximum haste, lending actual credence to the use of Love Halberd for high-haste zergs (and discrediting the idea of using Bradamante for such, at least when compared to Valkyrie's Fork).

Conclusions

Love Halberd's delay in conjunction with its OAT property can give it a weapon skill frequency lower than weapons without any multi-hit property. For example, an 8-hit Love Halberd setup has a higher WS frequency than a 6-hit setup for a polearm without any multi-hit property. This, along with its relatively high base damage (for a multi-hit weapon) and DEX +7 make it a "peer" to the likes of Bradamante, the latest fashionable polearm. At 80% haste, Bradamante is a relatively poor weapon compared to Love Halberd.

Tuesday, May 25, 2010

What's the proc rate for virtue weapons? How do you know?

One "line" of evidence: checking with Justice Sword

Taken at face value, the estimate 555/1000 indicates the "occasionally attacks twice" (OAT) rate of Justice Sword is significantly higher than 50% and could be considered 55% (source). But is the OAT property the same for all so-called "virtue weapons"?

Another line of evidence: checking with Fortitude Axe... and WAR

The rest of this post discusses how to estimate the OAT rate of Fortitude Axe in the presence of the double attack trait from WAR. But first, I needed a good idea about how Fortitude Axe OAT actually interacts with the DA trait. In the past, I blabbed a lot about how Fortitude Axe might interact with double attack, but my "conclusion" was based on very weak evidence. After collecting some more count data with kparser under 12% DA (source), which ruled out my previous weak hypotheses about the DA/OAT interaction, I got a better idea about how to explain these results (assuming kparser was working correctly...).

It appears (not exactly "proof") that OAT can process on both the normal hit (which is guaranteed to occur for a given attack round, if not actually land) as well as the hit from the possible DA proc (with a major caveat to be discussed soon). More specifically, the hit from the DA proc occurs independently of whether an OAT proc occurs. (Not probabilistically, of course, but "mechanistically.")

It is worth noting that conceptually the order of DA and OAT could easily be reversed, such that the hit from the OAT proc occurs independently of whether a DA proc occurs, but I will just say the resulting probability calculations are not supported by the data when OAT is mechanistically independent of DA.

Anyway, one way to show pictorially all "possible" outcomes where DA and OAT can interact is with the following "tree":

There are six hypothetically "distinct" outcomes, but it is very inconvenient to monitor the equipment menu for virtue stone expenditure. More important, though, is the fact that Fortitude Axe cannot quadruple attack, so the case of expending two virtue stones is impossible. (This makes sense, noting that triple attacks are impossible with zero DA rate.)

So what "happens" to this 2-virtue stone attack round that is impossible? It appears that even if a DA proc occurs, only one virtue stone can be expended anyway, so the "tree" simplifies further:

The resulting probability model of the number of hits in an attack round (ignoring the distinction between hits and misses) is specified as follows. Let X denote the number of hits in a given attack round, d the probability of a double attack proc, and π the probability of an OAT proc. Then,

Now that we have a reasonable probability model describing the interaction between double attack and the OAT property of Fortitude Axe ("reasonable" based on chi-square goodness of fit to the data given 12% DA rate and posited virtue weapon proc rates of 50% and 55%), we can now estimate the OAT rate. Proceeding with maximum likelihood estimation is not really necessary when an obvious unbiased estimator can be based off the observed number of single hits (denoted as X₁) in n attack rounds:

It follows that the unbiased estimator is

with variance

Note that when d = 0, the variance reduces to that for the estimator for a simple binomial proportion (marginal in the context of the multinomial distribution). (Note to self: from simulation, this estimator is only very slightly less efficient, from an MSE standpoint, than the MLE, which I would bet is UMVUE even if an analytical expression for the MLE and the CRLB is annoying to obtain.)

The estimated proportion of single hits (per attack round) is 1 - 595/1425/.88 = .5255183, with corresponding 95% confidence interval (.4964218, .5546148). Given the specified probability model (which cannot be "proven" to be true at this time) and the data, it is not possible to conclude that the OAT rate of Fortitude Axe is either 50% or 55% (both are plausible given the confidence interval), unfortunately. But it should be possible to rule out one or the other with further data collection (with the hope that the probability model is correct), using the estimator specified above.

A third way: Faith Baghnakhs

Among all virtue weapons, it would be fastest to determine the OAT rate of Faith Baghnakhs by counting the number of triple attacks and quadruple attacks. It would be easier to do this on ninja because you wouldn't have to pay attention to kick attacks (because you want to use a parser instead of counting manually). If the OAT rate for Faith Baghnakhs can be shown to be 55%, that, along with the observed proc rate for Justice Sword, could be used as evidence for a common OAT rate of 55% across all virtue weapons.