The Unbearable Triteness of Preening: critical hit rate

Showing posts with label critical hit rate. Show all posts

Saturday, July 24, 2010

The "new" melee pDIF

This post is definitely not for those who neither understand nor care about what melee "pDIF" is all about and why it can be of interest, so I find no point in making some sort of "for dummies" kind of introduction and will just jump into the results.

First, a reference to the "new" melee pDIF should be seen as a sarcastic gesture, as there likely have no been wholesale changes to pDIF after the August 2007 version update that brought the gameplay-altering "two-handed weapon adjustment." Therefore, the following results are assumed to reflect the actual changes to pDIF made in August 2007.

Data and results

The guy who plays Masamunai (currently of Cerberus) provided this spreadsheet of data, having tabulated the observed damage values for various ratios of attack to defense (without level correction), using both one-handed and two-handed weapons, on level 63-65 Lesser Colibri and then "standardizing" them to approximate observed pDIF values (acknowledging estimation error associated with in-game truncation of values). There are more details concerning the raw data and he provided his own analysis, but I prefer to do my own analysis so you don't necessarily have to review the spreadsheet yourself.

The following is an image attempting to plot 67,123 of the observed pDIF data values (almost of all the data) to show primarily how the minimum, maximum, and (most important to me) mean pDIF for both critical and non-critical ("normal") hits varies with the ratio of attack to defense:

It is somewhat difficult to plot 67,123 data values cleanly and elegantly with limited resolution, so I exploited transparency of data points, resulting in narrow "bands" that vary in opacity from top to bottom, an attempt to illustrate roughly the relative "density" of observed values. Each band represents the entirety of the data collected for a given attack/defense ratio. Another interpretation is that each band represents the observed conditional distribution of pDIF for a given attack/defense ratio.

The bands for critical pDIF are generally less "dense" or less opaque than those for normal pDIF, reflecting that fact that there are many more data points for normal pDIF (55,956 versus 11,127). Also, the bands are generally most translucent at the endpoints, reflecting the fact that the observed data at the extremes of each conditional pDIF distribution (for a given attack/defense ratio) occur relatively less frequently, which is consistent with the idea that pDIF is now a function of two uniform random variables (either the sum or the product), which follows a trapezoidal(-like) distribution. (But I will not be discussing probability distributions today.)

Aside from the plotting of the data values, regression lines for the mean pDIF (controlling for attack/defense ratio) were also plotted (lines based on ordinary least squares, which is justifiable as there are a lot of data points involved for each level of attack/defense considered). Regression was done in an informal piecewise fashion, as there are specific ranges of attack/defense ratio where the variance of pDIF is obviously not constant, specifically for three cases:

where there is a critical pDIF upper limit imposed (3.15 when attack/defense is approximately greater than 1.65)
where there is a normal pDIF lower limit imposed (1.00 when attack/defense is between 1.25 and 1.5), and
where the mode of normal pDIF is 1.00 and the mode does not occur at the left endpoint of the pDIF distribution (when attack/defense is less than 1.25). It should be noticed that it is impossible to discern the mode of pDIF (conditional on a given attack/defense ratio) based on the above graph. One would have to consult the original source as cited above.

I hope that will suffice as an explanation for the elements of the graph.

Interpretations and conclusions

These are a few of the things one could take away from the graph above.

Aside from the maximum attack/defense ratio attainable, there appear to be no differences in pDIF between one-handed weapons and two-handed weapons. I have incorrectly thought otherwise in the past, but I assumed people who cared about this knew what they were talking about. Obviously not.

While there is no data for two-handed weapons below 1.398 attack/defense ratio, I would invoke model parsimony and assert there is no good reason to expect differences at lower values of attack/defense. Although it is not shown above (and cannot be shown above cleanly), 2.00 is the maximum attack/defense ratio for one-handed weapons, and 2.25 is the maximum attack/defense ratio for two-handed weapons. Support for the these maxima can be found in the spreadsheet.

The ceiling on critical hit pDIF first occurs near 1.65 attack/defense. Moreover, the value of the ceiling, 3.15, is the modal (most frequently occurring) pDIF for attack/defense ratios above 1.65.

Mean pDIF, as a function of attack/defense, does NOT increase at the same rate for critical hits as for normal hits for a given value of attack/defense. A consequence of this is there is no pat way to relate normal pDIF to critical pDIF, like critical pDIF = normal pDIF + 1. To see what I mean, refer to this blog entry (JP), particularly the first image, to get a sense of how pDIF was incorrectly perceived more than a year after the August 2007 version update (a mish-mash of the critical hit pDIF ceiling of 3.15, increased attack/defense ratio maximum, and the old pDIF model).

Irrelevant considerations

This is a matter of personal preference, but I consider the so-called "secondary randomizer" an irrelevant red herring. pDIF as a product of two uniform random variables or sum of two uniform random variables, so what? (But I will note that the slope of mean pDIF without the second random factor does not change if the factor is added and does change if the factor is multiplied.) I just know it's there and I can explain what can cause it, but it is not very important for estimating mean pDIF, which is why I even made this post in the first place.

I also do not care about exactness of any pDIF model. Approximately true is fine with me as far as modeling rates of damage is concerned. (There are other factors when completely ignored or incorrectly computed that cause much more error than mere sampling error based on 60,000+ samples).

Formulas for mean pDIF as a function of attack/defense ratio and whether the weapon is one-handed or two-handed

These formulas are based on the regression estimates. (You may have noticed discontinuities in the piecewise mean pDIF functions suggested in the graph, but I do not care that much about fudging the estimates to eliminate that.) For normal-hit pDIF, which I have denoted as M_Normal, the estimated mean of M_Normal, as a function of H, the number of hands required to wield a weapon (H = 1, 2), and R, the attack/defense ratio (level-corrected or otherwise), is

For critical-hit pDIF, the functional relationship between that and H and R is

The following is the output of the regression procedure. I only include this to show that there is no reason to expect that the coefficient of determination be high, mainly because there is inherent variability of pDIF.

***Regression for normal pDIF, ATK/DEF < 1.25****

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 0.225403   0.010107    22.3   <2e-16 ***
ratio       0.782699   0.009748    80.3   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 0.1621 on 12257 degrees of freedom
Multiple R-squared: 0.3447,     Adjusted R-squared: 0.3446 
F-statistic:  6447 on 1 and 12257 DF,  p-value: < 2.2e-16 



***Regression for normal pDIF, 1.25 < ATK/DEF < 1.5****

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.07274    0.04129   1.762   0.0781 .  
ratio        0.90232    0.02969  30.390   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 0.2369 on 14254 degrees of freedom
Multiple R-squared: 0.06085,    Adjusted R-squared: 0.06078 
F-statistic: 923.6 on 1 and 14254 DF,  p-value: < 2.2e-16 



***Regression for normal pDIF, ATK/DEF > 1.5****

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) -0.294339   0.016866  -17.45   <2e-16 ***
ratio        1.162306   0.009566  121.50   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 0.2518 on 29479 degrees of freedom
Multiple R-squared: 0.3337,     Adjusted R-squared: 0.3336 
F-statistic: 1.476e+04 on 1 and 29479 DF,  p-value: < 2.2e-16



***Regression for critical pDIF, ATK/DEF < 1.65****

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.94139    0.01739   54.14   <2e-16 ***
ratio2       1.07335    0.01273   84.29   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 0.232 on 6745 degrees of freedom
Multiple R-squared: 0.513,      Adjusted R-squared: 0.5129 
F-statistic:  7106 on 1 and 6745 DF,  p-value: < 2.2e-16 



***Regression for critical pDIF, ATK/DEF > 1.65****

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  1.59673    0.04287   37.25   <2e-16 ***
highratio    0.68607    0.02344   29.27   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 0.1905 on 4377 degrees of freedom
Multiple R-squared: 0.1637,     Adjusted R-squared: 0.1635 
F-statistic: 856.8 on 1 and 4377 DF,  p-value: < 2.2e-16

Tuesday, June 22, 2010

Critical hit rate bonus of Fencer

(Edit: now with information on Fencer with dual wield.)

Fencer is a new job trait from the July 21, 2010 version update that is available to the Warrior job at level 45 and the Beastmaster job at level 80. It has the following help description: "Increases rate of critical hits when wielding with the main hand only. Grants a TP bonus to weapon skills." The critical hit rate bonus was estimated using the following procedure.

Methods (brief)

An estimate of the critical hit rate bonus was obtained by auto-attacking overnight a level 69 Ul'hpemde, which has AGI 65 (source). WAR75/MNK01 was used. The following equipment was used to obtain STR 57, DEX 64, and an accuracy score of 276:

Trainee Knife (240 dagger skill)
Walahra Turban
Dusk Gloves
Snow Ring (STR -2)
Swift Belt (Accuracy +3)
Aurum Sabatons (DEX +3, accuracy +5)

STR 57 ensures 0 damage to any Ul'hpemde, and DEX 64 ensures (with 4/4 critical hit rate merits) a 9% critical hit rate before the effect of Fencer (source). kparser was used for automated data collection.

The level of the targeted Ul'hpemde was inferred by comparing the predicted hit rate for a level 69 Ul'hpemde (.92) against a point estimate of the hit rate of 5628/6133 = .9176 , with 95% confidence interval (.9105, .9244). The observed hit rate is consistent with the prediction.

Estimation of Fencer's effect with dual wield was also done with a Trainee Knife/Trainee's Needle combination, but the Ul'hpemde was level 68. (Critical hit rate is "directly" independent of level, but not AGI, which depends on level to some extent. But for both level 67 and level 68 Ul'hpemdes, the AGI is 65.) The following image summarizes the final base attribute values for this particular trial:

Results

Single wield: A point estimate of 802/5628 - .09 = .0525 was obtained for the critical hit rate bonus, with a 95% confidence interval (.0434, .0619).

Dual wield: A point of estimate of 464/4983 - .09 = .0031 was obtained for the critical hit rate bonus, with a 95% confidence interval (-.0048, .0115).

Interpretation and conclusion

Since critical hit rate has statistically been shown to take only integer percent values, assuming that the bonus is additive, the critical hit rate bonus of Fencer is either 5% or 6% with 95% confidence.

For the dual-wield case, suppose there were a 5% bonus for the main hand and none for the off hand. The effective bonus would then be 2.5%. Yet the observed estimate is much less than 2.5%, which should be taken as evidence that Fencer has no effect when dual wielding.

Saturday, June 19, 2010

Weapon skill critical hit rate bonus: summary of evidence

(Edit #2: added information for Backhand Blow and Blade: Jin, and another source for Rampage.)

(Edit #1: added another source for Drakesbane.)

This is an attempt to summarize any evidence following attempts to determine the critical hit rate bonus at or around 100 TP (if any) for weapon skills whose "chance of critical varies with TP."

I am not aware of any (non-anecdotal) evidence for the following weapon skills: Ascetic's Fury, Vorpal Blade, Power Slash, Sturmwind, Keen Edge, Vorpal Scythe, Vorpal Thrust, Skewer, Blade: Rin, True Strike, Hexa Strike, Sniper Shot, Heavy Shot, Dulling Arrow, and Arching Arrow (17 weapon skills). That leaves only six: Backhand Blow, Evisceration, Rampage, Raging Rush, Drakesbane, and Blade: Jin.

For now, "convenient" determination of critical hit rate is possible only for the first hit. Most of the testing done concerns the first hit, and conclusions are based on the assumption that the bonus (where it exists) is additive.

Backhand Blow (hand-to-hand, 2 hits)

Source: dex/crit relation, WS crits, WS gorgets discussion (Blue Gartr forums)

Comparing the sample proportions 22/50 (.44) at 9% baseline critical rate and 37/50 (.74) at 30% baseline (with 6% from Destroyers), it is obvious that there is some kind of innate critical rate bonus for at least the first hit of Backhand Blow.

But with Backhand Blow TP varying between 100 and 120 TP, it seems likely that the critical rate was not fixed for each sample. The consequences of this on the allocation of Type I error and coverage probability of the corresponding interval estimate are explored for Blade: Jin bonus estimation (later in the post), as data for that was obtained by the same person, but for now I will just describe briefly how to go about estimating the bonus for Backhand Blow.

Assume that the innate bonus is additive and constant (meaning it's independent of whatever the baseline critical rate is). Also assume that the critical rate bonus from Destroyers (6%) increases the critical hit rate of Backhand Blow by an additional 6% (starting from 24%).

Let X₁ be the number of critical hits observed at 9% baseline, n₁ the total number of hits observed at 9%, X₂ the number of critical hits observed at 30% baseline, and n₂ the total number of hits observed at 30%. A natural "pooled" estimator for Backhand Blow's critical hit rate bonus is

and its standard error is

The sample proportion is .395 and a corresponding 95% confidence interval for the WS bonus is (30.32%, 48.68%).

Conclusion: there is a critical hit rate bonus for Backhand Blow at 100 TP. A bonus of 40% would be consistent with the given data.

Evisceration (dagger, 5 hits)

Source: Evis crit rate testing (Allakhazam forums)

At ~100 TP and given 24% base critical hit rate, the pooled sample gives a sample proportion 248/696 = .3563. A 95% confidence interval for the critical hit rate bonus is (8.61%, 15.32%).

Conclusion: there is a critical hit rate bonus for Evisceration at 100 TP, with +10% being a possibility.

Rampage (axe, 5 hits)

Source (1): ランページとDEXの関係

There are two sets of estimates: one for DEX 68, and one for DEX 124, with Gigantobugard as the target mob in both cases. I'm not much interested in calculating base AGI and confirming that the Megalobugard's level range is 40-43, so I ignored the estimates for DEX 68. DEX 124 ensures a 24% base critical hit rate.

At 100 TP, the sample proportion of critical hits is 35/130 = .2692. A 95% confidence interval for the critical hit rate bonus is ( -4.48%, 11.40%). But suppose there actually is a 10% critical hit bonus. For a sample size of 130, the probability that the sample is sufficient to show a statistically significant bonus is about .7388 (power calculation).

At 200 TP, the sample proportion of critical hits is 68/150 = .4533. A 95% confidence interval for the critical hit rate bonus is (3.20%, 29.66%).

Source (2): dex/crit relation, WS crits, WS gorgets discussion (Blue Gartr forums)

I did say I wasn't interested in calculating a mob's AGI, but a Clipper's AGI is either 18 or 21 regardless of the levels reported on FFXIclopedia, and either AGI value doesn't affect the actual crit rate for the DEX 57 case, which is indeed 13%. (See this for details about critical hit rate as a function of your DEX - mob AGI.)

Using the same "pooled" estimator rationale I used for Backhand Blow (earlier in the post), the sample proportion for Rampage's crit bonus at 300 TP is .465 and a corresponding 95% confidence interval for the rate bonus is (31.80%, 61.20%). For the sake of completeness, estimates for the bonus at 100 TP and 200 TP are (-9.26%, 25.40%) and (3.20%, 48.80%), respectively.

Conclusion: if there is a critical hit rate bonus for Rampage at 100 TP, the known evidence is insufficient to show that, but if the bonus were 10%, for n = 130 the power to reject the null hypothesis of no bonus is fairly high (.7388). Given all the data, it is relatively unlikely that the bonus is 10%, but a smaller bonus cannot be ruled out with such small samples.

Unsurprisingly, there is a bonus at 200 TP and 300 TP.

Raging Rush (great axe, 3 hits)

Source (1): レイグラのクリティカル率について　その１

The sample proportion is 20/40 given the usual 24% base. The "control" data for base critical rate (which is a good idea to have by the way), however, gives the sample proportion 44/130 = .3384, which is somewhat unusual, but I write that off merely as that, not a sign of dubious experimental error. This data alone gives the tentative impression that there is a bonus.

Source (2): RagingRush Critical rate test (Killing Ifrit forums)

The raw data (showing damage values) are in a spreadsheet, but you don't need to download it.

At 100 TP and given 24% base critical hit rate, the proportion of critical hits is 155/373 = .4155. A 95% confidence interval for the critical hit rate bonus is (12.50%, 22.74%). This is strong evidence that the critical hit rate bonus is not 10%. Possible candidates are 15% and 20%.

More interesting to me is that the damage for 1 TP return (2o occurrences) was also noted, providing an opportunity to determine whether a critical hit rate bonus also applies to off-hand hits (despite there being no way to tell the difference between a double attack hit and a regular off-hand hit). Assuming a 24% base critical hit rate, with 9 observed critical hits out of 20, the corresponding p-value is .03614, which suggests a critical hit rate bonus.

Conclusion: there is a critical hit rate bonus for Raging Rush at 100 TP, with +15% and +20% being possible candidates. The small sample for critical hits from off-hand hits suggests a critical hit rate bonus for off-hand hits of Raging Rush as well.

Drakesbane (polearm, 4 hits)

Source (1): drakesbane native crit% (FFXIclopedia forums)

The first sample is 38/100 and the second, 24/100 (given 106 TP).

38/100 is a fairly extreme observation given 24% base critical hit rate (if there were no bonus). On the other hand, 24/100 is not that extreme an observation given a 34% rate. Since there is no good reason to think the conditions changed between the two samples, pool the data and crank out an interval estimate for the rate bonus, which is (0.66%, 13.91%).

Source (2): 雲蒸竜変の検証

There are four samples: three for 100 TP and one for 300 TP.

For 100 TP, the sample proportions are 12/49, 15/45, and 15/41 (given 24% base critical hit rate). The pooled estimate is 42/135 = .3111 and a 95% confidence interval for the bonus is (-0.57%, 15.64%). While this interval covers 0, 0 is again close to the left endpoint (in the other case the 0 being on the "right" side based on expectations).

As for 300 TP, the sample proportion is 16/30 and a 95% confidence interval for the rate bonus is (10.32%, 47.66%), which rules out 50% (tentatively).

Conclusion: there is suggestive evidence for a critical hit rate bonus at 100 TP, with +5% and +10% being possible candidates. At 300 TP, a +50% bonus appears to be an "unlikely" possibility.

Blade: Jin (katana, 3 hits)

Source: dex/crit relation, WS crits, WS gorgets discussion (Blue Gartr forums)

The sampling was done in the same fashion as for Backhand Blow, with observed critical hit proportions 3/30 at 9% baseline crit rate and 8/30 at 30% baseline (with Senjuinrikio's 6% bonus) at 100 TP. Using the same estimator that I used for Backhand Blow, the "pooled" sample proportion for Blade: Jin's critical bonus is -0.01167, and a corresponding 95% confidence interval is (-10.73%, 8.39%).

Taking the confidence interval at face value, if there is a critical bonus for Blade: Jin at 100 TP, it is unlikely that it's 10% or higher, especially considering the "sloppy" manner in which the data was likely collected (with TP not being held fixed, the critical hit rate could have varied), which further supports that contention. If the bonus were 10%, obviously, the probability that a 95% confidence interval wouldn't cover 10% at the right endpoint of the interval would be near .025 (half the Type I error). The consequences of experimental "error" are explored in a simulation study described at the end of this post.

Conclusion: if there is a critical hit rate bonus for Blade: Jin at 100 TP, it is unlikely that the bonus is as high as 10%.

Simulation study: is a 10% critical hit rate bonus that unlikely for Blade: Jin?

Consider the following simulation study based on hypotheticals: if there actually were a 10% bonus at 100 TP, with a 1% increase for every 5 TP, then with TP varying between 100 and 119 TP, the critical rate varies between 10% and 13%.

Given that "TP overflow" is inevitable with dual wield, and that extra hits occurring beyond TP were quite possible because data collection was reported to be boring, suppose that each of the critical rates between 10% and 13% (inclusive) are equally likely to be "chosen" for Blade: Jin.

The purpose of the study is to show how likely it is that the "pooled" large-sample confidence interval covers 10% given the above conditions.

A histogram of the simulated sampling distribution of the critical hit rate bonus shows that it's obviously not normal, with the mean (about 11.5%) higher than 10%, which is supposed to be the "actual" bonus at 100 TP for this simulation. (The shape of the large-sample approximation of the sampling distribution is traced with the solid curve.)

On the other hand, the margin of error for all simulated sample proportions is higher than 9.56%, the margin of error for the actual sample, about 97.7% of the time. (The mean margin of error is 11.19%.) Also, the "actual" (in the context of the simulation) Type I error is about .059, with about .040 allocated to the right tail (meaning there is a probability of .0402 that the null hypothesis of .10 is rejected because the estimate is higher than .10 based on the criterion of statistical significance) and about .019 allocated to the left tail (meaning the null is rejected with probability .019 because the observed estimate is significantly lower than .10). By comparison, the nominal left-tail error is .025.

Repeating this exercise under the condition that there is no bonus, the margin of error for all simulated sample proportions is higher than 9.56% only 58.0% of the time, and the probability that a confidence interval's right endpoint is higher than 8.39% is less than 0.1%.

If Blade: Jin's critical hit rate bonus at 100 TP were actually 10%, considering TP overflow and additional hits occurring beyond TP overflow, it would be very unlikely that a given 95% confidence interval would not cover 10%. The margin of error would also be very likely to be higher than 9.56%. Therefore, it is more plausible that its critical rate bonus is significantly less than 10%, if it even exists.

The following is some code for the simulation, but the inner loop should probably be expanded so that it finishes faster.


n = 100000
ci.lower = numeric(n)
ci.upper = numeric(n)
p.pool = numeric(n)
for (i in 1:n) {
X1 = 0
X2 = 0

for (j in 1:30) {
X1 = X1 + rbinom(1,1,sample(seq(.19,.22,by=.01),1))
X2 = X2 + rbinom(1,1,sample(seq(.40,.43,by=.01),1))
}

p.pool[i] = (X1 + X2 - .39*30)/60

ci.upper[i] = p.pool[i] + qnorm(.975)*sqrt((X1/30*(1-X1/30) + X2/30*(1-X2/30))/120)
ci.lower[i] = p.pool[i] - qnorm(.975)*sqrt((X1/30*(1-X1/30) + X2/30*(1-X2/30))/120)
}

mean(p.pool)
me = (ci.upper - ci.lower)*.5
mean(me>sqrt((3/30*(1-3/30)+8/30*(1-8/30))/120)*qnorm(.975))
mean(ci.upper<.10) mean(ci.lower>.10)
mean(ci.upper<.10) + mean(ci.lower>.10)

Thursday, December 4, 2008

A half-year in parses

December 11: I now have the time to add some comments for all the parser output I posted last week.

Treasure and Tribulations BCNM, 1st attempt (July 11)

Melee Damage
Player            Melee Dmg   Hit/Miss  M.Low/Hi    M.Avg
NIN/WAR                 470      38/85      4/18    11.62


Spell Damage
Player                 Spell Dmg   Spell %  #Spells  S.Low/Hi     S.Avg
NIN/WAR                      914   64.55 %       29      4/44     31.52
- Doton: Ni                  164   17.94 %        4     40/44     41.00
- Huton: Ni                  140   15.32 %        4     20/40     35.00
- Hyoton: Ni                 200   21.88 %        6     20/40     33.33
- Katon: Ni                  110   12.04 %        5     10/40     22.00
- Raiton: Ni                 196   21.44 %        5     36/40     39.20
- Suiton: Ni                 104   11.38 %        5      4/40     20.80

Comments: it certainly is more palatable to fight a mimic (Small Box) straight up rather than hope you pick the right treasure chest. Comments on FFXIclopedia recommend sushi "except if you have really good gear," but melee accuracy against this mimic was a joke. I felt better off using the "wheel" lest the fight take 25 minutes.

Treasure and Tribulations BCNM, 2nd attempt (July 12)

Melee Damage
Player            Melee Dmg   Hit/Miss  M.Low/Hi    M.Avg
NIN/WAR                 214      20/78      5/13     8.47


Spell Damage
Player                 Spell Dmg   Spell %  #Spells  S.Low/Hi     S.Avg
NIN/WAR                     1008   80.45 %       36      4/44     28.00
- Doton: Ni                  115   11.41 %        5      5/40     23.00
- Huton: Ni                  190   18.85 %        6     10/40     31.67
- Hyoton: Ni                 220   21.83 %        7     20/40     31.43
- Katon: Ni                  164   16.27 %        7      4/40     23.43
- Raiton: Ni                 145   14.38 %        6      5/40     24.17
- Suiton: Ni                 174   17.26 %        5     10/44     34.80

Comments: more of the same (Small Box again), mainly to corroborate the hideous evasion of these mimics. I am curious whether there is any difference in hit rate targeting the larger boxes instead.

Evasion vs. Water Leaper (August 1)

Attacks Against:
Player           Total   Avoided   Avoid %
NIN/THF            253       247   97.63 %


Standard Defenses
Player           M.Evade  M.Evade %   Shadow  Shadow %   Parry  Parry %
NIN/THF              148    58.73 %       93   93.94 %       6   5.77 %

Comments: I trot out the thief support job to maximize my evasion. (I've seen "Evasion Bonus II" job trait from thief to be both +22 and +23 total.) This may be indispensable for something like Fenrir (I may try soloing it again now that Reraise effects can't be dispelled) but for mundane things not so much. Trading 12 or 13 evasion for all the abilities available to DNC37 (dancer also gets an Evasion Bonus trait) seems like a no-brainer for menial tasks, if I can ever bother to finishing leveling it.

Evasion vs. Goblin Slaughtermen, Temenos - Northern Tower (August 8)

Attacks Against:
Player           Total   Avoided   Avoid %
NIN/THF            241       234   97.10 %


Standard Defenses
Player           M.Evade  M.Evade %   Shadow  Shadow %   Parry  Parry %
NIN/THF              155    65.13 %       71   91.03 %       8   9.64 %

Comments: Ninja soloing for AF+1 in Temenos seems "common" enough for those who have the patience and adequate equipment. I've tended to err toward mixing both haste and evasion if only to speed up the process just a little, so even without maximum evasion, one can still evade a fair amount of attacks. (At least I assume that was the case for this, one of my last Temenos runs.) Sadly, in the past I have actually timed out mainly because of mediocre DD output, but it doesn't really matter to me whether I finish in 20 minutes or 28 minutes.

Enfeebling Despot (October 10)

BLM/RDM
Debuff      # Times   # Successful   # No Effect   % Successful
Bind             26             21             0        80.77 %
Gravity           8              8             0       100.00 %
Poison II        12             11             0        91.67 %

RDM
Debuff      # Times   # Successful   # No Effect   % Successful
Bind              3              2             0        66.67 %
Gravity           6              6             0       100.00 %

Comments: I had such extraordinary success (by my standards) binding Despot that I feel this is an anomaly. I am pretty sure my enfeebling magic skill a few months ago was 269, which isn't good for BLM. Although binding isn't necessary for soloing Despot as a black mage (yes, I didn't solo it here), it can give you a little slack.

Pahluwan Khazagand effect on crit rate (October 16)

Melee Damage
Player            Hit/Miss   M.Avg  #Crit     Crit%
WAR/NIN             459/39  143.43     40    8.71 %
SAM/WAR (Askar)    581/184  140.84     54    9.29 %
MNK72/WAR36       1270/283   52.96    148   11.65 %

Total Experience : 19012
Number of Fights : 100
Start Time       : 10:06:51 AM
End Time         : 11:07:50 AM
Party Duration   : 1:00:58
Total Fight Time : 1:35:08
Avg Time/Fight   : 36.59 seconds
Avg Fight Length : 57.08 seconds
XP/Fight         : 190.12
XP/Minute        : 311.77
XP/Hour          : 18706.50

Comments: I am no fan of the "Mamool Ja north" merit camp, but whatever it takes. I even included the experience summary to show that the exp rate was great (by my standards). Also, I noticed that the monk was wearing the Pahluwan body piece. I have seen bandied about the claim that the crit bonus on Pahluwan is "broken," and I believe this nonsense originated from this idiotic post from 2006. Such fuckers don't realize that the margin of error associated with the sample crit rate in question, even for 718 total hits, will be fairly wide. For example, a 95% Clopper-Pearson interval for the crit rate with Pahluwan body is (0.1366717, 0.1920368), so I wouldn't be talking shit about how the body makes the crit rate "worse."

Going back to the parser output, it seems to confirm the notion that critical hit rate is minimized at 5% (9% with 4 merits). This is to be expected without a sufficient amount of dexterity at this camp. If the monk's base crit rate before equipment was indeed 9% (assuming the monk had all the merits), then there is strong evidence that Pahluwan body does have an effect (trivial conclusion since the item description explicitly states there is one). As for the magnitude of the effect, a 95% CI for the crit rate bonus is (0.00939805, 0.04546965), so I am 95% confident that the true bonus is somewhere in that interval. So much for crit rate being "broken" (not that the effect isn't weak).

Enfeebling Aura Statues

≥ 82 (October 23)

BLM/RDM
Debuff      # Times   # Successful   # No Effect   % Successful
Bind             92             52             0        56.52 %
Dispel            1              0             1         0.00 %
Gravity         234            184             1        78.63 %
Sleep            34             22             0        64.71 %
Sleep II        120             94             0        78.33 %
Sleepga II        1              1             0       100.00 %
Stun             40             39             0        97.50 %

≥ 25 (October 24)

BLM/RDM
Debuff      # Times   # Successful   # No Effect   % Successful
Bind             11              8             0        72.73 %
Gravity          80             62             0        77.50 %
Sleep            11              7             0        63.64 %
Sleep II         26             24             0        92.31 %
Stun             22             22             0       100.00 %

≥ 9 (November 13)

BLM/RDM
Debuff      # Times   # Successful   # No Effect   % Successful
Bind              6              4             0        66.67 %
Gravity          25             20             0        80.00 %
Sleep II          5              5             0       100.00 %
Stun              8              8             0       100.00 %

Comments: Now that I have some working hypothesis on the relationship between magic skill and magic "hit rate" (again, to make a distinction between a "lack of resist" rate and the magic accuracy attribute), I am going to put it to the test against Aura Statues once I reach 289 enfeebling magic skill. (Merciful Cape is absolutely out of the question as I am not that masochistic; Enfeebling Torque is overpriced and obtaining Wizard's Coat +1 is contingent on luck getting the materials.) Oddly, the resist rate estimates seem consistent only for gravity. I'll have to look into it. Again, I wouldn't be surprised that a level correction plays some role.

Direct magic damage to Genbu (October 26)

Bio II
    3:   17
   35:    1
Burst II
 1067:    1
Thundaga III
  532:    1
Thunder IV
   73:    1
   99:    3
  199:    1
  398:    3
  795:    1
  798:   12

Comments: I seemed to have pretty good success damaging Genbu this time.

Dancer (lv14-15) EXP/hour (November 7)

Total Experience : 5845
Number of Fights : 82
Start Time       : 3:07:05 PM
End Time         : 4:17:42 PM
Party Duration   : 1:10:37
Total Fight Time : 1:43:29
Avg Time/Fight   : 51.68 seconds
Avg Fight Length : 75.73 seconds
XP/Fight         : 71.28
XP/Minute        : 82.76
XP/Hour          : 4965.81

Mob Listing
Mob                        Base XP   Number   Avg Fight Time
Akbaba                         ---        1             0.00
Canyon Crawler                  80        1            35.00
Canyon Rarab                    60        2            24.50
Canyon Rarab                    65        5            29.01
Canyon Rarab                    70        9            40.56
Canyon Rarab                    75        4            49.29
Goblin Digger                   80        1          1:33.09
Goblin Thug                     60        1         37:07.29
Goblin Thug                     65        1            35.00
Goblin Tinkerer                 80        1            54.01
Goblin Tinkerer                 90        1          1:00.04
Killer Bee                      70        6            36.41
Killer Bee                      75        5            38.22
Killer Bee                      80        4          2:37.06
Pygmaioi                        65        3            34.68
Pygmaioi                        70        3            50.68
Pygmaioi                        75        7            45.59
Pygmaioi                        80        2          3:35.61
Strolling Sapling               65        8            33.02
Strolling Sapling               70       10            42.63
Strolling Sapling               75        1             6.00
Yagudo Acolyte                  60        2            19.51
Yagudo Persecutor               90        2            42.54
Yagudo Piper                    90        1          1:01.01
Yagudo Scribe                   60        1            13.01
Yagudo Scribe                   65        1            10.00

Comments: I've become progressively less patient with leveling subjobs even though the last few have been easy to solo (against goblin pets), from samurai to dark knight to red mage and, now, dancer. I just don't see myself leveling another job as my playing time wanes, especially considering no job other than dancer will let me spend 70 minutes mowing down every EP nonstop for almost 5k exp/hr.

Monday, October 20, 2008

The relationship between DEX and critical hit rate

My previous post somehow got over 40 "click-throughs" on TTTO, perhaps because its authoritative title, "King's Justice versus Raging Rush," promised a decisive comparison yet its conclusions were slightly less touchy-feely than eyeballing. (I was actually looking for some feedback, but I guess it wasn't meant to be.) In that vein, I also offer this bait-and-switch regarding the relationship between DEX and critical hit rate.

I would not care about such things if not for the prospect of obtaining Byakko's Haidate one day; with its 15 DEX, surely there must be some obvious increase in critical hit rate, right?

In fact, for some reason or another 15 DEX was once "thought" always to increase critical hit rate by a paltry 1-2% despite the reality of sampling error. (I've always wondered how people arrived at such conclusions by sampling. Even if you collected data through a parse, if you had a sample of 2500 hits, the margin of error associated with your crit rate estimate would be as much as 2%.) This conventional "wisdom" was then debunked around March 2007 with a discussion of the DEX/crit relation motivated by the observation that lots of DEX sent crit rates soaring up to some maximum. Coincidentally or not, around that time there was a parallel discussion on Allakhazam about the same topic.

Sure, these people didn't bother to control for mob AGI. Now, it appears evident that your DEX relative to your target's AGI is a factor in the critical hit rate determination. But for the experiments discussed in those threads, AGI wasn't controlled. The AGI of Robber Crabs, a test subject in the Alla thread, apparently is either 39 or 42, and the AGI of Tavnazian Sheep and Miner Bees, targets in the BG thread, probably varies too. But despite the lack of control it was obvious that piling on enough DEX will increase your critical hit rate markedly at some point.

Unfortunately, this conclusion is couched in the lazy terminology of "tiers." Some examples are

(1) "Stack enough DEX to break some critical rate tier, where each point of DEX you add within that tier has a larger effect."

(2) "Any large amounts of DEX before a critical rate tier will not have a major effect on critical hit rate."

Implicit in such statements is that if you don't break a "tier," it isn't worth trying to pile on DEX. In turn, considering that "tiers" in crafting refer to discontinuous jumps in HQ rate, it isn't surprising that a "tier" in terms of crit rate is also thought of as a sudden, discontinuous jump at some critical level of DEX. But the evidence provided in the above threads doesn't really point to such a discontinuous phenomenon.

First, consider the results from BG thread. Amazingly, the point estimates were given as approximations based on sample sizes of about 300 (really, that lazy not to record the exact sample sizes?), but that isn't that big a deal. But these point estimates are themselves random variables with corresponding distributions so it is helpful to visualize confidence intervals for the true values of these crit rates for given levels of DEX, and I created a graph to help with that:

The 95% confidence intervals are represented by black bars with the point estimates centered within the CIs. I also marked what are thought to be the minimum and maximum crit rates for DEX only with gray lines, 9% minimum and 24% maximum with 4/4 critical hit rate merits (who doesn't have those?). Critical hit rate bonuses from equipment are not subject to the caps.

The data corresponding to "low" and "high" DEX on this graph conform to the minimum and maximum crit rates. (At least there is no reason to believe otherwise.) At some point, though, crit rate increases with DEX in seemingly a linear fashion, which could awkwardly be described as a "tier," I suppose. This evokes a parallel with overall hit rate versus accuracy, with a minimum of 20% and a maximum of 95% and hit rate thought to vary linearly with accuracy in between. So if crit rate does increase (linearly) within a certain range of DEX, it is worth adding DEX within this interval all other things being equal. Sure, I guess you are within a "tier" when this happens, but where's the evidence for a discontinuous jump to reach this "tier"?

Furthermore, there is hardly any evidence for the plural tiers.

I've also graphed the first set of data from Allakhazam (first post), which is similar to the BG one:

Interestingly, here the crit rate estimates increase over a 15-DEX range, even more evidence against the idea of a discontinuous jump.

Finally, in the Alla discussion data from the Robber Crabs was pooled. Pooled data generally poses statistical hazards (for one, we're assuming the exact experimental conditions for each person involved but you figure there's gotta some idiot to fuck it up or some other factor... like the fact that the AGI of Robber Crabs varies!), but let's just run with this. I created a graph of 95% CIs for the pooled data as follows:

Even in violating statistical assumptions (independence) it is obvious there is no discontinuous jump in crit rate to be seen that cannot be attributed to sampling error. And even with the fundamental shadiness of this experiment (not controlling AGI), I even had the cheerful temerity to do least-squares linear regression (which itself is inappropriate for a variety of reasons) on the data points for which over 1000 samples were collected, in the DEX region where crit rate seems to increase linearly. For me it's enough to know that there is an obvious increase in crit rate; it doesn't matter what the exact increase will be for 1 additional DEX.

Also, the region is fairly narrow (10-15 DEX) for Robber Crabs, which would explain why people observe a sudden jump when adding DEX, as there is the view that adding DEX for the purposes of increasing crit rate should be an all-or-nothing thing (never mind the reality that the tradeoffs you make to stack DEX make such an attempt impractical).

It isn't necessarily true that the results from robber crabs can be generalized to other mobs. But if this phenomenon is real and can be generalized, then you may not have to go for an all-or-nothing attempt to increase crit rates with DEX, either in an auto-attack or WS phase, as long as your DEX is within the region where DEX is considered helpful.

For robber crabs, this region appears to be between 77 and 92 DEX. The higher level robber crabs in Kuftal Tunnel have 42 AGI, which jibes with the idea that your crit rate is capped when your DEX is 50 higher than your target's AGI.

The "transition region" clearly doesn't start when your DEX is equal to your target's AGI, but where should it start? The statement in the previous paragraph implies that it could start at about 35 DEX above your target's AGI, but this is a troublesome statement to make given that the crit rates consistently appear to be above 9% (the minimum) before 77 DEX. One possible explanation is that crit rate could be a minimum when (DEX - AGI) is less than or equal to 0, and rises very slowly from 0 to around 35. This could be why it's difficult to see any improvement in crit rates from adding DEX on your usual merit mobs, which all have AGI above 67.

I admit I didn't break any new ground, but I thought it might be fun to show my take on this.