The Unbearable Triteness of Preening: magic accuracy

Showing posts with label magic accuracy. Show all posts

Friday, June 18, 2010

INT affecting Drain accuracy, continued

This is a followup to an earlier post showing that INT affects the accuracy of Drain.

Having found a way to suppress my base INT even further, I increased the difference between the two INT "treatments" to 60 (54 INT and 114 INT). As the resist rate for 121 INT could have been floored (presumably at 5%), maybe a 7-INT decrease would result in some increase in resist rate. The following dot plots summarize the distribution of the observed data visually:

The criterion I used last time to determine a difference between two cases, the number of Drain values set "far enough" apart from the bulk of the data (over the total number of samples in each case), doesn't quite work this time, in part because it is kind of difficult to define the "bulk" of the data for the 54-INT case. (It appears that there are more resists in general by an alternative criterion of number of Drain values below 144, though, and more for the 54-INT case than the 114-INT case.)

Instead, I probably am better off relying on the two-sample t-test to demonstrate statistical significance. The sample means for 54 INT and 114 INT are 154.28 and 191.33, respectively, and the 95% confidence interval for the difference of means is (12.213, 61.898), so the evidence, taken together (including that in the last post), provides strong support for the contention that INT affects Drain accuracy (provided that all the assumptions I stated last time hold, and why wouldn't they?), specifically that increasing INT increases its accuracy (in the form of fewer resists).

The most obvious practical implications of this finding is that the "conventional wisdom" that holds prioritizing dark skill above magic accuracy above recast reduction for Drain (and Aspir by close analogy) should incorporate INT, arguably before recast reduction. Where the benefit of reducing recast timers for Drain and Aspir is not fully realized (more often than you think) and the resist rate is suspect, the opportunity cost of recast reduction is usually additional INT, and it might not be a cost worth incurring depending on the actual trade-off.

Thursday, June 17, 2010

Does INT affect Drain accuracy?

(Correction: 06/18/2010. I meant /SCH instead of /DRK toward the end. I've gone mental...)

A while ago, I asserted that INT "seems likely" to affect the accuracy of Aspir and, by implied analogy, Drain, but I had absolutely nothing on which to base this assertion. Not quite as baseless is assuming that, since then, there has been absolutely no evidence presented anywhere to support or refute that assertion.

Some problems with getting data to show whether INT affects Drain accuracy

Why do I assume that? I'm not trying to be hater and talk shit, as ignorance about this is not on the level of, say, ignorance about the party-based, hidden latent effects of curry food items.

At least where examining the effect of INT on Drain accuracy is concerned, one problem is that if you're in a situation where you have good reason to believe Drain accuracy isn't "capped," you wouldn't want your HP to be low enough to allow you obtain the actual quantity of HP taken with Drain. (Low-level beetles and worms are not acceptable targets for examining Drain accuracy with level 75 jobs, and EM+ worms are not easily accessible... yet.)

Another problem, related to the first, is that the distribution of Drain still isn't known today, and "censored" values of HP drained don't help to provide insight into that. ("Censoring" is one way to describe the fact that Drain values reported in chat logs are based on maximum HP; any HP restored beyond your maximum HP does not count in the final chat log figure, so at best you only know at least how much you drained, not its actual value or whether your Drain was resisted.)

These are some of the problems that hamper data collection.

A way to avoid these problems?

If only there were a stationary target that didn't fight back, that could allow you to suppress your HP safely, for which Drain accuracy has the possibility not to be capped at level 75, and for which you could gather Drain data without interference from other players...

Zvahl Fortalices definitely satisfy the first condition, as they do not move. They also satisfy the second condition, as two of the fortalices deeper into Castle Zvahl Baileys (S) do not have any mobs wandering nearby, including Dark and Ice Elementals. Zvahl Fortalices definitely seemed like a promising candidate for Drain testing, so I actually set out to get some data to determine if INT has some accuracy effect.

That left the third and fourth conditions. Of course, I had no idea if it would even be possible for Drain accuracy not to be capped, but that would be part of the data collection anyway, with the hope that my Drain accuracy could be decreased enough to raise the corresponding resist rate above the (assumed) 5% resist rate floor. As for people doing skill-ups, I don't really begrudge them trying to maximize their skill-up opportunities, as this method of skill-up is liable to be "nerfed" come the June 21 version update.

Goals of data collection, some assumptions, and results

My way of determining whether INT has an effect on Drain accuracy is based on a simple two-sample comparison of the occurrence of resists, one sample based on "low" INT (71 in my case), and the other based on "high" INT (121). Again, this was based on the hope that my resist rate would not be floored (at 5%) for the low-INT case. This, in turn, is based on the assumption that the resist rate is floored at 5% and rises with decreasing magic accuracy. If the data shows the resist rate being above 5% for low INT, I conclude my Drain accuracy isn't capped for low INT. (That alone would not show that INT has an effect on accuracy; I would need the second sample under high INT as well.)

But what is considered a resist? Similar to the Aspir data collection I cited previously, it would be necessary to get some sense of the distribution of unresisted Drain values, with any low Drain values set "far enough" apart from the bulk of the data considered occurrences of a resist. This is the main assumption concerning the interpretation of the data (but a reasonable one).

Now, what about the other assumptions? I merely state some of them here because I simply was not interested in testing them, and I didn't collect enough data to test these assumptions anyway.

No differences among Zvahl Fortalices that could affect the results. This is a catch-all assumption concerning possible differences in magic evasion, INT, etc., but I don't think they exist (otherwise, fuck you, SE). If it could be shown that two Fortalices have two different base INT values (even if only a +1 INT difference), you would have to wonder about other possible confounders like level difference as well (can't assume these are level 75, etc.).
Even if there is a bonus to Drain on Fortalices, similar to a MAB bonus for elemental magic, it should still be possible to tell the difference between a resist and a non-resist. Bio II initial damage shows there is a MAB bonus, but even if there is a similar bonus for Drain (not MAB-related, of course), it shouldn't affect one's ability to distinguish between resists and non-resists.
The equipment bonuses (or penalties) aside from +50 INT for the "high INT" case have no effect on the accuracy of Drain. Now, obviously, I didn't put on equipment with dark magic skill or magic accuracy (or use a Dark Staff or Pluto's Staff), leaving only base attribute bonuses and penalties. Now, if you think MND and CHR actually have an effect on Drain accuracy, I'd like to hear the justification. If there are hidden accuracy effects on my equipment, that could be a problem, though.
Dark weather and Darksday have no effect on the accuracy of Drain. This is not really an assumption, as I didn't collect any data during Darksday or under Dark weather, but I just mention it anyway as they are potential confounders.

Now, the results. First, dot plots of the results as an initial visual impression, under low INT (top dot plot) and under high INT (below), suggest a minimum and maximum non-resisted Drain given 269 dark magic skill:

Before jumping into a discussion of the maximum and minimum Drain values, based solely on the criterion of a resist I described earlier (low values of Drain set "far enough" apart from the bulk of the data), there are 10/65 resists under low INT and 2/63 under high INT, so this data appears to provide good evidence that increasing INT increases the accuracy of Drain, especially if you think that for the 121-INT case, the resist rate was floored at 5%. I see no reason to be pedantic and report a confidence interval or p-value.

Under both low INT and high INT, the Drain maximum (unresisted) was 288 under 269 dark magic skill. It has been said that the maximum Drain and Aspir are 300 and 100, respectively, without any potency-enhancing gear (anecdotal discussion on BG), so if it can be shown that the Drain maximum is 288 under 269 skill for other mobs, you have to wonder how Drain potency actually scales with dark magic skill.

The location of the unresisted Drain minimum is less straightforward. One possibility is that it could be at 144 HP, which would be exactly half of the unresisted Drain maximum. It would be interesting if this relationship between maximum and minimum actually holds for all levels of dark magic skill (with other potency-enhancing factors presumably serving only to affect scale). One way to check this would be with with /SCH as a subjob.

And what of the relationship between the HP value of a resist and a non-resist? I actually got 29 HP under the low-INT case, and it's difficult to describe this relationship with with small samples. But small samples are enough to reach the major conclusions.

Conclusions

Based on the criterion that low values of Drain set "far enough" apart from the bulk of the observed data should be considered resists, additional INT appears to increase the accuracy of Drain. Ideally, the data collection should be repeated in an attempt to replicate this result.

Data collection could be performed using /SCH as a subjob. +50 INT (if it could be achieved) should still be able to manifest in the form of increased accuracy (provided INT does have effect), and further exploration of the relationship between dark magic skill and unresisted Drain maximum could be done, along with that between (unresisted) Drain maximum and minimum.

Tuesday, May 11, 2010

How to check if a Teiwaz has superior accuracy to Terra's Staff

Obviously, I am talking about the Teiwaz with elemental affinity: magic accuracy +3, not so much that I am talking about the earth-aligned Teiwaz.

It is thought that earth affinity: magic accuracy +1 is equivalent to the accuracy bonus of Earth Staff, and earth affinity: magic accuracy +2 equivalent to the accuracy bonus of Terra's Staff. If you ever read this blog, you would know that elemental NQ staves are considered to have +20 magic accuracy for the specified element, and HQ staves, +30 magic accuracy. It is postulated that elemental affinity: magic accuracy +3 corresponds to +40 magic accuracy.

Without discussing the evidence underlying the following experiment to check whether the earth Teiwaz is superior to Terra's Staff, I will describe a superiority "trial" involving relatively few casts.

Location: Alzadaal Undersea Ruins (Nyzul Isle Staging Point)
Target monster: Level 78 Qiqirn Poulterer (ranger)
Spell to cast: Stone (I)
How many casts: 100.
What to count: number of non-resisted Stone I, number of half-resisted Stone I, number of quarter-resisted Stone I, number of eighth-resisted Stone I (should be easy to identify from the magnitude of damage)

How to identify the level 78 Qiqirn Poulterer: one way to check you have found the correct level Qiqirn is to set your accuracy score to 263 and use the "check" function to find the right Qiqirn. One way to achieve this is to equip a weapon type for which you have 230 combat skill (example: BLM with max club skill). 230 combat skill corresponds to 227 accuracy. Suppose you also have 62 DEX. For a one-handed weapon (club), this means you have +31 accuracy. Then equip +5 accuracy worth of equipment (example: Chivalrous Chain) to achieve a total accuracy score of 263. Level 77 and 76 Poulterers will check "low evasion," while the level 78 Poulterer will give no evasion message. Incidentally, this implies the level 78 Poulterer has at least 293 total evasion. You can confirm the level after killing the Poulterer by noting EXP yield (200 base EXP for level 78, 230 given 15% Sanction bonus).

Total magic accuracy for this experiment: it is known reasonably well (I will not cite evidence at this time) that having 65 INT, 290 elemental magic skill, +5 magic accuracy (from equipment), and no elemental staff corresponds to having about 55% magic accuracy rate for the Stone I spell. (The level 78 Qiqirn Poulterer has 65 INT.) If your elemental magic skill is higher or lower (say 292), make the appropriate adjustments to INT and/or magic accuracy. Here, +/-1 INT is considered +/-1% magic accuracy rate (up to a point), and +/-1 magic accuracy (from equipment) is considered +/-1% magic accuracy rate, too.

Given the above, equipping a HQ staff like Terra's brings your magic accuracy up to ~85%. A "quickie" trial I ran gave 57/71 non-resisted Stone I, strong evidence of uncapped magic accuracy rate. As postulated previously, equipping a Teiwaz with earth affinity: magic accuracy +3 could bring your magic accuracy up to ~95% (the maximum rate).

Why 100 casts of Stone I on a level 78 Qiqirn Poulterer? 100 is an arbitrary figure as I am too lazy to do a power calculation, but since we know a priori that Terra's Staff doesn't even give a capped magic accuracy rate (and it shouldn't since I said it would be ~85%), it will be very easy to show that, if the Teiwaz earth affinity: magic accuracy +3 really has +40 magic accuracy, the observed data will indicate a capped magic accuracy rate. Of course, if the accuracy bonus were higher than +40, this test wouldn't be able to show that, but the in-game constraints described by the current magic accuracy "model" and a desire for a "minimal" sample size (the further away from 50%, the smaller the standard error) led to the above experimental conditions.

Considerations: avoid Qiqirn Goldsmith links. To minimize damage from ranged attacks, it is preferable to be RDM. If not, have someone spam heal you while you hammer out 100 casts in short order. It really doesn't take that long.

Questions and desired clarifications about experimental conditions may be fielded in the comments, if anyone actually gives a shit.

Credit to pchan on BG for previous work on Qiqirn Poulterers that allows for a fairly straightforward and not-onerous experiment.

Tuesday, April 21, 2009

One more time

I am fairly amused that the conclusions from lodeguy's magic accuracy experimentation and my data analysis have been used to support the shibboleth of "320 skill/120 INT" for direct-magic damage (just browsing FFXI forums periodically). Maybe "shibboleth" is too strong a pejorative, since at least this rule of thumb acknowledges that INT contributes to overall magic accuracy (even though this acknowledgment seemed to be supported mainly with anecdotes and collective experience rather than formal data collection).

Should we really care about attaining 120 INT?

As you may recall, lodeguy gave us data that suggest (informally) a critical point for ΔINT (caster's INT minus target's INT) that "connects" two distinct regimes of rate of change of overall magic accuracy with respect to INT. To summarize, before ΔINT +10, the rate of change is estimated to be 1% per 1 INT (actually a little less from statistical significance testing), and between ΔINT +10 and ΔINT +30, 0.5% per 1 INT. I only emphasize this range because there is no data to show what might happen beyond ΔINT +30. (Moreover, there was no data to suggest, as far as I can recall, the effect of INT below 50% overall m. acc. But, realistically speaking, no one is ever going to investigate these issues. This is the best we will ever get, probably.)

With that in mind, it might be interesting to get some sense of whether 120 INT is generally suitable in "endgame" to reach the second ΔINT range with the slower rate of change. To do this, one must compare 120 INT to the INT of various "endgame" mobs.

Regrettably, information about mob INT from English-language sources is either poorly documented (sequestered in obscure FFXI forum posts) or almost non-existent (seriously, does anyone give a fuck about anything other than Ebony Puddings?), and this annoyed me to the point that I attempted to calculate the INT (as well as magic defense bonus, or MDB, and reduction of magic damage taken, or MDT-) of various mobs that I faced over the past few months to get a sense of whether I was surpassing ΔINT +10 most of the time. As I said in the last post, magic damage is deterministic (level of resist is random), so it should be fairly straightforward to calculate mob INT in many cases. Of course, I could have made calculation errors or overlooked level variability for specific mobs. I will leave it to others to verify or refute my calculations.

There isn't much variety in what I do in FFXI, though. All I have is data for mobs in NW Apollyon and those for various ZNMs. First, NW Apollyon:

Monster	INT	MDB	MDT-
Bardha	75	0	0
Pluto	82	0	0
Mountain Buffalo	60	0	0
Apollyon Scavenger	62	0	0
Gorynich	72	0	0
Kronprinz Behemoth	74	0	0
Kaiser Behemoth	???	???	???

As you can see, most of the "normal" mobs have low INT so that ΔINT +10 is easily cleared. As for Kaiser Behemoth, I didn't gather enough information, but I am pretty sure it possesses some combination of MDB and MDT- traits. I also collected similar data on some ZNMs I fought several months ago:

Monster	INT	MDB	MDT-
Lil' Apkallu	60	0	1/4
Verdelet	115	0	0
Experimental Lamia	89	0	1/8
Mahjlaef the Paintorn	112	0	1/4
Cheese Hoarder Gigiroon	81	0	0
Vulpangue	78	0.20	0
Dea	62	0	0
Iriz Ima	70	0	0
Gotoh Zha the Redolent	92	0.28	1/8
Tinnin	85	0.20	0
Achamoth	65	0.16	0

Here, MDB is reported in terms of amount above 1.00. MDT- is reported in terms of fractional reduction of magic damage.

Other than Verdelet (an imp) and Mahjlaef the Paintorn (a soulflayer), all of the ZNMs have INT such that ΔINT is well above +10. Therefore, from the standpoint of optimizing overall magic accuracy (given what we know), it seems practical to exchange INT in excess of ΔINT +10 for elemental magic skill or magic accuracy. In particular, this could be useful for Tinnin, which seems to have higher magic resistance than the "lower-tier" ZNMs (probably a result of level difference) despite having "only" 85 INT.

Moreover, there could be some patterns to mob INT despite the limited information available. Beastmen and other "sentient" mob types (particularly soulflayers and imps) could have higher INT in general than other types. Magic users have higher INT in general than non-magic users (I will treat this as self-evident).

But concerning the main question, it appears, at least for most ZNMs that are worth nuking and mobs in NW Apollyon, that ΔINT +10 is surpassed most of the time. If you happen to get close to 120 INT incidentally, that's great, but not necessarily at the expense of possible improvements to magic skill/magic accuracy. For example, Dea has only 62 INT, but it is still prone to resisting Thunder IV (compared to Blizzard IV). Therefore, it would be appropriate to use Sorcerer's Petasos instead of Demon Helm +1 for the sake of improving accuracy.

None of these mobs even have INT above 120, so it's not like you would get much of an improvement to resist rates whoring INT (such that ΔINT +10 is satisfied) compared to whoring magic skill/accuracy (all things being equal).

So what about beastmen "kings" and HNMs? Bahamut ("The Wyrmking Descends") is reported to have 115 INT (from Studio Gobli, if you can actually find the documentation). (Bahamut is sentient, right? Check.) Also Jormungand is reported to have 120 INT (also from Studio Gobli). (Perhaps the example of Jormungand motivated the 120 INT figure?) Other than that, I have no other information.

Anyone can calculate mob INT, but...

... magic defense bonus (MDB) and reduction in magic damage taken (MDT-) can get in the way of calculating INT. These factors may play a role in determining overall magic damage for things like Sarameya and Tyger. Without knowing MDB and MDT- and considering the incessant flooring involved in these calculations, it is somewhat difficult to arrive at a unique set of MDB/MDT-/INT that allows you to calculate magic damage exactly without using formal optimization methods, and I am not interested in doing that.

However, this post offers some very useful facts to determine what exactly a mob's potential MDB or MDT- is. In particular,

1000 Needles is not affected by MDB.
Quick Draw is not affected by MDT-.
Damage calculations for both are independent of mob INT.

Unfortunately, I don't have access to blue mage or corsair, but these tools would be very useful if I had access to them. Practically speaking, it doesn't seem particularly appropriate to do this kind of testing during "serious" events (how seriously do you take Proto-Ultima?), but your mileage may vary (enough with the cliches!).

Tuesday, January 27, 2009

Tears of a clown

So that this post is not a complete waste of time, unlike the vast majority of hand-wringing and gloating cluttering TTTO over the past week post-banning (as opposed to the usual gloating and preening about acquired equipment and "accomplishments"), there were some more interesting results from lodeguy's experimentation that I did not address previously but are still interesting (they just didn't require any statistical techniques to analyze).

For direct-magic damage, does weakness to a certain element guarantee half-resists at worst?

Lodeguy demonstrated that there is a case where elemental weakness--specifically a Fire Elemental's weakness to water--guarantees half-resists for direct-damage magic at worst. He never observed anything worse than a half-resist casting Water magic over 1,000 times on a Fire Elemental. Remember that the usual distribution of resists is easily predicted (well, if you have some estimate of an "unresisted" rate to begin with), and that the proportion of quarter or "full" (1/8) resists when effective magic accuracy (no resist) is .30 is expected to be almost .50.

This observation probably does not apply to high-level Notorious Monsters with "known" elemental weaknesses. If it were true, there would be little incentive to maximize magic accuracy for such targets. I think players would generally accept the tradeoff of having few unresisted nukes in exchange for guaranteeing at least half-damage. But perhaps there is more to this phenomenon than what has been observed.

Again, I do not read Japanese so there may be something important in the discussion that I overlooked.

What is the maximum effective magic accuracy?

It is obviously 95% for direct-magic damage. Again, under the usual distribution of resists, the proportion of half resists when effective magic accuracy is 95% is .0475; for quarter resists, .002375. Note the rare quarter-resist event. A "full" resist with magic accuracy capped is even rarer, with hypothetical probability .000125, or basically a 1-in-10,000 event.

Now that that's taken care of, some idle thoughts on the duping-related bans.

Tears of a clown - my, my, how the ressentiment flies

It is obvious to me that the point of any carte blanche clause in a "terms of service" you may observe is so that the service provider has an easy out to rid itself of undesirables. I do not really care that SE reserves a prerogative to regulate its own product through banning of accounts.

Yet although SE certainly does not need to justify any bans or suspensions it metes out, the mere appearance of uneven application of "punishment" (regardless of the fact that punishment was unevenly applied) makes FFXI look even more of a joke than it already is.

That this exploit remained in place for over 18 months is enough of a joke. Never mind SE's execrable neglect of widespread RMT activity (considerably more pervasive than anything to do with Salvage) left unabated for years in a MMORPG whose conditions were extremely conducive to RMT activity (despite SE reserving carte blanche to terminate accounts), the "nerf" to Pandemonium Warden and Absolute Virtue in response to negative publicity, etc., etc.

And, anyone who thinks anyone at SE wasn't cognizant of the duping exploit at any time during the 18 or so months before the lead-up to patch-and-ban is an idiot. There is always some "snitch" who would report such a thing. I say snitch in the vein of someone who does the right thing mostly out of impure motives, like "seeing bitches get their comeuppance."

Face it, no one gives a fuck about the "integrity" of a consumer product/service like a MMORPG except immature 39-year-olds who deign to waste all their time fulminating about a trivial thing.

Of course, this doesn't stop idiots with no sense of proportion from riding SE's dick any chance they get. That SE should even pursue damages in court for duping in a video game to keep players "honest" is simply a farcical notion. Acting like SE is some poor besieged entity in a game rife with whining entitled players---most of whom, at the end of the day, despite crying about "intolerable" low drop rates in an endgame activity they voluntarily entered into, still waste their money on FFXI--is also a joke. SE and the "player community," you deserve one another.

At the end of the day, SE can merely point to your monthly credit card statements, cheater or not, and simply say, "monthly fee." Money is the prime mover, although SE sometimes seems not to act like it is.

Thursday, January 8, 2009

Mr. Decay

This post will be a potpourri of topics.

Chocobo racing - Crystal Stakes results

As of this post I've raced my (good) chocobo (SS/B/B/B) 112 times in the Crystal Stakes (C1) and obtained the following results:

1st: 52
2nd: 40
3rd: 16
4-8: 4

Total: 112

I haven't seen much information on results with other chocobo configurations, except from this one forum post (chocobo attributes unknown):

1st: 27
2nd: 12
3rd: 10
4-8: 7

Total: 56

Now, I have no idea how often this other chocobo faced competing PC chocobos, but seeing another chocobo's results helps to provide some more perspective.

Is B receptivity a good hedge if it means placing 2nd relatively more often than a chocobo with lower discernment, "all things being equal" (which never happens, but let's just finish this filler post)? One one hand, I'd rather place 1st more often at the expense of placing 3rd or lower more often, as in the long run the return could be better. (One way to think of it is that it's better to place 1st and 3rd in two races than 2nd both times.)

On the other hand, I don't like farming chocobucks.

Sure, you can calculate expected gains and losses of chocobucks per race, but I won't do it because it won't motivate me in any way to raise another chocobo.

Magic accuracy - does weather and day have an effect?

Earlier (see previous post), I observed that the (effective) magic accuracy of Paralyze seemed not to be (statistically) significantly affected when Paralyze was cast during Firesday and Iceday. I tried to search for more data sets, but I didn't find anything meaningful.

Lodeguy himself seems to have said neither day nor weather have an effect (too lazy to find the actual quote), but, really, since his goal was to measure changes in (effective) magic accuracy, whether or not there is a day/weather effect (that he didn't control for and is not practical to control for) doesn't matter all that much considering the effect, if it exists, processes only 1/3 of the time. (I haven't verified this myself though.)

Anyway, I guess I could operate under the assumption that weather and day do affect resist rates. But are the effects of day and weather on accuracy (if they exist) the same in magnitude as the effects of day and weather on damage?

If you wanted to test this assumption and you have a scholar, you could see whether single weather and day combined drastically increase the accuracy of nukes of the same element. (You could also check for the reduction in accuracy of nukes of the opposite element.)

Laziness dictates that I should do a basic statistical power calcuation to obtain the number of Bernoulli trials needed to observe that a possible 20% increase in effective magic accuracy is statistically significant, given a Type I error of 5%:

Computed N Total

Actual        N
Power    Total

0.801      166

This conservative (but one-sided) power calculation (details omitted) indicates I need a total of 332 samples (166 for the trials without the effect of weather and day, and 166 for the trials with the effect) to observe statistical significance (using Fisher's exact test) with a probability of .8. And this probability assumes that this 20% increase (or reduction depending on your approach) is real.

But, I would have to make sure my effective magic accuracy, without the effect of weather and day, is somewhere above 50% and less than 75%. Based on lodeguy's data, one could figure this out for Earth Elementals (...) or, better, a Qiqirn ranger.

If weather has an effect on magic accuracy, why does Klimaform exist?

The English description of Klimaform states that the ability "[i]ncreases the magic accuracy for spells of the same element as the current weather." This statement does not really imply an existing accuracy bonus from weather before Klimaform, nor does it really imply no weather bonus before Klimaform.

Magic accuracy - revisiting data sets other than lodeguy's

A long time ago I looked at this data set and then just glossed over it while talking about lodeguy's results. But "intellectual honesty" compels me to attempt to explain the results of this other data set.

Actually I do not recall all the experimental details, but I "hope" the Ebony Puddings targeted were at the infamous Mount Zhayolm experience "camp." There, Ebony Puddings have a level of 79 or 80. (Incidentally, I noticed that these flans provide a experience point bonus of 5%, which I could not find corroboration for on FFXIclopedia.) Then that makes the observed data more "plausible."

First, it would be pretty obnoxious to say that the effect of magic accuracy increases with skill level without even acknowledging the imprecision of the estimates. If you are going to claim that, then you have to claim that one point of magic accuracy input gives an effective magic accuracy increase well above 1%, as shown below, using the nuke data from "Test III" and "Test IV" together (without the INT observations):

                            Analysis Of Parameter Estimates

                              Standard     Wald 95% Confidence       Chi-
Parameter    DF    Estimate       Error           Limits            Square    Pr > ChiSq

Intercept     1     -3.2974      0.6462     -4.5639     -2.0309      26.04        <.0001 skill         1      0.0143      0.0022      0.0099      0.0186      40.56        <.0001 macc          1      0.0179      0.0027      0.0126      0.0232      43.27        <.0001

Not only can you not argue that macc is "better" than skill, you also cannot really say with a straight face that 1 point of magic accuracy input increases effective magic accuracy by some value well above 1%. That is just ridiculous on its face.

One possible explanation for the data is that the level 79 and level 80 Ebony Puddings were not targeted in roughly equal proportions; in the worst-case scenario, puddings of one level were inadvertently targeted exclusively for "Test III," and puddings of the other level were used exclusively for "Test IV." Since lodeguy provided some evidence of a level difference penalty (or bonus), we should be wary of such a phenomenon when collecting data.

For this data and experimental setting, a potential consequence of severe imbalance in the relative proportions of level 79 and level 80 Ebony Puddings targeted is a "distortion" of the true sampling distributions associated with the "skill" and "macc" effects, "true" meaning that the distributions should have a mean of 0.01.

This can be demonstrated through simulation as a demonstration of the concept. This is not a "proof" of anything, just a whimsical example. Suppose that the difference in level penalty between a level 79 and level 80 Ebony Pudding is 10% magic accuracy. Then, using the worst-case scenario I described above, I can generate approximate sampling distributions (with many, many assumptions) for the slopes associated with the main effects.

The most important assumption for this simulation is that 1 point of skill equals 1% effective magic accuracy, and 1 point of magic accuracy input equals 1% magic accuracy output (regardless of whether this is true in reality, which I think it is).

For elemental magic skill, the approximate sampling distribution has a mean of 0.0154 (not 0.01) and a standard deviation of .00245, which is close to the standard error from the actual data.

For magic accuracy input, the approximate sampling distribution has a mean of about 0.0133 (not 0.01) and standard deviation of about .00334. The standard error from the actual data is not close to .00334, but the concept still shows the "plausibility" of the data. Moral of the story: failing to control for real effects may have deleterious consequences.

As for the apparent (lack of) effect of INT below 50% effective magic accuracy ("Test II"), if 30 INT really corresponds to a 15% magic accuracy bonus (assuming any bonuses are cut in half because of the hit rate penalty), observing no improvement (or worse) is virtually guaranteed not to happen. At this point, I would just keep this result in mind but take it with a grain of salt.

Here's the R code I used to generate the above graphs:

n = 10000

skill2 = rep(0,n)
macc2 = rep(0,n)
skill_se = rep(0,n)
macc_se = rep(0,n)

for (i in 1:n) {
 success = c(rbinom(100,1,.59),rbinom(100,1,.72),rbinom(100,1,.72),rbinom(100,1,.79),rbinom(100,1,.90),rbinom(100,1,.90))
 skill = c(rep(274,100),rep(274,100),rep(287,100),rep(284,100),rep(284,100),rep(295,100))
 macc  = c(rep(0,100),rep(13,100),rep(0,100),rep(0,100),rep(11,100),rep(0,100))
 trials = data.frame(cbind(success,skill,macc))
 model = glm(success ~ skill + macc, family=binomial(link="identity"),data=trials)
 skill2[i] = coef(summary(model))[2,1]
 skill_se[i] = coef(summary(model))[2,2]
 macc2[i] = coef(summary(model))[3,1]
 macc_se[i] = coef(summary(model))[3,2]
}


win.graph(width = 6, height = 4.5, pointsize = 12)
hist(skill2,freq=FALSE)

win.graph(width = 6, height = 4.5, pointsize = 12)
hist(macc2,freq=FALSE)

Tuesday, December 30, 2008

Give me data or give me...

I will consider this my last post on the topic of magic accuracy/magic hit rates. Again, most of the credit should go to lodeguy for all of his time-consuming experiments and insights, much of which I elided in the interest of saving (my) time. I don't fancy myself a gatekeeper of knowledge (anyone who knows Japanese may want to review his posts in their entirety, since I cannot read Japanese for the most part) but two years on it's about time someone English-speaking talked about this stuff in some detail. Yet I thought writing about what someone else already figured out would be fairly straightforward...

You may review my first four posts on this topic:

On magic resist rates (Dec 17)
More on magic resist rates (Dec 18)
Describing "magic hit rate" symbolically (Dec 19)
Even more on magic resist rates (Dec 29)

First off, I'd like to address my use of the terminology "magic hit rate" to make a distinction between "effective magic accuracy" (output) and "magic accuracy" the attribute (input). I am not a terribly big fan of the term "magic hit rate" that I've been using the last few weeks, especially since in the Japanese language, the term 魔法命中率 could be translated as either "magic hit rate" or "magic accuracy," so there is no distinction between "hit rate" and "accuracy" using the Japanese term. Instead of using "magic hit rate" as "short-hand" for "the probability of landing an unresisted magic spell," I could have used one of the following:

effective magic accuracy
rate of landing magic unresisted
resist rate (as 1 minus the probability of landing an unresisted magic spell)

The first term seems the best since 1 point of magic accuracy input does not always yield 1 point of magic accuracy output, as shown previously, so I will use that term from here on. lodeguy himself used the term ヒット (hit) for an unresisted spell, so that is one reason I just adopted the terminology "magic hit rate" to start.

The second term is more awkward than "magic hit rate" and the third necessitates the use of negative language ("reducing resist rate") and making a distinction between different levels of resists (so that "resist rate" is understood as a catch-all for all types of resists). So that was supposed to be some kind of excuse for me using the term "magic hit rate" throughout.

Anyway, I mainly want to address whether a 1-point increase in elemental magic skill is equivalent to a 0.9% increase in effective magic accuracy above the 200 magic skill level.

This claim has endured as long as it has because partly because of the intuitive appeal inherent in the notion that magic accuracy is supposed to be analogous to melee accuracy. Perhaps it was to trump up the value of pure magic accuracy as opposed to specific magic skill. (All things being equal, which rarely occurs, magic accuracy does have appeal as a general attribute, a catch-all for all types of magic.)

Yet with no way to verify easily what one's effective magic accuracy is, there was no convenient way to refute or confirm that claim (among many, many other claims). But lodeguy did all the inconvenient work for you, and it was sitting under my nose. And I can provide some additional cover for lodeguy.

First, let's re-examine one of lodeguy's data sets.

Casting Water magic (103 INT, Neptune's Staff) on a level 78 Earth Elemental at various levels of elemental magic skill, he obtained the following results (11,934 trials):

Skill	No resist	1/2 resist	1/4 resist	1/8 resist
235	1960 (.532)	967 (.262)	409 (.111)	348 (.094)
240	1294 (.582)	563 (.253)	229 (.103)	136 (.061)
250	1390 (.694)	407 (.203)	142 (.071)	65 (.032)
262	1585 (.821)	287 (.149)	43 (.022)	15 (.008)
270	1858 (.887)	204 (.097)	30 (.014)	2 (.001)

When I looked at this data set, it was to establish the increase in effective magic accuracy, above 50% effective magic accuracy, for every 1-point increase in elemental magic skill. For some inexplicable reason, I used only the bottom three rows of the above table when fitting the linear probability model, obtaining the following results:

           Criteria For Assessing Goodness Of Fit

Criterion                 DF           Value        Value/DF

Deviance                   1          1.1684          1.1684
Scaled Deviance            1          1.1684          1.1684
Pearson Chi-Square         1          1.1610          1.1610
Scaled Pearson X2          1          1.1610          1.1610
Log Likelihood                    -2878.9070


Algorithm converged.


                      Analysis Of Parameter Estimates

                         Standard     Wald 95% Confidence       Chi-
Parameter    DF    Estimate       Error           Limits            Square    Pr > ChiSq

Intercept     1     -1.7053      0.1604     -2.0197     -1.3909     113.03        <.0001
skill         1      0.0096      0.0006      0.0084      0.0108     249.35        <.0001

What I should've done instead was use all the data, in which case I obtain the following results:

           Criteria For Assessing Goodness Of Fit

Criterion                 DF           Value        Value/DF

Deviance                   3          2.4983          0.8328
Scaled Deviance            3          2.4983          0.8328
Pearson Chi-Square         3          2.4871          0.8290
Scaled Pearson X2          3          2.4871          0.8290
Log Likelihood                    -6935.4539


Algorithm converged.


                      Analysis Of Parameter Estimates

                         Standard     Wald 95% Confidence       Chi-
Parameter    DF    Estimate       Error           Limits            Square    Pr > ChiSq

Intercept     1     -1.8719      0.0683     -2.0056     -1.7381     751.90        <.0001
skill         1      0.0102      0.0003      0.0097      0.0108    1458.65        <.0001

While the first 95% confidence interval covered .009, the last 95% confidence interval, which was generated considering all the data at hand, does not cover .009 (effective magic accuracy increase of 0.9% for every one-point increase in elemental magic skill), so it's a pretty safe bet that above 200 elemental magic skill, 1 point of elemental magic skill is equivalent to effective magic accuracy higher than 0.9%.

To visualize how good the model fit is, here's some graph-junk for you:

Conclusion: There is scant reason to believe that 1 point of elemental magic skill above the 200 level yields only a 0.9% increase in effective magic accuracy (unless lodeguy and I were extremely unlucky). You might as well treat it as a 1% increase! Perhaps this is not the case for other types of magic skill, and perhaps there is some funny business above the 300 level, but at least here is some conclusive evidence for the range of elemental magic skill considered.

The following is supposed to be the extra "cover" for lodeguy's results (they can stand on their own though), and again is mainly for my own amusement.

Finally, using the exact same sample-size allocation and levels of elemental magic skill that lodeguy used for this particular experiment, I can generate approximate sampling distributions for the mean change in effective magic accuracy (magic hit rate) per one-point increase in elemental magic skill.

First off, I generated a sampling distribution assuming +0.9% effective magic accuracy per +1 elemental magic skill. Here, I assumed that the effective accuracy at 240 elemental magic skill was exactly 53%, but it is the changes in elemental magic skill that really matter:

(The approximate normal distribution is drawn with a red curve, and the histogram uses the data generated from simulation.)

As you can see, under the assumption of +0.9% effective magic accuracy per +1 elemental magic skill, observing (as a point estimate) an increase in effective magic accuracy of 1% or greater for any one experiment (given 11,934 trials...) is pretty rare (in the right tail). If you treat this assumption as a straw man to knock down (otherwise known as the null hypothesis), you will knock down the straw man (reject the null) with an approximate probability of .93 (given Type I error of .05) if the real (not estimated) accuracy increase is 1% for every 1-point increase in elemental skill. Of course, if the real increase is just 0.9%, the null will be rejected "only 5% of the time" (the Type I error of .05 that was fixed in advance of frequentist inference).

It may also be interesting to see what an approximate sampling distribution for the mean change in effective magic accuracy, assuming +1.0% effective magic accuracy per +1 elemental magic skill, would look like:

If the assumption (+1.0% effective m.acc per +1 elemental skill) is actually true, then observing an increase in effective magic accuracy of 0.9% (or less) should be extremely rare (see left tail), given 11,000+ samples.

That's a wrap for this topic.

Monday, December 29, 2008

Even more on magic resist rates

(Edit - Dec. 30: Some further thoughts on the enspell experiment.)

In the past week, I discussed over several posts a Japanese player's extensive exploration of magic resist rates, specifically changes in "magic hit rate" ("lack of resist" rate) with each of several controllable factors (use of elemental staves, elemental magic skill, INT, and magic accuracy) for nukes alone. You may review these posts under the "magic resist analysis" tag.

I would've left it at that, but I didn't realize until now that there was a "reaction" on BG forums generated by my discussion of lodeguy's data. In particular, there are a few data sets that I would like to go over as they may help focus further investigation.

You may skip to the summary if you like.

Alkalurops vs. HQ elemental staff

As a competitor to elemental staves, Alkalurops seems to be maligned from an accuracy standpoint because it's assumed that its "effective" accuracy (comprised of contributions from INT/MND/CHR +10 and magic accuracy +20) is worse than that of a HQ elemental staff, usually because it is assumed that staves provide a multiplicative accuracy bonus (in the absence of any real evidence). (Obviously, for nukes Alkalurops is inferior to HQ staves merely from a damage standpoint.) But, if you've read any of my recent posts, you should now be comfortable asserting that INT does make an important contribution to reducing resist rates, at least when it comes to nukes.

Consider the results of the following experiment comparing the accuracy of Alkalurops to that of Terra's Staff (check the forum post for details as I am not interested in rehashing experimental conditions):

Condition	No resist	Some resist
No staff	296 (.296)	704 (.704)
Terra's Staff	354 (.443)	446 (.558)
Alkalurops	355 (.444)	445 (.556)

At the risk of committing a Type II error, this result is not all that surprising given what we've inferred thus far about magic hit rate bonuses from elemental staves, magic accuracy, and INT.

The Terra's Staff (HQ) seems to be providing a constant 15% increase to the "success" (no resist) rate, which could be considered an elemental magic accuracy bonus of +30 cut in half (+15) since the initial and final magic hit rates are both under 50%.

A possible explanation for the Alkalurops is that it seems to be providing an "effective" magic accuracy bonus of +30, with contributions from its magic accuracy attribute (magic accuracy +20) and from its INT attribute (INT +10 corresponding to an constant increase of 10% hit rate since ΔINT is below +10). This effective accuracy bonus is cut in half since the initial and final magic hit rates are both under 50%.

Not only does this example suggest that the accuracy effect of added INT (like magic skill, magic accuracy, and staff accuracy bonuses) is attenuated below 50% magic hit rate, it also suggests that Alkalurops is a strong enfeebling staff and an acceptable replacement for a whole family of HQ elemental staves (again, when it comes to enfeebling). When it comes to enfeebles, it would not be that great a leap to conclude that, based on results from nuking, INT/MND/CHR must provide some accuracy bonus in addition to a potency bonus (where applicable). The next example shows that additional MND does reduce the "complete resist" rate of Paralyze.

Experiments with Paralyze

This next data set comes from "FFXI Hunter's Bible Version II" and is the result of 8,000 casts of Paralyze on level 84 Aura Statues (were they sure the level was fixed?) under varying conditions. The event of "success" was defined as anything that wasn't a complete resist (both non-resists and partial resists). A summary of point and interval estimates for this data set follows:

                  Analysis Of Parameter Estimates

                          Standard  Wald 95% Confidence
Parameter        Estimate     Error        Limits         Pr > ChiSq

Intercept         -1.9919    0.4107  -2.7969     -1.1870      <.0001
skill              0.0064    0.0013   0.0038      0.0090      <.0001
macc               0.0073    0.0013   0.0046      0.0099      <.0001
mnd                0.0075    0.0013   0.0048      0.0101      <.0001
staff      HQ      0.2070    0.0207   0.1665      0.2475      <.0001
staff      NQ      0.1710    0.0205   0.1308      0.2112      <.0001
day        Ice     0.0160    0.0192  -0.0216      0.0536      0.4048
day        Fire   -0.0210    0.0187  -0.0576      0.0156      0.2610

I know it is kind of trivial to give such a summary (the result of fitting the saturated linear probability model) when you can inspect the data directly and see that MND does affect the accuracy of Paralyze, but it does conveniently summarize the precision of these point estimates in red.

It is important to note that the given point estimates are not appropriate to describe the actual changes in magic hit rate (no-resist rate) because they also encompass partial resists. These estimates should therefore be higher than the "real" no-resist rates.

The other important observation is that the effects of Iceday and Firesday, respectively, on the accuracy of Paralyze are not even close to being statistically significant (at the 5%, 10%, 15%, and 20% levels), if they even exist at all. The relevant p-values are in red. It is not mentioned whether an obi was used.

The following is a convoluted discussion on whether there are discrete levels of Paralyze resists and how they may be observed indirectly. You can skip this part since it's mainly for my own amusement.

We know (or should know from experience) that levels of resists for Sleep, Poison, and the "elemental enfeebling" line of spells seem to have 3 distinct levels of resists (full duration, half-duration, and complete resist). Is it appropriate to conclude that enfeebling spells with a "continuous" range of durations, like Paralyze (only Paralyze?), also have discrete levels of resistance?

If the point estimates above are really the result of Paralyze resists following a multinomial distribution, it may help to illustrate what these point estimates might really be... estimating. For example, if many enfeebling spells follow a multinomial distribution, using the exact same logic of conditional probability that has been validated for nukes (not to say that this is easily verifiable for enfeebles, because it would take forever to do so; therefore, I cannot say whether this assumption is valid at the moment), then the multinomial proportions may vary with some level of "input" (INT, magic skill, magic accuracy, whatever) as follows:

Here, I describe a situation where there are only 3 levels of resists (no resist, half-resist, and full resist). This image roughly describes how p, the probability of no resist, and p(1 - p), the probability of a half-resist, may vary with input level when p is below 50%.

Suppose that the quantity p + p(1 - p) (illustrated in red) is what the point estimates above are actually trying to describe. Specifically, the point estimates would then be estimates of the slope (rate of change) of the curve in red. They seem "plausible" enough considering the precision (or imprecision) of these estimates compared to the theoretical rate of change of this curve as illustrated below:

So, it may really be the case that there are discrete levels of Paralyze resists. It's just that they seem difficult to observe directly, and may be observed indirectly by obtaining sample proportions of no resists and partial resists summed together (the sum being what is easily observed).

Obviously, the theoretical slope of p + p(1 - p) is not constant. If this applies to enfeebles such as Sleep, et al., then any experiments using Sleep and the like must account for this. (Not that anyone does or would.)

What if there are four levels of resists for Paralyze (and other enfeebling magic spells)? I also generated graphs for this case:

And for the rates of change:

Obviously, these apply to nukes. You can use lodeguy's data to verify that these trends apply to elemental magic skill and INT.

Using enspells to estimate changes in "magic hit rate" with magic accuracy

This approach seems to be a clever way to accumulate a large number of "trials" somewhat easily because it takes advantage of auto-attack and there are discrete levels of resists that are easily observed (if not automatically tallied with a parser). So, you can possibly deduce (estimate) a target's magic evasion after controlling for your own magic accuracy "score" and perhaps level correction/penalty.

Note that since the author has ice accuracy merits, the recorded magic accuracy levels reflect that. (I missed that initially.) But throwing caution to the wind, simple analysis of the above data (modeling the "full" enspell rate, or no-resist rate) cranks out the following results:

           Criteria For Assessing Goodness Of Fit

Criterion                 DF           Value        Value/DF

Deviance                   2          0.0042          0.0021
Scaled Deviance            2          0.0042          0.0021
Pearson Chi-Square         2          0.0042          0.0021
Scaled Pearson X2          2          0.0042          0.0021
Log Likelihood                    -1678.4884


Algorithm converged.


                                Analysis Of Parameter Estimates

                                       Standard     Wald 95% Confidence       Chi-
Parameter             DF    Estimate       Error           Limits            Square    Pr > ChiSq

Intercept              1     -0.8225      0.2504     -1.3133     -0.3316      10.79        0.0010
skill                  1      0.0048      0.0009      0.0030      0.0066      28.05        <.0001
macc                   1      0.0031      0.0051     -0.0069      0.0132       0.37        0.5428
element      earth     1      0.0950      0.0239      0.0481      0.1419      15.77        <.0001
element      ice       1      0.1418      0.0721      0.0004      0.2831       3.86        0.0493

If I entered the data correctly (rather, if the author bothered to record his data correctly), the goodness-of-fit statistics make the data appear very suspicious (p-value of .9979022 for the deviance statistic). If the estimates can be trusted, it appears that a one-point increase in enhancing magic skill increases the probability of a "full" enspell by almost .005.

The difference in magic accuracy levels is small (only 5) so it will be hard to quantify the effect of magic accuracy precisely without increasing sample sizes.

Perhaps the accuracy of enspells is handled in a fundamentally different way than other types of magic are. (We see that the damage of enspells may be increased by adding enhancing magic skill, whereas the damage of nukes is not directly affected by increases in elemental magic skill.) It would then be a waste of time to reconcile these results to lodeguy's. On the other hand, it is much easier to investigate enspells for the reasons cited earlier.

Summary

Some "salient" observations:

In light of previous observations concerning the effects of INT and magic accuracy on magic hit rate ("lack of resist" rate), some evidence suggests that Alkalurops may be comparable to HQ elemental staves in terms of effective magic accuracy.
MND affects the accuracy of Paralyze.
Neither Firesday nor Iceday seems to affect the accuracy of Paralyze.
Know what you are measuring when investigating magic resist rates with enfeebles. Just because an enfeebling spell doesn't resist completely doesn't mean it wasn't a partial resist.
Investigation of enspells can lead to further insights about magic resist rates.

Friday, December 19, 2008

Describing "magic hit rate" symbolically

It has been almost two years since Taruface 4B (my term of endearment because he didn't provide his character name, or you can call him "Lodeguy" from his blog address) began his extensive, brute-force investigation of "magic hit rate," and it seems pretty strange to me that his results haven't really gotten much traction in the FFXI "community" (at least the English-speaking contingent) in the intervening 24 months. Just a thought.

In fact, in the past few months I have heard several English-speaking players allude to the graphs that seem to have originated from this player's blog, but no one ever called these players out on their assertions (they are not trivial ones), and I myself wasn't able to find these graphs until recently. (I just waded through Google results until I found his blog.)

Anyway, I don't feel like rehashing my recent posts about magic hit rate, which are more or less summarized in the previous post. I did get it twisted in the last post about how the game calculates magic hit rate after bonuses, but after considering all the analysis of the data in its entirety, I propose a simple equation to model magic hit rate, which is comprised of the following factors and discards completely the idea of a check against "base" magic hit rate (which was poorly defined by me and is a pretty awkward concept in hindsight) to determine magic hit rate bonuses:

A - magic accuracy score with contributions from magic skill, magic accuracy, staff bonus (elemental magic accuracy), and INT (and other factors)

E - magic evasion score with contributions from INT and elemental resistance (and other factors)

L - a penalty (or bonus) due to a level difference between caster and target.

Then, magic hit rate, represented by the greek letter π (as a probability), could be modeled as

where X = 0.5 if the quantity in parentheses is negative, and X = 1 if the quantity in parentheses is positive. Yes, you could say this model describes a "check" against that quantity in parentheses.

Also, π seems to have an upper bound of .95 (95%) (I do not feel like showing this at the moment) and the lower bound could be anywhere between 0 and .20 (20%) if we are drawing an analogy to melee hit rate. If I had to hazard a guess, I would place the lower bound at 5% for the sake of symmetry.

Such a relation would account for the observation that the bonuses from elemental staves, magic skill, and magic accuracy all seem to be halved below 50% hit rate. It would also account for my observation regarding the staff bonus, from an initial magic hit rate below 50% to a final magic hit rate above 50%. (This is where I got it twisted to begin with.)

A distinct factor to account for the specific contribution of INT, conditional on ΔINT, is also warranted given the data, but I would like to see more regarding INT bonuses below 50% hit rate.

There is also the possibility that, akin to melee accuracy and combat skill, the accuracy (hit rate) contribution from magic skill is 0.9% (instead of 1%) above 200 magic skill, so that pure magic accuracy may be more effective than magic skill (within certain ranges of magic skill), but showing that such a difference is statistically significant (assuming it exists) is difficult.

Another thought: for things that are resistant to magic of a particular element (or to a particular spell, like Slow), the "magic evasion" score could completely dwarf the "magic accuracy" score so that it's futile to pile on magic accuracy.

Again, this is just a working model, but it is one based on data, and not idle speculation. That's pretty much all I have for now, and I don't want to be even more pedantic and talk about the everyday implications of all of this if this model indeed holds, so I'll just leave it at that.

Thursday, December 18, 2008

More on magic resist rates

(Edit - Dec. 30: First image was fixed.)

(Edit - 7:00 PM PST: I wrote the last section in a muddle and it makes no sense. It was amended.)

(Edit - 5:00 AM PST: Summary added.)

This post is a continuation of my discussion of extensive data that a Japanese blogger collected for the purposes of investigating the relationship between "magic hit rate"--defined as a "lack of resist" rate for the purposes of my discussion unless otherwise stated--and each of several factors that are known to affect the accuracy of magic spells.

So far, I have gone over the possible relationship between magic hit rate and elemental staves and the relationship between hit rate and elemental magic skill. You may view the "tentative" conclusions so far. (I say "tentative" because I will be the first to acknowledge the limited scope of the binary regression models that are the basis for making any conclusions.) I will continue to focus exclusively on the accuracy of direct-damage magic ("nukes") as opposed to other magic types (but I may get around to discussing enfeebling magic later).

The importance of checking for linearity

First off, I just want to make a few comments regarding the (apparently) piecewise-linear relationship (which is plausible because it fits the data well, even if the author's procedure was more of an ad hoc one... not sure) between magic hit rate and elemental magic skill that the blogger described.

Yes, in the past I have said it may be feasible to estimate changes in "magic hit rate" with elemental magic skill by choosing two levels of elemental magic skill that are very far apart (and hope that your magic hit rate isn't capped before your higher level), and then perform some "regression" procedure, which is basically drawing a line through the two observed rates (sample proportions). If you fix the number of observations you will set out to collect, allocating your number of observations equally between the two levels will be the most efficient way to detect an effect (a statistical power rationale). Obviously, though, you can't even check for the linearity assumption (hence the term linear regression) since a line through two observed values is a perfect fit, and if the trend is not linear overall, the validity of your point estimate is highly suspect.

As an example, I return to this data set (experimental conditions: Water magic on a lv78 Earth Elemental, using 78 INT and varying levels of elemental magic skill with a Neptune's Staff):

Skill	No resist	1/2 resist	1/4 resist	1/8 resist
230	1233 (.380)	768 (.237)	499 (.154)	746 (.230)
240	832 (.434)	469 (.245)	245 (.128)	369 (.193)
250	1536 (.476)	826 (.256)	399 (.124)	468 (.145)
262	1000 (.598)	373 (.223)	163 (.097)	137 (.082)
270	1780 (.667)	600 (.225)	188 (.070)	99 (.037)

Performing a regression on this data yields the following results:

           Criteria For Assessing Goodness Of Fit

Criterion                 DF           Value        Value/DF

Deviance                   3         19.8838          6.6279
Pearson Chi-Square         3         19.8790          6.6263

         Analysis Of Parameter Estimates

                     Standard  Wald 95% Confidence
Parameter  Estimate     Error        Limits

Intercept   -1.2832    0.0719  -1.4242     -1.1422
Skill        0.0072    0.0003   0.0066      0.0077

As I said previously, obviously this model, which assumes a linear relationship between hit rate and skill over the entire range of skill, is a poor fit to the data. The Japanese blogger was aware of this and proposed piecewise linearity. I suspect a failure to check for lack of fit is behind the estimated hit rate increases described on wiki.ffo.jp for 1 point of elemental magic skill (.064), 1 point of INT/MND/CHR attribute (.074), and 1 point of magic accuracy (.072), although there is no source cited.

For your convenience, I have furnished a graph plotting the observed magic hit rates (sample proportions) versus elemental magic skill for the above data set, and plotted the linear probability model fit to show poorness of fit. I also drew 95% (exact) confidence intervals for the point estimates:

I did include a loglinear model fit mainly for my own amusement (not as bad a fit), but there is no reason to think "the dev team" would really use some kind of explicit loglinear relationship (much less some general logistic one) for anything in FFXI. So I lean toward piecewise linearity because that would be simple to implement, I think.

I must admit, however, that I'd like to see any trends beyond 270 elemental magic skill and below 230 elemental skill, but this thought only comes about because the blogger was perceptive enough to propose piecewise linearity. Is there even enough elemental skill equipment out there to test across a range of elemental skill broader than a 40-point range?

That said, we can proceed to examine the relationship between magic hit rate and INT, keeping in mind the perils of assuming "global" linearity.

Magic hit rate versus INT

Refer to the original post for specifics. (There is more commentary, but the table doesn't make any sense to me.) The author endeavored to examine the relationship between magic hit rate versus INT for a wide range of ΔINT (his INT minus the target's INT). Did he suspect "global" nonlinearity with ΔINT to begin with, or did these results lead him to suspect a similar trend with elemental magic skill? You'll have to ask him.

Level 78 Earth Elementals appear to have 73 INT. He used some Water nuke with a Neptune's Staff with 262 elemental magic skill. The data is summarized as follows:

ΔINT	No resist	1/2 resist	1/4 resist	1/8 resist
-20	957 (.545)	439 (.250)	183 (.104)	176 (.100)
-15	1000 (.598)	373 (.223)	163 (.097)	137 (.082)
-10	637 (.653)	230 (.236)	73 (.075)	36 (.037)
-5	870 (.678)	270 (.210)	101 (.078)	42 (.033)
0	886 (.721)	242 (.197)	71 (.058)	30 (.024)
+10	1585 (.821)	287 (.149)	43 (.022)	15 (.008)
+20	884 (.854)	129 (.125)	19 (.018)	3 (.003)
+30	1387 (.909)	127 (.083)	9 (.006)	3 (.002)

I don't really feel like replicating graphs that the original author already created, so I will just show you the one he created plotting the data (with 68% confidence intervals) and his piecewise linear regression:

Perhaps the presence of the piecewise regression model fit influences your perception of the the trend. Still, it seems that INT appears somewhat less effective at high levels of ΔINT.

Thus, it is pretty obvious that assuming "global" linearity would yield a poor fit to the data, so the piecewise linear regression (whether the cutoff point is intuited or rigorously chosen) approach seems reasonable in order to estimate precisely the effect of a 1-point change in INT on magic hit rate (depending on the range of ΔINT).

And whether or not you use ordinary least-squares regression (which assumes a normal response, which a marginally binomial proportion is not) or a MLE method (GLM), the point and interval estimates of the slopes are pretty close anyway. The following uses MLE estimation for ΔINT between -20 and 10:

           Criteria For Assessing Goodness Of Fit

Criterion                 DF           Value        Value/DF

Deviance                   4          1.6641          0.4160
Pearson Chi-Square         4          1.6626          0.4157

         Analysis Of Parameter Estimates

                     Standard  Wald 95% Confidence
Parameter  Estimate     Error        Limits

Intercept    0.7291    0.0051   0.7190      0.7391
dint         0.0090    0.0004   0.0082      0.0098

It seems that between ΔINT -20 and ΔINT 10, a 1-point increase in INT is expected to result in a 0.9% increase in magic hit rate. (Aside: I dislike expressing changes in proportions--hit rate is a proportion--as percentages because they are often interpreted as increases by a factor of (1+[percent]/100), which is not what I mean. So that is why I usually lean toward expression of rates in decimal form... not that it really helps understanding all that much.)

Is .009 (0.9%) significantly different (statistically) than .01 (1%)? The 95% confidence interval bounding the true rate of change in magic hit rate happens to exclude .01 (1%), so yes. But of course, it's possible Type I error has manifested.

Finally, considering the range of ΔINT between 10 and 30:

           Criteria For Assessing Goodness Of Fit

Criterion                 DF           Value        Value/DF

Deviance                   1          0.8059          0.8059
Pearson Chi-Square         1          0.8156          0.8156

         Analysis Of Parameter Estimates

                     Standard  Wald 95% Confidence
Parameter  Estimate     Error        Limits

Intercept    0.7741    0.0133   0.7482      0.8001
dint         0.0044    0.0006   0.0033      0.0056

It seems that between ΔINT 10 and ΔINT 30, a 1-point increase in INT is expected to result in a .0044 increase in magic hit rate. But we cannot distinguish between .0045 and .005 given the 95% confidence interval.

Observations: I am interested in what happens to the effect of INT below 50% magic hit rate. This could be achieved by removing the Neptune's Staff and repeating the experiment. (Have fun collecting 11,000 observations!) But this data set (not mine) does not show any effect of INT+30 (is ΔINT after INT+30 still below 0 for an Elvaan mage versus an Ebony Pudding?) at 242 elemental magic skill. Is this because the "base" magic hit rate (whatever it is) is well below 50%?

Conclusion: Above 50% "base" magic hit rate (whatever it is), it appears that below ΔINT 10, 1 point of INT gives about a .01 increase (or .009 if you are a stickler for statistical significance) in magic hit rate for direct-magic damage, and above ΔINT 10, a .005 increase in magic hit rate.

I hope this result can be generalized to any kind of mob and also to MND and CHR.

Magic hit rate versus magic accuracy

I don't see an in-depth examination of the effect of magic accuracy (from equipment). Early on, he seemed to have been trying to get a feel for things (see the original post). I just see the following data pertaining specifically to magic accuracy:

For level 75 Qiqirn Archaeologists (Aydeewa Subterrane), using Stone magic, 82 INT, and 230 elemental magic skill, and no elemental staff (1,365 observations):

Condition	No resist	1/2 resist	1/4 resist	1/8 resist
baseline	379 (.420)	194 (.215)	133 (.147)	196 (.217)
+10 m. acc	205 (.443)	124 (.268)	58 (.125)	76 (.164)

For level 75 Steelshells (The Boyahda Tree), using Stone magic, 82 INT, and 230 elemental magic skill, and no elemental staff (1,142 observations):

Condition	No resist	1/2 resist	1/4 resist	1/8 resist
baseline	580 (.744)	141 (.181)	49 (.063)	10 (.013)
+10 m. acc	303 (.837)	49 (.135)	8 (.022)	2 (.006)

Since Qiqirn are resistant to earth magic, there is a huge discrepancy in the magic hit rate of Stone between the two sets of trials.

For the Qiqirn trial, the magic accuracy effect is not statistically significant, but that may just be a consequence of "small" sample sizes (poor statistical power to detect an effect size so small):

                   Analysis Of Parameter Estimates

                     Standard  Wald 95% Confidence    Chi-
Parameter  Estimate     Error        Limits         Square  Pr > ChiSq

Intercept    0.4202    0.0164   0.3880      0.4524  653.65      <.0001
macc         0.0023    0.0028  -0.0033      0.0078    0.64      0.4254

For the Steelshell trial, the magic accuracy effect is highly, statistically significant, but the interval estimate is rather wide:

                    Analysis Of Parameter Estimates

                     Standard  Wald 95% Confidence     Chi-
Parameter  Estimate     Error        Limits          Square  Pr > ChiSq

Intercept    0.7436    0.0156   0.7129      0.7742  2262.00      <.0001
macc         0.0093    0.0025   0.0045      0.0142    14.05      0.0002

Still, in light of what we know about the relationship between magic hit rate and each of the factors that have been investigated well above 50% magic hit rate and well below 50% magic hit rate (elemental staff and elemental magic skill), it seems reasonable to infer that 1 point of magic accuracy is equivalent to about 0.5% magic hit rate below 50% "base" magic hit rate and about 1.0% magic hit rate, at best, above 50% "base" magic hit rate. (The confidence interval is duly noted, but common sense dictates that the 1-point magic accuracy bonus is 1% hit rate at best.)

What the heck is "base" magic hit rate, and what evidence supports such an idea?

I am speculating that "base" magic hit rate is the result of a calculation that compares your "magic accuracy" score before equipment and buffs (debuffs) to a mob's "magic evasion" score, which may be comprised of elemental resistance factors.

So far, the main purpose of making a distinction between a "base" magic hit rate and magic hit rate bonuses from equipment (and possibly buffs/debuffs) is that the bonuses from staves, elemental magic skill, and magic accuracy (and probably INT) are conditional on magic hit rate, based on the extensive data provided. And how do you go about determining the bonuses from equipment if the bonuses from equipment determine the "base" hit rate?

My initial thought was that if a "base" hit rate is below 50%, then any bonuses from equipment will be as I described previously, even if the actual hit rate ends up being above 50%. Again, speculation.

As far as evidence goes, here is one that contradicts what I just wrote. Yet another post from our highly esteemed Japanese blogger illustrates the magic hit rate bonus from using a staff that is the same element as that of the magic being used. The data are summarized as follows:

For level 75 Qiqirn Archaeologists (Aydeewa Subterrane), using Stone magic, 82 INT, and 230 elemental magic skill, and no elemental staff (1,307 observations):

Staff	No resist	1/2 resist	1/4 resist	1/8 resist
None	379 (.420)	194 (.215)	133 (.147)	196 (.217)
Terra's Staff	262 (.647)	81 (.200)	29 (.072)	33 (.081)

Note that the interval estimate of the magic hit rate without staff (not shown) does not cover .50, so I am 95% confident the real hit rate is below .50. Furthermore, the interval estimate of the magic hit rate with Terra's Staff (also not shown) does not cover .50 either, so I am 95% confident that the magic hit rate with Terra's Staff is well above .50.

Previously it was shown that a 95% confidence interval for the HQ staff effect "well" below 50% hit rate was (.1359, .1665). Here, the point estimate for the staff bonus appears to be .227, but how precise is this estimate?

            Analysis Of Parameter Estimates

                           Standard  Wald 95% Confidence
Parameter        Estimate     Error        Limits

Intercept          0.4202    0.0164   0.3880      0.4524
staff      HQ      0.2267    0.0289   0.1701      0.2833
staff      None    0.0000    0.0000   0.0000      0.0000

What's going on? First, one set of data showed that for magic hit rates above 50%, a HQ staff seemed to confer (what is thought to be) a constant 30% magic hit rate bonus (estimated). Then, another set of data showed that for magic hit rates below 50%, a HQ staff seemed to confer (what is thought to be) a constant 15% magic hit rate bonus (estimated). But the above 95% CI covers neither .15 nor .30.

So this data seems to undermine the idea of the "base" hit rate check I speculated about, unless a transition below 50% magic hit rate to above 50% magic hit rate (and vice versa) is handled by the game in a way that is difficult to observe. (Well, I have gone delirious at this point, so let me revisit this later.)

Summary

The conclusions inferred from the data so far (see my last post as well for a summary) rest on a few ideas and concessions that really warrant further examination:

There are two distinct "regimes" of magic hit rate before any bonuses from equipment (and probably buffs/debuffs and food, etc.) that determine the magnitude of the accuracy bonuses from elemental magic skill, elemental staves, and magic accuracy (all from equipment).

One region is below 50% magic hit rate
The other region is above 50% magic hit rate

We are assuming piecewise linearity to model the existence of the above phenomenon. Otherwise, some nonlinear relation (e.g. logistic) will result in more complex interpretations
I acknowledge that only direct-magic damage ("nukes") was investigated. "Further examination" here means that we should look at other types of magic (enfeebling) to see if the conclusions for direct-magic damage can be generalized.
I concede the possibility of weather/day possibly confounding the results. But these effects do not process 100% for the magic damage calculation, so if they also apply to magic accuracy, the effect is probably not 100% either (without obis). The effect may also be weak and hard to detect, if it even exists at all; if this is the case, it is not a serious confounding threat. (I don't see any data to corroborate this though. You can perform some regression diagnostics to check for omitted explanatory variables, too.)

However, even if the model described above is not exactly as SE designed magic accuracy/magic hit rate to work, it still is a model that seems to approximate well the "reality" of the situation (for nukes). It's not like I have a vested interest in promoting this view of magic hit rate bonuses. It is well within the realm of possibility that the data provide only a limited view of the whole situation.

That said, so far it appears (and I do emphasize that these are estimates) that, given the data so far:

If the initial and final magic hit rates are both below 50%, then

An HQ staff of the correct element gives a constant increase of 15% magic hit rate
A NQ staff of the correct element gives a constant increase of 10% magic hit rate
1 point of elemental magic skill gives a constant increase of 0.5% magic hit rate
1 point of magic accuracy gives a constant increase of 0.5% magic hit rate (caveat being the evidence is not that strong)

If the initial and final magic hit rates are both above 50%, then

An HQ staff of the correct element gives a constant increase of 30% magic hit rate
A NQ staff of the correct element gives a constant increase of 20% magic hit rate
1 point of elemental magic skill gives a constant increase of 1% magic hit rate
1 point of magic accuracy gives a constant increase of 1% magic hit rate (caveat being the evidence is not that strong)

"Open question"

If the initial and final magic hit rates are both above 50%, then

Between ΔINT -20 and ΔINT 10, 1 point of INT gives a constant increase of 1% magic hit rate
Between ΔINT 10 and ΔINT 30, 1 point of INT gives a constant increase of 0.5% magic hit rate

There is no information for initial and final magic hit rates both below 50% magic hit rate

"Open question"

level correction/penalty