General Discussion

General DiscussionStatistical significance of hero win rates

Statistical significance of hero win rates in General Discussion
Dire Wolf

    At what point do you have enough games on a hero to say you have a good win rate or a bad win rate with such hero? As an example I'm 9-10 on Luna or 47%. 47% as an overall win rate isn't very impressive but I could go on a 2 game win streak and be at 52%. It wouldn't be statistically strange for this to happen as 19 games is a tiny sample. Same thing for my viper on which I'm 10-8 for 55%. Could easily lose a couple and drop. I just don't really see my viper as being that much worse than my luna play in these small examples.

    Even drastically low win rates can change extremely fast given how small some samples are. Like my shadow shaman. I never thought I was particularly bad at him or support in general yet I started my career 1-6 on him for an abysmal 14% win rate. Now I've won my last two games with him and it's more than doubled to 33%. If I simply go 2-1 over my next 3 games he'll start sniffing 50% territory.

    Anyway, this mainly comes from topics where posters are asking how to win more solo queue and a lot have been suggesting play your strongest heroes and referencing win rates with those heroes. But I'd argue in such small samples those win rates don't necessarily tell you enough. I don't think there's any significent statistical difference between 40% and 60% under 20 games. That's a few games either way won or lost, could be a ton of factors why. Over hundreds of games this stuff becomes more apparent but for a lot of players like me who play many heroes there just doesn't seem to be enough info to draw conclusions like this. I think looking at farm and kda is much more valid as a gauge.

    Vandal

      I'm a firm believer that Fisher statistics are almost always irrelevant. They seek to place an interpretation of credibility on data without knowing anything about it - the task is impossible. When you learn about "confidence intervals" and "p-values" (tools of "significance"), you're learning about how to make shit up that makes no sense. In my scientific work, I am careful to avoid ever using either one of these.

      Here's the point: Find me one textbook that derives statistical significance, so I can actually observe the assumptions for myself and logic my way to the conclusion with the author. There aren't any since there are too many assumptions and no logic. Authors of statistics tend to just give you the definitions and speak loosely about how their asshole feels tight right now.

      whoji

        I think you can get a contingency table and apply G-test or Fisher's Exact Test to get a p-value. Hypergeometric test might do also.

        say you have 20 game with rubick with 15 win and 5 loss. assuming 50/50 win loss you got expected 10 win 10 loss .
        so the table will be
        10 10
        15 5

        apply a FET and the p-value will be 0.071

        try this
        http://www.physics.csbsju.edu/stats/fisher.form.html

        -------------------
        also, probably you dont want to do any statistics when the sample size is small, like less than 20. just saying.

        This comment was edited
        Vandal

          Don't listen to whoji. He's just using a bunch of esoteric terminology devised to disguise the fact that the Fisher statistics he is promoting are founded on unworkable assumptions about which he probably doesn't even know.

          Dipshit

            10,000 games.

            whoji

              ^^ "esoteric terminology " ? lmao. that 's college freshman level stuff. maybe even high school level in nowdays russia, india, china, korea

              This comment was edited
              Vandal

                "Yeah, what you want to do is use the delta-Gormogov table to find the right q-state. Afterward, you will run the he-she-wee-wee test to figure out nothing with confidence 90%."

                Vandal

                  Whoji, perhaps you seem to be confused about the complaints I place against Fisher statistics. I know the words, I know the tests, and I know how to apply them. But they are a gilded exterior to a worthless set of results.

                  whoji

                    lol. vandal you are cute. delta-Gormogov table ftw

                    Alyosha

                      Every time I play OD my team does terrible
                      Ive watched pros play him, I just get unlucky

                      Dire Wolf

                        Ok so you all agree looking at individual win rates on your heroes is a pretty bad way to judge your performance on said hero?

                        one and half gun

                          are you guys trying to summon relentless?

                          let me help

                          3000 mmr
                          3000 mmr
                          3000 mmr

                          Vandal

                            @there is no try

                            Yes. Here is a simple reason for it: You may play a hero very well, but DotA is more than how well you play a hero given a setting - it's in part about creating the setting in which you play. In other words, it's about choosing your heroes intelligently. You may be one of the most godly spectres in the game, but if you choose that every single game regardless of your team's picks and the opponent's picks, you will lose a whole lot.

                            About the only thing I am willing to say is an OK conjecture is that win rate with lots of games (done mostly in solo queue) demonstrates:
                            1.) You pick those heroes at the right time (which may not be a function of "skill" - Pretend you have a high win rate on Doom. Maybe you only choose Doom when you want to hardcore counter a specific setup you see developing on the other team. In other words, your Doom play is about as good as all your other play, but your metaknowledge about doom is superior)
                            2.) You are at a higher skill with that hero than your "recent" skill (e.g. like your last 200 games)

                            It's really usually a combo of these two things. In fact, 1 kind of leaks over into 2 when you think about it.

                            This comment was edited
                            whoji

                              @there is no try

                              No. if given enough number of matches and normalized by average winrate

                              for example your top hero, lifestealer winrate is 73.91 overal 46 games that is 34W - 12L
                              the average winrate for lifestealer is 46.64 . so your expected W-L in 46 game will be 21W-25L
                              so the p-value is 0.004

                              Nice. Dotabuff should totally implement this.

                              This comment was edited
                              Relentless

                                Statistics must be understood in their correct context. No elaborate mathematical understanding is required.

                                40% winrate at 10 games is not much different from 60% winrate at 10 games. Everyone can see that.

                                However 40% winrate at 100 games is a sign that you play this hero below the skill level of your average hero, while 60% winrate at 100 games clearly indicates that you play this hero above the skill level of your average hero.

                                If you keep playing your best hero exclusively it will eventually dip closer to 50% winrate because your MMR will increase. If you random a hero every game and have tons of games on all heroes, then you would have a good idea of your relative skill on each hero. But almost no one does that, my account is perhaps close... about 35% of my games I got a random hero. So all heroes have less than 4% of my games and only 12 heroes have over 2%.

                                Because my heroes games are very well distributed their winrates give a fairly accurate idea of my relative skill with each hero. But to compare between people you really need to see the difficulty of each game for each hero. Winning with a hero at 2k MMR is totally different than 3k, 4k, 5k, 6k ... so if you can't judge the difficulty of the games played you can't compare very well between players.

                                This comment was edited
                                Alation

                                  DAT ANTIMAGE SOLO QUEUE WINRATE. 3000MMR EZ

                                  yiran

                                    Well, my Windranger winrate is pretty bad, but I actually play pretty well with her after losing so much. Same with Puck (my winrate on Puck is awful on my old account, with like 1.4 KDA or something). So I don't think it is very accurate of your current skill (because most of those lost Windranger games were me not being a good (ah screw it) Windrunner and not shackling properly and not building mek.

                                    Blegh.

                                    Relentless

                                      Also some heroes function wonderfully when in a 5 stack, but totally suck while solo queing... Crystal Maiden for instance. CM really can't make bad players on your team get a lot out of your aura. So its pretty much wasted.

                                      KotL is also dramatically more powerful if players have the skill to do something with the extra mana provided to them. If they use it to sit around at full mana waiting for someone else to make a play... then Kotl becomes food and may as well go farm a radiance in the jungle.

                                      Tree is extremely powerful IF, your team does not feed. If they position well living armor can make you win every lane and let your gankers dive towers for kills. But if your team sucks living armor is not enough to keep them from feeding. If tree casts guise on heroes that are to dumb to use it and instantly walk away from trees breaking the invis... then the spell is worthless. If tree has can't use leach seed as a free mech to push a lane of creeps with the team... that spell is also pretty worthless. And Tree's utl does zero dmg... if the team does not make plays during that tiny 3 to 4.5 sec window tree may as well not have an ult. He is extremely dependent on teamwork.

                                      The point is that some heroes play much better in stacks than not in stacks. So you would have to consider what type of game situations they were played it to get relevant statistics. It's very easy in many mathematical models to ignore reality... it is rather common for statistical models to ignore reality. In pure math there are no assumptions, the rules are whatever you define them to be. But when you try to describe a real situation it is important to remember that math is generally only poorly applicable. Even basic mathematical concepts like addition, subtraction, and counting do not always correctly describe how real systems work. You have to check what assumptions are made first, or the math is just nonsense.

                                      One of the common errors in statistical analysis to assume that the larger a sample becomes the closer the results will be to the typical assumptions of probabilistic models. But this is not true. In real systems large samples often change the character of the sample... behavior can be radically different. While rolling a die 1000 times may get you closer to expectations than rolling it 100 times.... rolling a die 1,000,000 time in reality will not produce closer to the "expected" odds for each side. In fact the opposite will happen, after many roles the behavior of the dice diverges father and father from the "expected" behavior. That fact of real dice behavior is well known and is why in Las Vegas you always get fresh dice not old ones.

                                      In dota playing 100 games on a hero is completely different than playing 10 games, because obviously you will be much better on game 50 than you were on game 1. No mathematical understanding is required to see that the average winrate for 100 games on a hero is unlikely to tell you how good you are at that hero right now. You may have totally failed the first 10 games distorting the average. Maybe, but maybe not... You have to look into the details to find out.

                                      All that said I still think the best way to see if someone is good at a hero is to watch a replay of them playing that hero in a losing game. See what they can do in a difficult situation.

                                      This comment was edited
                                      Alation

                                        Relentless^ why did you forget the r's in 'father and father'? Nice explanation though

                                        Relentless

                                          I'm just bad at typing. Also my keyboard sticks from eating to much while using the PC and failing to clean it.

                                          This comment was edited
                                          Alation

                                            ^ Well it doesnt take away from the meaning of text, that's for sure.