lda1

Liam 😀 made this awesome Ludum Dare Personal Voting Analyzer. You go to the play and rate page while logged in, copy everything in the table showing your voting scores for all the things you rated, and then paste it into a box on the voting analyzer page. The tool then takes the data and pops out a bunch of charts and graphs about your votes.

I am super happy that this tool was made. For several LDs, now, I’ve been wondering how good my internal milestones for each rating are, and although my rating is balanced in some areas, it turns out that it’s pretty heavily skewed in others. I’ve had the feeling my judging was skewed for a while now, and this tool has given me the data I need to know how badly skewed it is and in which areas so that I can adjust how I vote for next time. The reason I care is that I feel like 3 stars should represent the average and that if I’m scoring right, I should have a nice bell curve across all my votes.

The pie chart above shows how many of the things I rated were Jam, Compo, or Other games, where Other are games on which I commented without rating anything because the game was unplayable. As you can see, about half the games I rated were Jam and half were Compo. This is the only pie chart the tool shows, and aside from telling me that I have good sample sizes for both Compo and Jam games, I find it less useful than the line graphs.

lda2

I’m pretty happy with my distribution of Overall ratings. It’s not exactly a bell curve on the right hand side, but it definitely looks like a nice mountain. I give a game its Overall rating last, taking into account my ratings in all the other categories. The slight skew to the right seen here is influenced by the heavy right-hand skews in a couple of other categories.

lda3

Innovation is one of the categories where my skew is the worst. When rating, it’s very hard to get a 1 out of me on Innovation; the game basically has to be a straight-up clone of something. Since I want 3 stars to be average, I then set 3 stars as standard game genre. To get a 5-star rating, a game has to be unlike anything else I’ve seen before. 2 and 4 stars have thus been based on how different it feels from standard. I think that to rectify this, I’ll have to set 2 stars as standard genre. This will give me more room for variations on doing better than that.

One thing I noticed while doing my voting during this LD is that my Innovation and Theme ratings tend to rise and fall together, though if you look at the Theme graph below, you’ll see that it has much sharper ups and downs.

lda4

This one is slightly skewed to the left for Compo, but a very pretty curve for Jam. I’m okay with this one. My 3 star rating for Fun means that that was pretty fun and I’m glad I played it this one time for rating it. 1 star indicates that I didn’t really enjoy the game at all and 2 stars is somewhere in between… it wasn’t terrible, but I could have done without playing it. 4 stars is this was really fun to play this one time, and 5 stars indicates that I am seriously considering keeping this game to play more later.

Given that rating system, it makes sense that Compo games would have a different skew. It’s harder to make a game by yourself with no outside assets in two days than it is to make a game on a team with outside assets allowed in three days. One could argue that I should go easier on Compo games, and in some of the other categories I do, but not this one. These two curves are visually dissimilar, but the number of games that got 3 stars or higher out of me in both divisions is about the same.

lda5Holy strange skew, Batman.

Every time a theme is announced, there’s generally an obvious and easy to implement way of interpreting the theme. For An Unconventional Weapon, for example, you can just take any regular whacky/slicy or shooty game genre and just replace the weapon with something strange. I always set that obvious implementation as my 3-star rating, give anything that doesn’t seem to fit the theme at all 1 star, and anything that blows my mind 5 stars. The 2- and 4-star ratings, then, are just… tweens.

I think that to rectify this, I’ll need to set the obvious implementation as 2 stars. This will give me more room (as with Innovation) for variations on anything that does better than that. So to use An Unconventional Weapon as an example again: a straight up regular platformer where the weapon difference is strictly a difference in visuals (a pickle instead of a sword, but it still acts like a sword) and maybe sound would get 2 stars from me; a game that is mostly a regular member of its genre but the unconventional weapon’s behavior is slightly different would get a 3; and then games that went out of their way to be offbeat would get 4 or 5 depending on how far out they were.

I have yet to analyze prevous LDs’ results, but I find myself wondering if this Innovation graph will be considerably different from those of previous LDs. About three games into rating, I came across a game which used a weapon which appeared to be normal but acted entirely un-normal, and as an idea my team and I hadn’t considered and the perfect fit for the word “unconventional”, it sort of set the bar for what 5 stars in Theme was gonna mean to me this time around.

lda6

My graphics ratings are also skewed to the right. I’ve always set 3 as being good enough to convey all details necessary for gameplay, but as with innovation and theme, I think I’ll need to move that down to 2 stars to give myself more room for variety in doing better. Things that I rated as 2 stars in Graphics this time would then fall to 1 star.

lda7

I feel pretty much the same way about my voting for Audio as I do about voting for Graphics. The only 1 star rating I’ve given out here was for a game whose audio simply grated on my nerves, but I think having 1 star be insufficient audio (or nerve grating) and setting the barely acceptable bar at 2 stars will help here. I suspect my Jam curve will still have a dip at 2 in the future, but many Jam games have a dedicated audio person, so that makes sense.

lda8

I’m not sure what to do about this cowboy hat. I voted on humor for far fewer games than I rated in total. Humor is dependent on so many things that it’s hard to quantify, and it doesn’t help that a lot of things most people find funny just aren’t amusing to me. On the other hand, just doing 1 star or 5 depending on whether I found it funny or not wouldn’t work, either. So… yeah. Cowboy hat.

lda9

This one is skewed to the right, but not horribly so. Nonetheless, I think setting “all right” at 2 stars and taking advantage of more options to the right of that would benefit here, as well.

Conclusion

There are other graphs at the bottom of the tool, showing how strong or weak the correlations are between pairs of individual categories, but as long as my overall category graphs are so skewed, I don’t think they’re very useful to me. As such, I’m not going to cover them here.

It’s really nice to have these graphs, though, because they’ve helped me visualize how my mental images of what each star rating turns out to be translates into how I vote. I am not going to go back and rerate the 115+ games I’ve voted on this time, but I will be looking back at these graphs for next time.


2 Responses to “Looking at my rating habits with the LD Personal Voting Analyzer”

  1. Liam :D says:

    Thank you for using the tool and giving it a shout-out 😀

    The bottom 8×8 grid of graphs has a tendency to break for many people, so it’s sort of surprising it worked for you.

    I’ll expand the tool a little bit after the competition is over to include a report on how much your votes differed from the average each game received. You however need to save your voting data ahead of time, as I believe that becomes inaccessible after the voting period ends. If you’d like to use that, simply do the same copy-paste process, except this time save it into a txt file instead of the textarea.

Leave a Reply

You must be logged in to post a comment.

[cache: storing page]