I want to share something with you guys and get your feedback. Feel free to agree or disagree. But before I continue, I just want to say that I’ve participated in 8 Ludum Dares so far, and they’re what I look forward to, I love them. Ok, now I’ll give my little spiel.

After making your game in the 48/72 Hour Time Limit, we get to check out and Rate other people’s games. When rating somebody’s game, we are allowed to give them x out of 5 stars in 8 different Categories. These Categories include: Innovation, Fun, Theme, Graphics, Audio, Humor, Mood, and most importantly, Overall. I’m going to talk more about the Overall category in a minute. Let me just talk about something else first:

As many of you may have noticed, probably for a while now, your results don’t seem very honest. The results you get may seem surprising, this could be in a bad way, or in a good way. You’re either pretty disappointed, or you’re really happy. This is because you didn’t get a ton of votes (i.e.: ~20-70 votes, which is what a lot of people end up getting). Think about it, there were about 2,800 other games out there. Do you think that with 50 votes out of the 2,800 entries you’ll get a really honest evaluation? You shouldn’t, because unfortunately, that’s not the case.

If 5 people rate your game and they all give you a 5 on Fun, the average would be a 5/5. If 10 people rate your game and 8 of them give you a 5 on Fun, and the other 2 give you a 4, the average would be a 4.8/5. Now the game with an average of 5/5 is ranked higher than the game with an average of 4.8/5. But the game with an average of 4.8/5 should be ranked higher because it had similar scores, and more people played it. Now I’m pretty sure that in the end the game’s categories aren’t ranked based on just the amount of stars given but still, I just wanted to give you something to think about. The rankings aren’t all that honest. I noticed that many of the Top Ranked games only had about ~30-70 votes. You’ll see that the games that got the most votes weren’t up there in the Top 100. But they did, however, have more real honest evaluations, while the others with ~30-70 votes were just lucky enough to get a handful of good ratings which therefore gave them higher rankings. The Top 100 are great games, no doubt, but are they the best out of the 2,800? We don’t really know for sure.

Ok, now I’m going to talk about this Overall category. The site says that your game is ranked overall based on your Overall category ratings. Do you think that we should be allowed to rate the Overall category? The Overall category should be based on the other categories rounded up. The site should give us the Overall Rating, not us. Here’s just one reason why: I’ve seen many people give a game great ratings, like 4-5 stars on every category, and then they would give the Overall category a 3. Umm… What? Shouldn’t you base the Overall category on the other categories? If you gave all the other categories 5 stars, then why would you give the Overall category 4 stars? The Average Overall Rating would be a 5, so give it a 5. But since people don’t always do that, let the Robots do the math and give us the Overall Rating, not the Humans.

I’m not entirely sure how Mike (Founder of Ludum Dare; Support him on Patreon!) can make the evaluations more honest because you can’t just have 2,800 people play all 2,800 games, that’s just ridiculous. But it’s just something to think about.

Give me your thoughts in the comments please.

Thanks for listening!

  1. cynicalmonkey says:

    While I can understand your frustration with the overall score I think there are some factors here you are overlooking.

    1)it would create a whole new system to game
    In an automated score the easiest way around it is not to compete. You would get more and more games that only have 1 or 2 categories listed promoting posts just like this one with people talking about why games with 4 in each category should rank higher than someone who got 5 in fun but only entered it in one category.

    2)you would need to remove humour altogether
    The only way round 1) is to make entry into all categories mandatory for compo in which case humour instantly puts anyone not wanting to make a game that can score well in humour at an instant disadvantage.

    3)a game can be more than a sum of its parts
    Some things don’t takeaway from the overall experience despite not scoring highly, you could make a fantastic game that’s fun with great sound and hits the theme but isn’t funny and is played out with red and blue cubes that in the current system could capture people’s attention enough to get 5 stars overall but in an aggregated system would only score 4.

    4) you are removing the human element
    These games are made by people and everyone is different in how they score by removing the overall element it’s like telling people that only their opinions on the other categories with the game matter, anything else the algorithm will take care of. It doesn’t matter if some people score differently than others, what’s important is they are consistent with how they score the games across the board.

    5) there’s more to a game than those categories
    They are a guideline for a game but not the only area a game can be successful in, puzzling, emotionally resonant, poignant, challenging, thought provoking, nostalgic etc. They don’t fit neatly into one of the categories but can be a huge addition to a games overall appeal and they shouldn’t have their own category because who wants to score a game in 200 categories but they are things a person can take into account when scoring overall and a formula wouldn’t

    6)it’s just a game jam dude
    Lighten up

    • Tuism says:

      What you said was my first reaction to this. There are many instances where a game might score highly on technical aspects, but just not great overall, in my personal opinion. And that’s what these ratings are, personal opinions.

      Being able to game the overall score by switching all aspects of ratings off (I’ll make a graphical masterpiece that doesn’t actually do anything. And it’d be 5/5 every rating) is just silly.

      To get around that, now you gotta mandate “your game must have at least 5 categories turned on” or something. Which creates even more problems.

      So no… I think it’s fine as it is now.

  2. Os_Reboot says:

    I feel obligated to reply (because I’ll admit I’ve had similar thoughts cross my mind).

    The Overall category is an interesting feature. I’ve seen charts and graphs showing that users tend to bias Overall towards Graphics or Audio when handing out scores. _In theory_ you could tilt your game towards those categories and do well, however I feel Ludum Dare has taken itself more lightly than this. Gaming the system doesn’t really fit the casual feel that Ludum Dare seems to have accumulated (at least from my perspective). You could go on and on about ways to “break” the rankings, but I’m not sure how well that meshes with the Ludum Dare I’ve come to know.

    Anyways, if people feel things need to be corrected than the website transition is probably the best time to do it. I trust Mike’s judgement on that, he’ll decide whats best for the community (we love you Mike <3).

    (IMO) Its a tricky situation, the balance as it stands probably shouldn't be toyed with.

  3. Jezzamon says:

    I like the overall category, while I don’t want to do it, I’d prefer getting rid of every other category rather than getting rid of overall.

    I’d say the other categories exist to promote games that exceeded in one element that might not otherwise get seen if the only category was overall.

    The overall category isn’t meant to be a culmination of the other categories. A really good game overall can have a very low humour, fun, or innovation category. Only people can make this judgement.

  4. caryoscelus says:

    On the first part (about not honest ratings due to lack of votes): there’s a minimum of 20 votes to be shown in results and if we assume those 20+ votes were from random people (which might often be not true though), it should not be far from accurate. The problem isn’t amount of votes per se, but the method of gaining them. E.g. someone makes a promotion post and people who liked the premise and screenshots play the game and enjoy it and rate it high and those uninterested don’t play it and thus ranking gets higher than it should be.

    On the second part: overall is in fact the only category which could be applied accurately to any game. Replacing it with autogen one would make results depend only on succeeding in as much categories as possible, not succeeding in making a great game. And to get precise results, you’ll need to add about hundred more categories to rate (and weight them according to their importance, which will still be subjective) so that every entry is judged equally. And that is quite absurd, especially for a game jam.

  5. Jezzamon says:

    I think the system with votes works pretty well actually. The only way to do really well is to make a really good game. You don’t need everyone in the contest to rate your game, that’s not the point. The point of only allowing other entrants to enter is that it makes it more fair.

    20 votes is actually good enough to get an accurate rating for you game. If your game is bad, or even just average, it’s impossible to get a really high rating just by being lucky. You have to be consistently lucky with 20 people. There will be a bit of noise, in the way that it’s pretty random whether a game would be 10th or 11th, but it doesn’t have a massive effect.

    I’m sorry if you feel like your game didn’t get a good enough rating. It’s a big improvement on some of your earlier games, but I think the rating is fair.

  6. LTyrosine says:

    About the first part on honest evaluation, a simple rule to improve 1000% rating balance: disable the choice of order of games that people rate. Instead, the LD site would present one game (per player) and it would persist there until the player rate it or click on something like “I won’t rate this game”. Then LD sort another game (based on coolness maybe, that is now how many games the creator actually rated) and so on. Of course all games still could be browsed, viewed, played, only the rating system would be fixed this way.

  7. xWarZonex says:

    I would like if the site averaged your scores overall, but I feel like the overall category is where you rate the overall feel to the game, not the overall average.

  8. HolyBlackCat says:

    First of all, nice tags lol. 😀

    I understand your frustration with the overall score, but I can’t agree that it must be just an average of other categories.

    There are some categories that are not on the site, but a player keeps them in mind when he chooses how many stars to put on `overall`. For example, `UI usability`, `technical execution`, `lack of glitches`, `amount of effort`. It would make no sense to add them to rating system (we don’t need that much categories), but almost everyone takes them into account when deciding what to vote for `overall`.

    Here is an example for you. Imagine a following game:
    A random game by random dude, it’s great in every category, so one would rate 5 stars for mood, graphics, audio, fun, humor, innovation and theme.
    But it also has fixed-size 400×400 window which is immovable, non-skippable long cut scenes, jokes about your mom in every dialog, requires latest drivers, latest MSVC runtime libraries, MSYS environment, latest .NET framework, OpenGL 4.5 and DirectX 11 and all these things need to be installed manually.
    The average of all categories would be 5, but no one would put five stars `overall` for such a game.

    The example is ‘a bit’ exaggerated, but I think you’ll understand it.

    • HolyBlackCat says:

      Also, another example:

      Imagine a decent game that has 3 on each category except `overall`.
      But it’s written in C, has configurable controls, adjustable resolution, works on Win/Mac/Linux/Android/iOS/NES and has proper build scripts for every one of these platforms. This one would deserve at least 4 stars `overall`.

    • HolyBlackCat says:

      Oops, typo.
      * …It wouldn’t make no sense to add them to rating system…

  9. Liam :D says:

    If you would like to put some data behind your claims, I encourage you to take a look at my post and the data scrape, provided here: http://ludumdare.com/compo/2016/01/10/how-many-people-actually-got-ratings/

    You can also use that to experiment and try to determine a good “overall” formula. The data is available for all Ludum Dares since LD15 in both HTML table format (Easily copyable to Excel, Calc or Sheets) and in JSON format (Easily usable with custom programs)

  10. PoV says:

    Ludum Dare has a lot of categories. When it comes to categories, we’ve learned over the years that the only categories that truly work are opinions. As an example, we and other big events like the IGF have tried having a “Techincal” category, but we just couldn’t make it work. It’s hard enough for games to reach the minimum number of votes required to get a score. There’s no way we can find and vet enough qualified people to gauge how “Technical” thousands of entries are. And at the end of the day, a “Technical” game isn’t necessarily a better game. It’s just not a good metric.

    However, everyone has an opinion on how good a game is (i.e. the Overall category). A game may have simple audio versus one with a professionally produced soundtrack, but the complete experience even with those simple sounds could be something really special.

    I like how CynicalMoney puts it:

    a game can be more than a sum of its parts

    There are some little things we can do to the scoring to give better averages, but for the most part, the results will look similar. A couple entries may change places, but top games will still be top games.

    I don’t think how we do it is bad. I think what will make the biggest improvement is getting more people involved in the voting. Making that process better, more rewarding.

  11. ChuiGum says:

    Oh wow, I didn’t expect this many replies already, good!

    You all have very valid points that I didn’t keep in mind (my bad). I understand what you guys are saying. Maybe the Overall Score can still be the other categories rounded up, but you exclude the categories you don’t want to be rated. i.e.: You have a game that’s awesome but not funny at all, so you exclude Humor.

    I only brought this conversation up so we all could discuss and maybe work things out, I wasn’t trying to complain or anything. It’s just hard to show my tone through text, with verbal communication it’s easier to see. But like I said, it’s just a nice subject to talk about.

    So maybe my whole Take Out the Overall Category is unreasonable. But I do feel that we should find a better way of ranking games, right? Again, I don’t know how we do this but something should change. It wouldn’t be a bug fix, it would just be a minor improvement.

    And yes it is “just a game jam,” but it’s the Largest Game Jam in the world and is only getting Bigger (Which is awesome by the way).

    And again, thanks for replying, I really appreciate it!

    • YinYin says:

      All we need is more voting. Or more precisely: more visibility (reward) for voting on games with few votes. The system is already geared like that, but it can be improved. Perhaps not just on that default voting page but also automatically tweeting entries with high coolness but low votes out periodically or featuring them right on the front page etc (I’m sure that’s all somewhere on Mikes todo/wishlist).
      Right now it’s just too easy to float on already existing popularity or pushing the game to as many streamers/youtubers as possible without ever voting yourself. Games already thriving on such momentum shouldn’t be so easily found via this page by sorting for most reviewed. (though this does seem to be a very important thing to learn imo – how to use social media effectively to spread your game without feeling bad about it :p)

      And as already mentioned often enough. Games can be far more or less than the sum of their parts (each category may be very well done, but they just don’t match well enough – or low quality assets can still be brought together as an overall excellent game).

      • ChuiGum says:

        Yeah I agree that Games can be far more or less than the sum of their parts.

        And yes, we just need more people to see more games that are out there. Your idea of Tweeting Entries sounds like a good idea too!

      • sathorn says:

        I agree that we need more voting, but not by making games with few votes more visible (which current system already does). Instead playing and rating games should be more pleasurable and rewarding to motivate people to vote more.

        At the beginning of the voting period you get recommended all kinds of games but after some time has passed and some votes have been cast the system heavily favors games with low votes over games with high coolness. If we assume that games with low coolness are generally worse because the developer doesn’t even bother playing other games we get the following problems:

        1. It recommends you more shitty games which are less fun to play and reduces motivation to play more games.
        2. Once you have some votes your reward for voting (getting votes and comments yourself) lessens and you need to vote on multiple games to a single vote yourself. Additionally when your game gets into the featured list it takes some time for people to play it and vote on it to push it out again so you always get your votes in little chunks. This can result in you getting a few votes and then having to rate like 10 games to get featured (rewarded) again. It would be a lot better if it would be: rate 1 or 2 games then get 1 rating.

        Possible solutions:

        1. Be more brutal to games with low coolness. If they don’t want to vote it’s not our problem to get them to 20 votes.
        2. Make higher rated games more visible in a reasonable way. Currently only the very best games get featured outside of the system, but pretty-good-but-not-the-best games are not. I think these games deserve to be played more.
        3. Make it an additional requirement to vote for 20 games to get a rating.
        4. Increase that number to 25 or something.

        • YinYin says:

          Actually I gotta clarify a bit: we don’t really need more quantity of votes, but rather better distribution. We already have enough.

          Basically all it takes is displaying a reviewer how many votes a game already has, if that’s already enough for a score, and reducing the impact on visibility/coolness when voting on games that have already passed the required threshold. If someone is actively seeking out the 50 most played games, filters that for the ~25 interesting looking ones and reviews only those, that’s not helping anyone – but currently raises coolness far enough to guarantee getting enough back.
          Seeking out the good entries and playing only those is easy enough already imo and kind of leisure time :p

          And you are right, the reward bit is pretty important. I’d say something like
          “you’ve gained X coolness and will appear more often on the front page!”
          “voting on games that have already passed their threshold won’t increase your coolness”
          “congratulations, your game has received enough votes to get a score!”
          etc. would do (similar to the little messages we already have, except with more detail before/after rating a game – to push members towards rating games that need it).

          Assuming low coolness generally equals bad games may yield true, however certainly not for the “doesn’t even bother” reason. There are a ton of top scoring games that have 0 or below recommended coolness.
          And then there’s also people who went on vacation during most of the voting period and simply couldn’t get enough together in the remaining time. I don’t think putting a requirement on that is a good idea.

