Building for Bad Matchmaking

    If you have ever played a competitive online game, you have probably found yourself in matches where one side is clearly much better than the other. What was matchmaking thinking!?

    The matchmaker is the obvious place to assign blame and look for a solution. But while a well tuned matchmaking algorithm certainly does make a difference, in reality even the best system is still constrained by many practical limitations. Other considerations such as ping, small player pools, and queue times frequently limit a system’s ability to be picky about the matches it puts together. Games with multiplayer teams and/or post-matchmaking factors (such as a draft phase) can further complicate the matchmaker’s job. For a more detailed look at matchmaking, see this excellent post by an Awesomenauts dev.

    The truth is that bad match-ups are going to happen, but they can still be fun games. A player’s perception of skill disparity is heavily filtered by a game’s mechanical structure. Certain game mechanics and dynamics will accentuate or mitigate this sense of disparity, and thus each game will have its own “tolerance range” within which match-ups remain fun.  My goal in this post is to examine the root frustrations of uneven matches and propose some techniques for avoiding them.

    To begin, we need to examine what makes a bad matchup unfun. The easy explanation is that players dislike losing, but that’s a little too simple. The degree and nature of a loss are really what makes the difference. It’s not so bad to lose a close fought match where the moment-to-moment gameplay was fun and you felt like you and your opponent(s) where close to evenly matched. What sucks is when your opposition is noticeably more skilled, to the point where basic gameplay is no longer fun and you don’t feel like you have any hope of winning the match.

    Let’s get into the details of how this works. First, skill disparities can stress or invalidate core game mechanics and prevent players from engaging with the game as intended. Take basketball for example. The height difference between two opposing players can only be so large before it completely removes some of the game’s foundational tensions. Past a certain height disparity, the taller player will be able to pass and shoot with impunity, and the game will probably cease to be fun or interesting for either side.

    The ideal game rewards players for practicing and mastering its mechanics, but also minimizes or caps the advantages derived from superior skill. A game’s tolerance limit is essentially set by the maximum skill disparity achievable before the stronger side outright crushes the the underdog. Strong players should win by eking out minor edges here and there, rather than simply shutting down weaker players.

    It’s also a good idea to have one or more mechanisms that push gameplay towards a competitive equilibrium. When one side outplays the other, the winning side should further its progress towards victory (i.e. score points, damage the enemy core, etc...) while the losing side should be given some type of advantage to help bring them into competitive parity with the winning side. This way, the core gameplay loop will reach a balanced state that is fun and challenging for both sides, but the side that is playing better will still hold the lead in the the overarching race to final victory.

    Correspondingly, try to avoid mechanics that further strengthen the winning side, since these will push match balance away from competitive parity and towards gameplay breakdown. It should be obvious why snowball dynamics are problematic, but it’s worth mentioning because this kind of positive feedback loop shows up in many successful competitive games (e.g. League of Legends, Starcraft, etc...). I’d argue that this has largely been to the detriment of those games at most tiers of play. I see these mechanisms as vestiges of their genres’ less competitive roots, and I think it behooves designers to think critically before replicating them in future generations of competitive games.

    As I’m writing this post, I’ve been tempted to go into several detailed asides about specific implementations. I think I’ll save those for one or more future posts, so to illustrate my propositions succinctly I’ll just stick with another basketball example.

    First, imagine a scenario where teams get taller and faster when based on their point lead. The team that gains the lead at the matches outset is probably the better team to begin with, and their advantage will continue to be compounded as their lead increases and the game becomes less and less competitive.

    What we want is the opposite; the trailing team is strengthened based on how far behind they are. The team that is inherently better will pull ahead, but eventually the buffs to the losing team will be sufficient to put the two teams on even footing, causing gameplay to become tense and challenging for both teams. The losing team will never fall too far behind on score, but in order to overtake the other team they will still need legitimately outplay them.

    Before I conclude this post, I want to be clear that equalizing mechanics have their own own costs and pitfalls. Here are some things to keep in mind:

  • Allow players to earn a meaningful lead. Handicaps should kick in when one side begins to fall too far behind. If catch-up mechanics scale up too quickly, it renders the early and mid-game irrelevant.
  • Make sure that handicaps are reduced promptly as the trailing side catches up. If handicaps persist, it can produce unfair and counter-intuitive situations where players are encouraged to intentionally fall behind in order to gain an advantage in the late game.
  • Keep in mind that your game isn’t required to support matches between players with extremely disparate skill levels. It just needs to cover the range of match-ups commonly produced by the matchmaker. Tune the system for the standard deviation in skill level.
  • If a match is simply too imbalanced to be fun, by all means end it early. Figure out what the breaking point is, then put mechanisms in place to end the game quickly when that limit is surpassed.

    And that pretty much sums up my thoughts on the matter. I really enjoy taking an evolutionary perspective on game analysis; peeling back the layers of contemporary games and genres to try and separate the load-bearing elements from those that are simply vestiges of their forerunners. Competitive gaming is going through a period of growth and transformation at the moment, and it will be very interesting to watch adaptations emerge over the next few years.

Pen-And-Paper Dabblings

I was cleaning up the files on my PC the other day, and I ran across my old folder of tabletop RPG dabblings. I pretty much learned to read off of a set of 3rd edition D&D manuals, and I’ve been tinkering with my own pen-and-paper systems since middle school. These efforts later culminated in college as a winter study project.

    The goal of the project was to craft a system that was better adapted to a market dominated by video games. Digital games do most of the rules management and bookkeeping for players, so newer generations of gamers are less tolerant of the rules overhead demanded by traditional pen-and-paper games. Video games have pen-and-paper beat in the action, PvP, and simulation departments. Pen-and-paper games should play to their remaining strengths by emphasizing aspects like communal storytelling and creative flexibility.

    The emphasis of my system is still in the traditional adventure/combat vein, though its core is flexible enough to handle plenty of other genres. I won’t go into too much detail on the system, but it did have a few features that are worth mentioning.

 

Roll Under

 

    I wanted a simple conflict resolution mechanic that minimized the number of variables and modifiers involved. To succeed on a skill test, players need to roll a result that is equal to or lower than their character’s rating in that skill. NPCs can also oppose players with their own skill rolls. If both sides succeed, the side with the higher die result breaks the tie, but the loser’s result might still temper their success.

    One thing that I really like about this system is that the mechanical feel of battles shifts as the skill level of the combatants increases. It’s hard to explain briefly, but basically battles become harder fought the higher the skill level, with longer periods of parity as both sides succeed more consistently and significant swings of fortune each time one side fails a roll. This setup requires a narrower range of skill ratings, so functions best in a system where characters’ stats scale up only slightly over the course of the campaign.

 

Default Skill Level

 

    Characters only have individual ratings for skills in which they specialize (and maybe a few skills they are really bad at). For all other rolls they use a default skill threshold that affords about a 50% chance of success. This captures the important details of a character and relieves players from the task of managing a large table of skills, most of which are not important to their character.

 

Zone System

 

    Instead of a rigid grid system, I wanted a more flexible and intuitive way of organizing characters in space. In this system each room or distinct area is a zone. On their turn, characters can move to reposition themselves within their current zone (“I move out of melee reach of the ogre and hide behind the stack of crates in the corner”), or transition to an adjacent zone through a choke point like a doorway or ladder. Rather than worry about precise reach of weapons and abilities, ranges can be described qualitatively: melee for physically adjacent targets, short range for targets in the same zone or adjacent zones, and long range for targets in distant zones.

    The zone system turns out to be very good at handling unconventional situations. My favorite usage was an Attack on Titan one-shot adventure I wrote up during testing. I defined the airspace around each titan as its own zone, which made it easy track the motions of player characters zipping around with 3D maneuvering gear.

 

    I haven’t worked much on my system since the college project. This is partly because I’ve been busy with other things, but actually the main reason that I found a published system that does an excellent job of accomplishing the goals I described above.

    I’m talking about Dungeon World by Sage LaTorra and Adam Koebel. The rules are simple and incredibly easy for new players to pick up. Each class has its own character sheet, which consists of a set of multiple-choice options that players fill out to create their characters. Dice rolls add only ability modifiers, and the result is compared against a constant scale (2-4 for failure, 5-9 for mixed success, and 10-12 for unconditional success). The system largely leaves it to the players to interpret the outcomes of rolls, and I’ve found this really helps groups get into the swing of cooperative storytelling.

    Dungeon World made me realize that my own thinking on pen-and-paper design was still too rooted in traditions of strategic wargaming and simulation. As someone who loves strategy games I really love designing those types of systems, but in many ways they get in the way of the surviving strong points of pen-and-paper roleplaying.

 

    Another great resource to have when designing a pen-and-paper game—and truthfully any RPG—is Whitson John Kirk the Third’s scholarly project Design Patterns of Successful Role-Playing Games, which is publicly available here.

 

 

Procedural Puzzles Generated Genetically

       A few years ago I designed some logic-maze puzzles around the mechanisms of color coded doors and keys (see Perspective). I was very interested the possibility of procedurally generating this kind of puzzle, but at the time I was missing some of the skills and knowledge to effectively tackle the problem, so I shelved the idea.

       I started thinking about procedural puzzle generation again last year after attending an excellent talk on the subject given by Ty Taylor, the designer of Tumblestone (here is a video of essentially the same talk). For Tumblestone, Ty was able to quantify the qualities he was looking for in his puzzles, which allowed him to cull over 99% of his pool through automated screening.

       After the talk it occurred to me that a genetic algorithm is a pretty natural extension of this automated curation system. Instead of culling puzzles for minor defects, the genetic approach applies mutations to imperfect-but-promising puzzles in search for increasingly fit configurations.

       As such, the genetic approach needs to be used a bit differently from other search methods. Because it population of puzzles converges towards an optimal solution, a single simulation isn’t going to yield more than one puzzle. For most applications you can probably just run the simulation over and over with different seeds and produce a collection of distinct levels. However, it could be the case that the convergent forces are quite strong, and consecutive simulations will produce puzzles that are very similar.

       If this is the case, you can still generate a healthy diversity of levels by running simulations with differing combinations of genetic operators and fitness criteria. Sure this is a little more work, but it’s actually a very appropriate approach to creating content for a puzzle game, since you need puzzles with widely differing properties to fulfill various roles: simple puzzles to teach the basics, intermediate puzzles to introduce new mechanics, advanced puzzles to challenge the player, etc…

       Let’s apply the genetic approach to my logic mazes. Before we get into the specific details, I want to define two concepts, the state graph and the mechanical model. These actually pertain to any puzzle which asks the player to transmute the puzzle into a its finished state by performing a series of steps or making a set of choices: Tumblestone, the Rubix Cube, Sudoku, etc...

       The fundamental similarity between all of these puzzles is that, in their most abstract form, they are nothing more than mazes; webs of possible states that the player navigates from start to finish. This is the state graph.

       These puzzles can also be described by much simpler mechanical models that summarize a puzzle as a set of rules or a logical framework. For example, a Rubix Cube has a state graph with quintillions of nodes, but it can also be expressed (and physically embodied) as a very elegant set of mechanisms that conform to intuitive spatial logic.

       The mechanical model of my logic maze is also a graph, though a much smaller one than the state graph it produces. The nodes represent rooms that the player’s avatar can occupy, and the edges represent doors to other nodes which can only be traversed under certain positions. Note that the model only cares about conditional doors. If two rooms are connected by a passage that is always open, they are functionally one node on the model.

       Rooms also contain interactive elements like levers and keyholes which the player can manipulate to open and close specific doors in the level. The puzzle I crafted for Perspective featured a few one-time keys for specific doors, a switch that alternated opening/closing two doors (the rotating walkway), and lots of color coded doors (there was a keyhole on each side of the door for the matching key, but that meant you couldn’t take the key with you when you walked through). Beyond these, there plenty of logical systems we could use as puzzle mechanics: one-way doors, doors that require a key but allow you carry it through, spatial logic like moving platforms and sliding blocks, the list is endless… it’s just a question of what we program in.

Perspective Ground Floor.png

       To generate a level, we begin with a population of seed mechanical model. The simplest seed is just a single node that contains both the start marker, the end door, and a switch or key to open the end door. Each generation, we will perform genetic operations on the models in the population. Genetic algorithms have two common types of operation: crossover and mutation. Crossover is a bit more tricky for puzzles, so for now we will just work with mutation.

       We can mutate models by adding, removing, rearranging, or reconfiguring our elements (nodes, doors, interactables). It also doesn’t hurt to use more sophisticated mutation operations to decrease the rate of dud levels, though we want to make sure we aren’t overly constraining the shapes that can be reached via mutation. In this case, we should mostly operate on the models by splitting/merging existing nodes because with these operations there is a better chance that the model will retain/improve its functionality.

       We also need to define our fitness criteria. A lot of information we want to assess comes from the puzzle’s state graph, so the first step in assigning a fitness score is to explore the state graph (begin at the start marker and perform a breadth-first search of reachable states). Here are some of the things we will want to consider:

  • Solvable: Is there a path through the state graph from start to finish? If not, the level should be immediately axed. Even if there is a route from the start to the solution, there could still be areas of the graph that don’t have a route to the solution. We could just allow players to restart the level if they get stuck, but if we don’t want this to happen we need to perform a breadth-first search of inbound edges from the end-node (can always solve), or the start-node (can always backtrack).

  • The Shortest Path: The shortest path is the solution to the puzzle, and it tells us a lot about the puzzle’s quality. It’s length is a good indication of the complexity and challenge of the puzzle, though in this case I want to discount state changes where the avatar simply walks to another room and focus only on how many times the player manipulates the logical systems in the level. We also want to make sure that each node on the path has many alternative edges so that players are not railroaded into the solution.

  • Layout Considerations: We want these levels to be pretty compact, so it makes sense to impose an upper limits on the number of positional nodes and edges. Depending on how we plan to lay the level out for players, there will be other things to select against; for example, neighborhoods of position nodes with lots of connections to each other can make things difficult for grid layouts.

  • Simulated Playtesting: Probably the best assessment method if done right. Essentially we ask the program to attempt the puzzle some number of times using heuristics such as a human might use. This can give us more accurate information on how interesting/challenging the puzzle actually is for a human. I haven’t thought out the exact details of how this would work for these logic mazes. I’m currently looking into some interesting literature on a genetic puzzle generation algorithm that accounts for implicit mental steps, so expect more thoughts on this subject in the future.

       These are just some of the metrics to look into. In fact I've touched on a lot of topics that deserve much more extensive exploration than I can fit into a blog post. The main observation to take from all this is that puzzles are all pretty similar in the abstract, and a lot of the fitness criteria we have discussed are broadly applicable without much adjustment. I’m intrigued by the concept of a very flexible program that can use this abstract approach to generate and analyze many different types of puzzles on request. I have the skeleton of this program floating around in my head, but I should probably rein myself in and build a simple logic maze generator first...

 

Mobile UI Musings, Part One

Recently I was kicking around some ideas for a digital card game, and it lead to some reflections on UI design for touch screens. Back when I was putting together the layout for Orbit, I had my phone’s 4” screen in mind, and my primary concern was making as much space as possible to display the planet. This resulted in the game’s current landscape layout — with the planet in the center, buttons to either side, and purely informational elements pushed to the top and bottom of the screen. Players are intended to hold the phone with both hands and input with their thumbs.

Players can grip a smaller phone in both hands and comfortably access the whole screen with their thumbs.

Players can grip a smaller phone in both hands and comfortably access the whole screen with their thumbs.

The problem with the “two-thumb” grip is that it doesn’t scale well. Between my thumbs I can comfortably cover the screen of my iPhone 5s, but someone with smaller hands or a larger screen (and screens are larger these days) would probably have trouble reaching the centerline.

It’s unsurprising then that most mobile games are designed for input with just the index finger on a free hand, since this allows for comfortable access to the center of a screen regardless of its size. The important thing to keep in mind when designing single-finger layouts is that you want to maintain the compactness of input area on larger screens.

Hearthstone’s tablet interface is a great example of this principle. The main input area occupies only a fraction of the screen. The layout and gravity of the game pieces ensure that most touch actions are short, upwards-dragging gesture that can be performed with minimal hand motion. The end-turn button is expressly placed at the periphery to avoid players pressing it accidentally.

Built with tablets in mind, Hearthstone makes input comfortable with just an index finger by confining the touch area to a vertical column.

Built with tablets in mind, Hearthstone makes input comfortable with just an index finger by confining the touch area to a vertical column.

Am I planning to rework Orbit’s UI? I will probably keep it as is for 4” screens, but for larger screens it definitely makes sense to rearrange the UI. Since I don’t actually want to scale the planet’s screen space up proportionally for larger screens (keeping the input area compact), I should have room to line up the build buttons along the bottom of the screen where they will be easiest to access via index finger. It might even make sense to switch to a portrait view. I am definitely looking forward to doing some experiments.

GP Atlanta 2016

Last month I played in the main event of Grand Prix Atlanta. I had played plenty of Pre-releases and FNMs before, but this was my first large-scale tournament. I consider myself to be a pretty good limited player, and I thought if I opened a good pool I would have a reasonable shot at making day two. As it turned out, I opened an excellent pool and was one of five players who finished day one undefeated!

The best part of going 9-0 on day one was that it started me near the top of the pairing ladder on day two, and I got to draft with and play against a bunch of fantastic pro and semi-pro players. I was definitely a little outgunned (I ended the day 1-5), but I had a ton of fun. Prior to the main event, I had actually only drafted Kaladesh twice (talk about unprepared for pod one), and I learned a lot about the set's limited environment by drafting and talking with such skilled players. I am also a bit tickled to have a little footnote in the coverage archives.

Me at the end of day one. Drained from a ten-hour tournament day, but very pleased with myself.

Me at the end of day one. Drained from a ten-hour tournament day, but very pleased with myself.

Lasso Selection for RTS Games

I recently read an excellent post by The Ulf arguing for the abolishment of box selection in favor of better unit selection tools. I liked that he created a circular brush selection demo as proof of concept, and I decided to go ahead and put together a little experiment for the selection method I am most interested in: lasso selection.

This is just a quick experiment, not a functional demo like Ulf's. As you can see, the lasso is a pretty quick and intuitive way of grabbing groups of units, especially specific pieces of formations. The selection is a little rough right now, but with some tweaks I think it will be possible to make precise selections near-instantaneously.