How our users exploited concurrency and how we fixed it

A story of a game exploit

Once upon a time I developed a somewhat popular web game called Forumwarz. At its peak, we were serving about 6 million dynamic requests a day off a single quad-core server.

Forumwarz limits how many turns a player can take in a day. We designed it this way so that the competitive aspect of the game wasn’t simply a contest of who had the most time available to play. Players had to choose their targets wisely.

A side effect of having limited turns was we had a good idea of the maximum score a player could earn in a day. However, a few weeks after the game started to become popular, we noticed some people were earning many multiples more than they should have been able to. Somehow they had figured out a way to exploit the game and it was giving them a huge advantage on our leader boards!

I reviewed the history of a couple of the offending players. The curious thing I discovered was that they weren’t taking any more turns than the others; they were just being rewarded much more.

There was only spot in the codebase that rewarded players, and it looked something like this:

# Keep everything in a database transaction
Player.transaction do

  # Find the player's current goal
  goal = player.goal

  # Make sure we don't reward goals that have been already been completed
  unless goal.completed?
    goal.update_column :completed, true
    player.increment!(:score, goal.score)
  end

end

After much headdesking, I eventually discovered that the above code is not safe under concurrency.

In Rails’ development mode on your local machine, the code will always work properly. This is why the exploit never came up while I was testing the game. However, if you run it in a concurrent environment (such as our quad-core server) players can reward themselves multiple times by making many requests in a very short period of time.

This happened to by what our players were doing. They discovered that if they submitted their final blow many times quickly, they’d be rewarded more than once.

Why does the code fail?

If you step through the above code with multiple processes in mind, the error is fairly obvious. Let’s say two requests come in to your web server at the same time. Both will reach this line:

goal = player.goal

which will trigger a database query that looks like this:

SELECT goals.* FROM goals WHERE goals.player_id = 1234`

At this point, the database will return the same values for both queries, as the update statement hasn’t happened yet. This means that when we reach this line:

unless goal.completed?

Both of them will pass! The player score will be updated twice.

The Solution

After I discovered the cause of the exploit, I was a little worried. I’ll be the first to admit I am green behind the ears when it comes to multithreading and concurrency. I was worried that I’d have to resort to semaphores or some kind of other exotic programming construct.

Fortunately, I soon found an easy solution: let the database handle the concurrency. Much smarter developers than me have put in thousands of hours of work into databases to make sure they hold up under concurrent situations such as these. All I’d have to do is leverage their hard work.

Here’s the solution I came up with:

Player.transaction do

  # Update completed attribute to true, but only when it's currently false
  row_count = Goal.update_all "completed = true", ["player_id = ? AND completed = false", player.id]

  # update the player score only if completed changed in the database
  if row_count == 1
    player.increment!(:score, goal.score)
  end

end

The key to the above solution is that your RDBMS will return a count of how many rows it changes when you execute an UPDATE. Only one request will receive a row count of 1 back. All others will receive 0 and will execute nothing. It just works!

I highly recommend using the above approach any time you want to trigger a secondary effect following a successful update. It’s easy to write and your application will be a lot safer.