By Reto.masterdisaster17, Server Programmer
So, it’s another happy day in the land of the Backend Programming Tyrannosaurus. I’ve made peace with the sheep; We were just too different, I suppose, but I’ll miss the wool.
Anyhow, let’s talk about my other project, Y’know that game thing? We have servers! In fact we have lots of servers, that do different stuff, and I thought you might like to know what the hell that stuff is. Let’s have a look at a pretty picture, shall we?
What we’ve got here is… failure to communicate a server architecture. It’s designed to keep you, the players, happy. And when are you happy? You’re happy (or we hope you’re happy) when :
- The servers don’t crash (So you can kill stuff all the time)
- The servers respond rapidly (So you don’t have to wait too long to kill stuff)
- The servers reliably save data (So it doesn’t forget that you killed stuff)
Note that there is no dedicated happiness server, that’s a function of the whole thing. But what does all this stuff actually do? Let’s look a moving a single assault team, as an example.
- You see that the hated enemy is moving towards ArschKickenBurg in southern germany, and decide that your armor assault team the P4NZ3RCL0WnZ wants to join the fight. You select your assault team and drag the movement arrow to ArschKickenBurg.
- The client sends a movement command to the clienthandler it is connected to
- The clienthandler receives the command. It computes the travel time to ArschKickenBurg from Canne d’Wuparsa where you started from, and sets up a timer to notify itself when you arrive.
- The clienthandler sends the movement command (Timer data, changes to destination etc.) to the RamStorageNode_War1 node (Movement is War specific data, så all the data goes to the War1 branch; If there were player data involved, some of the data would have been split off and sent to the RamStorageNode_Global node)
- RamStorageNode_War1 checks that your action is possible; ie. that you aren’t simultaneously trying to move the same assault team somewhere else (you might be connected from your mobile phone as well as your pc), that some other circumstance hasn’t invalidated your move before you even made it. If everything is OK, it writes the data to the in-memory database (RAM node, get it?), and starts notifying a lot of listeners.
- One listener is the DiskStorageNode_War1 node. This node makes sure that the data stored in the in-memory database is also saved to persistent storage, so that the data doesn’t disappear if we have to restart the server (for example when deploying new versions).
- The other listeners are the clienthandlers. Since this is the War1 RAM node, it only notifies the clienthandlers 1 to 4.
- The clienthandlers notify their connected clienthandlers that ArschKickenBurg is about to get a visit from the P4NZ3RCL0WnZ. This includes your OWN client; Your client won’t show the move happening until every other client is also being notified.
- You lean back and get ready to kill stuff.
This architecture provides reasonably fast writes and blazing fast reads; When you request data from the system, your client will only have to wait for the clienthandler’s local in-memory database. When you write, your client only has to wait for the changes to be written to the RAM node, because disk storage is just queued up and happens when it happens (The observant reader will note that in the most extreme failure scenario, where a RAM node and diskstorage node fail simultaneously, this means that we could lose any data still queued. Happily, the diskstorage is pretty fast, and therefore the queue is never that long)
The design is completely modular; Every node you see on the diagram can run on its own machine, on the same machine in one process or on the same machine in it’s own process. This means that I can load up the entire thing on my workstation for testing, or I can deploy it to live where each node has its own dedicated hardware for maximum performance and reliability. When running on separate machines like that, each node is completely independent of the others; If a node fails for some reason, the other nodes will simply wait for it to come back online. In some cases, you might not even notice it happening.
See? Simple, really. I’ve left out a lot of important details; The Log service that lets all the nodes report status, performance and warnings to a central log. The statistics DB that keeps track of your history. The CronJobs that does all the things that are not player actions, the game manager that launches action games. There’s stuff in here for a second blog post, so that might happen :)