I've been promising to write a technical post on Minecraft for a while, but never really got around to doing so. I'm on a tiny airplane now, though, with nowhere to run, so here we go!
One of the most complex parts of Minecraft is the terrain generation. When I changed the game over from being just single zones of a map to an infinite map, the terrain generation got a whole lot more complicated, as terrain needs to get generated on the fly as the player explores, and it has to be the same no matter what direction the player approaches it from.
1) How infinite is it?
First of all, let me clarify some things about the “infinite” maps: They're not infinite, but there's no hard limit either. It'll just get buggier and buggier the further out you are. Terrain is generated, saved and loaded, and (kind of) rendered in chunks of 16*16*128 blocks. These chunks have an offset value that is a 32 bit integer roughly in the range negative two billion to positive two billion. If you go outside that range (about 25% of the distance from where you are now to the sun), loading and saving chunks will start overwriting old chunks. At a 16th of that distance, things that use integers for block positions, such as using items and pathfinding, will start overflowing and acting weird.
Those are the two “hard” limits.
Most other things, like the terrain generation seeds and entity locations use 64 bit doubles for locations, and they do much subtler things. For example, at extreme distances, the player may move slower than near the centre of the world, due to rounding errors (the position has a huge mantissa, the movement delta has a tiny, so it gets cut off faster). The terrain generator can also start generating weird structures, such as huge blocks of solid material, but I haven't seen this lately nor examined exactly what behaviour causes it to happen. One major problem at long distances is that the physics starts bugging out, so the player can randomly fall into ground blocks or get stuck while walking along a wall.
Many of these problems can be solved by changing the math into a local model centered around the player so the numbers all have vaguely the same magnitude. For rendering, Minecraft already uses local coordinates within the block and offset the block position relative to the player to give the impression of the player moving. This is mostly due to OpengGL using 32 bit floats for positions, but also because the rounding errors are extremely visible when displayed on a screen.
We’re probably not going to fix these bugs until it becomes common for players to experience them while playing legitimately. My gut feeling is that nobody ever has so far, and nobody will. Walking that far will take a very long time. Besides, the bugs add mystery and charisma to the Far Lands.
2) Isn’t that terrain shape pretty awesome?
In the very earliest version of Minecraft, I used a 2D Perlin noise heightmap to set the shape of the world. Or, rather, I used quite a few of them. One for overall elevation, one for terrain roughness, and one for local detail. For each column of blocks, the height was (elevation + (roughness*detail))*64+64. Both elevation and roughness were smooth, large scale noises, and detail was a more intricate one. This method had the great advantage of being very fast as there's just 16*16*(noiseNum) samples per chunk to generate, but the disadvantage of being rather dull. Specifically, there's no way for this method to generate any overhangs.
So I switched the system over into a similar system based off 3D Perlin noise. Instead of sampling the “ground height”, I treated the noise value as the “density”, where anything lower than 0 would be air, and anything higher than or equal to 0 would be ground. To make sure the bottom layer is solid and the top isn't, I just add the height (offset by the water level) to the sampled result.
Unfortunately, I immediately ran into both performance issues and playability issues. Performance issues because of the huge amount of sampling needed to be done, and playability issues because there were no flat areas or smooth hills. The solution to both problems turned out to be just sampling at a lower resolution (scaled 8x along the horizontals, 4x along the vertical) and doing a linear interpolation. Suddenly, the game had flat areas, smooth hills, and also most single floating blocks were gone.
The exact formula I use is a bit involved (and secret!), but it evolved slowly over time as I worked on the game. It still uses the 2D elevation and noisiness maps, though.
STILL TO COME, ON TERRAIN GENERATION:
Caves and Large Features
Trees, Lakes, and Small Features
Now I'll prepare for landing so I can switch flights!