Report article RSS Feed Never reinvent the wheel

I thought i'd share some behind the scenes decisions we made, when creating the Desura download system.

Posted by lodle on Jul 21st, 2011

When you purchase a game or download a free mod from Desura (our digital distribution app), we want that download to come at max speed and with 100% reliability. On paper that sounds easy, however in theory it is a challenge without using Akamai or another high-priced CDN with many edge locations. To add to that challenge we pursue the Google model of using commodity servers and understanding that they will fail, so our system needs to automatically work around such failures.

After significant work and 3 failed attempts, I believe we have finally built a robust system and we've learnt along the way that you should never rebuild the wheel. Here's what we did:


First a custom protocol

In C++, we built from scratch an entirely custom transfer protocol to work with the Desura file format which only downloads parts of the file we need. This system works via a one socket, command and receive approach. The client asks for a file, an offset and download size and the server reads the file and regurgitates the data back to the client.

  • PRO: Super fast, zero CPU
  • CON: Worked fine internally when we only had the 4 team members using 6 servers. But as Desura started growing during the public beta, leaks started to form, sockets wouldn't timeout etc, and so many processes hung in a zombie state.
  • CON: Very hard to debug sockets because on our test machines it always works, in production with tons of clients issues arise.

Next, we used a buzzword: NODE.js

So I started to investigate other options that could provide the back end server. About two weeks earlier, node.js was released and this seemed to scream at us and to be just the solution we were looking for. Node.js provided library and function calls we needed, we just had to build our custom transfer protocol in this language. In production it seemed to be much more reliable and stable than the old server program so we deployed it to all mirrors and called it a day.

  • PRO: By using a tested codebase and not just OS calls, it worked well, was easy to deploy, portable and didn't need compiling.
  • CON: We were trying to make node.js do something it wasn't meant to do, and again memory leaks crept into once many users started connecting.
Fustration
My frustration levels after the downloads servers not working again

Third time around we learned our mistake

Finally our server admin (Greg) suggested that we should just use a FTP server. After all the File Transfer Protocol  has been around for 40 years and there are plenty of tried and tested server/client libraries to which to use. So we built our custom Desura file format to work using a FTP-like system and since then we haven't had a problem.

  • PRO: FTP is built to send files, it has been in use and optimized to do this by millions of people. So it is reliable, fast, scalable and easy!
  • CON: None

... and so this brings me to the point of this article. Never REBUILD THE WHEEL, especially when working with complex code (i.e. sockets) which needs to be 100% dependable from client to server. Building everything from scratch is nice, as it gives you control and the ability to optimize, however unless you have unlimited time, when existing libraries exist (especially tried and tested ones) - learn from our lesson and use them from the beginning.

Do you reinvent the wheel?

Post comment Comments
deadrawkstar
deadrawkstar Jul 21 2011, 3:21am says:

I'm reinventing the mod news :D

+9 votes     reply to comment
jjawinte
jjawinte Jul 21 2011, 4:18am says:

So you've got a few new grey hairs and your cat's run away - you'll recover and things will keep ticking along one way or another. Secondarily, look how much knowledge you've gained.

Thankfully, some people have that insatiable need to constantly reinvent everything and for that we have made great strides in most all technological areas ( as well as creating many profitable pharmaceutical stocks ). Moving a little too fast these days for me, but it's simply " the nature of the beast ", as they say.

+5 votes     reply to comment
(voythas)
(voythas) Jul 21 2011, 4:23am says:

How about P2P?

0 votes     reply to comment
Matt_Bak3r
Matt_Bak3r Jul 21 2011, 7:40am replied:

Maybe as a backupsystem, IMO not as a main, for desura not a good idea. needs to be reliable.

+2 votes     reply to comment
Kissaki
Kissaki Jul 24 2011, 9:25am replied:

P2P is an architectural idea. How is that not reliable?
Torrent for example is very reliable. Even more so than FTP (AFAIK).
With checksums for parts and the whole file you can make sure to keep integrity.

+2 votes     reply to comment
LordIheanacho
LordIheanacho Jul 25 2011, 10:22am replied:

Sometimes I beg to differ about torrents. some of them always seem to take forever to complete. On the other hand, it might just be my internet connection struggling to download such files.

+1 vote     reply to comment
Protektor
Protektor Dec 15 2011, 4:55pm replied:

P2P isn't a bad option, it is exactly how World of Warcraft does their updates to spread the load around and not slam the central sever. With torrents the same thing is possible. Look at what Vuze does. They host their own Torrent server that fees out content as if it were a just another P2P client running Vuze. So you get fast speeds and the help of other users hosting content to reduce the load when possible.

So it is very possible and used frequently, P2P technology to spread the download bandwidth needed around.

+1 vote     reply to comment
masternerdguy
masternerdguy Jul 21 2011, 3:09pm replied:

also p2p is being prohibited by ISPs because of the misconception that the only real use is for piracy.

+2 votes     reply to comment
Kissaki
Kissaki Jul 24 2011, 9:24am replied:

That’s a problem with ISPs, not the protocol.
Having bugged and/or raging users is great for changing that.
Only if awareness of the good use of P2P rises will the ISPs change their policy.

+3 votes     reply to comment
Protektor
Protektor Dec 15 2011, 4:56pm replied:

That is not correct. ISP do not block P2P technology if they did it would break World of Warcraft updates and several other games that use P2P technology to do updates to massive numbers of people.

+1 vote     reply to comment
Herr_Alien
Herr_Alien Jul 21 2011, 4:35am says:

I try not to.
For this one app that I worked on, we use HTTP for a couple of things. Now, even though HTTP is a simple protocol, I didn't want to bother implementing it. So, for this app, we went for libcurl, a handful of lines of C code and we were done.
Free time is such a precious commodity for me that I MUST reuse as much as possible. While I'd love to implement everything in my projects (even just for sake of learning), in all cases the shortened development time from re-using external libraries is worth it.

I mean when Marconi invented the radio, he used already existing components: antennas from Popov, metal powder detectors from Branley and so on. That doesn't make him a lesser man, but rather somebody who had a broader view than the others.

+4 votes     reply to comment
Metalspy
Metalspy Jul 21 2011, 8:36am says:

When we're talking about code, yes I do 'reinvent the wheel'. I'm currently still in the first learning phase so to speak, and I believe that I will learn a lot more if I analyze existing programs/parts of programs and try to write them myself. That way I know exactly what I'm working towards, while also gaining experience with how to structure my programs.

Of course I don't plan on doing this my whole (hopefully in the future professional) coding career long, because that would be a waste of time. But for learning purposes it's in my opinion a very valuable thing to do.

+4 votes     reply to comment
Dragonlord
Dragonlord Jul 21 2011, 9:01am says:

I'm used to reinvent the wheel all the time. Often the only way to get what you need ;)

+3 votes     reply to comment
jjawinte
jjawinte Jul 22 2011, 10:43am replied:

Excellent point !

+1 vote     reply to comment
TheHappyFriar
TheHappyFriar Jul 22 2011, 11:25am replied:

Reinvent or design a different one? :) Doing something already done from scratch is a great way to learn how it works though. Knowing why it's doing what it's doing allows you to figure out what's wrong easier.

+2 votes     reply to comment
Dragonlord
Dragonlord Jul 22 2011, 1:02pm replied:

Mostly doing what is not done "like that" so far. With re-inventing I mean more doing it again to fix what's a shortcoming or problem of the original design like what I mentioned in my last news post. Otherwise you could go all the way down to programming languages but I even wrote one on my own... rofl.

+2 votes     reply to comment
altercuca
altercuca Jul 21 2011, 9:10am says:

Awesome article Mark, thanks for sharing the experience.

+2 votes     reply to comment
TheHappyFriar
TheHappyFriar Jul 21 2011, 4:56pm says:

I've been using FTP since I first LAN'ed. Easiest way to transfer files pre-DVDRW/USB drive days: just setup a server on machine X & log in via windows ftp on machine Y. Piece. Of. Pie.

I've also setup some simple backup routines using FTP. Works wonders!

+2 votes     reply to comment
Katana_
Katana_ Jul 22 2011, 11:53am replied:

You're missing out. Take a look at rsync and scp - much, MUCH more secure solutions and just as effective, if not more.

+2 votes     reply to comment
Dragonlord
Dragonlord Jul 22 2011, 1:06pm replied:

But also more difficult to implement fully. FTP is a rather quick and dirty solution.

+2 votes     reply to comment
BluishGreenPro
BluishGreenPro Jul 21 2011, 10:04pm says:

I only reinvent "my own wheel" because I am still learning so much. I've scrapped the AI for one enemy in my game about 5 times now, and I might even do it again!
It really depends on the context... if there is an established system that works really well, then I guess it is best to go with that, but I really like being able to take credit for everything which I cannot do if I'm building off-of or using someone else's code.

+3 votes     reply to comment
Ark_
Ark_ Jul 22 2011, 10:31am says:

(meant as a reply to BluishGreenPro):
If your useing OpelGl or directX or well even jsut a compiler you cant take credit for everthing :P
Just a matter for where you call the line.

+2 votes     reply to comment
TheHappyFriar
TheHappyFriar Jul 22 2011, 11:23am replied:

Ever see a company say "Thanks to company XXXXX who made the great compiler!"?

Nope, I never have either. :p (if you hit "reply to comment" for that person it will start a tree off their response).

+2 votes     reply to comment
Ark_
Ark_ Jul 22 2011, 1:05pm replied:

Oh I know about the reply buttion I just forgot to hit it that time :)

Thats what I meant by drawing the line. Pretty must everything is based on something else, its just how far down you are willing to acknowledge.

But I pretty sure if one of the people that worked on the compiler seen a company say that, it would make them feel happy.

+2 votes     reply to comment
Kissaki
Kissaki Jul 24 2011, 9:31am says:

Can FTP do in-file-parts as well?

+2 votes     reply to comment
lodle
lodle Jul 26 2011, 10:26am replied:

extended it to do so

+1 vote     reply to comment
JSHuiting
JSHuiting Jul 24 2011, 11:38am says:

The problem with a P2P system is that yes, it can be reliable, as long there are people that seed the file that you want to download. If not, then no download for you. Also many times people added damaging files to their downloaded torrents which then were downloaded again by other people. So yes, if you want virusses and excessive slow downloads they should go for P2P.

@ Kissaki.

+2 votes     reply to comment
q68txy
q68txy Sep 20 2011, 12:48am says:

1) If you used bittorrent you would provide a http seed to support it. This alleviates all the unreliability claims.

2) If you're using ftp you're not reinventing the transfer but you are reinventing delta coding when you want reliable transfer of large files and updates that don't involve re-downloading the whole file.

3) HTTP is a good choice for serving files on the web especially when combined with bittorrent or something like zsync dot moria dot org dot uk

Bittorrent also provides a guarantee of cryptographic integrity which gives you both security and file validation without any extra work.

So yeah, don't reinvent the wheel, but FTP seems like a really odd choice.

+1 vote     reply to comment
Syllopsium
Syllopsium Dec 18 2011, 8:28pm says:

Custom FTP like protocol? You are fools. Port 62003 is blocked by default in any half decent firewall. A custom 'FTP like' protocol doesn't work with proxies specifically designed to handle real FTP - that work on every other FTP server, except yours.

WebDAV, rsync, scp or heck, HTTP/HTTPS with a couple of custom attributes is better than the mess that you've created.

Now I have to muck around trying to figure out how to push this through my OpenBSD firewall, which will probably involve writing custom code. I can expect no help in this, because when I report it to the OpenBSD community they will a) laugh at me a bit and b) rightly call the Desura developers idiots, because you've tried to overload a protocol. It's also not easily possible to debug this without looking up exactly what's custom about the protocol, instead of using existing tools if a well known protocol was used.

FTP is a deeply horrid protocol that was not designed for today's Internet, reacts extremely badly with firewalls and is lacking with features compared to more modern protocols. Anyone looking to use it in a new system needs their head examining.

Never re-invent the wheel - absolutely correct. Another valuable maxim is not to create something that looks mostly like a wheel but is actually a hexagon.

Stop wasting my time with something that should just work and use an existing protocol. No, I'm not going to consider using a less secure firewall that works with this broken protocol.

+2 votes     reply to comment
Syllopsium
Syllopsium Dec 18 2011, 8:43pm says:

Further examination of this reveals fixing it is as simple as a new forwarding rule from port 62003 to the ftp proxy.

Nevertheless my points still stand. Use a pre-existing protocol, on a standard port. Your requirements are not unique and your aim should be to make supporting a not particularly mainstream service easier for the users and sysadmins, than letting developers fart around with unnecessary custom protocols.

+1 vote     reply to comment
psyq123
psyq123 Jul 2 2012, 9:13am says:

Yes, the decision about port 62003 is odd. Why not put it on port 21? Most networks I'm in won't let any Desura traffic through.

+2 votes     reply to comment
Saturn
Saturn Apr 29 2013, 7:26am says:

Yep, GREAT, I can't download any of my purchases because of this. I don't have the option to just add a firewall rule to the router.

+2 votes     reply to comment
Post a Comment
click to sign in

You are not logged in, your comment will be anonymous unless you join the community today (totally free - or sign in with your social account on the right) which we encourage all contributors to do.

2000 characters limit; HTML formatting and smileys are not supported - text only

News
Browse
News
Report Abuse
Report article
Related Groups
Desura
Desura Official group with 10,078 members