Principles of a distributed system

written on Thursday, May 16, 2013

As what we are building at PythonAnywhere becomes more complex the interactions between systems also start to multiply.

The dependency graph starts looking tangled and working out what components depend on each other during the deployment process can be painful and error prone.

So I have been thinking a lot about what a perfect, fault tolerant, distributed system might look like.

Most of these problems are generally solved by operating your own DNS service so that addresses for machines are not hard coded IP numbers and you can load balance. But this is still not as elegant as the idea of having a complex service that self assembles out of simple components. Especially if you can also scale by just bringing up new servers that provide whichever service is under heaviest load.

Maybe it's worthwhile investigating using UpNP or Zeroconf. Though neither really assists with service advertisement out of the box.

This entry was tagged design, fault tolerance and self healing