Erlang Hot Code Download not widely used?

I just saw this 2012 video from LinuxConf.au about Erlang in production.

There is a part of the video where Bernard says that no big Erlang projects use Hot Code Loading apart from Ericsson because it is very difficult to guarantee that everything will work. It's around minute 29 .

It's true? Are there tools to help you test the hot code loading or make it easier now?

+3


source to share


2 answers


This is not true. Every Erlang user uses hot code loading to their advantage in one way or another - be it for development, testing, troubleshooting, one-time fixes, or full-scale deployments. This is one of the main advantages of Erlang. Rather unique.

For example, WhatsApp, one of the largest users of Erlang, relies on hot code downloads for almost all codepresses.



I've personally worked with hot code loading in scripts where every change was well understood and often done by the same person who made the change. It works really well and good engineers have no problem with it. Speaking of tools, loading modules one by one from the Erlang shell with l(...).

or all at once with l().

(see here ) works fine. Some people prefer release based tools like relx .

Others, such as Ericsson, are using enterprise style deployments with hot code loading after rigorous testing of clear releases and fixes. The goal here is to retrofit without using spare capacity and special procedures for draining and shifting loads. Operationally, this may be easier and more efficient than restarting, but testing may be more expensive.

+11


source


It is difficult to know if this is widely or little used. There are many Erlang systems out there today. However, I can think of reasons why and why not use it, since I have been working with bot settings for quite some time.

In favor of using:

  • Obviously, it is clearly useful to provide a fast feedback loop during development. I always develop with an open shell and with functions to load code automatically as a son of how to compile.
  • In the rare case you need to implement a monolithic application with high availability requirements, this is basically the only option

The main reason not to use it, as the presentation says, is difficult. Even if you manage to understand exactly how it works (this is not the hardest part).

This, in my opinion, is not just a tooling problem, but rather that you get a lot of internal complexity just because your code is now part of a mutable system state. You are basically running out of a lengthy system that changes behavior, so you add these to the problems you already have:



  • You are no longer sure that restarting the system will not change behavior in any fundamental way. As such, you probably have to take care that whatever code you load is also written to disk.
  • Changing the way your modules work (e.g. loading new code) is very difficult if: a) you never break compatibility, b) you somehow define the order in which modules should change, or c) you accept the worst that can happen it's a few glitches due to undefined functions, functions or case arguments, etc., and hope for the best (the worst is when new and old modules interact in unexpected ways, before you've finished loading all the new ones and actually run which one impossible logic).
  • You will almost certainly end up killing some process running old code when new code is loaded at some point. Maybe your leaders will help you, maybe not. This can be very confusing and difficult to debug anyway.
  • As the presentation says, it is very difficult to verify (if not impossible).

Etc.

Add to that, you have a long live server running with a long state of life, which is far from ideal.

So my advice is always that if you can get away with a distributed application and rolling updates, you should. This option is much easier to manage and, in my experience, works better overall.

+1


source







All Articles