Lookery Dev Blog RSS

"Stream of geeky consciousness..." @electromute


Authors
@ckelly@danmil
@dcancel@eliast

Friends


Archive

Jan
16th
Fri
permalink

OTP Release Mgmt — Worth the Complexity?

Sort of an open question — anyone out there use the full, somewhat forbiddingly complex system for Erlang OTP releases?  I finally think I at least understand it (thanks in no small part to Mitchell Hashimoto), but now that I do, I’m really not sure it’s worth it.

It seems like you put in place a bunch of nigglingly specific little config files, and then write a bunch of detailed specs for how code changes from one version to another.  And that lets you do some nice upgrade/downgrade stuff (via further nigglingly specific “appup” files).

However, for our case, I think it maybe boils down to, either:

  1. The code change is very simple, and I can make it happen via the `code:load_file()` and `code:purge()` functions without restarting the server.
  2. The code change is complex, and I’ll just restart the server outright.

I mean, sure no 9 9’s uptime then, but I’m just not sure we need that (and if we do, we might be able to get it via some haproxy games instead).

The OTP system seems to me to take over a bit of what the OS does in Unix-land.  In some ways, that’s good (the os-process-like Erlang processes are much better than shared-memory threads), but in other ways, now that I’m trying to manage a deployed server, it’s just, some things are a bit odd.

Anyone have any experience to share?

-Dan M

Comments (View)
Jan
14th
Wed
permalink

Very Nice OTP Examples/Tutorial

Meant to add this to the last post.  One Mitchell Hashimoto has written up a very nice series of tutorials on how to use the various bits of OTP (gen_server, gen_fsm, etc), which is a very, very nice addition to the dry, example-free OTP design principles.

His site is:

http://spawnlink.com/

And the series of tutorials starts with this one:

http://spawnlink.com/articles/introduction-to-the-open-telecom-platform/

Very much worth checking out…

-Dan M

Comments (View)
permalink

Back in Erlang Today…

…and, I have to say, for writing async/parallel code, it’s just awesome.  I sometimes think I’m drinking the shiny new technology-flavored kool-aid when I talk about Erlang (which kool-aid I desperately try to avoid, personally), but then someone gives me a task like:

Write a chunk of code to load balance HTTP requests between two identical cache servers, failing over to one if the other goes down.

And as I write it in Erlang, it’s just such a perfect fit.

Oh, and for any regular readers of this blog, I actually caught myself today saying “Hey, records are working really well right here.”  That’s right, records.  I know.  After all the abuse I’ve heaped on them.  Built-in dicts are still very much missing from the langauge, don’t get me wrong, but records work pretty well for holding state inside gen_server loops.

-Dan M

Comments (View)
Jan
9th
Fri
permalink

The Absolute Most Useful Thing I’ve Done At My Job So Far

I think the most satsifying thing about being a programmer is when you manage to take something which is difficult, or boring, or stressful, and you get the computer to do it for you. I don’t know about you, but for me, well, pretty much nothing makes me happier.  (side note: I think this is why I adore test-driven development — it lets me be lazy, without having that fear that someone later on will say “Um, Dan, why does your program make horribly obvious and embarassing error X?”  I don’t like having to think about that stuff all the time.  With tests, I don’t have to.)

Which brings me to today’s post.  I spend a good bit of time hovering over a bunch of Hadoop Map/Reduce jobs.  They form a directed, acyclic graph (natch), and we run them continually, on various repeating intervals (evey hour, every four hours, etc).

Elias did the original set up, and when I came on, I spent a while working on the code which launches each job, checking to see if its antecedents are ready, getting the output into the proper place in the HDFS tree, etc.

My single best day was when I realized that, with just a couple of easy python functions, I could get that code to (via the magic of graphviz), create an image of the dependency graph.  And then, by checking to see how old the data in HDFS was, I could find late jobs, and color them yellow/red appropriately.  Here’s an example from earlier today:

Scheduler Graphs

I can’t tell you how happy this makes me.  One of our jobs (“audience_metrics”) had, in fact, died because of some odd strptime formatting issue (found that in the log, knew exactly where to look).  Without the graph, I’m honestly not sure when I would have noticed this.   Days at least.  Likely it would have happened when one of our publishers asked us “Hey, how come my visitor counts haven’t updated in forever?”.   Which is the kind of thing I hate.

Anyways, thank you graphviz.  And yay for monitoring graphs.

-Dan M

Comments (View)
permalink

Great Error Message

Tailing some log files, forgot to enter the filename, just entered ‘tail -f’, got this response:

tail: warning: following standard input indefinitely is ineffective

Love it.  Dunno why.  Just love it.

-Dan M

Comments (View)