Lookery Dev Blog RSS

"Stream of geeky consciousness..." @electromute


Authors
@ckelly@danmil
@dcancel@eliast

Friends


Archive

Jan
9th
Fri
permalink

The Absolute Most Useful Thing I’ve Done At My Job So Far

I think the most satsifying thing about being a programmer is when you manage to take something which is difficult, or boring, or stressful, and you get the computer to do it for you. I don’t know about you, but for me, well, pretty much nothing makes me happier.  (side note: I think this is why I adore test-driven development — it lets me be lazy, without having that fear that someone later on will say “Um, Dan, why does your program make horribly obvious and embarassing error X?”  I don’t like having to think about that stuff all the time.  With tests, I don’t have to.)

Which brings me to today’s post.  I spend a good bit of time hovering over a bunch of Hadoop Map/Reduce jobs.  They form a directed, acyclic graph (natch), and we run them continually, on various repeating intervals (evey hour, every four hours, etc).

Elias did the original set up, and when I came on, I spent a while working on the code which launches each job, checking to see if its antecedents are ready, getting the output into the proper place in the HDFS tree, etc.

My single best day was when I realized that, with just a couple of easy python functions, I could get that code to (via the magic of graphviz), create an image of the dependency graph.  And then, by checking to see how old the data in HDFS was, I could find late jobs, and color them yellow/red appropriately.  Here’s an example from earlier today:

Scheduler Graphs

I can’t tell you how happy this makes me.  One of our jobs (“audience_metrics”) had, in fact, died because of some odd strptime formatting issue (found that in the log, knew exactly where to look).  Without the graph, I’m honestly not sure when I would have noticed this.   Days at least.  Likely it would have happened when one of our publishers asked us “Hey, how come my visitor counts haven’t updated in forever?”.   Which is the kind of thing I hate.

Anyways, thank you graphviz.  And yay for monitoring graphs.

-Dan M

Comments (View)
blog comments powered by Disqus