Replacing Cron & Building Scalable Data Pipelines at Airbnb

About Harrison Shoff

Understanding and analyzing user behavior is crucial for us at Airbnb. Our analytics team depends on very complex data pipelines for analysis, necessitating having a Data Infrastructure team for building systems and tools to support this function. In the past, we used Cron to manage these complex workflows, but we quickly realized sleep statements are not enough for managing complex dependency hierarchies. In this talk we’ll discuss how we build data pipelines. We use a tool we built in-house called Chronos, which is a distributed system for scheduling data pipelines that depends on Mesos. It allows us to build data pipelines which are easier to manage and debug. Internal tools are just as important as user-facing products. We care about how it works and how it feels, which is why Chronos doesn’t stop at the command line. We built a web interface on top of Chronos which abstracts away the complexity for building and managing distributed data pipelines. We’ll talk about how design, front-end and back-end engineering come together to build products like this, and share our experience building this as a scala project on top of mesos using backbone and dropwizard.


About Harrison Shoff

Speak Your Mind


To create code blocks or other preformatted text, indent by four spaces:

    This will be displayed in a monospaced font. The first four 
    spaces will be stripped off, but all other whitespace
    will be preserved.
    Markdown is turned off in code blocks:
     [This is not a link](

To create not a block, but an inline code span, use backticks:

Here is some inline `code`.

For more help see