The Journey to Elixir in Production

(At Scale)

Brandon Richey (@diamondgfx)

Director of Engineering at Teladoc

Who Is Teladoc?

Teladoc is the first and largest provider of telehealth medical visits in the United States.

We have over 12.6 million members

And we've done over 3000 consultations in a single day!

We have offices located in Purchase, NY, New York City, NY, and Lewisville, TX!

And we run Elixir in Production!

What Got Us Interested?

One of our coworkers started talking about this young language that boasted impressive benchmarks!

And even better, it was a syntax that was very similar to Ruby, so it was very easy to pick up, while also having a very smart design and feel to it!

At the time, we were investigating Rust and Go as possible ways to supplement our platform and bolster our performance.

At Teladoc, we don't design for our current size, we try to design for 10x or 100x growth.

And What Kept Us Interested?

Phoenix!

Now we had a great (and familiar) web framework to build out parts of our application stack!

And even better: it was insanely fast by default.

It took care of the harder parts of designing resilient and massively concurrent systems and allowed us to focus on building our product instead!

And Our Reactions to Our First Prototype?

Wow. Just wow.

The next question we had to answer about this amazing new technology?

How can we get this into production?

As luck would have it, the next project would require something just like this. But we still had to solve a different problem about this great new technology:

How can we introduce it at Teladoc?

Introducing Elixir to the Workplace

My enthusiasm for Elixir and Phoenix was palpable at this point.

I was already annoying co-workers with my tales of massive scaling and sub-millisecond response times.

I wanted to do more with the language but couldn't very well change the entire tech stack in a day!

So How Do You Get a New Tech Into Your Stack?

To get support for a technology, you can't just tell tales of how great it is.

You have to do two things: demonstrate and educate.

I started off with a small demonstration, knowing that education would come as a free side effect. (The good kind of side effects!)

Our First Demo

Since we were a Rails shop and many of the developers had experience with Rails dating back to the beta days, I decided to start off with the scaffold blog demo.

This was a great way to showcase to other developers the syntax of Elixir and some of the language's features.

Even better, it dispelled some immediate fears and reactions from people who had never heard of the technology: statements like "it will be too hard to get started in"

Everyone's Favorite Features?

So we did our demo, and everyone really enjoyed it! For some people it was a refreshing change to check out something new and get your brain working on a new set of problems. But what did people like the most?

  • The Pipe Operator
  • Pattern Matching
  • Guard Clauses
  • Seeing response times measured in microseconds instead of milliseconds!

What About Immutability?

I expected immutability to be the biggest stumbling block for most of our development team.

Turns out, since most of us had been supporting various legacy Rails code bases, many had grown tired of trying to determine the cause of errors due to insanely-specific edge cases.

The promise of having less wibbly-wobbly timey-wimey state mutability bugs to track down was a major winner.

But Demonstrations Alone Aren't Enough

By this point, I had laid the groundwork for interest for those in our team that had not already heard of Elixir

But alone, a neat demo only gets people interested, but it doesn't necessarily get enthusiasm.

In addition, to pull a project into a new technology, you need to persuade management.

Talking to Management

Working with a new language or framework is always very risky. In this case, I strongly felt that the rewards far outweighed the risks!

One of the biggest risks is skill.

Hiring for Ruby developers isn't easy, but it's certainly easier than something with an even smaller following, right? Then, we realized we were approaching the problem in the wrong way.

What if instead, you trained your staff on a new language and made people comfortable to support it?

Education Fosters Adoption

Elixir and Phoenix are amazing products, but you still need to train developers on them. To get people interested, we needed people to also be educated!

We started a few lunch-and-learn topics on Elixir and Phoenix to start building up even more developer enthusiasm.

As a bonus, with training available for a new technology, you no longer need to hire specifically for that language. Now, we didn't have to worry about not finding developers with the right keywords in their résumés. Instead, we could focus on hiring good developers and training them.

Mixing demonstration with education

Now the base of training was in place; we also needed to make people comfortable with the development workflow.

We held a 1.5 hour long interactive development session: building a chat service with Phoenix.

The session went from starting off with nothing to having a fully-working chat service in Phoenix using only the base tools provided by Phoenix; no extra magic. And it was GREAT!

The Result?

We succeeded in two immediately apparent areas:

We demonstrated to the development team that building a new Phoenix project was simple and familiar, but added a ton of extra benefits.

And we demonstrated to management that building something in Phoenix was just as productive as building something in Rails.

One Hidden Benefit

For some developers, this was their first introduction to an entirely different methodology of web application design: real-time instead of request/response.

The First Project

Eventually, it came time to build out our first project in Elixir.

I had just returned from ElixirConf 2015 and was itching to build something in it, but I knew I needed to find the right project.

And within weeks, we were approached about improving an internal monitoring solution.

The Requirements

The requirements for the project were clear:

The system had to be able to capture information from some of our largest applications.

The system had to be insanely resilient and error-free.

The system had to not introduce new errors or needless complexity into our existing infrastructure.

And it had to be built in very short deadline.

And how did it go?

Beautifully.

Even to this day, we have had no problems that were specific to the Elixir/Phoenix code base.

Every issue we've had so far (and there have been few at that) has been related to hardware, network, or the Ruby code supporting communication to the Elixir app.

Integrating An Elixir App

So now we had our Elixir application all built and ready to go! But our existing tech stack and infrastructure were all focused around Ruby and not much else!

We needed to figure out how to deploy this new Elixir application into our existing ecosystem.

And given the goals of our application design, we needed to understand the gotchas of a totally new base technology.

Some of which we found out as we went along.

Initial Challenges

Figuring out Exrm/Conform

For example, one gotcha: all of our development machines are MacBook Pros, but we'd be deploying to a CentOS box.

We can't build a release on a dev box and push it up.

We didn't have Elixir and its requirements on any of our production boxes.

First Pass At A Solution

We created a release VM that had all of the necessary packages and requirements.

Releases would be built on that VM and then pushed to a deployment box instead.

The end result was inelegant and manual (but working).

Other Challenges

Did you know that a default Exrm release will create a maximum of five rotating log files? And that they will cap out at 100KB per file?

I didn't!

But I certainly understood it after checking the log files on the first day!

Other Challenges

We use New Relic for monitoring! Sadly, it doesn't support Elixir, so APM was out.

We also needed to implement error notifications, but we used an older version of an exception reporting tool that wasn't supported in any of the Elixir libraries.

We also faced one more interesting challenge...

Our Phoenix app could take in more connections without breaking a sweat, so much so that our apps couldn't break them (to the point we thought connections were being dropped).

Sometimes, you build a product expecting some amount of failure and expect to iterate and improve it, ironing out those mistakes bit by bit.

With the Elixir side of this application, I didn't have to!

Current Snapshot of Performance

 

Average CPU Utilization: 0.05%-0.10%

Average RAM usage: 500MB-600MB

This level of performance is what we see during peak times.

This application takes in data sent from multiple load-balanced Rails servers and inserts it into a database.

At any given time this application is servicing requests from nearly 20 different applications all feeding directly in!

And it doesn't even break a sweat.

Other Results

We've actually discussed scaling back the VM because we're not even using a small fraction of its resources.

There is nothing better than having a conversation with infrastructure that your app is using far less resources than estimated, and that you'd be okay reclaiming those resources for certain other larger and more memory/CPU-intensive applications!

Where Is Teladoc with Elixir Now?

The team? There's practically a waiting list of people that want to work on Elixir projects.

When presented with the opportunity to learn something new or build in a new platform, a lot of developers will jump at the opportunity.

And when people see how Elixir changes and improves their development workflows, it further pushes them towards adoption!

The management? The excitement about developing true realtime applications and the massive scaling potential have made Elixir a very viable option.

Plus, all of the benefits that come with an immutable functional language make long-term support and maintenance a very easy question to answer, something that is not always a given in some other technologies.

The future? We have at least three projects in the wing that are going to be Elixir-based projects, and options for plenty other applications!

What does the future hold?

Teladoc is only going to keep growing from here, and we are not only excited about this prospect, but we're ready for it.

And as we keep growing, our performance needs are also only going to keep growing.

And that just means we're going to need more Elixir!

Any questions?

Empire City Elixir Conference Talk: Teladoc's Journey to Elixir in Production (At Scale)

By Brandon Richey

Empire City Elixir Conference Talk: Teladoc's Journey to Elixir in Production (At Scale)

  • 2,141