Dash apps on Amazon ECS: deploy faster, better, easier

Data scienceDataOps

Dash is a wonderful visualisation framework for creating interactive, enterprise-level data experiences that builds on the Plotly data visualisation framework (itself an opinionated wrapper over D3.js). You can create fairly complex dashboards and data exploration apps with relatively little effort. There is hardly any need to mess about in Javascript or CSS, and the framework is very cogent and Pythonic . The best part? The entire Dash app we build is effectively a Flask app. Consequently, we can Dockerise it and deploy it on Amazon’s Elastic Container Service (ECS) with surprising ease. This guide will take you through that, step-by-step.

The Dash app we’ll be building

Our Dash app in operation.
An example view of our finished application.

In this tutorial, we’ll mainly be concerned with the deployment of a rather trivial little application that allows you to explore the West African Ebola time series dataset on the UN’s Humanitarian Data Exchange (HDX) site.

The purpose was to display a number of key indications, such as cumulative case counts, deaths and suspected case counts by country and over time. Since this is a fairly interesting multi-dimensional data set (country, time and case status, i.e. suspected/confirmed/deceased), it’s uniquely suited to give a brief demonstration of what Dash can do – even if it is far from being a particularly sophisticated Dash app. You can find the whole app in this Github repo. This is not a tutorial on how to create a Dash app – for that, I suggest you begin here, and make your way through the tutorial. Trust me – it’s a wonderful world of creating beautiful visuals. Rather, this tutorial is about deploying your complete app in a superbly efficient way.

The deployment scheme: Dash is a Flask app that interacts with Gunicorn via WSGI, which then serves content up over HTTP/1.1.
The overall scheme of request processing in our application: clients make calls, gunicorn obtains these via WSGI from Flask and returns them as HTTP responses to the clients.
In more sophisticated implementations, a reverse proxy like Nginx would be used after gunicorn. Since gunicorn can only handle one request at a time, this setup is clearly unsuitable for more serious use cases. A reverse proxy would handle load balancing, concurrent request handling, anonymisation, compression and a boatload of nifty features.

The Dash application – effectively, a Flask app – will communicate with clients using gunicorn, a WSGI HTTP server application. In simpler terms, that means it mediates between HTTP requests and Python method calls. Its main purpose is to serve information exposed by a Python process that implements the WSGI protocol, such as Flask. Every time the user, say, selects a different country on the dropdown, a HTTP/1.1 request is sent. gunicorn translates the incoming HTTP/1.1 request to WSGI method calls intelligible to Flask.

Step by step

Our CONOP1) is basically this:

  • We will first design the Dash app. This has been done for you for the purposes of this exercise (see the repo referred to above), but this works perfectly fine with any other Dash application, as long as there’s a wsgi.py file that calls the main function of your main Flask/Dash app.
  • Then, we Dockerise our Dash app by writing a Dockerfile. This describes the environment in which our code will run, and the command that will start it.
  • We build a Docker container on our computer. If the Dockerfile is the recipe, building the container is baking the cookies.
  • We upload it to Amazon ECR, a Docker container registry.
  • From there, we launch a server using Amazon ECS, where we will deploy our Docker image.
The architectural sketch: our Dash app's Docker image is uploaded to ECR, and from then on deployed onto ECS.
The architectural sketch of our operations: we create a Docker image on our development box, upload it to Amazon ECR and deploy it to Amazon ECS, which will provision a Fargate service. This is what our users will interact with when they interact with the website.

While this all seems complicated, it’s actually quite easy in practice. Let’s get cracking!

Deploying our Dash app on ECS

Step 1: Dockerising the Dash app

The advantage of Docker is that it provides a deterministic environment. We can pre-determine what that environment will look like, and we can rely on it looking like that every single time. The Dockerfile’s job is to ensure this, acting as a recipe of sorts. Let’s build the Dockerfile of our project!

FROM ubuntu:18.10

Every Dockerfile starts with a FROM statement, which tells Docker what image to start from (called the base image).2) Docker, to shamelessly steal from Sir Isaac Newton, is all about standing on the shoulders of giants. Instead of reinventing the wheel, we start from a Docker image that’s as close to our needs as possible. In this case, this is a pre-installed Ubuntu 18.10. You can find images to build on in the Docker Hub, the central repository of all things public Docker.

LABEL maintainer="Chris von Csefalvay <chris@chrisvoncsefalvay.com>"

This part is optional, but LABELs allow you to append metadata to your Dockerfile.3) MAINTAINER used to be a statement of its own, but has since been deprecated, and labels can now be anything. It is a good idea and common courtesy to have your name and e-mail address included in the Dockerfile. That way, users of your work can get in touch with you.

RUN apt-get update
RUN apt-get install -y python3 python3-dev python3-pip

Using apt-get, the package manager Ubuntu uses, we install Python as well as the development headers and pip, the package manager.

COPY requirements.txt /tmp/requirements.txt
RUN pip3 install -r /tmp/requirements.txt

The COPY statement copies a file from the local file system, relative to the Dockerfile itself, to the specified definition in the image, and the RUN statement executes the command from that context (i.e. from within the image) and installs all the packages contained in requirements.txt. This contains all that the underlying OS requires to run our Dash application.

Next, we copy the files in the current folder into the /app folder within the image, and switch the working directory (on the image) to that folder – this contains our Dash app itself:

COPY ./ /app
WORKDIR /app

Finally, we provide the command that launches gunicorn, binds it to 0.0.0.0:80 (to accept any connections from port :80), and launch the wsgi.py file’s specification for a WSGI server (the extension of the file does not need to be provided):

CMD gunicorn --bind 0.0.0.0:80 wsgi

With that, we have constructed the Dockerfile that will create the Docker image we want out of our files (you can check the whole file here). Next, we’ll need to build the image from the Dockerfile.

Step 2: Building the Docker image

The Dockerfile is only the recipe – it’s time to bake the cookies! For this, it is paramount that we navigate in the file system to the directory where the Dockerfile is, and ensure that the relative paths, e.g. /app, which stores the Dash app, are correct. If that is the case, you can simply build the image using the following command:

docker build -t ebov-dash .

You might want to get yourself a cup of coffee – this will take a few.

Building our Dash app's Docker image.
This is what your docker build looks like… in time lapse.

Step 3: Setting up ECR

Navigate first to the Amazon ECR panel, and if this is your first repo, select Get started. You first have to choose a name for the container repository. Your container repository also has a unique identifier that derives from your region and your user ID (obscured):

Creating the Amazon ECR repo.
Create the repository by providing a unique repository name – in this case, ebov-dash.

Note: Depending on region, your URI’s region part (in the above example, us-west-2) may be different.

Next, select your newly created ebov-dash repo on your list of repositories, and note that it is looking quite empty. Click on the grey ‘View push commands’ button in the right upper corner to get some useful directions:

Push commands for your repository. Since we’ve already built our repo, we can skip Step 2.

Note: Again, depending on region, your URI’s region part (in the above example, us-west-2) in these instructions may be different.

Warning: In some rare cases, especially if you’re using root credentials (which you should NEVER, EVER do), you may encounter an error message at the 4th (push) command saying no basic auth credentials. This may be because you do not have credentials set up – in that case, obtain them in the way described here, then run aws configure to add them. However, if you’re pretty sure you’ve added your AWS credentials and all is well, yet you cannot push to the repository, try the following in lieu of the first command:

eval $(aws ecr get-login --no-include-email --region us-west-2 | sed 's|https://||')

In some rare cases, the URI is appended a https:// prefix, which makes it unable to authenticate. The above command resolves that using sed‘s replacement function.

Once the Docker client completes the push (another coffee time – this may, depending on your network speed, take anything up to 15-20 minutes!), you should see your wonderful new image in your ECR images:

Pushing to the ECR repo.
That’s the image we just pushed! Our job here is done, though for convenience’s sake, copy the image URI, as you’ll need it in the next step.

Step 4: Deploying to ECS

Amazon ECS is a streamlined, simplified tool to help you deploy images on ECR onto containers. ECS supports two ways of deploying images (known as ‘launch types’ in AWS lingo):

  • EC2, in which case AWS spins up a regular EC2 instance for us. This is basically, no different from having spun up EC2, installed Docker and deployed the code. EC2 allows for more granular control server-level customisation. For this reason, if you are running safety-critical or regulated series, this should be your choice.
  • Fargate, which is a serverless container runner: think of it like Lambda for containers! While Fargate limits configuration options to resources (CPU and memory), networking and IAM (Identity and Access Management) policies, Fargate is not only much cheaper in most cases – it is also a lot easier to set up.

In this case, we’ll be using a Fargate deployment. This is because we do not need granular server configuration for our Dash app. To get started with ECS, we will need to first define a container, then a task, finally a cluster. This may be somewhat counterintuitive, so let’s take them step by step.

Container definition

Creating the ECS container definition.
The bare-bones ECS container settings. There’s more under the Advanced Settings tab. Looooots more.
Building our ECS container definition.
Our container definition. Pretty barebones, and leaves much to the imagination (or, rather, later configuration) – especially memory and CPU. Don’t worry – we’ll deal with that soon enough.

A container definition is essentially a description for the underlying system. It describes how the system is supposed to run your container, including its requirements. ECS comes with a few pre-defined container definitions, such as nginx, but for now, click on custom, which will let us configure our very own container to accommodate our Dash app. There’s a stunning volume of advanced settings, from health checks to various other settings, but for the basics, you only need to set two things: a container name – in our case, ebov-dash – and the URI to the image, including the tag (the part after the colon in the Docker image URI) – you can copy this from the ECR screen. In this case, that means we’re leaving settings regarding CPU, memory and eventual GPUs for later.

Note: This may not always be your best choice. Occasionally, you want to force a memory-to-compute ratio. You might try to pretend you are picking an EC2 instance – would you pick a memory- or compute-heavy instance?

Once we have completed our container configuration, it’s time to move on to the task definition.

Task definition

Building our ECS task definition.
The task definition leaves relatively few options – in tthis case, only task memory and vCPU units.

A task definition is like a box your container needs to fit in. If your container were an actual physical container, the task definition would be its size and dimensional definition. Let’s consider, for instance, actual containers. In order to fit on a regular container cargo ship, a container has to be a regulation (ISO) 20ft container. This is what the task definition fulfills. In this case, we only need to set the task memory and vCPU units. ECS is highly granular, and allows setting vCPU units at 1/1024th of a core and task memory in megabyte increments.

Service definition

Now, we just need to define the ECS service definition.
Configure how many instances of your service will be run, what IPs will be admitted to it and, optionally, Elastic Load Balancing.

Next, we need to define our service. This determines the number of concurrent tasks that ECS is running concurrently. Among others, it also allows rapid IP restriction to a single CIDR range. This is useful if you want to keep your development product limited to client or in-house access. Note, however, that AWS also creates a security group by default, where you can create more granular settings. Finally, if your networking was set to Bridge rather than awsvpc, you can directly attach a load balancer here (otherwise, you would have to do that separately, following the instructions here).

Cluster definition

Finally, all we need to do is to define a cluster. AWS takes care of most of cluster management. Therefore, all we need to do is to name our cluster, and press next.

The ECS cluster definition is pretty straightforward at this point.
Cluster definition: quite literally a piece of cake.

Deployment time!

Once we have confirmed everything on the confirmation screen, it’s go time. AWS will present us with a live overview of the creation process:

Deploying our ECS image.
A third of the way there…

This is another good opportunity for coffee, as this might take a few minutes.

Confirming Dash app deployment

Once deployment is complete, we can look at our ECS cluster. Opening the Tasks tab, we can look at the only running task by clicking on its identifier. This will then show the currently assigned public IP. And hey presto, there’s our application!

Our ECS task is running!
Selecting the task definition shows the current network settings, including the public IP. Yours will, obviously, be different. Note this is an elastic IP, and if you want your app to have a static IP, you will have to attach one to the Elastic Network Interface (ENI) whose ID is displayed above.

Updating your Dash deployment on ECS

Your application has been humming away peacefully. Then one day, you decide to add a few features, fix bugs or add some more visualizations to your Dash app. How do you update your deployment?

Updating our Dash image by entering the new ECR link.
Updating the container definition with a new version by editing the image version tag. Note this must correspond to the new image’s tag in ECR.
  1. First, build your Docker image as described above, and push it to ECR. You may continue to use the :latest label, in which case it will simply be overwritten, or give it a more imaginative name. Overall, I find the use of :latest, which has become a convention in ECS where only one version per image is being kept, somewhat of a bad idea, and in practice, I often have images correspond to major development branches: master (production grade), develop (beta) and feature-xxxx, where xxxx is a feature we’re testing at the moment.
  2. Once that is done, create a new task definition by selecting the Task definitions menu point on the left-side menu in ECS, select your task definition, and click the blue Create new revision button.
  3. This brings you to a new task revision configuration. You might need to scroll down a bit to find the container definition. Click on the container name, and replace the old version’s tag with the new one.
  4. Press update to the container settings.
  5. Finally, press Create to commit this new revision of the task definition.

Having created a new task definition revision (same name, but this time ending in :2), all we need to do now is redeploy. In your ebov cluster, select the ebov-dash service and click Update. Finally, update to the recently created :2 version of the task definition family, and follow instructions. Sit back and enjoy having updated your entire Docker based containerised cluster comprising your Dash app deployed, and all that with two command line instructions and five clicks.

Deploying the new service definition.

Conclusion

Reading this guide, deploying your Dash app over ECS may appear anything but trivial. However, once you’ve done it a few times, the containerise-define-deploy workflow will become almost second nature, and you’ll have deployed applications that will rarely, if ever, need tending to.

For advanced ECS tips, subscribe to the Weekly Bite. Over the next few weeks, there’ll be an ECS trick each week in your Bite! Just scroll to the very top of your page or look at the sidebar, and enter your e-mail address. No spam – promise!

References   [ + ]

1. Concept of Operation
2. See Dockerfile reference on FROM.
3. See Dockerfile reference on metadata.
Chris von Csefalvay
Deep learning and computer vision researcher by day, clinical computational epidemiologist by night, constantly sleep-deprived husband and dad to the world's most adorable Golden Retriever puppy. Educated at Oxford and Cardiff, I have been working with data science teams for the last decade to improve their operations, fix their processes and make great teams perform even better.

You may also like

Leave a Reply