API overview

In this page we provide a brief overview of our deployment setup while explaining the reasons behind our choices.

Note

Deployment setups differs from problem to problem, and our proposed solution may not work for a particular use case. It is not our idea to impose a particular deployment solution.

The system we want to build has to satisfy the following requirements:

  • We want to develop a service to let the users interact with out model. We use FastAPI to quickly build a reliable solution.

  • We will have a concurrent setup (many users may interact with the APIs at the same time), so we need a way to schedule tasks in an asynchronous and distributed environment. Celery offers us the tools to satisfy those requirements.

  • Celery has its own interface: instead of implementing a dedicated piece of code, we use RabbitMQ as message broker.

  • This creates for us an asynchronous environment, where we need a way to cache the results. The NoSQL database Redis let us make a fast and powerful caching service.

  • Finally, we need a tool to store structured data, like data for users and locations. PostgreSQL is a modern relational database that does this work.

Basic deployment schema

../_images/API.png

In our setup:

  • The user creates a request through a standard interface provided by FastAPI. This request will be then processed and sent to the message broker RabbitMQ.

  • RabbitMQ will receive the message, process it, and forward it to Celery so it can understand what has to be executed in a dedicated worker, and what are the parameters to use.

  • The task is executed in a worker. If instructed, it can also access the database for get more data or store partial results.

  • Once the task is finished, Celery retrieves the result and store it in a cache system like Redis.

  • FastAPI pulls the results from the cache and show it to the user.

Note

Our deployment was built as an advanced toy example to show how many components are needed and how they should interact to deploy a system. Things like security, hardware scalability, extensive testing, CI, etc… are not considered but are necessary for a real world application.

FastAPI

FastAPI is a open source Python web framework for building web APIs.

With FastAPI we define how the user interact with our application. The “URL” used by the users are called routes.

We implemented routes for:

  • submitting an inference task,

  • check the status of a task,

  • retrieving the predictions,

  • choose the preferred place among the predicted ones,

  • training a new model,

  • get some statistics for monitoring the system.

More specifically, we implemented the following API routes:

Route

Description

/inference/start

Start a new inference task with the current model

/inference/status/{task_id}

Retrieve the state of the inference task started

/inference/results/{task_id}

Get the prediction from the model once the inference task is finished

/inference/select/

Select the best place among the top ranked places

/train/start

Train a new model

/content/{value}

Get debugging content from database

/metrics

Get monitoring statistics

As an example of how the routes are implemented, the following snippet shows the code for route /inference/start. This route consumes the data of a user and it schedules an inference task on celery.


@api.post('/inference/start')
async def schedule_inference(user_data: requests.UserData, db: Session = Depends(get_db)):
    """This is the endpoint used for schedule an inference."""
    crud.create_event(db, 'inference_start')
    ud = crud.create_user_data(db, user_data.dict())
    task: AsyncResult = inference.delay(ud.user_id)

    db_inf = crud.create_inference(db, task.task_id, task.status)
    return requests.TaskStatus(
        task_id=db_inf.task_id,
        status=db_inf.status,
        type='inference'
    )

In words, this route first creates a log event, for monitoring and tracking requirements, then create stores the received data in the database and creates an inference task. This task has only the freshly created user_id as parameter, since it will collect all the required data during the task itself from the database. Once the inference is started, the resulting AsyncResult object is stored in the database, once again for tracking purposes.

Description of every single route is writter in API Routes.

Celery

Celery is a distributed task queue that process messages and provides the tools for task scheduling.

We use Celery to perform some computational intensive tasks in an asynchronous way. It is very important to consider that providing just a web application to a user is not enough: if the user starts a heavy task, like our inference, it could be quite annoying to have the website freezed for until the task ends.

Celery needs a startup file and a configuration file. Writing them could be quite challenging for complex system. In out example we kept them easy and configure just the bare minimum to make it work as intended.

Deploying a system that can scale up with the computational load is mandatory. We implemented these two tasks in Celery:

  • model training,

  • inference on the trained model.

Implementation of these two tasks is described in Tasks

Configuration

Celery is configured by using two python files:

  • celery.py

  • celeryconfig.py

In the first file we create a Python object worker pointing to a Celery instance. This object represents the Celery application and works as entry point for every operations with Celery. The object Celery is instantiated giving a name, a reference to a backend (in our case Redis) and a reference to a message broker (in our case RabbitMQ). Once the object is created, we load our custom configuration file using config_from_object and then, at line 17, we start Celery itself.

celery.py
 1"""Celery configuration script from environment variables and config file."""
 2from celery import Celery
 3
 4import os
 5
 6
 7worker = Celery(
 8    'tasks',
 9    backend=os.getenv('CELERY_BACKEND_URL'),
10    broker=os.getenv('CELERY_BROKER_URL'),
11)
12
13worker.config_from_object('worker.celeryconfig')
14
15
16if __name__ == '__main__':
17    worker.start()

The environments variable used in the latter file have to be set at system level (as system environment variables or, in our case, as environment variable in the docker-compose file).

The values we used are:

CELERY_BROKER_URL=pyamqp://rabbitmq/
CELERY_BACKEND_URL=redis://redis/

In the file celeryconfig.py we configure Celery such that it can load and start on request the tasks we want to run asynchronously.

celeryconfig.py
result_expires=3600

# Ignore other content
accept_content=['json']
task_serializer='json'
result_serializer='json'

timezone='Europe/Zurich'
enable_utc=True

imports=(
    'worker.tasks',
    'worker.models',
    'database',
    'database.crud',
)

include=[
    'worker.tasks.inference',
    'worker.tasks.train',
]

Redis

Redis is a NoSQL in-memory data store that we use as a cache to temporarily store results from the asynchronous tasks.

In our deployment setup this NoSQL database is only used by FastAPI and Celery to exchange data once the task have been completed. Since both Celery and FastAPI has a very good compatibility with Redis, the solution is basically plug & play and it does not need configuration.

RabbitMQ

RabbitMQ is a message broker that we use as middleware to send messages from FastAPI to Celery.

A message broker is usually necessary when two modules needs a translation layer to communicate correctly. RabbitMQ is a well supported solution which provides a web-based graphical interface useful to monitoring the inter-communication between services.

RabbitMQ has a lot of customizations, but we run it in very basic setup for our toy example, so we can use the default settings.