SECRET OF CSS

FastAPI Best Practices. Opinionated list of best practices and… | by Yerassyl Zhanymkanov | Aug, 2022


Opinionated list of best practices and conventions we developed after 1.5 years in production at our startup.

1*WtrpbdU2MHnQvoyt5VOtUg
A DALL-E-generated image with a hipster sitting in front of wide vertical screens

Although FastAPI is a great framework with fantastic documentation, it’s not quite obvious how to build larger projects for beginners.

For the last 1.5 years in production, we have been making good and bad decisions that impacted our developer experience dramatically. Some of them are worth sharing.

Contents

  1. Project Structure. Consistent & Predictable
  2. Excessively use Pydantic for data validation
  3. Use dependencies for data validation vs DB
  4. Decouple & Reuse dependencies. Dependency calls are cached
  5. Don’t make your routes async, if you have only blocking I/O operations
  6. Migrations. Alembic
  7. BackgroundTasks > asyncio.create_task
  8. Be careful with dynamic pydantic fields
  9. Save files in chunks
  10. If you must use sync SDK, then run it in a thread pool.

This article contains only a portion of the guidelines we followed, so feel free to find the original github repository with the full list of detailed best practices, which has already gained some positive feedback (#1 hot post for a day in r/Python, and 250 stars within the first week on GitHub).

There are many ways to structure the project, but the best structure is a structure that is consistent, straightforward, and has no surprises.

  • If looking at the project structure doesn’t give you an idea of what the project is about, then the structure might be unclear.
  • If you have to open packages to understand what modules are located in them, then your structure is unclear.
  • If the frequency and location of the files feels random, then your project structure is bad.
  • If looking at the module’s location and its name doesn’t give you an idea of what’s inside, then your structure is very bad.

Although the project structure, where we separate files by their type (e.g. api, crud, models, schemas) presented by Sebastián Ramírez is good for microservices or projects with fewer scopes, we couldn’t fit it into our monolith with lots of domains and modules.

A structure that I found more scalable and evolvable is inspired by Netflix’s Dispatch with some little modifications.

fastapi-project
├── alembic/
├── src
│ ├── auth
│ │ ├── router.py
│ │ ├── schemas.py # pydantic models
│ │ ├── models.py # db models
│ │ ├── dependencies.py
│ │ ├── config.py # local configs
│ │ ├── constants.py
│ │ ├── exceptions.py
│ │ ├── service.py
│ │ └── utils.py
│ ├── aws
│ │ ├── client.py # client model for external service communication
│ │ ├── schemas.py
│ │ ├── config.py
│ │ ├── constants.py
│ │ ├── exceptions.py
│ │ └── utils.py
│ └── posts
│ │ ├── router.py
│ │ ├── schemas.py
│ │ ├── models.py
│ │ ├── dependencies.py
│ │ ├── constants.py
│ │ ├── exceptions.py
│ │ ├── service.py
│ │ └── utils.py
│ ├── config.py # global configs
│ ├── models.py # global models
│ ├── exceptions.py # global exceptions
│ ├── pagination.py # global module e.g. pagination
│ ├── database.py # db connection related stuff
│ └── main.py
├── tests/
│ ├── auth
│ ├── aws
│ └── posts
├── templates/
│ └── index.html
├── requirements
│ ├── base.txt
│ ├── dev.txt
│ └── prod.txt
├── .env
├── .gitignore
├── logging.ini
└── alembic.ini

When a package requires services or dependencies from other packages — import them with an explicit module name.

from src.auth import constants as auth_constants
from src.notifications import service as notification_service
from src.posts.constants import ErrorCode as PostsErrorCode

Pydantic has a rich set of features to validate and transform data.

In addition to regular features like required & non-required fields with default values, Pydantic has built-in comprehensive data processing tools like regex, enums for limited allowed options, length validation, email validation, etc.

Pydantic can only validate the values of client input. Use dependencies to validate data against database constraints like email already exists, user not found, etc.

As a bonus, using a common dependency eliminates the need of writing tests for each of these routes to validate the post_id.

Dependencies can be reused multiple times, and they won’t be recalculated — FastAPI caches dependency’s result within a request’s scope by default, i.e. if we have a dependency that calls service get_post_by_id, we won’t be visiting DB each time we call this dependency – only the first function call.

Knowing this, we can easily decouple dependencies into multiple smaller functions that operate on a smaller domain and are easier to reuse in other routes. For example, in the code below we are using parse_jwt_data dependency three times:

  1. valid_owned_post
  2. valid_active_creator
  3. get_user_post,

but parse_jwt_data is called only once, in the very first call.

Under the hood, FastAPI can effectively handle both async and sync I/O operations.

  • FastAPI runs sync routes in the threadpool and blocking I/O operations won’t stop the event loop from executing the tasks.
  • Otherwise, if the route is defined async then it’s called regularly via await and FastAPI trusts you to do only non-blocking I/O operations.

The caveat is if you fail that trust and execute blocking operations within async routes, the event loop will not be able to run the next tasks until that blocking operation is done.

The second caveat is that operations that are non-blocking awaitables or are sent to thread pool must be I/O intensive tasks (e.g. open file, db call, external API call).

  • Awaiting CPU-intensive tasks (e.g. heavy calculations, data processing, video transcoding) is worthless since the CPU has to work to finish the tasks, while I/O operations are external and the server does nothing while waiting for that operations to finish, thus it can go to the next tasks.
  • Running CPU-intensive tasks in other threads also isn’t effective, because of GIL. In short, GIL allows only one thread to work at a time, which makes it useless for CPU tasks.
  • If you want to optimize CPU-intensive tasks you should send them to workers in another process.
  1. Migrations must be static and revertable. If your migrations depend on dynamically generated data, then make sure the only thing that is dynamic is the data itself, not its structure.
  2. Generate migrations with descriptive names & slugs. Slug is required and should explain the changes.
  3. Set human-readable file template for new migrations. We use the *date*_*slug*.py pattern, e.g. 2022-08-24_post_content_idx.py
# alembic.ini
file_template = %%(year)d-%%(month).2d-%%(day).2d_%%(slug)s

BackgroundTasks can effectively run both blocking and non-blocking I/O operations the same way it handles routes (sync functions are run in a thread pool, while async ones are awaited later)

  • Don’t lie to the worker and don’t mark blocking I/O operations as async
  • Don’t use it for heavy CPU-intensive tasks.

Don’t hope your clients will send small files.

If you have a pydantic field that can accept a union of types, be sure the validator explicitly knows the difference between those types.

Not Terrible Solutions:

  1. Order field types properly: from the most strict ones to loose ones.

2. Validate input has only valid fields.

Pydantic ignores ValueErrors for union types and iterates them. If no type is valid, then the last exception is raised.

3. Use Pydantic’s Smart Union (>v1.9) if fields are simple

It’s a good solution if the fields are simple like int or bool, but it doesn’t work for complex fields like classes.

Without Smart Union:

With Smart Union:

If you must use an SDK to interact with external services, and it’s not async then make the HTTP calls in an external worker thread.

For a simple example, we could use our well-known run_in_threadpool from starlette.



News Credit

%d bloggers like this: