Getting Your Python API SaaS-Ready: Four essential features you can’t miss.

11 min readJun 23, 2023

The shift to Software as a Service (SaaS) business models have created a robust market for businesses and solopreneurs to deliver value, improve customer service, and generate continuous revenue streams. This shift has made APIs — the backbone of any digital interaction — more critical than ever. Unfortunately, not all APIs are ready to transition to a full-fledged SaaS model. They must be strengthened with certain core features to make this leap successfully.

If you have a Python API that you’re considering transitioning into a SaaS offering, keep reading this article; here, I will walk you through four core features to add to your API to ensure it’s SaaS-ready. We’ll delve into topics like controlling access by using API keys and client secrets, logging errors and events, and making your API bulletproof to client request abuse

These features will not only bolster the performance and reliability of your API but also enhance its security and usability, making it a top-tier choice for your clients. Whether you’re a seasoned API developer or just getting started in the SaaS world, this guide aims to provide valuable insights and actionable advice to elevate your API to the next level.

Before starting, throughout this article, I will use Python and FastAPI for all the technical aspects of building a robust API. I love Python’s power and simplicity, especially when combined with a fast, modern web framework like FastAPI.

The following are the four cross-concern features I think are indispensable to transforming your API in a SAAS:

Structured error and event logging handling.
Securing the API with API key support.
Versioning.
Rate Limiting: Safeguarding your API from potential abuse.

So, let’s embark on this journey to transform our Python API into a robust SaaS offering!

— -

Logging errors and events: How to enhance the observability of our API.

A critical factor in maintaining the health and usability of your API as a SaaS product is the ability to log errors and events effectively. As we all know, logs provide valuable insights from the system, enabling us to monitor its behavior, debug issues, and understand user interaction patterns.

FastAPI has built-in support for handling and logging exceptions. However, integrating it with third-party services like Loggly, Datadog, or Sentry can further streamline logging processes, providing comprehensive insights and real-time alerts.

Regarding exception handlers, FastAPI offers a powerful mechanism to catch exceptions and perform necessary operations — like logging — before returning the error response to the client. Exception handlers are the perfect place to log errors as they provide a centralized point for dealing with all exceptions, reducing redundancy and improving maintainability.

Let’s see an example of how we set up an exception handler in FastAPI and log errors using Sentry:

from fastapi import FastAPI, HTTPException
from sentry_sdk import init as initialize_sentry, capture_exception

initialize_sentry(dsn="YOUR_SENTRY_DSN")

app = FastAPI()

@app.exception_handler(HTTPException)
async def http_exception_handler(request, exc):
    # Capture the exception for Sentry
    capture_exception(exc)
    # Return the default FastAPI handler response
    return await request.app.default_exception_handler(request, exc)

In the code snippet above, we’re initializing Sentry with our DSN. Then, we’re using FastAPI’s `exception_handler` decorator to catch HTTP exceptions. In the exception handler, we capture the exception using Sentry’s `capture_exception` function, which logs the error to Sentry before returning the default error response.

Remember, logging isn’t limited to errors and exceptions. It’s also valuable to log application events, such as inbound requests, system state changes, and user behavior. These logs can help us understand how users interact with our API, identify potential issues, plan future improvements, and even use that information to train AI models.

FastAPI’s middleware feature can be an effective way to log requests and responses. A middleware component gets called with every request before it is processed by any specific path operation and before exception handlers.

from fastapi import FastAPI, Request
from sentry_sdk import add_breadcrumb

app = FastAPI()

@app.middleware("http")
async def add_sentry_breadcrumbs(request: Request, call_next):
    # Add a breadcrumb for Sentry with the request details
    add_breadcrumb(
        category="request",
        data={
            "path": request.url.path,
            "method": request.method,
        },
    )
    # Continue processing the request
    return await call_next(request)

In the example above, we use Sentry’s `add_breadcrumb` function in a FastAPI middleware to log each request’s path and method as breadcrumbs in Sentry. This way, we can track the sequence of operations leading to an error.

Securing our API with API Keys.

API key handling is a pivotal part of any SaaS API. It enables user authentication and ensures only authorized clients can access your services. With API keys, you can track and control how the API is used, prevent abuse, and even offer tiered access levels based on different clients’ needs.

While FastAPI does not provide built-in mechanisms for API key authentication, it offers the flexibility to integrate easily with any authentication system. With its rapid data access capabilities, Redis can serve as a robust and scalable solution for API key storage and validation.

The API key handling functionality is implemented in the following file called apikey_support.py:

from fastapi import Depends, FastAPI, HTTPException, status,Request
from fastapi.security import APIKeyHeader
from redis import Redis, ConnectionPool

from typing import Dict


# Initialize Redis connection pool
pool = ConnectionPool(host='localhost', port=6379, db=0)  # Use your Redis server details

api_key_header = APIKeyHeader(name="X-API-KEY", auto_error=False)

#FastAPI App instantiation
app = FastAPI()

def get_redis_client():
    return Redis(connection_pool=pool)

async def validate_api_key(request:Request,api_key: str = Depends(api_key_header)) -> Dict[str,str]:

    r = get_redis_client()
    tenant = r.hgetall(api_key)

    if not tenant:
        raise HTTPException(
            status_code=status.HTTP_401_UNAUTHORIZED,
            detail="Invalid API Key",
        )
    tenant = {k.decode(): v.decode() for k, v in tenant.items()}
    request.state.tenant = tenant
    return tenant

Let’s break down the most important parts of the previous code:

1. Initializing Redis connection: The `ConnectionPool` class is used to manage the Redis connection pool. This way, the connection can be reused instead of opened and closed for every Redis command execution, significantly increasing performance.

2. Setting up API Key Header: FastAPI’s `APIKeyHeader` class is used to define the header’s name from where the API key will be extracted. In this case, it’s set to “X-API-KEY”. The `auto_error` parameter is set to `False`, which means FastAPI won’t automatically generate an error when the header is missing.

3. API Key Validation: The `validate_api_key` function is an asynchronous function that is designed to validate the provided API key. This function will be used as a dependency in your path operations. It fetches the `api_key` from the header (using `Depends(api_key_header)`) and retrieves the corresponding data from Redis. If the API key does not exist in Redis, an HTTP 401 Unauthorized error is raised with the message “Invalid API Key”. If the API key is valid, it converts the returned tenant data (which is in bytes) into a string and saves it to the `request.state` object. The `request.state` object is an easy way to share data between your route handlers and middleware during a single request.

The tenant has the following structure:

{'name': 'tenant1', 'client_secret': 'your_client_secret', rate_limit:'Client rate limit'}

Now, let’s see how we can enforce APIKeys in the application’s endpoints.

from fastapi import APIRouter, Depends, HTTPException, Request, Form
from typing import Optional

from infrastructure.security.apikey_support import validate_api_key

router = APIRouter(
    prefix="/api/v1/search",
    tags=["search"]
    }
)


@router.post("/nlp")
async def sem_search(request:Request, 
                     query: str = Form(), 
                     top_k: Optional[int] = Form(5), 
                     Dependencies = Depends[validate_api_key]):

  tenant = request.state.tenant

As in the previous code snippet, setting up API key validation at the endpoint level is not a good practice. A better approach is to set up this at the app level so every request will be validated before any endpoint gets called.

import asyncio
import logging
import uvicorn

from fastapi import Depends,HTTPException
from fastapi import FastAPI, Request
from fastapi.middleware.cors import CORSMiddleware
from fastapi.middleware import Middleware

from sentry_sdk import add_breadcrumb
from sentry_sdk import init as initialize_sentry, capture_exception
from sentry_sdk.integrations.logging import LoggingIntegration

from infrastructure.security.apikey_support import validate_api_key
from interface.endpoints import search


middleware = [
    Middleware(
        CORSMiddleware,
        allow_origins=["*"],
        allow_credentials=True,
        allow_methods=["*"],
        allow_headers=["*"],
    )
]


initialize_sentry(
    dsn="your dsn"
)

logger = logging.getLogger(__name__)

app = FastAPI(middleware=middleware,
              dependencies = [Depends(validate_api_key)])

Versioning: Ensuring smooth transitions and backward compatibility.

APIs are not static. As the API evolves with new features, fix bugs, or improves performance, there will come a time when you’ll need to introduce breaking changes. Versioning is a strategy that allows us to handle these changes smoothly, ensuring backward compatibility and preventing disruptions for our existing users.

There are several API versioning strategies; here, I’ll talk about the three most common ones: URI versioning, request header versioning, and parameter versioning. FastAPI doesn’t enforce any particular versioning strategy. It is up to you to choose the one that best fits your needs.

Here’s a brief look at each of these methods:

1. URI versioning: This is the most straightforward approach, where the API version is included in the URI (usually as a path parameter). It’s simple and easy to understand, but it does mean that different versions of the API are essentially different endpoints.

2. Request header versioning: The API version is sent as an HTTP header. This technique keeps the URI clean but can be slightly more complicated, as users must remember to include the correct header.

3. Parameter versioning: The version is sent as a query parameter in the request. Like header versioning, this keeps the URI clean. It’s also easier to use than header versioning, as it doesn’t require setting headers.

Let’s illustrate these strategies with a simple FastAPI app:

URI versioning

from fastapi import FastAPI, Header, HTTPException

app = FastAPI()

# URI Versioning
@app.get("/v1/api/resource")
async def read_resource_v1():
    # Handle request for version 1...

@app.get("/v2/api/resource")
async def read_resource_v2():
    # Handle request for version 2..

Request header versioning

from fastapi import FastAPI, Header, HTTPException

app = FastAPI()

# Request Header Versioning
@app.get("/api/resource")
async def read_resource(header_version: str = Header(None)):
    if header_version == "v1":
        # Handle request for version 1...
    elif header_version == "v2":
        # Handle request for version 2...
    else:
        raise HTTPException(status_code=400, detail="Invalid or miss

Parameter versioning

from fastapi import FastAPI, Header, HTTPException

app = FastAPI()

# Parameter Versioning
@app.get("/api/resource/{version}")
async def read_resource(version: str):
    if version == "v1":
        # Handle request for version 1...
    elif version == "v2":
        # Handle request for version 2...
    else:
        raise HTTPException(status_code=400, detail="Invalid version")

In these examples, we provide different versions of the same resource. Clients can specify the version they want in the URI, in an HTTP header, or as a query parameter. By giving your clients the flexibility to choose when to adopt new versions, you can avoid disruptions and provide a better user experience.

Rate Limiting: Safeguarding our API from potential abuse.

Rate limiting is a crucial API component, especially when evolving to a SaaS model. This feature protects your API from potential abuse by limiting how many requests a client can make within a specific period. It mitigates risks of Denial-of-Service (DoS) attacks and ensures that all clients receive a fair share of the service.

Rate limiting also helps keep our budget under control when our services consume costly third-party APIs like OpenAI GPT-4 or GPT-3.5-turbo.

FastAPI doesn’t come with built-in rate limiting; however, we can readily integrate it with other open-source tools to accomplish this task. One such tool is SlowApi, a rate-limiting library. We can couple SlowAPI with Redis to create fast, scalable, and efficient storage for the API rate-limiting configuration.

The following example illustrates how we can set up rate limiting with FastAPI, SlowApi, and Redis:

Let’s start by defining the SlowAPI limiter configuration in a file called rate_limiter.py

from slowapi import Limiter, _rate_limit_exceeded_handler
from slowapi.util import get_remote_address
from slowapi.errors import RateLimitExceeded

limiter = Limiter(key_func=get_remote_address, storage_uri="redis://localhost:6379")

In this example, we initialize our Limiter with the get_remote_address key function (to distinguish clients based on their IP address) and a Redis URI for storage.

Let’s see how straightforward it is to add rate-limiting support to our endpoints.

import logging

from fastapi import APIRouter, Depends, HTTPException, Request, Form
from typing import Optional

from infrastructure.security.apikey_support import validate_api_key
from infrastructure.quota.rate_limiter import limiter

logger = logging.getLogger(__name__)

router = APIRouter(
    prefix="/api/v1/search",
    tags=["resume"],
    responses={404: {"description": "Resume Not found"}}
)

@router.post("/nlp")
@limiter.limit("10/minute")
async def sem_search(request:Request, 
                     query: str = Form(), 
                     top_k: Optional[int] = Form(5)):

  tenant = request.state.tenant
  logging.info(f"search request from tenant {tenant} , query {query}")
  return "ok"

In the previous code snippet, we attach the Limiter to the endpoint at /api/v1/search/nlp to accept a maximum of 10 requests per minute. If a client exceeds this limit, the _rate_limit_exceeded_handler will trigger, returning a 429 Too Many Requests response with the following body:

{"error":"Rate limit exceeded: 10 per 1 minute"}

The key_func parameter is a callable that SlowApi uses to distinguish between different clients. By default, it uses the get_remote_address function, which distinguishes clients based on their IP address. However, there are better solutions than this for cases where we want to set different rate-limiting quotas based on the client id.

We can set up dynamic rate limits by passing a callable to the `@limiter.limit` decorator, instead of a string. This function should take the same arguments as your endpoint and return a string with the rate limit.

Unfortunately, slowAPI doesn’t provide an easy way to determine and set the rate limit expression based on the current client id. We have to do a little trick subclassing the Limiter class to override the _check_request_limit function.limit. So this is our newer version of the rate_limiter.py file.

from slowapi import Limiter, _rate_limit_exceeded_handler
from slowapi.util import get_remote_address
from slowapi.errors import RateLimitExceeded

from fastapi import Depends,Request
from typing import Optional,Callable,Any


class DynamicLimiter(Limiter):
    def _check_request_limit(
        self,
        request: Request,
        endpoint_func: Optional[Callable[..., Any]],
        in_middleware: bool = True,
    ) -> None:
        
        tenant = request.state.tenant
        
        def get_rate_limit():
           return tenant["rate_limit"]
        
        @limiter.limit(get_rate_limit)
        def get_expression(request):
            return endpoint_func
 
        print(endpoint_func)
        return super()._check_request_limit(request,get_expression,in_middleware)
    
def get_user_id(request: Request = Depends(Request)):
    return request.state.tenant['name']

limiter = DynamicLimiter(key_func=get_user_id, 
                         storage_uri="redis://localhost:6379")

So any limit expression we set in controllers is overridden to the tenant.rate_limit by our custom DynamicLimiter.

@router.post("/nlp")
@limiter.limit('')
async def sem_search(request:Request, 
                     query: str = Form(), 
                     top_k: Optional[int] = Form(5)):

  tenant = request.state.tenant
  print(f"search request from tenant {tenant} , query {query}")
  return "ok"

This setup allows you to apply different rate limits to different users, which can be useful if you want to offer different service levels to different types of users (e.g., free users vs. premium users).

Conclusion

To finish, building a robust and efficient Python API for your SaaS application is more than just developing functionalities that fulfill business requirements. It also includes ensuring your API is secure, scalable, and maintainable, incorporating features like error handling and reporting, rate limiting, API key handling, versioning, and logging.

With FastAPI as the backbone of our application, we’ve seen how these elements enhance the overall functionality. I hope this article will help you implement these core features in your FastAPI applications, paving the way for a reliable, efficient, and successful SaaS offering.

To facilitate your journey and give a hands-on experience, all the code snippets discussed throughout this article are assembled into a comprehensive GitHub project. I invite you to explore the project repository to get a closer look at the implementation details.

Happy coding!

We have reached the end of the post. Feel free to DM me if you want to know more details or want my help in developing a prototype. Please add your comments if you have any questions.

Thanks for reading!

Stay tuned for more content about GPT-3, NLP, System design, and AI in general. I’m the CTO of an Engineering services company called Klever, you can visit our page and follow us on LinkedIn too.