Micro Services Classroom Series – 12/Dec/2021


  • Caching means storing data in temporary space (cache), so that it can be retrieved faster.
  • This temporary space can be application memory, server hard disk sapce or some thing else
  • The whole purpose of caching is to lighten the workload by avoiding any heavy processes for querying the data again.
  • So we can cache the recipes queried by users from popular authors
  • For server-level caching, most of the time cache is stored in the same web server as aplication, But it can also be stored in other server as well such as Redis (Remote Dictionary Server) or Memcached (high performance distribute cache memory)


  • We can install a package called as flask-caching which is a flask extension package.
  • This package will allow us to implement caching functionality easily
  • Refer Here for the official documentation.
  • You can think of cache as a dictionary object that contains key-value pars. The key here is used to specify the resource the cache, where as the value is used to store the actual data to be cache.
  • The process is better illustrate through the following image Preview
  • Install Flask-caching pip install Flask-Caching and pip freeze > requirements.txt
  • Refer Here for the changeset containing changes to implement Simple Caching
  • Note: In production grade systems, we would be using MemCached or Redis for storing the cache.
  • Clearing the Cache when data updates

API Rate Limiting

  • When we provide an API service, we need to ensure fair usage for every user so that system resources are effectively and fairly serving all.
  • We need to apply restrictions by limiting a small number of high-traffic users.
  • The way to do is set a limit per user. For example we can limit number of requests per user to be no more that 100 per second
  • HTTP Headers and Response Codes
    • We can use HTTP headers to display rate limit information.
    • The following attributes in HTTP Headers can tell us number of requests allowed (rate per second) and the remaining quota and when the limit will be reset
      • X-RateLimit-Limit: Shows the rate limit of this API endpoint
      • X-RateLimit-Remaining: Shows the number of remaining requests allowed before the next reset
      • X-RateLimit-Reset: When the rate limit will be reset (in UTC epoch time)
      • Retry-After: The number of seconds before the next reset
    • When the user starts to violate thre rate limit, the API will return the HTTP status code 429 Too many requests
        "errors": "Too Many Requests"
  • To implement this in Flas we have Flask-Limiter extension package
  • Install this package pip install Flask-Limiter
  • Refer Here for the usage.
  • To rate limit an api end point
[count][per|/][n (optional)]

100 per minute
  • Refer Here for basic rate limiting
  • Next Steps:
    • Best Practices for rate limiting
    • Open API/Swagger documentation

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

About learningthoughtsadmin