Easy caching in Python Twisted applications.

Alexander Gorin

0/5 (0 vote)

Aug 20, 2014

CPOL

4 min read

9416

Introduction to txcaching library.

Introduction

Most backend web developers have faced the problem of needing to cache their data. While solutions like memcached have pretty straightforward interfaces and solve a lot of common tasks, their use may be non-trivial for large projects, especially when a developer needs to keep track of a bunch of keys and values. If the architect was skilled enough, the structure of the project may allow developers to add caching to the functionality relatively easily. But what if you have a project which may be not well-structured and was not supposed to support caching from the very beginning, and you need to add caching quickly and without much influence to the architecture? In the case of Twisted it is even harder because of its asynchronous nature.

In this article I want to introduce txcaching - a library that makes operations with memcached in Twisted applications much easier. It is based on twisted.protocol.memcache. For many cases just a few lines of code will allow you to cache your methods calls or services.

The development of this library has started just recently, so your advice on how to improve it may be extremely useful.

The github page of the project is here. This article mostly repeats its README file.

API reference of the library you may find here.

Brief description of the library.

The main module of the library - txcaching.cache - provides 3 decorators to cache the calls of various types of functions:

cache.cache - caches the output of a function. The function may be either blocking or asynchronous - after decoration it will be asynchronous anyway. It may seem strange, because decorators don't usually affect the behaviour of functions that way. However, if you need to cache your function calls, it usually means that it is an important part of your architecture and a potential bottleneck, so it is not cool to add there a blocking call to an external server. This decorator may be used with methods as well as with functions, but if you use it with a method, you must provide the name of the class in class_name argument.
cache.cache_sync_render_GET -caches the output of render_GET method of Resource subclass. The method must return a string - not server.NOT_DONE_YET constant.
cache.cache_async_render_GET -caches the output of render_GET method of Resource subclass. The method must return a server.NOT_DONE_YET constant.

All the functions above use the arguments of cached functions to generate keys on memcached server, so you don't have to keep track of your keys on the server.

Of course, the simple caching itself is just a part of the task - we also need to change or remove the cached data when our data storage is changed. But with txcaching it will be very easy as well. The library provides module keyregistry to help you work with cached data.

Examples

Let us see how it happens looking at the examples.

Both examples implement almost the same functionality - a simple web-server that allows to add users and their emails to data storage. The operation of getting email by username represents the case of a long heavy operation (the 2 seconds delay was added manually), so we want to cache the results of requests. In the first example we use decorator cache to cache the storage itself:

class DB:
    def __init__(self):
        self.data = {}

    @cache.cache(class_name="DB", exclude_self=True)
    def get(self, username):
        """Very heavy request"""

        print "Reading from DB"
        time.sleep(2)
        email = self.data.get(username, None)
        if email:
            return defer.succeed(email)
        else:
            return defer.fail(Exception("User not found"))

    def set(self, username, email):
        self.data[username] = email
        cache_key = keyregistry.key(DB.get, args=(username,))
        if cache_key:
            cache.replace(cache_key, email)

Method DB.get is decorated by function cache, so its results will be cached. Pay attention to the function DB.set: it checks if the value corresponding to the username has been cached using keyregistry.key and updates the cache. Note that we didn't have to work with cache keys directly.

In the second example we apply another approach - we use cache.cache_async_render_GET to cache the service. (Use of cache.cache_sync_render_GET would be almost the same.)

class EmailGetter(Resource):
    def __init__(self, username):
        self.username = username

    @cache.cache_async_render_GET(class_name="EmailGetter")
    def render_GET(self, request):

        d = DB.get(self.username)
        d.addCallback(lambda email: request.write(email_response % email))
        d.addErrback(lambda failure: request.write(email_not_found % self.username))
        d.addBoth(lambda _: request.finish())

        return server.NOT_DONE_YET

class EmailSetter(Resource):
    def render_GET(self, request):
        return set_email

    def render_POST(self, request):
        username = request.args.get("username", [""])[0]
        email = request.args.get("email", [""])[0]

        cache_key = keyregistry.key(EmailGetter.render_GET, args=(EmailGetter(username),))
        if cache_key:
            cache.delete(cache_key)

        DB.set(username, email)
        return email_set_confirmation % (username, email)

EmailGetter.render_GET is decorated by cache_async_render_GET, so its results will be cached. Note that in this case the result will depend on the state of Resource object (self.username field), so we don't set exclude_self=True in cache_async_render_GET. EmailSetter.render_POST checks if the value corresponding to the username has been cached using keyregistry.key and drops the cache corresponding to the particular username.

To work with cached data you may also use other functions provided by module cache: get(), set(), append(), flushAll() etc.

Conclusion

This library allows you to add caching to your Twisted project very quickly and easily. A similar approach may be efficient for usual synchronous APIs as well. I may add these capabilities in the future.