How to Use Redis for Caching and Pub/Sub in Python | by Lynn Kwong

Image from Unsplash.

Redis is an open-source in-memory data structure store that can be used as an in-memory key-value database, caching system, and pub/sub message broker. Redis is special in that all its data is stored in memory rather than on disk, which makes it extremely fast and a popular option for caching.

In a previous article, the essentials of Redis data types and common commands were introduced. In this article, we will focus on how to use Redis for caching and Publisher/Subscriber (pub/sub) in Python.

You can install a Redis server directly on your computer. However, for learning purposes, it’s recommended to use a Docker container for Redis because you can always use the latest version of Redis for testing. To start a Redis server with a Docker container, run the following commands in a shell.

A high port 16379 is used to avoid potential port conflicts on your machine. Besides, a Docker network is created to make it easier to connect the Redis server with redis-cli:

You can now run the common commands with redis-cli as demonstrated in the previous article. However, the focus of this article is how to use Redis in Python, so let’s proceed.

To access the Redis server in Python, you need a Python Redis client. The recommended one is redis-py, which is mature and well supported. It is currently seen as “the way to go” for Python.

To use redis-py, we should install it on our computer. It is recommended to install it in a virtual environment so it won’t mess up with existing libraries. For simplicity, we will use conda to create a virtual environment:

The (redis) on the command line indicates that the virtual environment is successfully created and activated. In case you want to learn more about condathis article can be a good reference.

Then to install redis-pyrun:

$ pip install redis

Note that the library to install is redisnot redis-py. To make it easier to run Python commands interactively, we will also install iPython in the virtual environment:

$ pip install ipython

We can then start iPython and interact with Redis:

If your Redis server has authentication enabled, you can specify the password with the password parameter. You can also specify the db parameter if your Redis server has multiple databases. The default database number is 0, which is the default one to use when you don’t specify one.

Note that you should specify the port correctly, here it is the custom high port 16379, not the classical 6379. Also note that the keys and values ​​for Redis can only be bytes, strings, integers, and float numbers. For best practice, we should normally only use meaningful strings as the keys. However, for a value with a non-basic type like an object or array, it must be first converted to a string or bytes before it can be set as a value in Redis. The json.dumps function is commonly used to convert dictionaries or lists in Python to JSON strings.

On the other hand, as we see, by default, all responses are returned as bytes in Python. If we want to decode all string responses from a Redis client, we can specify the decode_responses and encoding parameters when we create the client:

If you are unfamiliar with the concepts of string encoding/decoding and Unicode/UTF-8, this article can be helpful.

To delete a Redis key, just use the delete method of the client:

In the result, 1 means the key is deleted successfully, and 0 indicates that the key does not exist.

When we have many Redis keys to check, there is a convenient method calledscan_iterwhich is similar to the SCAN command but is much more convenient to use because we don’t need to remember and specify the cursor:

Here the colon (:) is just used to separate the object type and the id and actually doesn’t have any special meaning. The scan_iter method expects a Redis pattern as an input. If no pattern is specified, all the keys will be returned. Especially, the star “*” matches any number of any characters. It is equivalent to the “.*” pattern in a regular expression.

Above we have introduced the very fundamental usage of Redis in Python. Most Redis methods in Python have their native redis-cli command counterparts. If you are interested in or need to use other methods, you can refer to the previous article that focuses on native redis-cli commands.

Now let’s focus on the Redis pipeline and pub/sub which are not commonly used with redis-clibut more commonly in a driver (Python in this post).

For Redis, the pipeline is a way to execute multiple commands at the same time. The commands will be buffered and only one request will be sent to the Reids server. In this way, the overhead for communicating between the Redis server and client is reduced and the speed/efficiency can be increased. The pipeline is very helpful when you need to run many Redis commands sequentially.

The pipeline is similar to the transaction in SQL databases. However, instead of first starting a transaction and then committing, here you first create a pipeline object and then execute:

As we see, the pipeline is very straightforward to use. It has more advanced settings as well which makes it work in more complex situations.

Redis is a commonly used lightweight message/queue or publish/subscribe (pub/sub) system. For example, Airflow uses it as the broker that forwards messages from scheduler to worker.

As with other message/queue systems, we need to create a channel and subscription before we can publish and receive messages. The channel is sometimes called topic in some message/queue systems such as the Pub/Sub service of Google Cloud Platform.

Actually, we don’t need to create a channel explicitly. The channel is created automatically when it is subscribed to for the first time, as shown above. Moreover, we can subscribe to multiple channels at the same time. We can even subscribe to multiple channels by pattern, as we will see soon with the psubscribe method below.

Let’s now publish some messages to the channel:

It should be noted that the Redis client (redis_cli) is used to publish messages to a channel, not the PubSub object created above. The number returned is the number of subscribers to which the message was delivered. Let’s create a new subscription and see if the number changes accordingly.

In this examplepsubscribe means a subscription will be created by pattern. The created subscription will listen to all the channels matching the specified pattern.

As we see, when we publish to channel-*the return is 1 meaning the message was only published to one subscriber, namely the one created with pattern channel-*. However, when the message was published to channel-1 again, the return value changes to 2, meaning the message was published to two subscribers. Actually, there is only one PubSub object created and used. However, this PubSub object has two subscriptions/subscribers, one to channel channel-1 and the other one to all the channels matching channel-*.

Let’s now get the published messages from the system. To do this, we need to call the get_message method of the PubSub object.

The get_message method gets the next message if one is available, otherwise None. It returns a dictionary with four keys:

  • type — The type of the data. The value can be subscribe, psubscribe, messageetc. message means the actual data that is published, the other ones mean the subscription types and can be seen as metadata. We are normally only interested in the data with the message type.
  • pattern — The pattern for the channels. It is None for all messages except the pmessage type, as shown above. In this example, it means that the message is published to channels matching the pattern channel-*.
  • channel — The channel to which the message is published.
  • data — The actual message data published. For the “metadata” subscriptions, it is the number of subscriptions for the PubSub object at the time when the get_message method was called. For the message type, it is the actual message that was sent.

If you are curious as me, a “metadata” message is published when a new subscription is created and when it is unsubscribed.

Normally you won’t bother too much with the metadata and only need to work with the real messages that are published.

As we can see from the simple examples above, it is very straightforward to create a channel (topic) and subscription and then publish/receive messages with Redis. You can then create some logic to parse the data received and use it in your application.

In this article, the most common use cases of Redis in Python are introduced. Together with the previous article on using native Redis commands, you should have a fairly good understanding of Redis now and can start to use it in your work.

Leave a Comment