How to solve the problems of Redis cache avalanche, breakdown and penetration

01-16-2023

This article mainly explains how to solve the problems of Redis cache avalanche, breakdown and penetration. Learn how to solve the problems of Redis cache avalanche, breakdown and penetration!

1. Cache avalanche

1. What is cache avalanche?

Cache avalanche means that a large number of requests cannot hit the cache data in Redis, that is, in RedisIf the data cannot be found, the business system can only query the database, which in turn causes all requests to be sent to the database. As shown in the figure below:

QQ截图20230116205447.jpg

The database is not like Redis that can handle a large number of requests. The surge in requests caused by the cache avalanche must cause the database to go down , which will inevitably affect the business system, so if a cache avalanche occurs, it will definitely be fatal to the business system.

2. Why is there a cache avalanche?

Under what circumstances does a cache avalanche occur? To sum up, there are two reasons:

  • A large number of Redis cached data expires at the same time, causing all the data to be sent to Redis The request cannot hit the data, and can only be queried in the database.

  • RedisThe server is down, all requests cannot be processed by Redis, and can only be turned to the database to query data.

3. How to avoid cache avalanche?

There are different solutions to the cause of the cache avalanche:

  • For a large number of random cache expiration times, the solution is based on the original expiration time , plus a random expiration time, such as a random expiration time between 1 and 5 minutes, so as to prevent a large amount of cached data from expiring at the same time.

  • For Redis to solve the cache avalanche caused by downtime, you can set up the master-slave server of Redis in advance Data synchronization, and configure the sentinel mechanism, so that when the Redis server cannot provide services due to downtime, the sentinel can send The Redis slave server is set as the master server and continues to provide services.

Second, cache breakdown

1. What is cache Breakdown

Cache breakdown is similar to cache avalanche. Avalanche is due to the expiration of a large amount of data, while cache breakdown refers to the expiration of hot data. All requests for hot data need to be processed in the database. Processing

2. How to avoid cache breakdown?

Three ways to solve the cache breakdown:

  • Do not set the expiration time

If we If we can know in advance that a certain data is hot data, then we can not set the expiration of these data, so as to avoid the problem of cache breakdown. The product data for the flash sale is written in the cache in advance and no expiration time is set.

  • Mutual exclusion lock

Knowing in advance that some data will have a lot of access, we can of course set it to not expire, but more Many times, we cannot predict in advance, how to deal with this situation?

Let's analyze the situation of cache breakdown:

Under normal circumstances, when a Redis cache data expires, if there is Request, then re-query to the database and then write to the cache, so that subsequent requests can hit the cache without having to go to the database to query.

When the hotspot data expires, due to a large number of requests, when a request cannot hit the cache, it will query the database and rewrite the data to Redis, that is, write Before Redis, when other requests come in, they will also query the database.

Well, we know that after the hot data expires, many requests will query the database, then we can add a mutex lock to the business logic to query the database, only the request to obtain the lock can query the database and Write the data back to Redis, while other requests that do not acquire the lock can only wait for the data to be ready.

The above steps are shown in the following figure:


  • Setting the logical expiration time

  • Using mutex locks can solve the problem of cache breakdown very simply, but requests that do not acquire locks are queued up, which affects the performance of the system. Another way to solve cache breakdown is in the business Data redundancy has an expiration time, for example, in the following data weThe expire_at field is added to indicate the data expiration time.


    {"name":"test","expire_at":"1599999999"}Copy code

    There is a redundant logical expiration time in the hot data in the cache, but the data does not set an expiration time in Redis

    When a request gets the data in Redis, judge whether the logical expiration time has expired, if not, return directly, if it expires, open another thread to obtain After locking, query the database and write the latest query data back to Redis, while the current request returns the queried data.

    Third, cache penetration

    1. What is cache penetration

    Cache penetration means that the data to be searched is neither in the cache nor in the database. Because it is not in the cache, the request will definitely reach the database. The Redis cache is useless, as shown in the following figure:


    2. Why does cache penetration occur

    Under what conditions does cache penetration occur? There are mainly the following three situations:

    • Malicious user attack request

    • Misuse Redis and The data in the database is deleted

    • When the user has not generated content, such as the user's article list, the user has not written an article, so there is no data in the cache and database

    3. How to avoid cache penetration?

    a. Cache empty or default value

    When no data is found in the Redis cache, Then query from the database, if there is no data, just cache a space or default value directly, so as to avoid querying the database next time; but in order to prevent the problem that the database already corresponds to the database, and then return a null value, it should be set for the cache Expiration time, or directly clear the corresponding cache null value when generating data.

    b. Bloom filter

    Although the cache null value can solve the cache penetration problem, it still needsOnly by querying the database once can we determine whether there is data. If there is a malicious attack by a user, the data id that does not exist in the system will be used for high-concurrency queries. All queries must pass through the database, which will still bring great pressure to the database.

    So, is there a way to determine whether data exists without querying the database? Yes, use Bloom filter.

    Bloom filter is mainly composed of two parts: bit array + N hash functions, the principle is:

    • Use N hash function pairs The data to be marked is hashed.

    • The calculated hash value is modulo the length of the bit array, so that the position of each hash value in the bit array can be obtained.

    • Mark the corresponding position in the bit array as 1.


    When data is to be written, perform the steps described , calculate the position of the corresponding bit array and mark it as 1, then when executing the query, you can check whether the data exists.

    In addition, due to the error caused by the hash collision problem, the non-existent data will be judged to exist after passing through the Bloom filter, and then check the database, but the probability of hash encounter is very small , Bloom filters can help us intercept most of the penetration requests.

    Redis itself supports Bloom filter, so we can directly use Redis Bloom filter instead of implementing it ourselves, which is very convenient.

    4. Summary

    Cache avalanche, breakdown, and penetration are cache exceptions that are often encountered in business application caching. The causes and solutions are shown in the following representation:


    problemcausesolution
    Cache AvalancheA large amount of data expires or Redisserver crashes1. Random Expiration time 2. Master-slave + sentinel cluster
    Cache breakdownHot data expiration1. Do not set expiration time 2. Add mutex 3. Redundant logical expiration time
    Cache penetrationNeither the request database nor Redis 1. Cache null or default value 2. Bloom filter


Copyright Description:No reproduction without permission。

Knowledge sharing community for developers。

Let more developers benefit from it。

Help developers share knowledge through the Internet。

Follow us

Recommended reading

high perspicacity