Cache elimination algorithms LRU and LFU

Caching is a computer thinking. For repeated calculations, cache the results. The next time you calculate this task, you don't go to the real calculation, but directly return the results, which can speed up the processing speed. Of course, for some things that will change over time, the cache will fail and have to be recalculated.

For example, there are only two cache spaces and there are many data to be cached, 1, 2, 3, 4 and 5. When the cache space is full, one cache needs to be eliminated. The elimination algorithms include LRU, LFU, FIFO, SC secondary opportunity, aging algorithm, clock working set algorithm, etc.

Algorithm flow

LRU is the least used recently. It adds the data to a linked list and sorts it according to the access time. When elimination occurs, the oldest access time is eliminated.
For example, there are data 1, 2, 1, 3, 2
At this time, there are (1, 2) in the cache
When 3 joins, we have to eliminate the latter 2 and turn it into (3, 1)

LFU, which is not often used recently, adds data to the linked list and sorts by frequency. If a data has been accessed, its frequency is + 1. When elimination occurs, the low frequency is eliminated.
For example, there are data 1, 1, 1, 2, 2, 3
In cache (1 (3 times), 2 (2 times))
When 3 is added, the latter 2 must be eliminated and become (1 (3 times), 3 (1 time))
Difference: LRU has to eliminate 1.

obviously
LRU does not hit the cache high for the data that appears in a cycle
For example, such data, 1, 1, 1, 2, 2, 2, 3, 4, 1, 1, 1, 2, 2
When we get to 3 and 4, 1 and 2 will be eliminated, but there are many 1 and 2 behind

LFU cache hits are not high for alternating data
For example, 1, 1, 1, 2, 2, 3, 4, 3, 4, 3, 4, 3, 4, 3, 4, 4, 4, 3, 4, 3, 4
Because the front is (1 (3 times), 2 (2 times))
3 join to eliminate 2, 4 join to eliminate 3, and 3 join to eliminate 4. However, 3 and 4 are the ones that need caching most. After 1 goes to 3 times, no one can eliminate it.

realization

There are two topics on leetcode
LRU: https://leetcode.com/problems/lru-cache/description/
LFU: https://leetcode.com/problems/lfu-cache/description/

The requirement is to add put() to the cache and read get() from the cache, which should be implemented in O(1).

An implementation method of LRU:
Use a two-way linked list to record the access time, because the insertion and deletion of the linked list are efficient, and the new time is in the front and the old one is in the back.
Use a hash table to record the cache (key, value), hash search approximate O(1), the worst O(n) in case of hash conflict, and the records in the hash table (key, (value, key_ptr)), key_ PTR is the address of the key in the linked list. In order to find the node in O(1) time and promote the node to the header.
The key in the linked list can quickly find the value in the hash and delete it.

An implementation method of LFU:
Use a master two-way linked list to record (access times, from the chain header) and record (key) in chronological order from the linked list
A hash table record (key, (value, master list ptr, slave list ptr)) ptr indicates the address of the key in the linked list
Then, get and put operate in the hash table, which is similar to O(1). In the hash table, the address of a node in the linked list can be found in O(1), and the access frequency of the node is raised. The insertion and deletion of the linked list are also O(1).

--------------------Finally, paste an AC Code:--------------------
Code performance: 1000000 additions, reading time
LRU: 480ms
LFU: 510ms
NSCache: 2000ms
YYCache: 1400ms

LRU:

 

#include <list>
#include <unordered_map>

using namespace std;

class LRUCache {
    
public:
    LRUCache(int capacity);
    ~LRUCache();
    int get(int key);               // Get cache, hash lookup complexity
    void put(int key, int value);   // Adding the cache, the same key will overwrite the complexity of hash insertion
    
private:
    int max_capacity;
    list<pair<int, int>> m_list;           // Two way linked list, pair < key, value >
    unordered_map<int, list<pair<int, int>>::iterator> u_map;   // Hash map, vector + list implementation, < key, list:: ITER >
};

LRUCache::LRUCache(int capacity) {
    max_capacity = capacity;
}

LRUCache::~LRUCache() {
    max_capacity = 0;
    u_map.clear();
    m_list.clear();
}

int LRUCache::get(int key) {
    auto it = u_map.find(key);      // C++11 automatic type inference
    if (it != u_map.end()) {
        // splice() merge M_ Move item of list to m_list. In begin()
        m_list.splice(m_list.begin(), m_list, it->second);
        return it->second->second;      // return value
    }
    return -1;
}

void LRUCache::put(int key, int value) {
    auto it = u_map.find(key);
    if (it != u_map.end()) {
        // Update the value of the key and advance the key
        it->second->second = value;
        m_list.splice(m_list.begin(), m_list, it->second);
    } else {
        // First judge whether it is full. If it is full, delete it
        if (m_list.size() >= max_capacity) {
            int del_key = m_list.back().first;
            u_map.erase(del_key);
            m_list.pop_back();
        }
        // Insert into U_ Map, in list
        m_list.emplace_front(key, value);   // emplace_front and puch_front, emplace_front does not copy nodes, does not move elements, and is efficient
        u_map[key] = m_list.begin();
    }
}

LFU:

 

#include <list>
#include <unordered_map>

using namespace std;

// map value structure
typedef struct LFUMapValue {
    int value;
    list<pair<int, list<int> > >::iterator main_it;    
    list<int>::iterator sub_it;
} LFUMapValue;

class LFUCache {
public:
    LFUCache(int capacity);
    ~LFUCache();
    int get(int key);
    void put(int key, int value);
    void right_move(LFUMapValue *value);  // Turn the key of a node to the right to increase the number of accesses
    
private:
    int max_cap;
    int cur_cap;
    // Store pair < count, sublist < key > > structure, count access times, count from small to large, and key time from new to old
    list<pair<int, list<int> > > m_list;
    unordered_map<int, LFUMapValue> u_map;      // Store < key, lfumapvalue > structure
    unordered_map<int, LFUMapValue>::iterator map_it;
};

LFUCache::LFUCache(int capacity) {
    cur_cap = 0;
    max_cap = capacity;
    m_list.emplace_front(pair<int, list<int> >(1, list<int>()));    // Insert node with count == 1
}

LFUCache::~LFUCache() {
    m_list.clear();
    u_map.clear();
}

void LFUCache::right_move(LFUMapValue *value) {
    auto pre = value->main_it;
    auto pre_sub_it = value->sub_it;
    auto next = pre;
    next++;
    
    if (next != m_list.end()) {
        if (pre->first + 1 != next->first) {        // Access times + 1 to judge whether they are equal
            if (pre->second.size() == 1) {
                pre->first++;       // The list of this count has only one key, in place + 1, and no new node is created
            } else {
                // Insert a node before next
                auto it = m_list.emplace(next, pair<int, list<int> >(pre->first + 1, list<int>()));
                it->second.splice(it->second.begin(), pre->second, pre_sub_it);
                value->main_it = it;
                value->sub_it = it->second.begin();
            }
        } else {
            // Append sub in next_ List header
            next->second.splice(next->second.begin(), pre->second, pre_sub_it);
            value->main_it = next;
            value->sub_it = next->second.begin();
            
            // If pre Size = = 0 release
            if (pre->second.size() == 0) {
                m_list.erase(pre);
            }
        }
    } else {
        if (pre->second.size() == 1) {
            pre->first++;       // In situ + 1
        } else {
            // Create a new node insert
            list<int> tmp_list;
            tmp_list.splice(tmp_list.begin(), pre->second, pre_sub_it);
            // tmp_ The iterator of list cannot be used. Add m_list, tmp_list is copied and constructed to generate a new list insert, tmp_list released
            m_list.emplace_back(pair<int, list<int> >(pre->first + 1, tmp_list));
            value->main_it = m_list.end();
            (value->main_it)--;
            value->sub_it = value->main_it->second.begin();
        }
    }
}

int LFUCache::get(int key) {
    map_it = u_map.find(key);
    if (map_it == u_map.end()) {
        return -1;
    }
    
    LFUMapValue *value = &(map_it->second);
    right_move(value);
    
    return value->value;
}

void LFUCache::put(int key, int value) {
    if (max_cap == 0) {
        return ;
    }
    map_it = u_map.find(key);
    if (map_it == u_map.end()) {
        // Not found, insert
        list<int> *firstList = &(m_list.front().second);
        if (cur_cap == max_cap) {
            // Eliminate one
            if (firstList->size() > 0) {
                // u_ Delete from map and list
                u_map.erase(firstList->back());
                firstList->pop_back();
                cur_cap--;
            }
        }
        cur_cap++;
        if (m_list.front().first != 1) {
            m_list.emplace_front(pair<int, list<int> >(1, list<int>()));
            firstList = &(m_list.front().second);
        }
        firstList->emplace_front(key);
        LFUMapValue map_value;
        map_value.value = value;
        map_value.main_it = m_list.begin();
        map_value.sub_it = firstList->begin();
        u_map[key] = map_value;
    } else {
        // Find it, update it and increase the number of visits
        map_it->second.value = value;
        right_move(&(map_it->second));
    }
}



https://www.jianshu.com/p/1f8e36285539
 

Tags: Redis

Posted by fiztec on Wed, 25 May 2022 21:58:16 +0300