Reading notes on core technology of 100 million traffic website (continuously updated)

1. High concurrency principle

Stateless, split, service, message queue, data heterogeneity, cache silver bullet, concurrency.

Heterogeneous data: order sub databases and tables are generally divided according to the order ID. if you want to query the order list of a user, you need to aggregate the data of multiple tables before returning, resulting in low order table reading performance. At this time, you can divide databases and tables according to user ID to form a set of user order tables.

Data closed loop: Commodity details page. Because there are too many data sources, there are many factors affecting the stability of the service. The best way is to store the used data heterogeneous to form a data closed loop. Steps: heterogeneous data (receive data changes through MQ mechanism and store them in an appropriate storage engine, such as redis) - > data aggregation (optional) - > front end display (get data through one or a few calls).

Cache silver bullet:

Process node Cache technology
client Use browser cache
Client application cache
Client network Proxy server enable cache
Wide area network Use proxy server (including CDN)
Use mirror server
Using P2P technology
Origin and origin network Use the caching mechanism provided by the access layer (Nginx)
Use the caching mechanism provided by the application layer (in heap cache, out of heap cache, local redis cache)
Distributed cache usage
Static and pseudo static
Use the caching mechanism provided by the server operating system

2. High availability principle

Degradation, current limiting, flow cutting, rollback

Design idea of degradation switch: centralized switch management, multi-level read service that can be degraded, switch pre positioning (pre positioning to Nginx layer), service degradation (asynchronous call, giving priority to high priority data)

3. Business design principles

Anti duplication design, idempotent design, process definition, state and state machine, feedback of background operating system, approval of background system, documents and notes, backup.

 

High availability

4. Isolation

Thread isolation, process isolation, cluster isolation, machine room isolation, read-write isolation, dynamic and static isolation, crawler isolation, hotspot isolation and resource isolation.

Thread isolation: thread pools of different levels can be divided according to the service level;

Cluster isolation: for example, when seckill service may affect the stability of other systems, consider providing a separate service cluster for seckill;

 

Request isolation based on Servlet3:

Thread model before Servlet3

Introducing Servlet3

Business thread pool isolation

Asynchronization using Servlet3

Asynchronization can increase throughput and the flexibility we need, but it will not improve response time.

 

5. Current limiting

  • Current limiting algorithm: token bucket algorithm, leaky bucket and counter.

Token bucket: limit the average inflow speed and allow a certain degree of burst;

Leaky bucket: used for flow shaping, the main purpose is to smooth the outflow rate;

Counter: limits the total number of concurrent.

 

  • Application level current limiting: limit the total number of concurrent / connection / requests, limit the total number of resources, limit the total number of concurrent / requests of an interface, limit the number of time window requests of an interface (use Guava's Cache to store counters and set expiration time), smooth the number of requests of an interface (Guava RateLimiter provides token bucket algorithm)

 

  • Distributed current limiting: the key to distributed current limiting is to atomize the current limiting service. The solution is Redis+Lua or Nginx+Lua. If the concurrent traffic of the application is very large, the consistent hash can be considered to slice and downgrade the distributed current limiting to the application and current limiting.

Redis+Lua:

Nginx+Lua: use Lua rest lock mutual exclusion to solve atomic problems, and use NGX shared. Dict implements counters by sharing dictionaries.

 

  • Access layer current limiting:

For Nginx access layer current limiting, you can use: connection number current limiting module ngx_http_limit_conn_module and leaky bucket algorithm mgx_http_limit_req_module, you can also use the Lua current limiting module Lua rest limit traffic provided by OpenResty to deal with more complex current limiting scenarios.

 

  • Front end: throttling and anti shake

 

6. Degraded stunt

  • Automatic switch degradation: automatic degradation is based on system load, resource usage, SLA and other indicators. Including: timeout degradation, statistical failure times degradation, failure degradation and current limiting degradation.
  • Manual switch degradation
  • Read service degradation: policies include temporarily switching reads (degraded to read cache, degraded to static), temporarily shielding reads (shielding read entries, shielding a read service)
  • Write service degradation: write degradation cannot be degraded in most scenarios. However, some circuitous tactics can be used to solve the problem, and the synchronous mode can be reduced to asynchronous mode.
  • Multi level degradation:

Switch multi-level degradation: page JS degradation switch, access layer degradation switch and application layer degradation switch

Workflow degradation: give priority to high priority data, only deal with some characteristic data, and reasonably allocate traffic to the most needed occasions.

give an example:

1) If the malicious order verification is unavailable, it can be bypassed directly or changed to asynchronous;

2) If the order planning performance drops, but it can still be processed, priority will be given to processing high-grade orders and data with simple processing logic (e.g. single product and single piece);

3) When distributing orders, if the warehouse load is saturated, the transportation volume to JD warehouse can be reduced and the transportation volume to other destinations can be increased.

  • Configuration center
  • Downgrade with Hystrix

 

7. Timeout and retry mechanism

Agent layer timeout and retry, Web container timeout, middleware and client timeout retry, database client retry, business timeout, front-end Ajax timeout.

Note: the high-level timeout depends on the low-level timeout.

1) JDBC timeout settings: connectTimeout (time to wait for establishing socket link with MySQL database, default 0, can not be set), socketTimeout (timeout time to wait for reading and writing socket after establishing socket between client and MySQL database, default 30 minutes in linux system, can not be set);

2) Connection pool timeout setting: the default value is 0, which means unlimited waiting. The unit is milliseconds. 60000 is recommended;

3) MyBatis query timeout: defaultStatementTimeout (indicates the default query timeout in the MyBatis configuration file, in seconds, without setting infinite wait). If some sql needs to execute timeout, it can be set separately through the Mapper file

4) Transaction timeout: the total number of code execution in a transaction, in seconds

  • Common strategies for timeout processing: retry, remove the non viable node, base, wait page or error page.
  • For non idempotent write services, retry should be avoided, or you can consider generating a unique serial number in advance to realize idempotence.
  • The database / cache server should always check for slow operation and consider service degradation in case of severe timeout.

 

8. Rollback mechanism

Transaction rollback, code base rollback, deployment version rollback, data version rollback and static resource version rollback.

Transaction rollback: in most scenarios of distributed transactions, final consistency needs to be considered rather than strong consistency. The common two-phase commit and three-phase commit protocols have low rollback difficulty, but have a great impact on performance. You can consider such as transaction table, message queue, compensation mechanism, TCC mode (preemption / confirmation / cancellation) to achieve final consistency.

 

9. Pressure measurement and plan

  • System pressure measurement:

Before the pressure measurement, there should be pressure measurement scheme (pressure measurement interface, concurrent volume, pressure measurement strategy (sudden, gradual pressurization, concurrent volume), pressure measurement index (machine load, QPS/TPS, response time)), and then the pressure measurement report should be produced, optimized and disaster tolerant.

In pressure measurement, discrete pressure measurement should be selected. If the pressure measurement is hot data, it can not reflect the real processing capacity of the system; In addition, during the actual pressure test, the full link pressure test should be carried out to prevent the problems caused by non core system service invocation or resource competition between systems.

 

Offline pressure measurement: for a single interface, the simulation degree is not high;

Online pressure measurement: according to reading and writing (read pressure measurement, write pressure measurement, mixed pressure measurement), according to data simulation degree (simulation pressure measurement and drainage pressure measurement), according to whether to provide services to users (isolated cluster pressure measurement and online pressure measurement).

Simulation pressure test: the system pressure test is carried out through the simulation request. The simulation data can be programmed, manually constructed or accessed by Nginx; Drainage pressure measurement: use TCPCopy to copy the real flow on the line, and then drain to the pressure measurement cluster for pressure measurement. You can also enlarge the flow by N times.

Isolated cluster pressure test: remove some servers providing external services from the online cluster, and then drain the online traffic to the cluster for pressure test, which is safe; Online pressure measurement is realized by reducing the number of online servers, which is very risky. It is carried out by gradually reducing servers in low peak periods.

 

  • System optimization: system optimization and system expansion.

 

  • Emergency plan: system classification, full link analysis, configuration monitoring alarm, and finally formulate emergency plan.

The system classification can be divided according to the core system and support system.

Example of full link emergency plan:

Network access layer: mainly focus on the unavailability of computer room, DNS failure and VIP failure;

Application access layer: the main focus is the upstream application route switching, current limiting, degradation, isolation and other plan processing;

Web application layer: the main focus is service dependent routing switching, connection pool exception, current limit, timeout degradation, service exception degradation, application load exception, database failover, cache failover, etc;

Data layer: the main concerns are high database / cache load, database / cache failure, etc.

 

10. Application level cache

  • Java cache type

Heap cache: the advantage is that it is not serialized / deserialized and fast; The disadvantage is that limited by the size of heap space, large amount of data will lead to long GC time;

Out of heap cache: it can reduce DC pause time, which is only limited by the size of machine memory; The disadvantage is that reading data requires serialization / deserialization, which is much slower than heap caching;

Disk cache: data still exists when the JVM is restarted;

Distributed cache: it can solve the following problems of single machine cache: single machine capacity and data consistency of multiple machines.

 

Guava: it only provides heap cache, which is small and flexible with the best performance;

Ehcache: 3.x provides heap cache, out of heap cache, disk cache and distributed cache;

MapDB: an embedded Java database engine and collection framework, it provides support for Maps, Sets, Lists, Queues and Bitmaps, as well as ACID transactions, incremental backup, heap cache, out of heap cache and disk cache.

 

  • realization

1) Heap cache

Guava Cache implementation:

Cache<String, String> myCache = 
CacheBuilder.newBuilder()
.concurrencyLevel(4)
.expireAfterWrite(10, TimeUnit.SECONDS)
.maximumSize(10000)
.build();

Cache recycle policy:

Based on capacity: maximumSize, set the cache capacity, and recycle it according to LRU if it exceeds;

Based on time: expireAfterWrite, set TTL (TimeToLiveSeconds) and recycle cached data regularly; expireAfterAccess, setting TTI (TimeToIdleSeconds) may cause dirty data to exist for a long time;

Based on Java object references: weakKeys/weakValues (weak reference cache), softValue (soft reference cache);

Active invalidation: invalidate (object key) / invalideall (iteratable <? > keys) / invalideall(), active invalidation of some cached data;

Concurrency level:

concurrencyLevel.

Statistical hit rate:

recordStats.

 

Ehcache 3.x implementation:

CacheManager cacheManager = CacheManagerBuilder.newCacheManagerBuilder().build(true);

CacheConfigurationBuilder<String, String> cacheConfig = CacheConfigurationBuilder.newCacheConfigurationBuilder(String.class, String.class, ResourcePollsBuilder.newResourcePoolsBuilder()
.heap(100, EntryUnit.ENTRIES))
.withDispatcherConcurrency(4)
.withExpiry(Expirations.timeToLiveExpiration(Duration.of(10, TimeUnit.SECONDS)));

Cache<String, String> myCache = cacheManager.createCache("myCache", cacheConfig);

Cache recycle policy:

Based on the capacity: heap(100, EntryUnit.ENTRIES), set the number of cached entries. If it exceeds, it will be recycled according to LRU;

Based on space: heap(100, EntryUnit.MemoryUnit.MB), set the memory space of cache, and recycle it according to LRU when it exceeds; In addition, you should set withSizeOfMaxObjectGraph(2) the traversal depth of the object graph and the maximum object size that withSizeOfMaxObjectSize(1, MemoryUnit.KB) can cache when counting the object size;

Based on time: withexpiration (expiration. Timetoliveexpiration (duration. Of (10, timeunit. Seconds)), set TTL without TTI; Withexpiration (expirations. Timetoidleexpiration (duration. Of (10, timeunit. Seconds))), set TTL and TTI at the same time, and the values of TTL and TTI are the same;

Active failure: remove (k key) / removeAll (set <? Extends k > keys) / clear();

Concurrency level: withdispatcher concurrency;

Statistical hit rate: none.

 

MapDB 3.x implementation: less used, slightly implemented

Cache recovery strategy: Based on capacity, time, setting TTI and active invalidation;

Concurrency level setting: supported

Statistical hit rate: None

 

2) Out of heap cache

Ehcache 3.x implementation: Replace heap with offheap is OK. Capacity based cache expiration policy is not supported;

MapDB 3.x implementation: omitted;

 

  • Application level cache example

1) Multi level cache API encapsulation

Local cache initialization:

public class LocalCacheInitService extends BaseService {
    @Override
    public void afterPropertiesSet() throws Exception {
        // Item category cache
        Cache<String, Object> categoryCache = 
            CacheBuilder.newBuilder()
                .softValues()
                .maximumSize()
                .expireAfterWrite(Switches.CATEGORY.getExpiresInSeconds() / 2, TimeUnit.SECONDS)
                .build();
        addCache(CacheKeys.CATEGORY_KEY, categoryCache);
}

private void addCache(String key, Cache<?, ?> cache) {
    localCacheService.addCache(key, cache);
}

tip:

The expiration time of local cache is half that of distributed cache;

Associate the cache key prefix with the local cache;

 

2) Write cache API encapsulation

public void set(final String key, final Object value, final int remoteCacheExpiresInSeconds) throw RuntimeException {
    if (value == null) {
        return;
    }
    
    // Copy value object
    // The local cache is a reference, and the distributed cache needs to be serialized
    // If you do not copy, it is assumed that the local cache will be inconsistent with the distributed cache after the data is changed
    final Object finalObject = copy(value);
    // If the write local cache is configured, obtain the relevant local cache according to the KEY, and then write
    if (writeLocalCache) {
        Cache localCache = getLocalCache(key);
        if (localCache != null) {
            localCache.put(key, finalValue);
        }
    }
    // If no write distributed cache is configured, it returns directly
    if (!writeRemoteCache) {
        return;
    }
    // Asynchronous update distributed cache
    asyncTaskExecutor.execute(() -> {
        try {
            redisCache.set(
                key,
                JSONUtils.toJSON(finalValue),
                remoteCacheExpiresInSeconds);
        } cache (Exception e) {
            LOG.error("update redis cache error, key : {}", key, e);
        }
    });
}

 

3) Read cache API encapsulation

private Map innerMget(List<String> keys, List<Class> types) throw Exception {
    Map<String, Object> result = Maps.newHashMap();
    List<String> missKeys = Lists.newArrayList();
    List<String> missTypes = Lists.newArrayList();
    // If the local cache is configured, read the local cache first
    if (readLocalCache) {
        for (int i = 0; i < keys.size(); i++) {
            String key = keys.get(0);
            Class type = types.get(0);
            Cache localCache = getLocalCache(key);
            if (localCache != null) {
                Object value = localCache.getIfPresent(key);
                result.put(key, value);
                if (value == null) {
                    missKeys.add(key);
                    missTypes.add(type);
                }
            } else {
                missKeys.add(key);
                missTypes.add(type);
            }
        }    
    }
    // If no read distributed cache is set, return
    if (!readRemoteCache) {
        return result;
    }
    final Map<String, String> missResult = Maps.newHashMap();

    // For the key partition, do not call too large at one time
    final List<List<String>> keysPage = Lists.partition(missKeys, 10);
    List<Future<Map<String, String>>> pageFutures = Lists.newArrayList();

    try {
        // Batch acquisition of distributed cache data
        for (final List<String> partitionKeys : keysPage) {
            pageFutures.add(asyncTaskExecutor.submit(
                                () -> redisCache.mget(partitionKeys)));
        }
        for (Future<Map<String, String>> future : pageFutures) {
            missResult.putAll(future.get(3000, TimeUnit.MILLSECONDS);
        }
    } catch (Exception e) {
        pageFutures.forEach(future -> future.cancel(true));
        throw e;
    }
    return result;
}

4)NULL Cache

private static final String NULL_STRING = new String();
// Query DB
String value = loadDB();
// If the DB has no data, it is encapsulated as NULL_STRING and put into cache
if(value == null) {
    value = NULL_STRING;
}
myCache.put(id, value);


// When reading
value = suitCache.getIfPresent(id);
// Cache NULL, return
if(value == NULL_STRING) {
    return null;
}

5) Force to get the latest data

Determine whether to force the cache to be flushed by setting the ThreadLocal switch

if(ForceUpdater.isForceUpdateMyInfo()) {
    myCache.refresh(skuId);
}
String result = myCache.get(skuId);
if (result == NULL_STRING) {
    return null;
}

6) Failure statistics and delay alarm (omitted)

 

  • Cache usage pattern practice

1) Cache aside: that is, business code is written around the cache, and the business code directly maintains the cache.

// 1. Get data from cache first
value = myCache.getIfPresent(key);
if (value == null) {
    // 2.1.  If the cache does not hit, go back to the source to get the source data
    value = loadFromSoR(key);
    // 2.2.  Put the data into the cache, and you can get the data from the cache next time
    myCache.put(key, value);
}

// 2. Write the scenario and synchronize the cache immediately after writing
writeToSoR(key, value);
myCache.put(key, value);

// 3. Write scenario, cache invalidation after writing
writeToSoR(key, value);
myCache.invalidate(key);

For concurrent updates, cache inconsistency:

For user dimension data, concurrency is very rare and can be ignored;

For commodity data, you can listen to binlog to update the cache, but the cache update will be delayed;

For the read service scenario, you can use consistent hash to balance the same operation load to the same instance, reduce the concurrency probability, or set a relatively short time.

2) Cache as SoR: that is, the cache is regarded as SoR. All operations are performed on the cache, and then the cache is delegated to SoR for real read / write. The original text is based on Guava and Ehcache, and the source code is omitted.

 

11. HTTP cache

  • HTTP cache (source code omitted)

1) The server responds to last modified and brings the if modified since request header to the server to verify whether the document is modified. If there is no modification, it returns 304;

2) Cache control: Max age (HTTP/1.1) and Expire (HTTP/1.0) are used to determine how long browser content is cached. The former has higher priority;

3) In general, Expires = current system time + cache time (cache control: Max age);

4) ETag (HTTP/1.1) can be used to determine whether the page content has been modified.

 

  • HttpClient client cache

HTTP/1.1 compatible client caching is provided in version 4.3 of httpclient (HTTP/1.0 caching is not implemented).

When using HttpClient client cache, you need to introduce the following dependencies

<dependency>
    <groupId>org.apache.httpcomponents</groupId>
    <artifactId>httpclient-cache</artifactId>
    <version>4.5.2</version>
</dependency>

to configure:

CacheConfig cacheConfig = CacheConfig.custom()
    .setMaxCacheEntries(1000)    // Cache up to 1000 entries
    .setMaxObjectSize(1 * 1024 * 1024)    // The maximum cache object size is 1MB
    .setAsynchronousWorkersCore(5)    // Asynchronous update cache thread pool minimum number of idle threads
    .setAsynchronousWorkersMax(10)    // Maximum threads of asynchronous update cache thread pool
    .setRevalidationQueueSize(10000)    // Asynchronous update thread pool queue size
    .build();
// Cache storage
HttpCacheStorage cacheStorage = new BasicHttpCacheStorage(cacheConfig);
// Create HttpClient
httpClient = CachingHttpClients.custom()
    .setCacheConfig(cacheConfig)    // Cache configuration
    .setHttpCacheStorage(cacheStorage)    // Cache storage
    .setSchedulingStrategy(new ImmediateSchedulingStrategy(cacheConfig))    //Cache scheduling policy when validating cache
    .setConnectionManager(manager)
    .build();

BasicHttpCacheStorage means that it is stored in memory (the simplest LRU algorithm is implemented using LInkedHashMap). By default, Ehcache and mamcache storage implementations are also provided. There is no time-based expiration policy. It is recommended to use Ehcache in actual use.

ImmediateSchedulingStrategy will create a thread pool using the thread pool parameters we configured, and then asynchronously make a revalidation request.

For the cache implemented by Ehcache, please refer to: https://www.cnblogs.com/dehai/p/5063106.html

 

  • Nginx Http cache
  • Nginx proxy layer cache

 

12. Multilevel cache

  • Overall process

1) Access Nginx, balance the request load to the application Nginx, and use polling or consistent hash algorithm;

2) Use Nginx to read the local cache (implemented by Lua Shared Dict, Nginx Proxy Cache and Local Redis) to reduce the back-end pressure, especially for hot issues;

3) If the local Nginx cache fails to hit, the distributed cache will be read and the Nginx local cache will be written back;

4) If the distributed cache fails to hit, it will go back to the source Tomcat cluster. You can also use polling or consistent hash algorithm;

5) In Tomcat application, first read the local heap cache, if any, return and write it to the main Redis cluster;

6) As an optional part, you can try gambling on the primary Redis cluster again to prevent traffic impact when there is a problem with the secondary cluster;

7) If all caches fail to hit, you can only query the DB or related services to obtain relevant data and return;

8) Step 7) the returned data is asynchronously written to the Redis main cluster. There may be multiple Tomcat instances written at the same time, which will cause data disorder?

 

  • How to cache data

1) Expired and non expired

For cached data, we can consider non expiration cache and cache with expiration time.

No expiration cache scenario: for the data accessed by the long tail, most scenarios with high data access frequency, or if the cache space is sufficient, you can consider no expiration cache, such as users, classifications, commodities, prices, orders, etc. When the cache is full, consider using the LRU mechanism to expel the old cache data. Use cache aside mode;

Expiration cache mechanism: it is generally used to cache data of other systems (unable to subscribe to change messages, or the cost is very high), limited cache space, low-frequency hotspot cache and other scenarios.

2) Dimensional cache and incremental cache

3) Large Value cache

4) Hotspot cache

 

  • Distributed caching and application load balancing

 

  • Hotspot data and update cache

1) Single machine full cache + master-slave

All caches are stored in the application local machine. After returning to the source, the data will be updated to the master Redis cluster, and then copied to other slave Redis clusters through the master-slave mode. Cached updates can be synchronized by lazy loading or subscription messages.

2) Distributed cache + application local hotspot

For distributed caching, we need to conduct application caching in Nginx+Lua application to reduce the access impact of Redis cluster, that is, first query the local cache of the application, if it hits, directly cache it, if it doesn't hit, then query the Redis cluster and return to Tomcat, and then cache the data to the local application.

 

  • Update cache and atomicity

1) Update timestamp or version comparison is used when updating data. If Redis is used, its single thread mechanism can be used for atomic update;

2) Use, for example, canal to subscribe to the database binlog;

3) Distribute the update requests to multiple queues according to the corresponding rules, and then update each queue with a single thread, and pull the latest data to save when updating;

4) Use distributed locks.

 

  • Cache crash and quick fix

1) Module taking: a bad instance will lead to a large number of cache misses, which should be avoided by using master-slave;

2) Consistent hash: an instance is broken, which only affects part of the cache;

3) Quick repair: master-slave mechanism, request degradation, cache reconstruction.

 

13. Detailed explanation of connection pool and thread pool

  • Database connection pool
  • HttpClient connection pool: the real connection pool is when the long connection is enabled. The short connection is only used as a semaphore to limit the total number of requests, and the connection is not multiplexed; When the JVM stops or restarts, remember to close the connection pool and release the connection; HttpClient is thread safe. Do not create one at a time; If the connection pool is configured to be large, you can consider creating multiple HttpClient instances; Using connection pooling, consume the responder and release the connection as soon as possible.
  • Thread pool:

Java thread pool. When using thread pool, you must set the pool size and queue size and set the corresponding rejection policy;

Tomcat thread pool

Posted by jhl84 on Thu, 05 May 2022 09:34:01 +0300