elasticsearch cold and hot separation Cluster Construction -- the road of building a dream

Introduction to cold heat separation architecture

Hot and cold separation is a very popular architecture of ES at present. It makes full use of the advantages and disadvantages of cluster machines to realize resource scheduling and allocation. The index writing and query speed of ES cluster mainly depends on the IO speed of disk. The key point of hot and cold data separation is to use solid-state disk to store data. If all solid-state disks are used, the cost is too high, and it is more wasteful to store cold data. Therefore, using ordinary mechanical disks and solid-state disks can make full use of resources and greatly improve performance. Therefore, we can store the real-time data (within 5 days) in the hot node and the historical data (5 days ago) in the cold node, and we can use the characteristics of ES to migrate the data of the hot node to the cold node according to the time. Here, because we establish the index library on a daily basis, the data migration will be more convenient.

Case:

When using cold hot separation, we need to build the index library in the hot node and wait for a certain time to migrate the index library to the cold node. Therefore, we need to increase the amount of heating nodes to set the number of slices.
For example, if we have 6 hot nodes and 9 cold nodes, and the data volume of the main partition of the index library is about 500G, then the index library establishes 18 partitions and all of them are in the hot node. At this time, the partition distribution of the index library is: hot node: 18, cold node: 0; After the data is not hot data, all the fragments of the index library are migrated to the cold node. The distribution of the fragments of the index library is: hot node: 0, cold node 18.

Implementation of ElasticSearch cold and hot separation architecture

ElasticSearch's hot and cold separation architecture is an idea. Its implementation principle is to use ElasticSearch's routing, set the corresponding routing in the data node (two types of master data nodes), and then specify the servers to be distributed when creating the index database. After a period of time, migrate the data of these index databases to other data nodes according to business requirements

Node name

Node port mapping

(host: container)

label

Remarks

es01

9200:9200

9300:9300

hot
es02

9201:9200

9301:9300

hot
es03

9202:9200

9302:9300

hot
es04

9203:9200

9303:9300

cold
es05

9204:9200

9304:9300

cold
es06

9205:9200

9305:9300

cold
kibana 5601:5601 Graphical management tools
cerebro 9000:9000 Monitoring tools
eshead 9100:9100 eshead tool

 

#docker-compose.yml

version: '3.7'
services:
  es01:
    image: elasticsearch:7.17.3-ik
    container_name: es01
    #restart: always
    environment:
      - node.name=es01
      - network.publish_host=es01
      - cluster.name=es-docker-cluster
      - cluster.initial_master_nodes=es01,es02,es03,es04,es05,es06
      - "discovery.seed_hosts=es02,es03,es04,es05,es06"
      - bootstrap.memory_lock=true
      - "ES_JAVA_OPTS=-Xms256M -Xmx256M"
      - node.attr.box_type=hot
    ulimits:
      nproc: 65535
      memlock:
        soft: -1
        hard: -1
    volumes: 
      - /data/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml
      - /data/es1:/usr/share/elasticsearch/data
    ports:
      - 9200:9200
      - 9300:9300
 
  es02:
    image: elasticsearch:7.17.3-ik
    container_name: es02
    #restart: always
    environment:
      - node.name=es02
      - network.publish_host=es02
      - cluster.name=es-docker-cluster
      - cluster.initial_master_nodes=es01,es02,es03,es04,es05,es06
      - "discovery.seed_hosts=es01,es03,es04,es05,es06"
      - bootstrap.memory_lock=true
      - "ES_JAVA_OPTS=-Xms256M -Xmx256M"
      - node.attr.box_type=hot
    ulimits:
      nproc: 65535
      memlock:
        soft: -1
        hard: -1
    volumes: 
      - /data/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml
      - /data/es2:/usr/share/elasticsearch/data
    ports:
      - 9201:9200
      - 9301:9300
 
  es03:
    image: elasticsearch:7.17.3-ik
    container_name: es03
    #restart: always
    environment:
      - node.name=es03
      - network.publish_host=es03
      - cluster.name=es-docker-cluster
      - cluster.initial_master_nodes=es01,es02,es03,es04,es05,es06
      - "discovery.seed_hosts=es01,es02,es04,es05,es06"
      - bootstrap.memory_lock=true
      - "ES_JAVA_OPTS=-Xms256M -Xmx256M" 
      - node.attr.box_type=hot
    ulimits:
      nproc: 65535
      memlock:
        soft: -1
        hard: -1
    volumes: 
      - /data/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml
      - /data/es3:/usr/share/elasticsearch/data
    ports:
      - 9202:9200
      - 9302:9300
 
  es04:
    image: elasticsearch:7.17.3-ik
    container_name: es04
    #restart: always
    environment:
      - node.name=es04
      - network.publish_host=es04
      - cluster.name=es-docker-cluster
      - cluster.initial_master_nodes=es01,es02,es03,es04,es05,es06
      - "discovery.seed_hosts=es01,es02,es03,es05,es06"
      - bootstrap.memory_lock=true
      - "ES_JAVA_OPTS=-Xms256M -Xmx256M"
      - node.attr.box_type=cold
    ulimits:
      nproc: 65535
      memlock:
        soft: -1
        hard: -1
    volumes: 
      - /data/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml
      - /data/es4:/usr/share/elasticsearch/data
    ports:
      - 9203:9200
      - 9303:9300

  es05:
    image: elasticsearch:7.17.3-ik
    container_name: es05
    #restart: always
    environment:
      - node.name=es05
      - network.publish_host=es05
      - cluster.name=es-docker-cluster
      - cluster.initial_master_nodes=es01,es02,es03,es04,es05,es06
      - "discovery.seed_hosts=es01,es02,es04,es03,es06"
      - bootstrap.memory_lock=true
      - "ES_JAVA_OPTS=-Xms256M -Xmx256M"
      - node.attr.box_type=cold
    ulimits:
      nproc: 65535
      memlock:
        soft: -1
        hard: -1
    volumes: 
      - /data/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml
      - /data/es3:/usr/share/elasticsearch/data
    ports:
      - 9204:9200
      - 9304:9300

  es06:
    image: elasticsearch:7.17.3-ik
    container_name: es06
    environment:
      - node.name=es06
      - cluster.initial_master_nodes=es01,es02,es03,es04,es05,es06
      - "discovery.seed_hosts=es01,es02,es04,es03,es05"
      - bootstrap.memory_lock=true
      - "ES_JAVA_OPTS=-Xms256M -Xmx256M"
      - node.attr.box_type=cool
    ulimits:
      nproc: 65535
      memlock:
        soft: -1
        hard: -1
    volumes:
      - /data/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml
      - /data/es6:/usr/share/elasticsearch/data
    ports:
      - 9205:9200
      - 9305:9300

  kibana:
    image: kibana:7.17.3
    container_name: kibana
    #restart: always
    environment:
      - SERVER_NAME=192.168.80.100
      - I18N_LOCALE=zh-CN
    ports:
      - 5601:5601
    ulimits:
      nproc: 65535
      memlock:
        soft: -1
        hard: -1
    volumes:
      - /data/config/kibana.yml:/usr/share/kibana/config/kibana.yml
  
  eshead:    
    image: mobz/elasticsearch-head:5-alpine
    container_name: eshead
    ports:
      - 9100:9100

  cerebro:
    image: lmenezes/cerebro:latest
    container_name: cerebro
    ports:
        - 9000:9000
    command:
        - -Dhosts.0.host=http://es01:9200

#elasticsearch.yml profile

cluster.name: "es-docker-cluster"
network.host: 0.0.0.0
http.cors.enabled: true
http.cors.allow-origin: "*"
node.master: true
node.data: true
discovery.zen.minimum_master_nodes: 3
node.max_local_storage_nodes: 5


#kibana.yml profile

server.host: "0.0.0.0"
server.shutdownTimeout: "10s"
elasticsearch.hosts: ["http://192.168.80.100:9200","http://192.168.80.100:9201"]
monitoring.ui.container.elasticsearch.enabled: true

#Create directory
mkidr -p /data/config /data/es{1..6}

chmod -R 777 /data/es* /data/config

Start cluster

#Start es cluster validation
docker-compose up -d

#View all output logs
docker-compose logs -f

#View the log of the specified container
docker-compose logs -f es03

#View cluster status

curl -XGET http://192.168.80.100:9200/_cluster/health

curl -XGET http://192.168.80.100:9200/_cat/health

#View the hot and cold distribution of cluster nodes
curl -XGET http://192.168.80.100:9200/_cat/nodeattrs

#View cluster node status
curl -XGET http://192.168.80.100:9200/_cat/nodes

#View all indexes
curl -XGET http://192.168.80.100:9200/ _cat/indices?pertty

#Firewall port development
firewall-cmd --zone=public --add-port=9000/tcp --permanent
firewall-cmd --zone=public --add-port=9100/tcp --permanent
firewall-cmd --zone=public --add-port=9200-9205/tcp --permanent
firewall-cmd --zone=public --add-port=9300-9305/tcp --permanent
firewall-cmd --reload

#web access monitoring
http://192.168.80.100:9000

#web access head tool
http://192.168.80.100:9100

#Google browser plug-in elasticsearch head is easy to use. You can go to Google's app store to add extensions

Tags: Big Data ElasticSearch Distribution Virtualization

Posted by coldkill on Fri, 06 May 2022 13:23:41 +0300