elasticsearch cold and hot separation Cluster Construction -- the road of building a dream

Introduction to cold heat separation architecture Hot and cold separation is a very popular architecture of ES at present. It makes full use of the advantages and disadvantages of cluster machines to realize resource scheduling and allocation. The index writing and query speed of ES cluster mainly depends on the IO speed of disk. The key point ...

Posted by coldkill on Fri, 06 May 2022 13:23:41 +0300

Hadoop learning from 0 to 1 -- Chapter 12 Hadoop data compression

1. Compression overview Compressed computing can effectively reduce the number of read and write sections of the underlying storage system. Compression improves the efficiency of network bandwidth and disk space. When running MR program, I/O operation, network transmission, Shuffle and Merge take a lot of time, especially in the case of la ...

Posted by frkmilla on Fri, 06 May 2022 07:30:28 +0300

[Frontiers of Trading Technology] Experience Sharing of Real-time Computing System Construction

Abstract: Real-time computing technology has been applied to various fields such as advertising, e-commerce, games, and entertainment. For example, e-commerce websites analyze user attributes in real time, and push relevant products to customers based on the analysis results; online games analyze player data in real time, and then analyze game ...

Posted by alvinho on Fri, 06 May 2022 03:10:34 +0300

AirFlow high order, task dependent correlation demo in two DAG s with different start times

Preface background There is a scheduling requirement. After querying the previous history DAG, it is found that there is a DAG that can be used as the front of my new scheduling, so I want to see how the task s between DAGs are related, so I have the following Demo. If you can surf the Internet scientifically and your English listening is good ...

Posted by tom2oo8 on Fri, 06 May 2022 02:11:16 +0300

Flink Foundation: FLINK SQL query statement operator

1 Scan, Projection and Filter Operator describe Scan / Select / AsBatch} stream processing SELECT * FROM Orders SELECT a, c AS d FROM Orders Where / FilterBatch} stream processing SELECT * FROM Orders WHERE b = 'red' SELECT * FROM Orders WHERE a % 2 = 0 User defined scalar function (Scalar UDF)Batch} stream processing Custom f ...

Posted by jaret on Thu, 05 May 2022 22:14:31 +0300

hadoop+hive notes on deploying stand-alone version

preface For the deployment test conducted on the Ubuntu 18 version of the native simulator, refer to the official document: hadoop: Link address hive: Link address Version used: hadoop: 3.2.1 hive: 3.1.2 The whole process is configured with the root account. hadoop installation configuration hadoop uses a virtual cluster, that is, a singl ...

Posted by anshu.sah on Thu, 05 May 2022 05:22:18 +0300

Recommendation system - Hadoop fully distributed (development focus)

Development focus, Hadoop is fully distributed 1. Copy hadoop100 to 101 and 102 2. ssh password-free login 3. Cluster configuration 4. Make and distribute scripts using xsync (ignorable) 5. Cluster and test 1. Copy hadoop100 to 101 and 102 (1) scp (secure copy) secure copy scp can realize data copy between server and server. (from ser ...

Posted by Ralf Jones on Wed, 04 May 2022 17:23:48 +0300

jupyter example of Pegasus WMS

Follow the previous installation, which is an example on docker given by the official website course The installation of docker is omitted. It was installed before. Next is the example on docker given on the official website docker pull pegasus/tutorial:5.0.0 docker run --privileged --rm -p 9999:8888 pegasus/tutorial:5.0.0 Then visit the pa ...

Posted by 2005 on Wed, 04 May 2022 04:05:47 +0300

HiveSql interview question: continuous check-in and receive gold coins [Baidu - difficult questions - general solution]

catalogue 0 problem description 1 data preparation 2 problem analysis 3 Summary 0 problem description User behavior log table tb_user_log iduidartical_idin_timeout_timesign_in110102021-07-07 10:00:002021-07-07 10:00:091210102021-07-08 10:00:002021-07-08 10:00:091310102021-07-09 10:00:002021-07-09 10:00:42141010 2021-07-10 10:00:00 2021- ...

Posted by erika_web on Tue, 03 May 2022 21:59:16 +0300

mapreduce job submitted to yarn analysis

mapreduce job submitted to yarn analysis Related classes Configuration Configure the job. If not, the default configuration will be used. Job Encapsulates the running information of a job. Cluster An object representing the local connection between ResourceManager and file system; Internally encapsulates the file system information of Jo ...

Posted by grenouille on Tue, 03 May 2022 13:23:53 +0300