Programming practice of Spark Streaming+Kafka based on Python

explain There are many articles explaining the principle of Spark Streaming, which will not be introduced here. This paper mainly introduces the programming model, coding practice and some optimization instructions using Kafka as the data source spark streaming:http://spark.apache.org/docs/1.6.0/streaming-programming-guide.html streaming-kafka ...

Posted by Adam_28 on Mon, 16 May 2022 21:59:57 +0300

K8S builds Kafka:2.13-2.6.0 and Zookeeper:3.6.2 clusters

Build Kafka:2.13-2.6.0 and Zookeeper:3.6.2 clusters 1, Service version information: Kafka: v2.13-2.6.0 Zookeeper: v3.6.2 Kubernetes: v1.18.4 2, Create Zookeeper image Zookeeper uses the official image provided in docker hub. You can download it directly by using the following command: docker pull zookeeper:3.6.2 Since the startup script used ...

Posted by miles_rich on Mon, 09 May 2022 02:15:25 +0300

Docker compose deploying kafka clusters (alicloud ECS also applies)

Record how to use docker compose to deploy kafka cluster and test it through SpringBoot. According to the leaders in the network, they use docker compose to build kafka cluster, but they still encounter many problems, either the server runs through, but the test cases fail and other strange reasons. So when you succeed in building, you think i ...

Posted by suspect on Sun, 08 May 2022 03:31:51 +0300

Kafka Producer (including interceptor, partition, serializer and asynchronous message sending mode)

Kafka Producer (including interceptor, partition, serializer and asynchronous message sending mode) Kafka producer is a role in the whole Kafka architecture. It can be different components that integrate Kafka. Kafka producer is thread safe and can be used by multiple threads at the same time. 1 how to build a KafkaProducer There are two way ...

Posted by digi24 on Wed, 04 May 2022 08:50:29 +0300

spark streaming (real-time stream word frequency statistics)

First in idea Import maven dependency package <dependency> <groupId>org.apache.kafka</groupId> <artifactId>kafka_2.11</artifactId> <version>2.0.0</version> </dependency> <dependency> <groupId>org.apache.kafka</groupId> <artifactId>kaf ...

Posted by ym_chaitu on Sun, 01 May 2022 03:48:44 +0300

Getting started with Kafka -- Python Kafka Client performance test

1, Foreword Due to work reasons, Kafka is used, and the existing code can not meet the performance requirements, so it is necessary to develop efficient tools to read and write Kafka. This paper is a performance test record of Python Kafka Client. Through this test, we can know what third-party library is selected to have the highest performanc ...

Posted by unbreakable9 on Thu, 28 Apr 2022 15:24:53 +0300

Kafka: theme management

Theme management Theme management includes creating theme, viewing theme information, modifying theme, deleting theme and other operations. You can use kafka topics SH script to perform these operations. This script is in $KAFKA_HOME/bin / directory. The script body has only one line: #!/bin/bash # Licensed to the Apache Software Foundation (AS ...

Posted by sanstenarios on Tue, 19 Apr 2022 13:24:54 +0300

zookeeper installation and startup

Installing zookeeper on CentOS 7 1 Preparation 1. Prepare the server. This installation adopts a virtual machine server with centos7 system, 2G memory and 60G storage; 2. Server installation java environment: refer to the blog article "installing jdk8 on CentOS 7"; 3. Prepare the zookeeper installation package. Zookeeper-3.4.11 is ad ...

Posted by craka on Tue, 19 Apr 2022 12:56:57 +0300

Flume+Kafka get MySQL data

abstract MySQL is widely used in the storage database of massive business. In the era of big data, we urgently need to analyze the massive data, but it is obviously unrealistic to analyze the big data on MySQL, which will affect the operation stability of the business system. If we want to analyze these data in real time, we need to copy them t ...

Posted by Voodoo Jai on Thu, 14 Apr 2022 03:20:23 +0300

ksqlDB basic usage

Basic concepts ksqlDB Server ksqlDB is an event flow database and a special database. Based on Kafka's real-time data flow processing engine, ksqlDB provides a powerful and easy-to-use SQL interactive way to process Kafka data flow without writing code. KSQL has excellent characteristics such as high expansion, high elasticity and fault toler ...

Posted by sushant_d84 on Wed, 06 Apr 2022 08:14:29 +0300