https://www.gravatar.com/avatar/dfba7b9af4258f14b883b676b32bf3db?s=240&d=mp

爱宝儿

游戏开发及物联网大数据集群维护及部署开发,数据存储及相关设计,大数据、ETL、Flink、kafka,hadoop框架等多种技术开>发技术。涵盖语言涉及Java、Go、Python、C、C#等。

仅供学习参考

flume kafkachannel测试

../../images/weixin_public.png

测试 kafkachannel 测试 流处理,根据硬件配置 及JVM参数 配置 a1.channels.c2.kafka.consumer.linger.ms = 0 ms无延迟 a1.channels.c2.kafka.consumer.batch.size = 100W 批次大小 a1.channels.c2.kafka.consumer.max.request.size = 85.83Mb = 90000000 bytes 1.可通过调节 batch.size 批次处理数据大小 2.可通过调节 max.request.size 请求数据大小bytes

flume 调优参数测试

../../images/weixin_public.png

flume1.9.0

测试 filechannel kafkasink 修改参数 如下: a1.sinks.k1.kafka.producer.retries = 1 a1.sinks.k1.kafka.producer.max.block.ms = 0 测试 发现 先写满 file 在sink 至 kafka
测试 3W条数据 总需要时间 256.008/s

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
-----------------------------------配置开始 --------------------------------------
a1.sources = r1
a1.channels = c1
a1.sinks = k1

a1.sources.r1.type = http
a1.sources.r1.bind = 0.0.0.0
a1.sources.r1.port = 55000
a1.sources.r1.contextPath = agent
a1.sources.r1.channels = c1
a1.sources.r1.handler=com.zzwl.flume.source.ZzwlHttpServerHandler

a1.channels.c1.type=file
a1.channels.c1.checkpointDir=/home/bigdata/flume/data/datacheck
a1.channels.c1.dataDirs=/home/bigdata/flume/data
a1.channels.c1.useDualCheckpoints=true
a1.channels.c1.backupCheckpointDir=/home/bigdata/flume/data/bakdatacheck
a1.channels.c1.checkpointInterval=30000


a1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink
a1.sinks.k1.kafka.topic = flume-hlwx-stream
a1.sinks.k1.kafka.bootstrap.servers = vm0104:9092,vm0204:9092,vm0402:9092
a1.sinks.k1.kafka.flumeBatchSize = 20
a1.sinks.k1.kafka.producer.acks = 1
a1.sinks.k1.kafka.producer.linger.ms = 1
a1.sinks.k1.kafka.producer.compression.type = snappy
a1.sinks.k1.kafka.producer.client.id = flume-kafka-producer
a1.sinks.k1.kafka.producer.max.block.ms = 0
a1.sinks.k1.kafka.producer.retries = 1

a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

-----------------------------------配置结束 --------------------------------------

测试 filechannel kafkasink 修改参数 如下: a1.sinks.k1.kafka.producer.retries = 1 a1.sinks.k1.kafka.producer.max.block.ms = 0 测试 发现 memorychannel 实时往 sink 中写入 kafka
测试 3W条数据 总需要时间 256.008/s