Storm Trident示例Aggregator

时间：2018-03-24 22:37:02 阅读：265 评论：0 收藏：0 [点我收藏+]

Aggregator首先在输入流上运行全局重新分区操作(global)将同一批次的所有分区合并到一个分区中，然后在每个批次上运行的聚合功能，针对Batch操作。与ReduceAggregator很相似。

省略部分代码，省略部分可参考：https://blog.csdn.net/nickta/article/details/79666918

static class State {  
        int count = 0;  
    }

FixedBatchSpout spout = new FixedBatchSpout(new Fields("user", "score"), 3,    
                new Values("nickt1", 4),   
                new Values("nickt2", 7),    
                new Values("nickt3", 8),   
                new Values("nickt4", 9),    
                new Values("nickt5", 7),   
                new Values("nickt6", 11),   
                new Values("nickt7", 5)   
                );   
        spout.setCycle(false);   
        TridentTopology topology = new TridentTopology();   
        topology.newStream("spout1", spout)   
                .shuffle()   
                .each(new Fields("user", "score"),new Debug("shuffle print:"))  
                .parallelismHint(5)  
                .aggregate(new Fields("score"), new BaseAggregator<State>() {  
                    //在处理每一个batch的数据之前，调用1次  
                    //空batch也会调用  
                    @Override  
                    public State init(Object batchId, TridentCollector collector) {  
                        return new State();  
                    }  
                    //batch中的每个tuple各调用1次  
                    @Override  
                    public void aggregate(State state, TridentTuple tuple, TridentCollector collector) {  
                        state.count = tuple.getInteger(0) + state.count;  
                    }  
                    //batch中的所有tuples处理完成后调用   
                    @Override  
                    public void complete(State state, TridentCollector collector) {  
                        collector.emit(new Values(state.count));  
                    }  
                      
                }, new Fields("sum"))  
                .each(new Fields("sum"),new Debug("sum print:"))  
                .parallelismHint(5);

输出：

[partition4-Thread-136-b-0-executor[37 37]]> DEBUG(shuffle print:): [nickt1, 4]
[partition4-Thread-136-b-0-executor[37 37]]> DEBUG(shuffle print:): [nickt3, 8]
[partition3-Thread-118-b-0-executor[36 36]]> DEBUG(shuffle print:): [nickt2, 7]
[partition4-Thread-136-b-0-executor[37 37]]> DEBUG(shuffle print:): [nickt5, 7]
[partition3-Thread-118-b-0-executor[36 36]]> DEBUG(shuffle print:): [nickt4, 9]
[partition3-Thread-118-b-0-executor[36 36]]> DEBUG(shuffle print:): [nickt6, 11]
[partition1-Thread-82-b-1-executor[39 39]]> DEBUG(sum print:): [19]
[partition2-Thread-66-b-1-executor[40 40]]> DEBUG(sum print:): [27]
[partition4-Thread-136-b-0-executor[37 37]]> DEBUG(shuffle print:): [nickt7, 5]
[partition3-Thread-54-b-1-executor[41 41]]> DEBUG(sum print:): [5]

Storm Trident示例Aggregator

标签：class blog dba reg 参考 ide sdn sea 分代

原文地址：https://www.cnblogs.com/nickt/p/8641424.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行