Spark机器配置计算

时间：2018-07-10 00:26:54 阅读：220 评论：0 收藏：0 [点我收藏+]

● Based on the recommendations mentioned above, Let‘s assign 5 core per executors => --executor-cores = 5 (for good HDFS throughput)

● Leave 1 core per node for Hadoop/Yarn daemons => Num cores available per node = 16-1 = 15

● So, Total available of cores in cluster = 15 x 10 = 150

● Number of available executors = (total cores/num-cores-per-executor) = 150/5 = 30

● Leaving 1 executor for ApplicationManager => --num-executors = 29

● Number of executors per node = 30/10 = 3

● Memory per executor = 64GB/3 = 21GB

● Counting off heap overhead = 7% of 21GB = 3GB. So, actual --executor-memory = 21 - 3 = 18GB

技术分享图片

基本思路就是要明确经验值，一个executor跑5个task，因为spark需要和hdfs client交互实现对于hdfs的读写；所以多个客户端可以实现并行，效果比较好；

然后就是首先计算core的数量，

接着计算executor数量，包括总数量以及单节点数量；首先求出总数量，然后是单个节点的数量；注意这里需要把AM的executor数量考虑进去（一个）

最后是计算内存；内存都是计算单机内存；但是内存不可能都分配给JVM；

原文地址：https://www.cnblogs.com/xiashiwendao/p/9286736.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

周排行