标签:cond cap log cal sof exe 一个 int err
简单来说,这天,有一个 AWS 工程师在调查 Northern Virginia (US-EAST-1) Region 上 S3 的一个和账务系统相关的问题,这个问题是S3的账务系统变慢了(我估计这个故障在Amazon里可能是Sev2级,Sev2级的故障在Amazon算是比较大的故障,需要很快解决),Oncall的开发工程师(注:Amazon的运维都是由开发工程师来干的,所以Amazon内部嬉称SDE-Software Developer Engineer 为 Someone Do Everything)想移除一个账务系统里的一个子系统下的一些少量的服务器(估计这些服务器上有问题,所以想移掉后重新部署),结果呢,有一条命令搞错了,导致了移除了大量的S3的控制系统。包括两个很重要的子系统:
测试代码
public class TaskExecutor { private static final int DEFAULT_TASK_QUEUE_LEN = 200; private static final long DEFAULT_KEEP_ALIVE_TIME = 10 * 1000; private BlockingQueue<Runnable> taskQueue; private ThreadPoolExecutor executor; private static TaskExecutor instance; public static TaskExecutor getInstance(){ if(instance == null) instance = new TaskExecutor(); return instance; } private TaskExecutor() { this.taskQueue = new ArrayBlockingQueue(DEFAULT_TASK_QUEUE_LEN); } public void setPoolSize(int poolSize){ if(this.executor == null){ this.executor = new ThreadPoolExecutor(poolSize,poolSize,DEFAULT_KEEP_ALIVE_TIME,TimeUnit.SECONDS,this.taskQueue); }else{ this.executor.setCorePoolSize(poolSize); this.executor.setMaximumPoolSize(poolSize); } } public void execute(Runnable task) throws InterruptedException { while (taskQueue.remainingCapacity() <= 0) { Thread.sleep(1); } executor.execute(task); } public void waitQueueClean() throws InterruptedException { while(taskQueue.remainingCapacity() < DEFAULT_TASK_QUEUE_LEN){ Thread.sleep(1);} } }
标签:cond cap log cal sof exe 一个 int err
原文地址:http://www.cnblogs.com/sheefee/p/7324388.html