标签:cond cap log cal sof exe 一个 int err
简单来说,这天,有一个 AWS 工程师在调查 Northern Virginia (US-EAST-1) Region 上 S3 的一个和账务系统相关的问题,这个问题是S3的账务系统变慢了(我估计这个故障在Amazon里可能是Sev2级,Sev2级的故障在Amazon算是比较大的故障,需要很快解决),Oncall的开发工程师(注:Amazon的运维都是由开发工程师来干的,所以Amazon内部嬉称SDE-Software Developer Engineer 为 Someone Do Everything)想移除一个账务系统里的一个子系统下的一些少量的服务器(估计这些服务器上有问题,所以想移掉后重新部署),结果呢,有一条命令搞错了,导致了移除了大量的S3的控制系统。包括两个很重要的子系统:

测试代码
public class TaskExecutor {
private static final int DEFAULT_TASK_QUEUE_LEN = 200;
private static final long DEFAULT_KEEP_ALIVE_TIME = 10 * 1000;
private BlockingQueue<Runnable> taskQueue;
private ThreadPoolExecutor executor;
private static TaskExecutor instance;
public static TaskExecutor getInstance(){
if(instance == null)
instance = new TaskExecutor();
return instance;
}
private TaskExecutor() {
this.taskQueue = new ArrayBlockingQueue(DEFAULT_TASK_QUEUE_LEN);
}
public void setPoolSize(int poolSize){
if(this.executor == null){
this.executor = new ThreadPoolExecutor(poolSize,poolSize,DEFAULT_KEEP_ALIVE_TIME,TimeUnit.SECONDS,this.taskQueue);
}else{
this.executor.setCorePoolSize(poolSize);
this.executor.setMaximumPoolSize(poolSize);
}
}
public void execute(Runnable task) throws InterruptedException {
while (taskQueue.remainingCapacity() <= 0) { Thread.sleep(1); }
executor.execute(task);
}
public void waitQueueClean() throws InterruptedException {
while(taskQueue.remainingCapacity() < DEFAULT_TASK_QUEUE_LEN){ Thread.sleep(1);}
}
}

标签:cond cap log cal sof exe 一个 int err
原文地址:http://www.cnblogs.com/sheefee/p/7324388.html