老系统,使用了commonj做为时间和任务线程池。其中的时间管理之前系统只有单个fixed-rate的时间任务一直都运行的很好,最近系统中加入了健康监控(dropwizard healtch check),需要加多一个时间任务,结果在测试过程中出现了healtch check任务偶尔在触发点执行多次的问题,百思不得其解,读了commonj的源代码 (de.myfoo.commonj.timers.FooTimerManager.java和de.myfoo.commonj.timers.TimerExecutor.java)才大概知道问题可能是怎么回事,从代码里面我们可以看出,如果同时有2个timer task,在第一个task正在执行的时候(timer.getScheduledExecutionTime() 还没被执行到),那么waittime就会是负数,整个while循环会不停的运行,假设这个时候第二个task符合执行条件,就会被加入执行pool,如果第二个task在下一次循环的时候状态还没有改变(因为waittime是0),就会被多次加入pool,我们的系统中最多被加入了7次,都是多线程惹的祸,footimermanager完全无法预料加入pool的task什么时候改变状态,以另外线程的状态来判断任务是否在执行不可取,最后把系统的commonj改成了spring的scheduled task,简单好用。
//FooTimerManager
@Override
182 public void run() {
183
184 // while we are not stopped...
185 while (!stopped) {
186
187 long nextTime = System.currentTimeMillis() + 1000;
188
189 synchronized (this) {
190 // iterate all timers and check if the need to be executed
191 for(Iterator<TimerExecutor> iter = timers.iterator(); iter.hasNext();) {
192 TimerExecutor timerExecutor = iter.next();
193 FooTimer timer = timerExecutor.getTimer();
194
195 // Added by SOSI-DCC
196 // check if timer is cancelled
197 if(timer.isCancelled()) {
198 iter.remove();
199 }
200
201 // check if timer is expired
202 else if(timer.isExpired() && !timerExecutor.isRunning()) {
203 // execure timer if not suspended / suppending
204 if(!suspended && !suspending) {
205 pool.execute(timerExecutor);
206 }
207
208 // remove one shot timers after execution
209 if(timer instanceof OneShotTimer) {
210 iter.remove();
211 }
212 } else {
213 // find the soonest execution time
214 long time = timer.getScheduledExecutionTime();
215 if(time < nextTime) {
216 nextTime = time;
217 }
218 }
219 }
220
221 // count running timers that are still running
222 boolean running = false;
223 for(Iterator<TimerExecutor> iter = timers.iterator(); iter.hasNext();) {
224 TimerExecutor timerExecutor = iter.next();
225 if(timerExecutor.isRunning()) {
226 running = true;
227 }
228 }
229
230 if(suspending && !running) {
231 suspended = true;
232 }
233
234 if(stopping && !running) {
235 stopped = true;
236 }
237
238 // wait til next execution is due...
239 long waitTime = nextTime - System.currentTimeMillis();
240 if(waitTime > 0) {
241 try {
242 wait(waitTime);
243 } catch(InterruptedException e) { // ignore
244 }
245 }
246
247 }
248 }
249
250 }
//TimerExecutor
@Override
72 public void More ...run() {
73 running = true;
74 try {
75 // execute the timer
76 timer.execute();
77
78 // compute next execution time
79 timer.computeNextExecutionTime();
80 } catch (Exception e) {
81 // ignore
82 } finally {
83 running = false;
84 // timerManager.notifyAll();
85 }
86 }
Trouble Shooting -- Commonj任务同一时间多次触发
原文地址:http://blog.csdn.net/cloud_ll/article/details/45114535