Jump to content

Run Away Scheduler


Dan Munns

Recommended Posts

Hi all,

I set up some scheduled tasks on Thursday and have come in this morning to find that one of them has generated 1821 tasks.

It looks to have generated 1 new task every minute since it kicked off on the 1st at midnight. I have since stopped the task.

None of the other scheduled tasks have done anything weird (so far) 

I have attached the config of the task. 

Also if someone could delete all tasks named 'Systems User Review' from our instance fairly urgently as other genuine tasks are lost in a sea of duplicates and I dont fancy sitting around all day deleting them. 

Thanks 

Dan

Capture.PNG

Link to comment
Share on other sites

Hi @Dan Munns,

I'm sorry to hear about the issue. We will start investigating about what happened here and once we understand the issue we will remove tasks named "System User Review". 

Can you please confirm that you want us to remove these generated tasks from your instance?

Also, can you let me know if you did further changes to the Scheduled Job after stopping it?

Thank you,

Daniel.

Link to comment
Share on other sites

Hi All,

Just an update, we found the problem with the run-away schedule.   The problem id if you set a monthly schedule on the 30th then enable the month of Feb when the schedule runs it would throw a permanent error which *should* have set the job status to failed.  Unfortunately a defect in our implementation meant we neither output the failure error message to the log or marked the job as failed, the effect was the schedule would continue to run every cycle.  So we have done the following...

1. We now output the failure message to the server log

2. We now output the failure message and state info to the scheduled job log

3. We have properly handled this condition and now set the job status to failed. 

4. We are going to add better input validation on the API's used to create and update scheduled jobs

5. We are going to add input validation in the UI to prevent such erroneous schedule configurations from being created. 

6. We are going to re-work this to improve the way it works at the end of the month and automatically bring the schedule forward for any month that does not include the day you specified. 

These changes will be rolled out as soon as is practical.  We will put in a temporary client-side validation into the UI and get that out asap to help prevent this in the short term.

In the mean time, the work around is not to specify a month day of greater than 28 when doing a monthly schedule. 

If you have been affected by this problem please let us know and we can clear down the tasks that have been created in your instance database.  Please accept our apologies for this defect, this should not have gotten through our tests or code reviews, so hands up, we are sheepishly off to go and right our wrong.

Thanks,

Gerry

 

 

 

  • Like 1
Link to comment
Share on other sites

Hi @Dan Munns,

Managing times, especially in a system that can handle multiple time zones is complicated. In your case, the time was set to 00:00:00. In the UK we are now in summer time, meaning that it will take you one hour back, that is the 30th of September, 23:00:00 hrs.

We will add another small change where the default time will be 12:00:00 to avoid these kind of issues.

Thanks,

Daniel.

 

Link to comment
Share on other sites

41 minutes ago, Dan Munns said:

Hi @Gerry

Did you work out why my schedule went haywire? 

As you can see from the image I posted earlier I had set the task to run on the 1st rather than the 30th.

Thanks

Hi Dan,

[edit] My apologies, I see my colleague has already provided the explanation to your specific scenario two posts above, hope thats ok.

Thats interesting, we will need to look at that scenario too.  The problem is, the error reporting internally (at the code level) is quite good, but we were not outputting the error messages in any meaningful way.  The running away problem was a bad coding error, we were handling the said error messages but not reporting them, and more crucially not marking the scheduled job as failed.  I am 100% sure the fix we have done will prevent any future run-aways, and if there is a problem with your specific schedule we will get a clear error message/explanation.  

Is you scheduled job still on your system disabled?

Gerry

Link to comment
Share on other sites

@Gerry no worries it's fine. If I hadn't have set it up wrong in the first place it may have been missed for a while. 

I did think it may have been time related as I have used systems in the past where we had to use 23:59:59 or 00:00:01 as 00:00:00 caused issues.


Oh well lessons learned. That's the fun of IT!! 

  • Like 2
Link to comment
Share on other sites

If thats the only opton then yes that fine.

Once the offending tasks had been cleared down the notifications went back to normal for a time. Only noticed this last night. I was hoping an update would fix it but I have updated and no change.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...