Stored Procedure to determine when a job is done

This is a stored procedure that I have found useful in a number of circumstances. It came into being because there are times that I need to run a job via T-SQL and wait for it to complete before continuing the script. The problem is that sp_start_job does just what it says: it starts the job and then immediately continues. There is no option to wait for it to finish.

I needed some way to pause script execution pending the completion (hopefully successfully) of the job.

One important note: this procedure using the undocumented xp_sqlagent_enum_jobs system procedure. While this has been around for ages, it is unsupported. I get why using it may bother some, but this procedure is the only way that I know of to reliably determine the current run status of a job.

The procedure accepts five parameters. The first is a string @job_name containing the name of the job to be monitored. This is the only mandatory parameter. The second parameter is an integer @job_timeout_minutes, which controls how long in minutes that the procedure will wait for the job to complete. If that threshold is exceeded, the proc will raise an error. It defaults to 60 minutes. If @job_timeout_minutes is 0, the procedure will run until the job exits.

The remaining three parameters are all outputs from the procedure. @job_failed is a boolean (bit) value that is true (1) if the job completed but in a failed state. The last two, @job_start_time and @job_complete_time, indicate the beginning and end times for the job execution.

The procedure checks the job status every 5 seconds, so there may be a brief lag between the actual completion of the job and when the procedure returns. For me, this short delay has never been an issue.

Here is some sample code for calling the procedure:

exec msdb.dbo.sp_start_job @job_name = 'Name of job to run';

declare @job_failed bit,
    @job_start_time datetime,
    @job_complete_time datetime;

exec dbo.usp_WaitForJobToFinish @job_name = 'Name of job to run',
    @job_timeout_minutes = 60,
    @job_failed = @job_failed output,
    @job_start_time = @job_start_time output,
    @job_complete_time = @job_complete_time output;

if @job_failed = 1
begin
       raiserror ('Job failed', 16, 1);
end

And finally, here is the stored procedure definition.

create procedure dbo.usp_WaitForJobToFinish
(
       @job_name sysname,
	   @job_timeout_minutes INT = 60,
       @job_failed BIT = 0 OUTPUT,
	   @job_start_time datetime = null output,
	   @job_complete_time datetime = null output
)
AS
set nocount on;

-- Brief wait to let job system stabilize.
waitfor delay '0:00:05';
 
if not exists (select * from msdb.dbo.sysjobs j where j.name = @job_name)
begin
	return;
end

declare @timeoutTime datetime = dateadd(minute, @job_timeout_minutes, getdate());

DECLARE @xp_results TABLE
(
                     job_id                UNIQUEIDENTIFIER NOT NULL,
                     last_run_date         INT              NOT NULL,
                     last_run_time         INT              NOT NULL,
                     next_run_date         INT              NOT NULL,
                     next_run_time         INT              NOT NULL,
                     next_run_schedule_id  INT              NOT NULL,
                     requested_to_run      INT              NOT NULL,
                     request_source        INT              NOT NULL,
                     request_source_id     sysname          COLLATE DATABASE_DEFAULT NULL,
                     running               INT              NOT NULL,
                     current_step          INT              NOT NULL,
                     current_retry_attempt INT              NOT NULL,
                     job_state             INT              NOT NULL
);

/* Values for "job_state":
              0 = Not idle or suspended
              1 = Executing
              2 = Waiting For Thread
              3 = Between Retries
              4 = Idle
              5 = Suspended
              [6 = WaitingForStepToFinish]
              7 = PerformingCompletionActions
*/

DECLARE @can_see_all_running_jobs INT = 1;
DECLARE @job_owner sysname = SUSER_SNAME();
DECLARE @job_state INT = 0;

WHILE @job_state != 4
BEGIN
       DELETE @xp_results;

       INSERT INTO @xp_results
       EXECUTE master.dbo.xp_sqlagent_enum_jobs @can_see_all_running_jobs, @job_owner;

       SELECT @job_state = x.job_state
       FROM @xp_results x
       JOIN msdb.dbo.sysjobs j ON j.job_id = x.job_id
       WHERE j.name = @job_name;

       IF GETDATE() > @timeoutTime
       AND @job_timeout_minutes > 0
       BEGIN
              RAISERROR('Timeout expired', 16, 1);
              RETURN;
       END

       IF @job_state != 4
       BEGIN
              WAITFOR DELAY '0:00:05';
       END
END

SELECT TOP (1) @job_failed = CASE WHEN jh.run_status = 0 /* Failed */ THEN 1 ELSE 0 END,
	@job_start_time = msdb.dbo.agent_datetime(jh.run_date, jh.run_time),
	@job_complete_time = dateadd(second, jh.run_duration, msdb.dbo.agent_datetime(jh.run_date, jh.run_time))
FROM msdb.dbo.sysjobs j
JOIN msdb.dbo.sysjobhistory jh ON jh.job_id = j.job_id
WHERE j.name = @job_name
and jh.step_id = 0
ORDER BY jh.run_date DESC, jh.run_time DESC;
GO

Update 3/23/2020: I addressed a couple of issues in the original procedure. First, if this procedure is executed immediately after the call to sp_start_job, sometimes the initial time that xp_sqlagent_enum_jobs is called, the job is still shown with an idle status, causing the procedure to exit prematurely. I have therefore added a 5-second delay at the beginning of execution to allow the job system to stabilize.

The second fix was to the final select statement. I added jh.step_id = 0 into the WHERE clause so that the overall job status is retrieved, and not just the last step to run. Normally, this will be one and the same, but I added the additional predicate for more correctness.

Update 3/24/2020: Based on Mario’s suggestion in the comments, I have made another change so that if @job_timeout_minutes = 0, the procedure will wait indefinitely until the calling job completes.

3 thoughts on “Stored Procedure to determine when a job is done”

  1. Hi.

    Thanks for your work on this. Gonna try it.

    How would you like to change it so that if @job_timeout_minutes = 0 then there is no timeout and the job runs as long as it takes? I’ve got a job that takes anywhere from 10 to 12 hours to run.

    Just a thought.

    Thanks again.

    Mario

    1. Mario, that is a good suggestion. I have modified the proc to wait indefinitely if @job_timeout_minutes = 0.

  2. Hi,

    This solves all my problems. We are not allowed to use openrowset for security reasons. Thanks a lot.

    Edgar.

Leave a Reply

Your email address will not be published. Required fields are marked *