[ Back to Sprockit home page ]

Restart and rewind

One of Sprockit's key differences is that restart after error is very easy. Restarts are usually necessary after an unexpected error has caused your ETL batch to fail – at this point you'll be in a hurry to get up and running again, so easy restart is a real advantage.

Your approach to managing errors in your ETL probably looks something like this:

  1. Detect an error. You find out that something has gone wrong during your ETL batch and that processing has not completed. If anything has failed, you will notice that Sprockit's process manager SQL Agent job1) itself is reported to have failed – this indicates that at least one process managed by Sprockit has failed.

    You can identify failed processes using this query. You might choose to run it directly as part of your regular checks, or you could embed it in your ETL monitoring dashboard.

    SELECT *
    FROM sprockit.Process
    WHERE [Status] = 'Errored'

    (When processing completes successfully, this query returns no rows).

  2. Diagnose the error. If an error has occurred, you can see more detail in the table sprockit.Error:

    SELECT TOP 10 *
    FROM sprockit.Error

    In the case of errors thrown from SSIS packages, error messages reported in the SSIS catalog database are copied into this table. To look at SSIS catalog execution events directly you will need the SSIS execution ID for the package – you can find it in sprockit.Execution like this:

    , epv.PropertyValue AS SsisExecutionId
    FROM [sprockit].[Error] e
      INNER JOIN sprockit.Execution ex ON ex.ExecutionId = e.ExecutionId
      OUTER APPLY sprockit.tvf_ParsePropertyValue('SsisExecutionId', ex.ExecutionConfiguration) epv

    (This query uses sprockit.Error.[ExecutionId], available from Sprockit v1.5. For earlier versions you will need to use the values in [SourceDb], [SourceSchema] and [SourceObject] to identify the process, then find the Sprockit execution record corresponding to the process's [LatestExecutionId]).

  3. Fix the error. Like a pro.

  4. Restart processing. To restart processing, start the process manager agent job2) at step 2. And that's it!

This feature requires Sprockit v1.5 or higher.

Sometimes a process fails not because of a problem in the process itself, but instead due to some unexpected effect of an earlier process. In this situation, fixing the problem may require a change to that earlier process (and perhaps no change at all to the failed process).

After fixing a process like this, you can't just restart from the point of failure – you need the now-fixed earlier process to be run again to produce the correct results. Sprockit's usp__Rewind utility enables you to 'rewind' processing to that point, allowing you to restart from there.

In a typical scenario, after an ETL failure as described above you find that a process P has failed. You then find that its behaviour is correct, but that it failed because of bad data produced by a predecessor process Q. To fix the problem, you make a change to Q, then you're ready to restart processing starting from Q. To do this:

  1. Rewind to Q. Do this using the utility SP usp__Rewind:

    EXEC sprockit.usp__Rewind 'Q' 
  2. Restart processing. To restart processing, start the process manager agent job3) at step 2. And that's it!

Rewind can also be useful in development and testing environments. When you've deployed a change to dev or test, you might want to run the ETL just from the point of the change. To do that, simply rewind processing to that point, then start the process manager at step 2.

1) , 2) , 3)
The default process manager job name is Run Sprockit (DbName).