[ Back to Sprockit home page ]

Sprockit tutorial

In this tutorial, you'll learn:

  • how processes are configured in Sprockit;
  • what normal (error-free) execution looks like;
  • basic process configuration;
  • what execution errors look like, and how to handle them.

If you haven't done so already, you should make sure you understand the basic concepts by reading the Introduction to Sprockit.

You will need:

  1. the tutorial demo setup script, available here:

    Download it, open the script in SSMS, connect to your Sprockit database, then run it.

The demo configuration contains ten simple stored procedures, named demo.usp_ProcessA through demo.usp_ProcessJ. The demo setup script creates these SPs, registers them in sprockit.Process, and records their inputs and outputs in sprockit.Resource.

The processes need to be run in an order that respects these dependencies:

In this lesson we'll look at some of the objects in the Sprockit framework.

  1. Have a look at the tables and stored procedures in the database. The SPs and tables in the demo schema are just for this tutorial, everything in the sprockit schema is part of Sprockit.

    Notice particularly the tables:

  2. You'll notice that process demo.usp_ProcessF hasn't been configured – that's OK, we'll do it later.

    Have a look at what's in sprockit.Process. In here you'll see an entry for each stored procedure that you found in the demo schema.

    (From Sprockit v1.5 you'll also see a process called 'SprockitIndexMaintenance' – this is a process used by Sprockit to maintain its own table indexes).

    Look in sprockit.Resource and you'll see the input and output resources used by each process.

  3. Finally, have a look at the list of SQL Server Agent jobs. You'll find a new job called RunSprockit (SprockitDb) – this runs Sprockit's process manager.

    (Instead of SprockitDb, you'll see the name of the database where you installed Sprockit).

In this lesson we'll run an ETL batch, watch it in execution, then look at the results.

  1. Start the 'Run Sprockit' SQL Agent job at step 1.

  2. In your query window, run this monitoring query:

    Double-click anywhere in the code to select the whole block.

    SELECT *
    FROM sprockit.Process
    ORDER BY ProcessId

    Keep re-running it every few seconds and you'll notice the status of processes changing as they become Ready, start Running, and finally are Done.

    You will use the query again – keep it open in SSMS.

  3. When everything's Done, take a look at what happened:

    • In table sprockit.Execution you will find a record for each process executed in the batch. Each one has a unique [ExecutionId], a [ProcessId] (from sprockit.Process) indicating which process was executed, and a [HandlerId] indicating which handler ran the process – this refers to a record in sprockit.Handler.

    • The table sprockit.Handler contains one record for each process handler that was used in the batch. You'll find a few records in here, each with a unique [HandlerId] and with a [BatchId] referring to the entry in sprockit.Batch.

    • sprockit.Batch records a unique identifier for each batch, along with its start time and the values of various system properties when the batch started.

  4. Start the 'Run Sprockit' job again, then rerun the monitoring query – now you'll notice that many of values in fields [AvgDuration] and [BranchWeight] are non-zero. These values are updated at batch start using data from sprockit.Execution, and are used to make process selection decisions.

    (If you watch carefully, you'll notice that when multiple processes are Ready at the same time, the process that runs next is the one with with the highest [BranchWeight] value).

  5. In sprockit.Handler, notice the field [SqlAgentJobName]. This stores the name of the SQL agent job that ran each handler, but if you refresh the list of agent jobs after all processes are Done, you won't find any handler jobs – each one is deleted automatically when the handler exits.

    Start the 'Run Sprockit' job again, then refresh the list of agent jobs – you'll notice immediately that a job called something like SprockitDb_SprockitHandler_20180306211345 has been created. Keep refreshing the list of jobs while the process manager runs and you will see more, similar jobs created as the manager spawns new handlers. Each of these gradually disappears as the handlers finish and the jobs are deleted.

  • The Run Sprockit job is Sprockit's process manager. Whenever it finds processes ready, it starts a new handler. Add a schedule to this job to run your ETL processing automatically at your preferred time.
  • Handlers run in separate SQL Agent jobs with names like SprockitDb_SprockitHandler_20180306211345. They are deleted automatically when they finish.

In this lesson we'll add a new process – demo.usp_ProcessF – to the ETL routine.

  1. To run demo.usp_ProcessF, we need to add it to the table sprockit.Process. Run this script to do that:

    INSERT INTO sprockit.Process (
      DbName
    , SchemaName
    , ProcessName
    ) VALUES (
      DB_NAME()
    , 'demo'
    , 'usp_ProcessF'
    )
  2. To run the new process in the right order, we need to add its resources to sprockit.Resource.

    Look at the definition of demo.usp_ProcessF and you will see that the procedure selects data from tables demo.Table_03 and demo.Table_04, and uses it to populate demo.Table_08. Sprockit refers to Table_03 and Table_04 as inputs for usp_ProcessF, and to Table_08 as its output.

    Populate the process's inputs and outputs by running this script:

    INSERT INTO [sprockit].[Resource] (
      [ProcessId]
    , [IsInput]
    , [DbName]
    , [SchemaName]
    , [ResourceName]
    )
    SELECT 
      p.ProcessId
    , t.IsInput 
    , DB_NAME()
    , 'demo'
    , t.TableName
    FROM sprockit.Process p
      CROSS JOIN (VALUES
        ('Table_03', 1)
      , ('Table_04', 1)
      , ('Table_08', 0)
      ) t (TableName, IsInput)
    WHERE p.ProcessName = 'usp_ProcessF'

    We're configuring usp_ProcessF in this way so that you can see exactly what's going on, but other options are available!

  3. Start the 'Run Sprockit' job again, then rerun the monitoring query. Keep refreshing the results to watch processes being run – you will notice that usp_ProcessF runs after usp_ProcessB and before usp_ProcessI (as indicated by the process dependency graph at the top of the page).

    Notice also that when you configured usp_ProcessF, you didn't specify any information directly about usp_ProcessB or usp_ProcessI – you just provided the list of resources used and produced by usp_ProcessF.

  • For a process to run, it must appear in sprockit.Process. For processes to run in the right order, their resource dependencies must be recorded in sprockit.Resource.
  • Configuring resource dependencies means you don't need to know anything about other processes that are already set up.

In this lesson we'll encounter an ETL error, fix it, then resume processing where we left off.

  1. Run this SQL script to alter the definition of demo.usp_ProcessE:

    ALTER PROCEDURE [demo].[usp_ProcessE]
    AS 
     
    DECLARE @i INT = CAST('abc' AS INT)
    WAITFOR DELAY '00:00:05'

    This is going to cause an error when it attempts to cast the string 'abc' to an INT.

  2. Start the 'Run Sprockit' job, then use the monitoring query from lesson 2 to monitor progress. After a while you'll notice a few things:

    • demo.usp_ProcessE gets the status Errored;
    • demo.usp_ProcessI gets the status Blocked because it depends on demo.usp_ProcessE (as shown in the dependency graph at the top of this page);
    • other processes not dependent on demo.usp_ProcessE continue to run normally, until everything reachable (not Errored or Blocked) is Done.

  3. When the process manager finishes, you will see in the agent job history that the 'Run Sprockit' job is reported to have failed – this indicates that not all processes were Done when it finished.

    To see what exactly went wrong, take a look in the table sprockit.Error:

    SELECT * FROM [sprockit].Error

    In here you will find a record indicating that demo.usp_ProcessE failed, reporting the error “Conversion failed when converting the varchar value 'abc' to data type int” on line 4.

    Fix the error by altering the procedure, commenting out the offending statement:

    ALTER PROCEDURE [demo].[usp_ProcessE]
    AS 
     
    --DECLARE @i INT = CAST('abc' AS INT)
    WAITFOR DELAY '00:00:05'
  4. We're ready to carry on processing now, but we don't want to have to start again from the beginning – we just want to pick up from where we left off.

    To do this, start the 'Run Sprockit' SQL Agent job at step 2, then run the progress monitoring query. You'll notice that everything that was already Done is still Done – but that demo.usp_ProcessE is now Ready (or already Running again) and demo.usp_ProcessI is Not ready. When demo.usp_ProcessE now finishes successfully, demo.usp_ProcessI will become Ready, start Running, and finally be Done in the usual way.

  5. Finally, if you take another look in the sprockit.Execution table you'll find two entries for demo.usp_ProcessE in the latest batch – one which failed with error number 245 and one which succeeded.

    SELECT 
      e.*
    FROM sprockit.Process p
      INNER JOIN sprockit.Execution e ON e.ProcessId = p.ProcessId
      INNER JOIN sprockit.Handler h ON h.HandlerId = e.HandlerId
    WHERE p.ProcessName = 'usp_ProcessE'
    AND h.BatchId = (
      SELECT MAX(BatchId) FROM sprockit.Batch
    )

  • The Run Sprockit job is Sprockit's process manager. Starting it at step 2 causes processing to pick up where it left off (usually after an error).
  • This is the only time you would start it at step 2 – normally it runs on a schedule and starts at step 1.

Now you're ready to start using Sprockit!

  1. Create a new Sprockit install (don't use the demo database you've been using – make a fresh one).
  2. Add your processes to sprockit.Process and their inputs and outputs to sprockit.Resource.
  3. Schedule the Run Sprockit job to run when you want your ETL processing to start.

To clean up after the tutorial, you can just drop the tutorial database and the 'Run Sprockit' agent job.

Ready to level up? Time to consider:

Any problems? Have a look at some frequently asked questions.