[ Back to Sprockit home page ]

Configuring processes in Sprockit

Process configuration in Sprockit has two parts:

A process in Sprockit is either:

  • something that is executed (a stored procedure or an SSIS package), or;
  • a stored query (a view or a user-defined function).

To have Sprockit run a stored procedure or package, you must make an entry for it in sprockit.Process; if its input resources include views or user-defined functions, you should also make an entry in sprockit.Process for each such stored query.

Configuring views and functions in sprockit.Process ensures that dependent processes don't attempt to use them until their underlying base tables are ready.

When making an entry for a process, you need only provide values for three attributes: [DbName], [SchemaName], and [ProcessName], e.g.:

INSERT INTO sprockit.Process (
  DbName
, SchemaName
, ProcessName
) VALUES (
  'MyDb'
, 'MySchema'
, 'usp_MyProcess'
)

SSIS packages must be installed in the local SSIS catalog and are configured in exactly the same way. Set the values of [DbName] and [SchemaName] to the names of the catalog folder and project containing the package. The value of [ProcessName] is the name of the package, including its .dtsx extension, e.g.:

INSERT INTO sprockit.Process (
  DbName
, SchemaName
, ProcessName
) VALUES (
  'MySsisCatalogFolder'
, 'MySsisProject'
, 'MyPackage.dtsx'
)

A resource in Sprockit is something that provides data to, or is the product of, a process. Resources that provide data to processes are referred to as the process's inputs; those produced by the process are its outputs. Processes' resources are configured in sprockit.Resource.

Sprockit uses process dependencies to control the order of process execution, but you specify them as resource dependencies because they are easier to configure.

A process can have any number of inputs or outputs (zero or more of each), but a resource can be the output of only one process. A stored query's output – the output of a view or function process – is the view or function, and must be configured as such.

The process with which a resource is associated is indicated by its [ProcessId]: when configuring a resource you must specify the relevant process ID, the three parts of the resource's name, and whether the resource is an input or output. e.g.:

INSERT INTO sprockit.Resource (
  ProcessId          
, DbName     
, SchemaName 
, ResourceName       
, IsInput            
) VALUES 
  (123, 'MyDb', 'MySchema', 'MyTable', 1)   -- input resource
, (123, 'MyDb', 'MySchema', 'MyOutput', 0)  -- output resource

To avoid having to find the process ID yourself, you could do something like this:

WITH cte AS (
  SELECT * FROM (VALUES 
    ('MyDb', 'MySchema', 'MyTable', 1)   -- input resource
  , ('MyDb', 'MySchema', 'MyOutput', 0)  -- output resource
  ) t (    
    DbName     
  , SchemaName 
  , ResourceName       
  , IsInput            
  )
)
INSERT INTO sprockit.Resource (
  ProcessId          
, DbName     
, SchemaName 
, ResourceName       
, IsInput            
)
SELECT
  p.ProcessId
, cte.DbName     
, cte.SchemaName 
, cte.ResourceName       
, cte.IsInput            
FROM cte
  INNER JOIN sprockit.Process p
    ON p.DbName = 'MyDb'
    AND p.SchemaName = 'MySchema'
    AND p.ProcessName = 'usp_MyProcess'

Double-click anywhere in the code to select the whole block.

Instead of having to write INSERT statements yourself, you can give dependency information to Sprockit in a small piece of XML and it will do it for you!

This piece of XML describes the process configured above:

<Process name="MyDb.MySchema.usp_MyProcess">
  <Inputs>
    <Resource name="MyDb.MySchema.MyTable" />
  </Inputs>
  <Outputs>
    <Resource name="MyDb.MySchema.MyOutput" />
  </Outputs>
</Process>

Process and resource names are treated like other three-part names in TSQL – if your name contains a space, a full-stop, or other problem characters, you must surround it in square brackets. This is always necessary for SSIS package names, e.g. MySsisCatalogFolder.MySsisProject.[MyPackage.dtsx].

To configure the process, pass the XML to the stored procedure sprockit.usp__ConfigureProcess:

EXEC sprockit.usp__ConfigureProcess '
<Process name="MyDb.MySchema.usp_MyProcess">
  <Inputs>
    <Resource name="MyDb.MySchema.MyTable" />
  </Inputs>
  <Outputs>
    <Resource name="MyDb.MySchema.MyOutput" />
  </Outputs>
</Process>
'

sprockit.usp__ConfigureProcess shreds the XML to extract the process and resource details, then makes the necessary changes in sprockit.Process and sprockit.Resource. You can use this approach to add a new process or to update the configuration of an existing one.

When updating an existing process configuration, you must either re-specify all its resources or specify partialConfig=“true” in the <Process> tag, e.g.:

EXEC sprockit.usp__ConfigureProcess '
<Process name="MyDb.MySchema.usp_MyProcess" partialConfig="true">
  <Inputs>
    <Resource name="MyDb.MySchema.MyExtraTable" />
  </Inputs>
</Process>
'

If you omit partialConfig=“true”, any of the process's resources already present in sprockit.Resource but not specified in the XML will be removed.

A stored query is a view or user-defined function. Objects of these kinds encapsulate access to underlying base tables (perhaps via other views or functions), so are not ready for use until those underlying tables are all up to date.

To ensure that a stored query is not used until it is ready, you should configure views and functions as processes in their own right. The (single) output of such a process is the stored query itself, e.g.:

EXEC sprockit.usp__ConfigureProcess '
<Process name="MyDb.MySchema.uvw_MyView2">
  <Inputs>
    <Resource name="MyDb.MySchema.MyTable1" />
    <Resource name="MyDb.MySchema.MyTable2" />
    <Resource name="MyDb.MySchema.uvw_MyView1" />
  </Inputs>
  <Outputs>
    <Resource name="MyDb.MySchema.uvw_MyView2" />
  </Outputs>
</Process>
'

This ensures that any processes using the view or function do not attempt to do so before its base tables are ready.

You can use the utility stored procedure sprockit.usp__CheckProcessConfiguration to report cases in which views or functions are used as inputs but do not appear as the output of a process with a matching name.