Row diversion on error

SSIS easily supports the diversion of error rows, but diagnosis of the error that caused the row to be diverted is less straightforward. A data flow component's error output includes all of its input columns, with two additional ones:

  • ErrorCode – the SSIS code identifying the type of error encountered;
  • ErrorColumn – the integer ID of the column that caused the error.

Neither of these two attributes are immediately human-readable. This article presents a pattern for translating them using a script component that requires only minimal modification for inclusion in a new data flow.

  1. Build your data flow as normal.
  2. Drag a script component onto the data flow surface, setting its type to 'Transformation'
  3. Take the error output of the component requiring diversion handling and connect it to the input of the new script transformation. Change the row error handling mode from 'Fail component' to 'Redirect row'
  4. Double-click on the script component to open its editor, then revise it as follows:
    • On the Input Columns page, select all available input columns
    • On the Inputs and Outputs page, drill down into the properties of the components output ('Output 0') and click on 'Output Columns'
    • Use the 'Add column' button to add columns with the following Name and DataType properties:

      • ErrorDescription DT_WSTR(255) – this column will contain the description associated with the ErrorCode value supplied by SSIS
      • ErrorColumnName DT_WSTR(128) – this column will contain the column name associated with the ID supplied by SSIS in ErrorColumn
      • ErrorValue DT_NTEXT – this column will contain the value of the column that caused the error
      • RowId DT_WSTR(255) – this column will contain an (implementation-specific) row identifier, identifying the row that caused the error

      The latter two columns may be unnecessary if you are diverting errors into destination-specific error tables, but will be useful if you are diverting them into a single generic error table.

    • On the Script page, click 'Edit Script'.
  5. The default script that opens in VSTA includes the line public class ScriptMain : UserComponent. This is the declaration of the ScriptMain class.
    • Select the code for this pattern (below – you can select the whole block by double-clicking on it somewhere) and copy it to the clipboard
    • Paste it into the script, replacing the existing class declaration and everything after it in the script file.
    • In the body of method Input0_ProcessInputRow, edit this line:

      Row.RowId = Row.YourRowIdFieldNameHere_IsNull ? “” : Row.YourRowIdFieldNameHere.ToString();

      replacing YourRowIdFieldNameHere and YourRowIdFieldNameHere_IsNull with the properties representing your own unique row identifier. (It doesn't actually matter what property you specify here – but it's useful to have a row ID. If you don't have one in your data flow, it may be useful to add a row number in the data flow itself.)

  6. Save, close, etc. Now you can connect the output of the script component to write the translated error details into a destination of your choice (flat file, error table, etc).

For each row that passes through the component, the Input0_ProcessInputRow method populates the four new columns:

  • RowId is given the value of the property you specify when you replace YourRowIdFieldNameHere and YourRowIdFieldNameHere_IsNull
  • ErrorDescription is looked up in the component's metadata
  • ErrorColumnName is determined by looking up the column's identification string from its ID, then transforming it is something close to the original column name by comparing it to the list of input columns. (The list of input columns is cached in the columnNames list during the execution of the component's PreExecute() method).
  • A type-independent ErrorValue is read from the errored column as a byte array. This enables the script to remain metadata-agnostic and much more reusable.
public class ScriptMain : UserComponent
{
  private System.Collections.Generic.List<string> columnNames;
  private IDTSComponentMetaData130 metadata;
 
  public override void PreExecute()
  {
    base.PreExecute();
    metadata = this.ComponentMetaData as IDTSComponentMetaData130;
    columnNames = new System.Collections.Generic.List<string>();
    foreach (IDTSInputColumn130 c in metadata.InputCollection[0].InputColumnCollection)
      columnNames.Add(c.Name);
  }
 
  public override void Input0_ProcessInputRow(Input0Buffer Row)
  {
    try
    {
      var columnName = GetColumnName(Row.ErrorColumn);
 
      Row.RowId = Row.YourRowIdFieldNameHere_IsNull ? "" : Row.YourRowIdFieldNameHere.ToString();
      Row.ErrorDescription = metadata.GetErrorDescription(Row.ErrorCode);
      Row.ErrorColumnName = columnName;
      Row.ErrorValue.AddBlobData(GetColumnValue(columnName, Row));
    }
    catch (Exception e)
    {
      // transform .Net error into SSIS error
      bool pbCancel = true;
      metadata.FireError(-1, "", e.StackTrace, "", -1, out pbCancel);
    }
  }
 
  private string GetColumnName(int errorColumn)
  {
    if (errorColumn < 1)
      return "<unavailable>";
    string idStr = metadata.GetIdentificationStringByID(errorColumn);
    foreach (string c in columnNames)
      if (idStr.EndsWith("[" + c + "]"))
        return c;
    return idStr; // if column names have been revised between the errored component and this one...
  }
 
  private byte[] GetColumnValue(string columnName, Input0Buffer row)
  {
    columnName = columnName.Replace("_", "");
 
    object value = null;
    foreach (var p in row.GetType().GetProperties())
      if (p.Name == columnName + "_IsNull")
        value = (bool)p.GetValue(row) ? "<null>" : row.GetType().GetProperty(columnName).GetValue(row);
    string str = value == null ? "<unavailable>" : value.ToString();
 
    byte[] bytes = new byte[str.Length * sizeof(char)];
    System.Buffer.BlockCopy(str.ToCharArray(), 0, bytes, 0, bytes.Length);
    return bytes;
  }
}