Calculating Azure Data Factory test coverage

This is the sixth and final part of my series on automated testing for Azure Data Factory (ADF) pipelines. If this is the first part you're reading, you may prefer to start at the beginning. If it's the sixth part, thanks for coming with me and well done 😊.

In software engineering, code coverage (or test coverage) measures the proportion of a program's source code executed by a given test suite. This provides assurance around the completeness – or otherwise – of a program's testing. In the previous article I used Azure Data Factory's pipeline execution history to verify that specific activities were executed. In this article I use execution history data to measure the proportion of a data factory's activities (across all pipelines) executed during a full test run – the test suite's activity coverage.

To determine which activities have been executed by a test suite, I need to collect and aggregate activity run data from every pipeline execution triggered from any test fixture. In the previous post I developed components to retrieve and cache activities for a pipeline run – I'll use those components here to collect data systematically.

I'm going to create a new helper class to contain functions specific to coverage measurement. It's a subclass of the database helper because I want to exploit functionality from classes further up the hierarchy:

This means that pipeline-specific helper classes must now be subclasses of CoverageHelper – you can see that change in this post's Visual Studio solution.

You may prefer to follow this section in the complete VS solution. All the code in this article is available on GitHub – there's a link at the end.

The coverage helper class adds a new RunPipeline() method, which hides the original implementation in the pipeline run helper. This allows it do a couple of other things, in addition to calling the original method directly:

  • Before the pipeline is run (line 7), the new method caches the current call stack – this enables me to identify the test scenario that triggered the pipeline (e.g. PL_Stage_Titles.GivenExternalDependencies)
  • After the pipeline has completed (lines 9-11) – and if coverage reporting has been enabled – a new GetActivityRuns() method (in the pipeline run helper) gets a list of all the activity runs from the pipeline's execution, logging them using the private RecordActivityRun() method.
  1. namespace AdfTesting.Helpers
  2. {
  3. public class CoverageHelper<T> : DatabaseHelper<T> where T : CoverageHelper<T>
  4. {
  5. public new async Task RunPipeline(string pipelineName)
  6. {
  7. var callStack = new System.Diagnostics.StackTrace();
  8. await base.RunPipeline(pipelineName);
  9. if (ReportTestCoverage)
  10. foreach (var ar in await GetActivityRuns())
  11. RecordActivityRun(ar, callStack.ToString());
  12. }
  13.  
  14. public bool ReportTestCoverage
  15. {
  16. get
  17. {
  18. try
  19. {
  20. var measure = GetSetting("ReportTestCoverage");
  21. if (measure == "true")
  22. return true;
  23. }
  24. catch (Exception) { }
  25. return false;
  26. }
  27. }
  28.  
  29. private void RecordActivityRun(ActivityRun ar, string context)
  30. {
  31. var parameters = new Dictionary<string, object>
  32. {
  33. ["@pipelineName"] = ar.PipelineName,
  34. ["@activityName"] = ar.ActivityName,
  35. ["@context"] = context
  36. };
  37. ExecuteStoredProcedure("test.RecordActivityRun", parameters);
  38. }
  39. }
  40. }
  • The property ReportTestCoverage (line 14) returns true if the runsettings file contains a “ReportTestCoverage” setting with the value “true”. Setting it to “false” (or omitting it) allows tests to be run without the overhead of collecting activity run data.
  • RecordActivityRun() (line 29) passes the pipeline and activity names, along with the method call stack trace, to database stored procedure [test].[RecordActivityRun] (via a new database helper method ExecuteStoredProcedure()).

The pipeline run helper's new GetActivityRuns() method uses the existing InitialiseActivityRuns() behaviour to cache all activity runs in the helper. This ensures that no additional calls to ADF will be required to run subsequent unit tests.

public async Task<IEnumerable<ActivityRun>> GetActivityRuns()
{
    await InitialiseActivityRuns();
    return _activityRuns;
}

The database helper method ExecuteStoredProcedure() provides a convenient way to run a named stored procedure, optionally with parameters:

public void ExecuteStoredProcedure(string spName, Dictionary<string, object> parameters = null)
{
    using (var cmd = new SqlCommand(spName, _conn))
    {
        cmd.CommandType = CommandType.StoredProcedure;
        if (parameters != null)
            foreach (string parameterName in parameters.Keys)
                cmd.Parameters.Add(new SqlParameter(parameterName, parameters[parameterName]));
        cmd.ExecuteNonQuery();
    }
}

The coverage helper's RecordActivityRun() method uses ExecuteStoredProcedure() to call database SP [test].[RecordActivityRun]. The SP writes its parameters – the details of a single activity run – into testing log table [test].[ActivityRun]:

After a test suite has completed execution, the data in this table describes every ADF pipeline activity run which took place during any of the suite's tests.

The coverage helper's RunPipeline() method allows me to identify every activity run from the test suite's execution, but that's only half the story. To determine if activities have not been run, I also need a list of every activity from every pipeline in the data factory instance. Obtaining pipelines' activities should be independent of test fixtures, because I want to be able to identify pipelines I've forgotten to test. I also don't want to waste time retrieving the same pipeline's activities more than once.

In addition to test fixtures, NUnit has the notion of a setup fixture that can be used to run [OneTimeSetUp] (and/or [OneTimeTearDown]) once for a group of test fixtures. A [OneTimeSetUp] method in a setup fixture not in any namespace will be run once for the entire test suite, before any tests are run. I use this feature to collect all the data factory's pipeline activities before test execution begins.

This also gives me the opportunity to truncate the [test].[ActivityRun] table at the start of each test run.

I create class CoverageHelperSetup as a [SetUpFixture] outside any namespace (line 1). It subclasses CoverageHelper to take advantage of other helper resources.

If coverage reporting is enabled, [OneTimeSetUp] takes three actions:

  • empties table [test].[ActivityRun] (using the database helper's existing WithEmptyTable() method; line 9)
  • empties table [test].[PipelineActivity] in the same way (line 10)
  • calls GetPipelines(), a new method in the data factory helper that returns all the factory instance's pipelines and activities, logging each activity using the private RecordActivity() method (lines 12-14).

Afterwards, the method calls TearDown() explicitly to ensure that connections in the database and factory helpers are properly disposed (line 17).

  1. [SetUpFixture]
  2. public class CoverageHelperSetup : CoverageHelper<CoverageHelperSetup>
  3. {
  4. [OneTimeSetUp]
  5. public async Task SetupCoverageHelper()
  6. {
  7. if (ReportTestCoverage)
  8. {
  9. WithEmptyTable("test.ActivityRun");
  10. WithEmptyTable("test.PipelineActivity");
  11.  
  12. foreach (var p in await GetPipelines())
  13. foreach (var a in p.Activities)
  14. RecordActivity(p.Name, a);
  15. }
  16.  
  17. TearDown();
  18. }
  19.  
  20. private void RecordActivity(string pipelineName, Activity act)
  21. {
  22. var parameters = new Dictionary<string, object>
  23. {
  24. ["@pipelineName"] = pipelineName,
  25. ["@activityName"] = act.Name
  26. };
  27. ExecuteStoredProcedure("test.RecordActivity", parameters);
  28.  
  29. if (act is ForEachActivity)
  30. foreach (var a in ((ForEachActivity)act).Activities)
  31. RecordActivity(pipelineName, a);
  32.  
  33. if (act is UntilActivity)
  34. foreach (var a in ((UntilActivity)act).Activities)
  35. RecordActivity(pipelineName, a);
  36.  
  37. if (act is IfConditionActivity)
  38. {
  39. foreach (var a in ((IfConditionActivity)act).IfTrueActivities)
  40. RecordActivity(pipelineName, a);
  41. foreach (var a in ((IfConditionActivity)act).IfFalseActivities)
  42. RecordActivity(pipelineName, a);
  43. }
  44.  
  45. if (act is SwitchActivity)
  46. foreach (var c in ((SwitchActivity)act).Cases)
  47. foreach (var a in c.Activities)
  48. RecordActivity(pipelineName, a);
  49. }
  50. }

RecordActivity() uses the database helper's ExecuteStoredProcedure() method to call a database stored procedure. Each call to the specified SP – [test].[RecordActivity] – makes an entry in database table [test].[PipelineActivity]:

Some pipeline activities – like ForEach or If Condition – contain other activities. Nested activities are flattened into a single list by recursive calls in RecordActivity().

The data factory helper's new method – GetPipelines() – returns every pipeline published to the data factory instance.

public async Task<List<PipelineResource>> GetPipelines()
{
    await InitialiseClient();
    var page = await _adfClient.Pipelines.ListByFactoryAsync(_rgName, _adfName);
    var pipelines = page.ToList();
 
    while (!string.IsNullOrWhiteSpace(page.NextPageLink))
    {
        page = await _adfClient.Pipelines.ListByFactoryNextAsync(page.NextPageLink);
        pipelines.AddRange(page.ToList());
    }
    return pipelines;
}

Large factory instances may return pipeline lists in more than one “page” – the purpose of the while loop is to iterate over all pages.

To measure activity coverage, I compare activities executed by the test suite against activities in the data factory's pipelines. I do this using database view [test].[CoverageReport] (defined in the Visual Studio solution for this post), which combines data from the [test].[ActivityRun] and [test].[PipelineActivity] tables.

This screenshot reports coverage for the test suite from the same VS solution:

I copied the test suite from my previous post, then deleted the Given100Rows scenarios for pipeline “PL_Stage_Titles_With_Warning” (in both the unit and functional test fixtures). As a result, the scenario in which the row count is above the minimum warning threshold is never tested. The coverage report highlights untested activities, allowing gaps in coverage like this to be identified and addressed.

From the coverage report above, 18 of 19 – around 95% – of the factory's activities were executed at least once by my test suite.

The equivalent to activity coverage in a general-purpose programming language is statement coverage. Other coverage measures with ADF analogues include:

  • function coverage – has every Execute Pipeline activity been run?
  • branch coverage – have the true/false branches of every If Condition been run? Has every case of a Switch activity been run?

In languages with more open-ended flow of control, measures like these can be important. In the case of ADF, each is a subset of activity coverage – if every activity is executed, then it's also true that every branch/case/Execute Pipeline activity has been run.

For me, % activity coverage isn't particularly useful, beyond telling me when coverage is incomplete. Incomplete coverage means that one or more activities were never executed during a test suite – i.e. that they have not been tested. I have two principal use cases for activity coverage measurement:

  1. to make sure that you've tested every pipeline activity at least once
  2. to influence local practice in the absence of isolated testing.

If you're in a shop that only ever runs integration tests, a coverage measure showing that some pipeline activities are never tested is a powerful argument for isolating and automating tests.

In this post I used the execution history for all ADF pipelines executed during a test run to calculate activity coverage for my test suite. Gaps in activity coverage reveal untested activities, and may indicate that scenarios under test are not sufficiently varied. Highlighting coverage gaps improves testing by uncovering scenarios which are not tested.

  • Code: The code for the series is available on Github. The Visual Studio solution specific to this article is in the adf-testing-series/vs/06-CodeCoverage folder. It contains three projects: the NUnit project AdfTests along with database projects for the [ExternalSystems] and [AdfTesting] databases. Tables in the [ExternalSystems] database are based on Microsoft's Northwind sample database.

    The GitHub repo includes my “tests.runsettings” file, but its association with the VS solution is not persisted – this is a VS issue. Before running tests for the first time you will need to specify the solution's runsettings file.

  • Share: If you found this article useful, please share it – and thanks again for reading!