• 14-Sep-2021Infrastructure as nearly-all-code

    T-SQL Tuesday #142 (September 2021) is hosted by Frank Geisler. Frank's choice of subject is "using descriptive techniques to build database environments".

  • 23-Feb-2021More Get Metadata in ADF

    Last year I wrote a post about doing this in pure ADF, with really terrible performance. By way of apology, I've had another go in an Azure Function!

  • 10-Nov-2020Azure Data Factory, the ADF UX and Git

    If you're using Azure Data Factory (ADF), you're probably using Git and almost certainly using the ADF User Experience (ADF UX) – ADF's online integrated development environment (IDE). These three components are so closely interlinked that sometimes it's hard to think about them separately – in this article I try to do exactly that.

  • 28-Oct-2020Catch-22: Automating MSI access to an Azure SQL Database

    A problem I ran into recently is how to automate granting access to SQL databases in different environments. I'm using Terraform to build parallel data engineering environments which – amongst other things – include one or more SQL databases and instances of Azure Data Factory (ADF). ADF pipelines need access to databases, and I want to authenticate linked service connections using the factory's managed identity (MSI). Building this automatically is harder than it sounds!

  • 13-Oct-2020The Ice Cream Van of Abstraction

    October 2020's T-SQL Tuesday is hosted by Rob Volk (b|t) with the subject Data Analogies, or: Explain Databases Like I’m Five! Thanks for hosting, Rob! It turns out that I don't do much specifically databasplaining™ by analogy, but there's one analogy I go back to again and again when talking to people unfamiliar with a useful concept, both inside and outside of SQL Server: abstraction.

  • 07-Oct-2020Google Analytics API pagination in Azure Data Factory

    In a previous post I created a pipeline to retrieve data from the Google Analytics reporting API, using an OAuth 2.0 access token for authorisation. Azure Data Factory's Copy data activity handles various styles of paged API response, but it doesn't support the approach taken by the Google Analytics reporting API. In this post I look at how to make that work.

  • 29-Sep-2020Get Metadata recursively in Azure Data Factory

    Azure Data Factory's Get Metadata activity returns metadata properties for a specified dataset, but not recursively. In this post I try to build an alternative using just ADF.

  • 10-Sep-2020Extract data from Google Analytics with Azure Data Factory

    In a previous post I prepared an ADF pipeline to make authorised Google Analytics API requests, using an Azure Function to obtain an OAuth 2.0 access token. In this article I'm going to use the returned token to authorise an API connection and extract data from Google Analytics.

  • 08-Sep-2020Automate to replicate

    In September 2020's T-SQL Tuesday, Elizabeth Noble (b|t) wants to know what members of the SQL community have automated to make their lives easier. Thanks for asking, Elizabeth 😀.

  • 28-Aug-2020Access Google Analytics with Azure Data Factory

    At the time of writing, Azure Data Factory has no connector to enable data extraction from Google Analytics, but it seems to be a common requirement – here's how to do it using ADF's current feature set.

  • 13-Aug-2020Parameterising the Execute Pipeline activity

    A shortcoming of Azure Data Factory's Execute Pipeline activity is that the pipeline to be triggered must be hard-coded into the activity – so it's impossible to use metadata-driven approaches like iterating over a list of pipeline names. This post looks at an alternative approach.

  • 11-Aug-2020Time capsule

    T-SQL Tuesday for August 2020 is hosted by Tamera Clark (b|t). She's asking for help in assembling a #SQLCommunity time capsule -- thanks for hosting, Tamera!

  • 24-Jul-2020Why automate ADF pipeline testing?

    Since writing my series on automated testing Azure Data Factory pipelines, I've had a few questions along the lines of "why bother?". One reader commented that "a typical ADF developer tests their pipeline doing debug runs". This is exactly how I develop pipelines: make a change, run the change, repeat until the pipeline does what I want. "Why bother with more testing?" is a good question!

  • 14-Jul-2020Default fault

    July 2020's T-SQL Tuesday is hosted by Kerry Tyler (b|t). Kerry describes pilots' use of plane crash reports to learn aviation safety lessons, and asks for similar – albeit hopefully lass catastrophic – tales of SQL Server-related disaster.

  • 09-Jul-2020Dropping temporary tables

    Local temporary tables – tables with names that begin with a single # character – are dropped automatically by SQL Server when they are no longer in scope. So why drop them explicitly at all? Here are some ideas!

  • 30-Jun-2020Print big

    A feature of T-SQL is that strings longer than 8000 bytes are truncated by PRINT. If you haven't already discovered this, you might wonder why it's a problem – the answer (for me at least) is dynamic SQL.

  • 25-Jun-2020From Azure Pipelines to GitHub Actions

    I use Azure Pipelines a lot, both for CI/CD (including automated testing for Azure Data Factory) and for building disposable Azure environments repeatably. Most of my code, however, lives in GitHub. GitHub provides its own service for automating CI/CD software workflows – GitHub Actions – and in this post I compare it to Azure Pipelines.

  • 17-Jun-2020Calculating Azure Data Factory test coverage

    This is the sixth and final article in a series on automated testing for Azure Data Factory pipelines. In software engineering, code coverage (or test coverage) measures the proportion of a program's source code executed by a given test suite. In this article I use execution history data to measure the proportion of a data factory's activities (across all pipelines) executed during a full test run – the test suite's activity coverage.

  • 09-Jun-2020The under-appreciated Start → Run

    This month's T-SQL Tuesday, hosted by Kenneth Fisher (b|t), asks for specifically non-SQL related tips and tricks. I found that much harder than any number of technical problems, which I guess is the point of T-SQL Tuesday...

  • 03-Jun-2020Unit testing Azure Data Factory pipelines

    In part three of this series I looked at functional tests for ADF pipelines: verifying, in isolation, that pipelines are “doing things right”. In this post I'll be testing isolated pipelines to check that they're “doing the right things” – this is one description of a unit test.

  • 21-May-2020Testing Azure Data Factory in your CI/CD pipeline

    In my previous post I used ADF pipeline parameters to implement dependency injection for ADF pipelines and build isolated functional tests using the NUnit testing framework. In this article, I integrate the NUnit testing solution into an Azure DevOps pipeline, so that I can run tests automatically whenever changes are made to ADF resources.

  • 12-May-2020A tale of two smartphone apps

    This month's T-SQL Tuesday is hosted by Glenn Berry (b|t). Glenn's invitation was “to write about what you have been doing as a response to COVID-19”, and it feels like my answer should be this: not much...

  • 06-May-2020Isolated functional tests for Azure Data Factory

    In this article, I'll look at isolating a pipeline from its external dependencies in order to test it independently and with a range of testing scenarios. I'll be trying to establish that a pipeline is “doing things right” in isolation – this is a functional test.

  • 26-Apr-2020Automate integration tests in Azure Data Factory

    In my previous post, I set up and ran one basic test of a single pipeline. In this article, I refactor my VS testing solution to make it easier to add new tests and to test new pipelines.

  • 20-Apr-2020Set up automated testing for Azure Data Factory

    Test automation allows you to run more tests, in less time, with guaranteed repeatability. If you change an existing ADF dataset definition for a new ADF pipeline, how do you know haven't broken something else? Automatically re-testing all your ADF pipelines before deployment gives you some protection against regression faults. Automated testing is a key component of CI/CD software development approaches: inclusion of automated tests in CI/CD deployment pipelines for Azure Data Factory can significantly improve quality.

  • 26-Jan-2020Errors in script tasks and components

    Unhandled errors thrown out of SSIS script tasks are usually accompanied by the message “Exception has been thrown by the target of an invocation” – this isn't very informative! A more manageable approach is to wrap as much of your code as possible inside a try/catch, then raise caught errors to SSIS for cleaner failure and easier diagnosis. This article shows you how to do that.

  • 02-Jan-2020Managing extended properties

    Extended properties are a means of attaching user-defined key-value pairs to database objects. This page provides some utility scripts to facilitate interaction with extended properties.