Blog

  • 11-Oct-2022 – Metadata-driven SQL code generation

    October 2022's T-SQL Tuesday is hosted by Steve Jones – thanks Steve 😊. Steve's invitation this month is to write about using dynamic SQL – that is, T-SQL statements built up from text strings and then executed as code.

  • 12-Jul-2022 – Argument {0} is null or empty. Parameter name: paraKey

    I encountered this error while using Azure Data Factory's Script activity -- with a bit of luck, writing it down will help me remember how to fix it next time!

  • 09-Mar-2022 – SQL database project (SSDT) merge conflicts

    More merge conflict fun! This time, using pre-commit hooks to duck conflicts in SSDT .sqlproj files.

  • 01-Feb-2022 – Merge conflicts in SSAS Tabular

    I sometimes find working with Visual Studio's projects a challenge in multi-developer environments, because each project type seems to have its own vulnerability to Git merge conflicts. This post looks at how to avoid them when working with SSAS tabular.

  • 18-Jan-2022 – Ordered STRING_SPLIT

    I'm a bit late to this party, but some long-awaited news is finally here – STRING_SPLIT now returns the ordinal position of string elements in various Azure SQL offerings.

  • 11-Jan-2022 – Breaking the rules

    January 2022's T-SQL Tuesday is hosted by Andy Yun -- thanks Andy! This month, Andy's asking about learning that changes your opinion.

  • 14-Sep-2021 – Infrastructure as nearly-all-code

    T-SQL Tuesday #142 (September 2021) is hosted by Frank Geisler. Frank's choice of subject is "using descriptive techniques to build database environments".

  • 23-Feb-2021 – More Get Metadata in ADF

    Last year I wrote a post about doing this in pure ADF, with really terrible performance. By way of apology, I've had another go in an Azure Function!

  • 10-Nov-2020 – Azure Data Factory, the ADF UX and Git

    If you're using Azure Data Factory (ADF), you're probably using Git and almost certainly using the ADF User Experience (ADF UX) – ADF's online integrated development environment (IDE). These three components are so closely interlinked that sometimes it's hard to think about them separately – in this article I try to do exactly that.

  • 28-Oct-2020 – Catch-22: Automating MSI access to an Azure SQL Database

    A problem I ran into recently is how to automate granting access to SQL databases in different environments. I'm using Terraform to build parallel data engineering environments which – amongst other things – include one or more SQL databases and instances of Azure Data Factory (ADF). ADF pipelines need access to databases, and I want to authenticate linked service connections using the factory's managed identity (MSI). Building this automatically is harder than it sounds!

  • 13-Oct-2020 – The Ice Cream Van of Abstraction

    October 2020's T-SQL Tuesday is hosted by Rob Volk (b|t) with the subject Data Analogies, or: Explain Databases Like I’m Five! Thanks for hosting, Rob! It turns out that I don't do much specifically databasplainingβ„’ by analogy, but there's one analogy I go back to again and again when talking to people unfamiliar with a useful concept, both inside and outside of SQL Server: abstraction.

  • 07-Oct-2020 – Google Analytics API pagination in Azure Data Factory

    In a previous post I created a pipeline to retrieve data from the Google Analytics reporting API, using an OAuth 2.0 access token for authorisation. Azure Data Factory's Copy data activity handles various styles of paged API response, but it doesn't support the approach taken by the Google Analytics reporting API. In this post I look at how to make that work.

  • 29-Sep-2020 – Get Metadata recursively in Azure Data Factory

    Azure Data Factory's Get Metadata activity returns metadata properties for a specified dataset, but not recursively. In this post I try to build an alternative using just ADF.

  • 10-Sep-2020 – Extract data from Google Analytics with Azure Data Factory

    In a previous post I prepared an ADF pipeline to make authorised Google Analytics API requests, using an Azure Function to obtain an OAuth 2.0 access token. In this article I'm going to use the returned token to authorise an API connection and extract data from Google Analytics.

  • 08-Sep-2020 – Automate to replicate

    In September 2020's T-SQL Tuesday, Elizabeth Noble (b|t) wants to know what members of the SQL community have automated to make their lives easier. Thanks for asking, Elizabeth πŸ˜€.

  • 28-Aug-2020 – Access Google Analytics with Azure Data Factory

    At the time of writing, Azure Data Factory has no connector to enable data extraction from Google Analytics, but it seems to be a common requirement – here's how to do it using ADF's current feature set.

  • 13-Aug-2020 – Parameterising the Execute Pipeline activity

    A shortcoming of Azure Data Factory's Execute Pipeline activity is that the pipeline to be triggered must be hard-coded into the activity – so it's impossible to use metadata-driven approaches like iterating over a list of pipeline names. This post looks at an alternative approach.

  • 11-Aug-2020 – Time capsule

    T-SQL Tuesday for August 2020 is hosted by Tamera Clark (b|t). She's asking for help in assembling a #SQLCommunity time capsule -- thanks for hosting, Tamera!

  • 24-Jul-2020 – Why automate ADF pipeline testing?

    Since writing my series on automated testing Azure Data Factory pipelines, I've had a few questions along the lines of "why bother?". One reader commented that "a typical ADF developer tests their pipeline doing debug runs". This is exactly how I develop pipelines: make a change, run the change, repeat until the pipeline does what I want. "Why bother with more testing?" is a good question!

  • 14-Jul-2020 – Default fault

    July 2020's T-SQL Tuesday is hosted by Kerry Tyler (b|t). Kerry describes pilots' use of plane crash reports to learn aviation safety lessons, and asks for similar – albeit hopefully lass catastrophic – tales of SQL Server-related disaster.

  • 09-Jul-2020 – Dropping temporary tables

    Local temporary tables – tables with names that begin with a single # character – are dropped automatically by SQL Server when they are no longer in scope. So why drop them explicitly at all? Here are some ideas!

  • 30-Jun-2020 – Print big

    A feature of T-SQL is that strings longer than 8000 bytes are truncated by PRINT. If you haven't already discovered this, you might wonder why it's a problem – the answer (for me at least) is dynamic SQL.

  • 25-Jun-2020 – From Azure Pipelines to GitHub Actions

    I use Azure Pipelines a lot, both for CI/CD (including automated testing for Azure Data Factory) and for building disposable Azure environments repeatably. Most of my code, however, lives in GitHub. GitHub provides its own service for automating CI/CD software workflows – GitHub Actions – and in this post I compare it to Azure Pipelines.

  • 17-Jun-2020 – Calculating Azure Data Factory test coverage

    This is the sixth and final article in a series on automated testing for Azure Data Factory pipelines. In software engineering, code coverage (or test coverage) measures the proportion of a program's source code executed by a given test suite. In this article I use execution history data to measure the proportion of a data factory's activities (across all pipelines) executed during a full test run – the test suite's activity coverage.

  • 09-Jun-2020 – The under-appreciated Start β†’ Run

    This month's T-SQL Tuesday, hosted by Kenneth Fisher (b|t), asks for specifically non-SQL related tips and tricks. I found that much harder than any number of technical problems, which I guess is the point of T-SQL Tuesday...

  • 03-Jun-2020 – Unit testing Azure Data Factory pipelines

    In part three of this series I looked at functional tests for ADF pipelines: verifying, in isolation, that pipelines are β€œdoing things right”. In this post I'll be testing isolated pipelines to check that they're β€œdoing the right things” – this is one description of a unit test.

  • 21-May-2020 – Testing Azure Data Factory in your CI/CD pipeline

    In my previous post I used ADF pipeline parameters to implement dependency injection for ADF pipelines and build isolated functional tests using the NUnit testing framework. In this article, I integrate the NUnit testing solution into an Azure DevOps pipeline, so that I can run tests automatically whenever changes are made to ADF resources.

  • 12-May-2020 – A tale of two smartphone apps

    This month's T-SQL Tuesday is hosted by Glenn Berry (b|t). Glenn's invitation was β€œto write about what you have been doing as a response to COVID-19”, and it feels like my answer should be this: not much...

  • 06-May-2020 – Isolated functional tests for Azure Data Factory

    In this article, I'll look at isolating a pipeline from its external dependencies in order to test it independently and with a range of testing scenarios. I'll be trying to establish that a pipeline is β€œdoing things right” in isolation – this is a functional test.

  • 26-Apr-2020 – Automate integration tests in Azure Data Factory

    In my previous post, I set up and ran one basic test of a single pipeline. In this article, I refactor my VS testing solution to make it easier to add new tests and to test new pipelines.

  • 20-Apr-2020 – Set up automated testing for Azure Data Factory

    Test automation allows you to run more tests, in less time, with guaranteed repeatability. If you change an existing ADF dataset definition for a new ADF pipeline, how do you know haven't broken something else? Automatically re-testing all your ADF pipelines before deployment gives you some protection against regression faults. Automated testing is a key component of CI/CD software development approaches: inclusion of automated tests in CI/CD deployment pipelines for Azure Data Factory can significantly improve quality.

  • 26-Jan-2020 – Errors in script tasks and components

    Unhandled errors thrown out of SSIS script tasks are usually accompanied by the message β€œException has been thrown by the target of an invocation” – this isn't very informative! A more manageable approach is to wrap as much of your code as possible inside a try/catch, then raise caught errors to SSIS for cleaner failure and easier diagnosis. This article shows you how to do that.

  • 02-Jan-2020 – Managing extended properties

    Extended properties are a means of attaching user-defined key-value pairs to database objects. This page provides some utility scripts to facilitate interaction with extended properties.