Wednesday, 29 June 2016

Calling C# Methods from SQL Server

I was recently able to take advantage of a useful Microsoft SQL Server feature that enabled me to call a .NET library class method from a stored procedure.

Brief Context


I was working on a client project in which our team were required to write an ETL process. The ETL process was mostly written using SQL Server Integration Services (SSIS). In one part of the ETL process, we needed to do some fuzzy string similarity matching. We found that the Levenshtein string distance algorithm suited our requirements.

Aside: The Levenshtein algorithm takes two strings and computes the "edit distance" which indicates how many transformations are necessary to reach a given target string from a given source string. For example, the function Levenshtein("test", "tear") would return the value two.

In our case, we wanted to abstract the Levenshtein output into a percentage, so for example, the function GetTextSimilarity("test", "tear") would return 50% - indicating that the inputs are half similar.

Creating the Class Library


The full code for the class library is available on GitHub.

The first step is to create a .NET class library which targets a .NET framework version that the flavour of SQL Server you are working with supports. In my case, we created a C# class library targeting .NET framework 4.5 as we were going to use the class library in Microsoft SQL Server 2014.

Ensure that the method you want to call from SQL Server is marked static. You will need to decorate the method with the Microsoft.SqlServer.Server.SqlProcedure attribute. If you look at my class library code, you'll find two classes, namely, LevenshteinDistance and StringCompare. The LevenshteinDistance class contains the algorithm implementation in a static method marked with the SqlProcedure attribute. Similarly, the StringCompare method (shown below) contains one method, GetTextSimilarity, which is the method we want to call as a function from SQL server.

using System;
using Microsoft.SqlServer.Server;

namespace SqlLibrary.Text
{
    public class StringCompare
    {
        [SqlProcedure]
        public static double GetTextSimilarity(
            string inputOne, string inputTwo
        )
        {
            if (inputOne == null)
                throw new ArgumentNullException(nameof(inputOne));

            if (inputTwo == null)
                throw new ArgumentNullException(nameof(inputTwo));

            if (inputOne == string.Empty && inputTwo == string.Empty)
                return 1;

            int distance = LevenshteinDistance
                .GetLevenshteinDistance(inputOne, inputTwo);

            return 1.0 - distance / 
                (double)Math.Max(inputOne.Length, inputTwo.Length);
        }
    }
}

The GetTextSimilarity method calls the LevenshteinDistance.GetLevenshteinDistance method and returns the percentage similarity value that we required. Note that only the method you want to call as a function requires the SqlProcedure attribute. In this case, we could have only marked the GetTextSimilarity method with this attribute as this is the only method we want to call from SQL Server.

Registering the Class Library in SQL Server


Once you've finished writing the class library and you are ready to call your method(s) from SQL Server, you'll need to execute the following SQL script (obviously, you'll need to replace the database name, assembly file path, function name and method namespace).

USE MyDatabase
GO

CREATE ASSEMBLY SqlLibrary from 'C:\SqlLibrary.dll' WITH PERMISSION_SET = SAFE
GO

CREATE FUNCTION GetTextSimilarity(@inputOne [nvarchar](4000), @inputTwo [nvarchar](4000))
RETURNS [floatWITH EXECUTE AS CALLER, RETURNS NULL ON NULL INPUT
AS 
EXTERNAL NAME [SqlLibrary].[SqlLibrary.Text.StringCompare].[GetTextSimilarity]
GO

sp_configure 'clr enabled', 1
GO

RECONFIGURE
GO

Once the script has finished executing, you will find the registered assembly under the Database\Programmability\Assemblies node in SQL Server Management Studio.

If you look under Database\Programmability\Functions, you'll find the method you registered earlier (see SQL script).


You can now call your method as a standard T-SQL function call. Quite useful!

Saturday, 19 December 2015

C# CSV Library and HTTP Handlers

A colleague and I was recently looking at some old ASP.NET WebForms code which generated a potentially large CSV string and returned it in a response to a HTTP request. The original code was placed in a standard ASP.NET WebForms page in the OnLoad event handler. As the code was working with the HttpResponse object and writing the CSV directly to the HTTP output stream, we determined that this code was better placed in a HTTP Handler. The primary advantage of using a HTTP handler to generate the file is that we can bypass the unnecessary ASP.NET WebForms page life cycle.

The existing code was using a StringBuilder to generate the CSV string in the page OnLoad event handler. As you may imagine, this looked messy and was a pain to maintain. To abstract the CSV string creation logic, my colleague introduced me to a useful CSV library called LINQ to CSV. The project page link for this library is here and the library is available as a NuGet package.

The LINQ to CSV library enabled us to decorate some properties in a model class with attributes that specify how to output the property in CSV form. The example below (from the library project page) shows how to apply the attributes:

public class Product
{
    [CsvColumn(Name = "ProductName", FieldIndex = 1)]
    public string Name { get; set; }
    [CsvColumn(FieldIndex = 2, OutputFormat = "dd MMM HH:mm:ss")]
    public DateTime LaunchDate { get; set; }
    [CsvColumn(FieldIndex = 3, CanBeNull = false, OutputFormat = "C")]
    public decimal Price { get; set; }
    [CsvColumn(FieldIndex = 4)]
    public string Country { get; set; }
    [CsvColumn(FieldIndex = 5)]
    public string Description { get; set; }
}

Once your model class is decorated, you can then use other classes of the library to generate a CSV representation of a collection containing your model instances. This representation can be output to disk as demonstrated in the code example below:

IList<Product> products = new List<Product>();
products.Add(
    new Product
    {
        Name = "Samung Galaxy S6",
        LaunchDate = DateTime.Today,
        Price = 500,
        Country = "United Kingdom",
        Description = "This is a Samsung phone product"
    }
);

products.Add(
    new Product
    {
        Name = "Apple iPhone 5",
        LaunchDate = DateTime.Today,
        Price = 600,
        Country = "United States",
        Description = "This is an Apple phone product"
    }
);

CsvFileDescription inputFileDescription = new CsvFileDescription();
inputFileDescription.SeparatorChar = ',';
inputFileDescription.FirstLineHasColumnNames = true;

CsvContext csvContext = new CsvContext();
csvContext.Write(products, @"C:\Products.csv");

Usefully, the CsvContext class of the library also supports an overload to the Write method which accepts a TextWriter stream. Passing a TextWriter stream to this overload results in the library outputing the CSV representation to your stream rather than a file on disk. In our case, we used a StreamWriter that wrote to a MemoryStream. Once the CSV file content was in the MemoryStream instance, we sent the stream contents in the HTTP response from the HTTP handler. The full HTTP handler code is below (using the example product data).

public class ProductsExportHandler : IHttpHandler
{
    public bool IsReusable { get { return true; } }

    public void ProcessRequest(HttpContext context)
    {
        context.Response.AddHeader(
            "content-disposition"
            "attachment;filename=BulkExport.csv"
        );
        context.Response.Cache.SetCacheability(HttpCacheability.NoCache);
        context.Response.ContentType = "application/vnd.text";

        IList<Product> products = new List<Product>();
        // Add products...
            
        var inputFileDescription = new CsvFileDescription();
        inputFileDescription.SeparatorChar = ',';
        inputFileDescription.FirstLineHasColumnNames = true;
            
        using (var memoryStream = new MemoryStream())
        {
            using (var streamWriter = new StreamWriter(memoryStream))
            {
                var csvContext = new CsvContext();
                csvContext.Write(products, streamWriter);
                    
                streamWriter.Flush();
                    
                context.Response.BinaryWrite(memoryStream.GetBuffer());
            }
        }
    }
}

Sunday, 6 September 2015

Practice Test Driven Development with C# Exercises on Exercism.io

I was recently introduced to Exercism.io which is a place where you can download a number of C# exercises to work on. Exercism.io not only supports C# but also other programming languages. Each exercise comes with a pre-written suite of unit tests and it is your task to implement the required functionality in order to make the unit tests pass.

The exercises are a good way to get familiar with Test Driven Development (TDD) without having to write the tests. You can still semi-follow the prescribed red-green-refactor approach, where you initially have a test that fails, you then write an implementation to make the test pass and then you refactor your implementation until you're happy with it.

The Exercism.io website provides a useful command line interface tool which you use to fetch exercises and then also submit your solutions. Once a solution is submitted, it is made available on Exercism.io for others to "nitpick". This is a great feature where others who have done the same exercise can view your solution and suggest improvements. Although I've found the C# community on Exercism.io to be fairly quiet (with only a few people nitpicking), I can see this being a very useful feature once more people start to use the site.

If you've also heard of Project Euler (which I blogged about here), then yes, Exercism.io is similar in principle, but there are differences which you'll find quickly after trying both sites. I've tried a few exercises in the C# section of Exercism.io - they are fun and a good to practice your C#/.NET framework knowledge. Highly recommend this site to anyone who enjoys small programming exercises in their spare time and anyone who is interested in trying TDD. You can get started here.