Partial Function Application to the rescue


I recently posted a question to the “Code Gurus” at The Code Project. I was impressed with the speed at which I received an answer (thank you Richard Deeming). And even though I know I can always head back to TCP to see my question and answer, I thought I would write it out here so I would have it ‘at my fingertips’ so-to-speak.

The gist of what I was trying to accomplish was this:

I have to write out a file for which each field has a set of specifications that must be met. I wanted to create a dictionary where the key is the field identifier and the value is a set of functions that could be called to validate that the data being written meets the criteria for each field. For example:

_rules = new FileRules()
{
    { "F01", new FieldRules() { AlphaNumeric, Mandatory, ... } },
    { "F02", new FieldRules() { PhoneNumber, ... } },
     ...
};

internal class FieldRules : List<Func<object, string>> { }

So far it is easy enough. I just have to define AlphaNumeric(), Mandatory(), and PhoneNumber() as functions that accept an object and return a string or throw an error. I can add as many validation rules as I wish to each field. Then, to validate the fields, I can use

object validatedData = fieldData;
try
{
    foreach (var rule in _rules[fieldId])
    {
        validatedData = rule(validatedData);
    }
}
catch (ValidationException ex) { ... }

Now comes the fun part:

One of the validation functions I wanted to add was for maximum field length. But different fields have different limits. I needed a way to say that, for one field, the validation function checks that the field fits in 10 characters and, for another, the field fits in 30. And I need to do that in the first block of code, where I am listing the rules for each field.

But how do you create a reference to a function where some parameters are assigned when the reference is assigned to a variable and others are specified when the reference is invoked?

I thought it was impossible to do… but I figured “what’s the harm in asking”? And lo and behold, there is a way to do it using partial function applications.

Essentially, partial function application means defining a function that defines a function: calling functionA creates a functionB with certain characteristics and returns a reference to it. This allows you to pass parameters to functionA that will be be embedded in the definition of functionB. Different parameters can be passed to different calls to functionA, resulting in different copies of functionB, each of which use their own set of embedded values. So functionB is only a ‘partial function’ that is not fully defined until it is given some parameters from functionA.

int B(x, y) { result = dosomething(x, y); return result ); // the original function
Func <int, Func<int, int>> A => m => n => B(m, n); // partial function application

var s = A(2); // n=2
// s is a reference to a function that accepts m and 
// calls B with parameters (x:m, y:2)
...
var t = s(3); // m=3
// B has been called with (x:3, y:2)

So, how would I define that for the MaxLen validation function?

Func<int, Func<object, string>> MaxLen => max => value => CheckLength(value, max);
Func<object, int, string> CheckLength => (value, maxlength) => {...}

The code can be a bit hard to follow, but I’ll try to break it down. However, we must first recognise that the first line of code has some syntactic sugar in it. If we expand it out, it would look more like this:

Func<int, Func<object, string>> MaxLen { get; } = max => value => CheckLength(value, max);
  • Func<int, Func<object, string>> MaxLen { get; } = max => value => CheckLength(value, max);
    • MaxLen is a read-only property that returns a function It is defined as a set of nested lambda expressions.
      • To read a lambda expression, just remember that parameters appear on to left of the => and the function to execute appears to the right.
  • a: Func<int, Func<object, string>> MaxLen {get;} = max => value => CheckLength(value, max);
    • The function returned by MaxLen is defined as a top-level lambda expression that accepts a single integer parameter, max, and returns the result of a second-level lambda expression.
  • b: Func<int, Func<object, string>> MaxLen {get} = max => value => CheckLength(value, max);
    • The definition of the second-level lambda expression is that it is a function that accepts an object parameter, value, and returns a reference to the string function CheckLength() which is called with value and max as parameters.
  • Func CheckLength => (toCheck, maxlength) => {…}
    • CheckLength() is another lambda expression: this one acceps an object (toCheck) and an integer (maxlength) and performs some unspecified action on them in order to return a string.
      • Note that CheckLength() does not need to be a lambda expression: it could have been defined as a traditional method code block.

This allows me to write

    { "F03", new FieldRules() { AlphaNumeric, MaxLen(10), ... } },
    { "F04", new FieldRules() { AlphaNumeric, MaxLen(30), ... } },

In the first line, when I reference MaxLen(10), I am adding into the dictionary the result a call to the lambda expression in a above. That result is a reference to a function (b) that has been generated with max set to 10.

Later, when I call

    foreach (var rule in _rules[fieldId])
    {
        validatedData = rule(validatedData);
    }

I will be extracting the reference to the function into the variable rule. Invoking rule with the parameter validatedData will cause the lambda expression b to be evaluated (since this is what the reference points to). The result of that evaluation is a call to the function CheckLength with toCheck set to the contents of validatedData and maxlength set to max (10).

Similarly, the second addition to the dictionary performs the same operations, but with max set to 30.

Or, using the format from the other example I gave above regarding functions X and Y:

var testFor10 = MaxLen(10); // max=10
// testFor10 is a reference to a function that accepts value and 
// calls CheckLength with parameters (toCheck:value, maxlength:10)

var testFor30 = MaxLen(30); // max=30
// testFor30 is a reference to a function that accepts value and
// calls CheckLength with parameters (toCheck:value, maxlength:30)
...
var firstTest = testFor10("abc"); // value="abc"
// CheckLength has been called with (toCheck:"abc", maxlength:10) 
var secondTest = testFor30("abc"); // value="abc"
// CheckLength has been called with (toCheck:"abc", maxlength:30)

Thus, we have a way to create references to functions wherein one (or more) of the functions’ parameters are assigned when the references are created (e.g. max=10 or max=30), and others are defined when the functions are invoked (e.g. value=validated or value="abc").

Partial function applications can also be defined to accept multiple ‘pre-defined’ parameters, or multiple functions can reuse the same ‘backing function’ with slightly different parameters.

Of course, the code for CheckLength() could have been rolled right into the definition of MaxLen(), but I kept it separate for clarity.

I hope to make greater use of this technique in the future.

Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s