Partial Function Application to the rescue


I recently posted a question to the “Code Gurus” at The Code Project. I was impressed with the speed at which I received an answer (thank you Richard Deeming). And even though I know I can always head back to TCP to see my question and answer, I thought I would write it out here so I would have it ‘at my fingertips’ so-to-speak.

The gist of what I was trying to accomplish was this:

I have to write out a file for which each field has a set of specifications that must be met. I wanted to create a dictionary where the key is the field identifier and the value is a set of functions that could be called to validate that the data being written meets the criteria for each field. For example:

_rules = new FileRules()
{
    { "FH01", new FieldRules() { AlphaNumeric, Mandatory, ... } },
    { "FH02", new FieldRules() { PhoneNumber, ... } },
     ...
};

internal class FieldRules : List<Func<object, string>> { }

So far it is easy enough. I just have to define AlphaNumeric(), Mandatory(), and PhoneNumber() as functions that accept an object and return a string or throw an error. I can add as many validation rules as I wish to each field. Then, to validate the fields, I can use

object validated = stuffToOutput;
try
{
    foreach (var rule in _rules[fieldId])
    {
        validated = rule(validated);
    }
}
catch (ValidationException ex) { ... }

Now comes the fun part:

One of the validation functions I wanted to add was for maximum field length. But different fields have different limits. I needed a way to say that, for one field, the validation function checks that the field fits in 10 characters and, for another, the field fits in 30. And I need to do that in the first block of code, where I am listing the rules for each field.

But how do you create a reference to a function where some parameters are assigned when the reference is assigned to a variable and others are specified when the reference is invoked?

I thought it was impossible to do… but I figured “what’s the harm in asking”?

And lo and behold, there is a way to do it using partial function applications. Essentially, given that I have a function X that takes inputs a and b and returns c, I would define a function Y that takes a parameter z and returns a function that takes w and calls X with a=w and b=z. I could then later obtain c from X by calling Y with only 1 parameter (the other being fixed when Y was assigned).

int X(a, b) { c = dosomething(a, b); return c ); // the original function
Func <int, Func<int, int>> Y => z => w => X(w, z); // partial function application

var m = Y(2); // z=2
// m is now a function that accepts w and calls X with a=w and b=2
...
var n = m(3); // w=3
// X has been called with a=3, b=2

So, how would I define that for the MaxLen validation function?

Func<int, Func<object, string>> MaxLen => max => value => CheckLength(value, max);
Func<object, int, string> CheckLength => (value, maxlength) => {...}

The code can be a bit hard to follow, but I’ll try to break it down. However, we must first recognise that the first line of code has some syntactic sugar in it. If we expand it out, it would look more like this:

Func<int, Func<object, string>> MaxLen { get; } = max => value => CheckLength(value, max);
  • Func<int, Func<…>> MaxLen { get; } =…
    • MaxLen is a read-only property that returns a function It is defined as a set of nested lambda expressions.
      • To read a lambda expression, just remember that parameters appear on to left of the => and the function to execute appears to the right.
  • a: Func<int, Func<…>> MaxLen= max => value =>…
    • The function returned by MaxLen is defined as a top-level lambda expression that accepts a single integer parameter, max, and returns the result of the second-level lambda expression starting at value =>.
  • b: Func<int, Func<object, string>>value => CheckLength(value, max);
    • The definition of the second-level lambda expression is that it is a function that accepts an object parameter, value, and returns a reference to the string function CheckLength() which is called with value and max as parameters.
  • Func CheckLength => (toCheck, maxlength) => {…}
    • CheckLength() is another lambda expression: this one acceps an object (toCheck) and an integer (maxlength) and performs some unspecified action on them in order to return a string.
      • Note that CheckLength() does not need to be a lambda expression: it could have been defined as a traditional method code block.

This allows me to write

    { "FH03", new FieldRules() { AlphaNumeric, MaxLen(10), ... } },
    { "FH04", new FieldRules() { AlphaNumeric, MaxLen(30), ... } },

In the first line, when I reference MaxLen(10), I am adding into the dictionary the result a call to the lambda expression in a above. That result is a reference to a function (b) that has been generated with max set to 10.

Later, when I call

    foreach (var rule in _rules[fieldId])
    {
        validated = rule(validated);
    }

I will be extracting the reference to the function into the variable rule. Invoking rule with the parameter validated will cause the lamda expression b to be evaluated (since this is what the reference points to). The result of that evaluation is a call to the function CheckLength with toCheck set to the contents of validated and maxlength set to max (10).

Similarly, the second addition to the dictionary performs the same operations, but with max set to 30.

Or, using the format from the other example I gave above regarding functions X and Y:

var testFor10 = MaxLen(10); // max=10
// testFor10 is now a function that accepts value and calls CheckLength with maxlength=10
...
var tested = testFor10("abc"); // value="abc"
// CheckLength has been called with toCheck="abc" and maxlength=10 

Thus, we have a way to create references to functions wherein one (or more) of the functions’ parameters are assigned when the references are created (e.g. max=10 or max=30), and others are defined when the functions are invoked (e.g. value=validated or value=”abc”).

Partial function applications can also be defined to accept multiple ‘pre-defined’ parameters, or multiple functions can reuse the same ‘backing function’ with slightly different parameters.

Of course, the code for CheckLength() could have been rolled right into the definition of MaxLen(), but I kept it separate for clarity.

I hope to make greater use of this technique in the future.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s