I recently posted a question to the “Code Gurus” at The Code Project. I was impressed with the speed at which I received an answer (thank you Richard Deeming). And even though I know I can always head back to TCP to see my question and answer, I thought I would write it out here so I would have it ‘at my fingertips’ so-to-speak.
The gist of what I was trying to accomplish was this:
I have to write out a file for which each field has a set of specifications that must be met. I wanted to create a dictionary where the key is the field identifier and the value is a set of functions that could be called to validate that the data being written meets the criteria for each field. For example:
_rules = new FileRules()
{
{ "F01", new FieldRules() { AlphaNumeric, Mandatory, ... } },
{ "F02", new FieldRules() { PhoneNumber, ... } },
...
};
internal class FieldRules : List<Func<object, string>> { }
So far it is easy enough. I just have to define AlphaNumeric()
, Mandatory()
, and PhoneNumber()
as functions that accept an object
and return a string
or throw an error. I can add as many validation rules as I wish to each field. Then, to validate the fields, I can use
object validatedData = fieldData;
try
{
foreach (var rule in _rules[fieldId])
{
validatedData = rule(validatedData);
}
}
catch (ValidationException ex) { ... }
Now comes the fun part:
One of the validation functions I wanted to add was for maximum field length. But different fields have different limits. I needed a way to say that, for one field, the validation function checks that the field fits in 10 characters and, for another, the field fits in 30. And I need to do that in the first block of code, where I am listing the rules for each field.
But how do you create a reference to a function where some parameters are assigned when the reference is assigned to a variable and others are specified when the reference is invoked?
I thought it was impossible to do… but I figured “what’s the harm in asking”? And lo and behold, there is a way to do it using partial function applications.
Essentially, partial function application means defining a function that defines a function: calling functionA
creates a functionB
with certain characteristics and returns a reference to it. This allows you to pass parameters to functionA
that will be be embedded in the definition of functionB.
Different parameters can be passed to different calls to functionA
, resulting in different copies of functionB
, each of which use their own set of embedded values. So functionB
is only a ‘partial function’ that is not fully defined until it is given some parameters from functionA.
int B(x, y) { result = dosomething(x, y); return result ); // the original function Func <int, Func<int, int>> A => m => n => B(m, n); // partial function application var s = A(2); // n=2 // s is a reference to a function that accepts m and // calls B with parameters (x:m, y:2) ... var t = s(3); // m=3 // B has been called with (x:3, y:2)
So, how would I define that for the MaxLen
validation function?
Func<int, Func<object, string>> MaxLen => max => value => CheckLength(value, max);
Func<object, int, string> CheckLength => (value, maxlength) => {...}
The code can be a bit hard to follow, but I’ll try to break it down. However, we must first recognise that the first line of code has some syntactic sugar in it. If we expand it out, it would look more like this:
Func<int, Func<object, string>> MaxLen { get; } = max => value => CheckLength(value, max);
- Func<i
nt, Func<object, string>
> MaxLen { get; } =max => value => CheckLength(value, max);
- MaxLen is a read-only property that returns a function It is defined as a set of nested lambda expressions.
- To read a lambda expression, just remember that parameters appear on to left of the => and the function to execute appears to the right.
- MaxLen is a read-only property that returns a function It is defined as a set of nested lambda expressions.
- a: Func<int,
Func<object, string>
> MaxLen{get;}
= max => value => CheckLength(value, max);- The function returned by MaxLen is defined as a top-level lambda expression that accepts a single integer parameter, max, and returns the result of a second-level lambda expression.
- b
: Func<int,
Func<object, string>> MaxLen {get} = max =>
value => CheckLength(value, max);- The definition of the second-level lambda expression is that it is a function that accepts an object parameter, value, and returns a reference to the string function CheckLength() which is called with value and max as parameters.
- Func CheckLength => (toCheck, maxlength) => {…}
- CheckLength() is another lambda expression: this one acceps an object (toCheck) and an integer (maxlength) and performs some unspecified action on them in order to return a string.
- Note that CheckLength() does not need to be a lambda expression: it could have been defined as a traditional method code block.
- Note that CheckLength() does not need to be a lambda expression: it could have been defined as a traditional method code block.
- CheckLength() is another lambda expression: this one acceps an object (toCheck) and an integer (maxlength) and performs some unspecified action on them in order to return a string.
This allows me to write
{ "F03", new FieldRules() { AlphaNumeric, MaxLen(10), ... } },
{ "F04", new FieldRules() { AlphaNumeric, MaxLen(30), ... } },
In the first line, when I reference MaxLen(10)
, I am adding into the dictionary the result a call to the lambda expression in a above. That result is a reference to a function (b) that has been generated with max
set to 10.
Later, when I call
foreach (var rule in _rules[fieldId])
{
validatedData = rule(validatedData);
}
I will be extracting the reference to the function into the variable rule
. Invoking rule
with the parameter validatedData
will cause the lambda expression b to be evaluated (since this is what the reference points to). The result of that evaluation is a call to the function CheckLength
with toCheck
set to the contents of validatedData
and maxlength
set to max
(10).
Similarly, the second addition to the dictionary performs the same operations, but with max
set to 30.
Or, using the format from the other example I gave above regarding functions X and Y:
var testFor10 = MaxLen(10); // max=10 // testFor10 is a reference to a function that accepts value and // calls CheckLength with parameters (toCheck:value, maxlength:10) var testFor30 = MaxLen(30); // max=30 // testFor30 is a reference to a function that accepts value and // calls CheckLength with parameters (toCheck:value, maxlength:30) ... var firstTest = testFor10("abc"); // value="abc" // CheckLength has been called with (toCheck:"abc", maxlength:10) var secondTest = testFor30("abc"); // value="abc" // CheckLength has been called with (toCheck:"abc", maxlength:30)
Thus, we have a way to create references to functions wherein one (or more) of the functions’ parameters are assigned when the references are created (e.g. max=10
or max=30
), and others are defined when the functions are invoked (e.g. value=validated
or value="abc"
).
Partial function applications can also be defined to accept multiple ‘pre-defined’ parameters, or multiple functions can reuse the same ‘backing function’ with slightly different parameters.
Of course, the code for CheckLength()
could have been rolled right into the definition of MaxLen()
, but I kept it separate for clarity.
I hope to make greater use of this technique in the future.