View on GitHub

Kestrels, Quirky Birds,
and Hopeless Egocentricity

“Working through @raganwald's combinator book. I WROTE A THRUSH! Love the book.”—Corey Haines
“I bought it a month ago and it is brilliant! It let me connect my functional PL class to Ruby.”—Zach Abbott
“O_O... it seems there's no limit in the joy of coding and thinking in #ruby”—Edoardo Rossi

The road to this web site began when I first applied ideas from Combinatory Logic to my daily programming. Combinatory Logic is a fascinating subject in its own right, and stands beside the Lambda Calculus as one of the foundations of Computer Science.

In addition to being a powerful theoretical basis for reasoning about computers, it's chock full of ideas that are surprisingly practical for day-to-day programming.

my book

I've blogged about some of the applications of Combinatory Logic to my Ruby programming, and many of them were featured on /r/programming and/or Hacker News. More than a few people suggested I write a book about combinators in Ruby, so I adapted the blog posts into a convenient and portable ebook.

Read the entire text online, below. It's covered by The MIT License, so feel free to deep link, copy, and share. At the end, you'll find some additional resources to read and enjoy.

NOTE: The text below was generated automatically from the book copy and there are some formatting errors in the source code. The PDF, Kindle, and iBook versions are properly formatted.

Kestrels, Quirky Birds, and Hopeless Egocentricity

Introduction

Like the Lambda Calculus, Combinatory Logic is a mathematical notation that is powerful enough to handle set theory and issues in computability.

Combinatory logic is a notation introduced by Moses Schönfinkel and Haskell Curry to eliminate the need for variables in mathematical logic. It has more recently been used in computer science as a theoretical model of computation and also as a basis for the design of functional programming languages. It is based on combinators. A combinator is a higher-order function that uses only function application and earlier defined combinators to define a result from its arguments.

In this book, we’re going to meet some of the standard combinators, and for each one we’ll explore some of its ramifications when writing programs using the Ruby programming language. In Combinatory logic, combinators combine and alter each other, and our Ruby examples will focus on combining and altering Ruby code. From simple examples like the K Combinator and Ruby’s .tap method, we’ll work our way up to meta-programming with aspects and recursive combinators.

about the bird names

When Combinatory Logic was first invented by Haskell Curry, the standard combinators were given upper-case letters. For example, the two combinators needed to express everything in the Lambda Calculus and in Set Theory are the S and K combinators. In 1985, Raymond Smullyan published To Mock a Mockingbird, an exploration of combinatory logic for the recreational layman. Smullyan used a forest full of songbirds as a metaphor, with each of the combinators given the name of a songbird rather than a single letter. For example, the S and K combinators became the Starling and Kestrel, the I combinator became the Idiot bird, and so forth.

These ornithological nicknames have become part of the standard lexicon for combinatory logic.

thanks

There are too many people to name,but amongst the crowd, Alan Smith stands out.

Kestrels

In Combinatory Logic, a Kestrel (or “K Combinator”) is a function that returns a constant function, normally written Kxy = x. In Ruby, it might look like this:

# for *any* x,
kestrel.call(:foo).call(x)
  => :foo

Kestrels are to be found in Ruby. You may be familiar with their Ruby 1.9 name, #tap. Let’s say you have a line like address = Person.find(...).address and you wish to log the person instance. With tap, you can inject some logging into the expression without messy temporary variables:

address = Person.find(...).tap { |p| logger.log "person #{p} found" }.addre\
ss

tap is a method in all objects that passes self to a block and returns self, ignoring whatever the last item of the block happens to be. Ruby on Rails programmers will recognize the Kestrel in slightly different form:

address = returning Person.find(...) do |p|
  logger.log "person #{p} found"
end.address

Again, the result of the block is discarded, it is only there for side effects. This behaviour is the same as a Kestrel. Remember kestrel.call(:foo).call(x)? If I rewrite it like this, you can see the similarity:

Kestrel.call(:foo) do
  x
end
  => :foo

Both returning and tap are handy for grouping side effects together. Methods that look like this:

def registered_person(params = {})
  person = Person.new(params.merge(:registered => true))
  Registry.register(person)
  person.send_email_notification
  person
end

Can be rewritten using returning:

def registered_person(params = {})
  returning Person.new(params.merge(:registered => true)) do |person|
    Registry.register(person)
    person.send_email_notification
  end
end

It is obvious from the first line what will be returned and it eliminates an annoying error when the programmer neglects to make person the last line of the method.

Object initializer blocks

The Kestrel has also been sighted in the form of object initializer blocks. Consider this example using Struct:

Contact = Struct.new(:first, :last, :email) do
  def to_hash
    Hash[*members.zip(values).flatten]
  end
end

The method Struct#new creates a new class. It also accepts an optional block, evaluating the block for side effects only. It returns the new class regardless of what happens to be in the block (it happens to evaluate the block in class scope, a small refinement).

You can use this technique when writing your own classes:

class Bird < Creature
  def initialize(*params)
    # do something with the params
    yield self if block_given?
  end
end

Forest.add(
  Bird.new(:name => 'Kestrel) { |k| combinators << k }
)

The pattern of wanting a Kestrel/returning/tap when you create a new object is so common that building it into object initialization is useful. And in fact, it’s built into ActiveRecord. Methods like new and create take optional blocks, so you can write:

class Person < ActiveRecord::Base
  # ...
end

def registered_person(params = {})
  Person.new(params.merge(:registered => true)) do |person|
    Registry.register(person)
    person.send_email_notification
  end
end

In Rails, returning is not necessary when creating instances of your model classes, thanks to ActiveRecord’s built-in object initializer blocks.

Inside, an idiomatic Ruby Kestrel

When we discussed Struct above, we noted that its initializer block has a slightly different behaviour than tap or returning. It takes an initializer block, but it doesn’t pass the new class to the block as a parameter, it evaluates the block in the context of the new class.

Putting this into implementation terms, it evaluates the block with self set to the new class. This is not the same as returning or tap, both of which leave self untouched. We can write our own version of returning with the same semantics. We will call it inside:

module Kernel

  def inside(value, &block)
    value.instance_eval(&block)
    value
  end

end

You can use this variation on a Kestrel just like returning, only you do not need to specify a parameter:

inside [1, 2, 3] do
  uniq!
end
  => [1, 2, 3]

This isn’t particularly noteworthy. Of more interest is your access to private methods and instance variables:

sna = Struct.new('Fubar') do
  attr_reader :fu
end.new

inside(sna) do
  @fu = 'bar'
end
  => <struct Struct::Fubar >

sna.fu
  => 'bar'

inside is a Kestrel just like returning. No matter what value its block generates, it returns its primary argument. The only difference between the two is the evaluation environment of the block.

The Enchaining Kestrel

In Kestrels, we looked at #tap from Ruby 1.9 and returning from Ruby on Rails. No we’ll going to look at another use for tap. As already explained, Ruby 1.9 includes the new method Object#tap. It passes the receiver to a block, then returns the receiver no matter what the block contains. The canonical example inserts some logging in the middle of a chain of method invocations:

address = Person.find(...).tap { |p| logger.log "person #{p} found" }.addre\
ss

Object#tap is also useful when you want to execute several method on the same object without having to create a lot of temporary variables, a practice Martin Fowler calls [Method Chaining](http://martinfowler.com/dslwip/MethodChaining.html “”). Typically, you design such an object so that it returns itself in response to every modifier message. This allows you to write things like:

HardDrive.new.capacity(150).external.speed(7200)

Instead of:

hd = HardDrive.new
hd.capacity = 150
hd.external = true
hd.speed = 7200

And if you are a real fan of the Kestrel, you would design your class with an object initializer block so you could write:

hd = HardDrive.new do
  @capacity = 150
  @external = true
  @speed = 7200
end

But what do you do when handed a class that was not designed with method chaining in mind? For example, Array#pop returns the object being popped, not the array. Before you validate every criticism levelled against Ruby for allowing programmers to rewrite methods in core classes, consider using #tap with Symbol#to_proc or String#to_proc to chain methods without rewriting them.

So instead of

def fizz(arr)
  arr.pop
  arr.map! { |n| n * 2 }
end

We can write:

def fizz(arr)
  arr.tap(&:pop).map! { |n| n * 2 }
end

I often use #tap to enchain methods for those pesky array methods that sometimes do what you expect and sometimes don’t. My most hated example is Array#uniq!:

arr = [1,2,3,3,4,5]
arr.uniq, arr
  => [1,2,3,4,5], [1,2,3,3,4,5]
arr = [1,2,3,3,4,5]
arr.uniq!, arr
  => [1,2,3,4,5], [1,2,3,4,5]
arr = [1,2,3,4,5]
arr.uniq, arr
  => [1,2,3,4,5], [1,2,3,4,5]
arr = [1,2,3,4,5]
arr.uniq!, arr
  => nil, [1,2,3,4,5]

Let’s replay that last one in slow motion:

[  1,  2,  3,  4,  5  ].uniq!
  => nil

That might be a problem. For example:

[1,2,3,4,5].uniq!.sort!
  => NoMethodError: undefined method `sort!' for nil:NilClass

Object#tap to the rescue: When using a method like #uniq! that modifies the array in place and sometimes returns the modified array but sometimes helpfully returns nil, I can use #tap to make sure I always get the array, which allows me to enchain methods:

[1,2,3,4,5].tap(&:uniq!).sort!
  => [1,2,3,4,5]

So there’s another use for #tap (along with Symbol#to_proc for simple cases): We can use it when we want to enchain methods, but the methods do not return the receiver.

In Ruby 1.9, #tap works exactly as described above. Ruby 1.8 does not have #tap, but you can obtain it by installing the andand gem. This version of #tap also works like a Quirky Bird, so you can write things like HardDrive.new.tap.capacity(150) for enchaining methods that take parameters and/or blocks. To get andand, sudo gem install andand. Rails users can also drop andand.rb in config/initializers.

The Obdurate Kestrel

The andand gem includes Object#tap for Ruby 1.8. It also includes another kestrel called #dont. Which does what it says, or rather doesn’t do what it says.

:foo.tap { p 'bar' }
bar
  => :foo # printed 'bar' before returning a value!

:foo.dont { p 'bar' }
  => :foo # without printing 'bar'!

Object#dont simply ignores the block passed to it. So what is it good for? Well, remember our logging example for #tap?

address = Person.find(...).tap { |p| logger.log "person #{p} found" }.addre\
ss

Let’s turn the logging off for a moment:

address = Person.find(...).dont { |p| logger.log "person #{p} found" }.addr\
ess

And back on:

address = Person.find(...).tap { |p| logger.log "person #{p} found" }.addre\
ss

I typically use it when doing certain kinds of primitive debugging. And it has another trick up its sleeve:

arr.dont.sort!

Look at that, it works with method calls like a quirky bird! So you can use it to NOOP methods. Now, you could have done that with Symbol#to_proc:

arr.dont(&:sort!)

But what about methods that take parameters and blocks?

JoinBetweenTwoModels.dont.create!(...) do |new_join|
  # ...
end

Object#dont is the Ruby-semantic equivalent of commenting out a method call, only it can be inserted inside of an existing expression. That’s why it’s called the obdurate kestrel. It refuses to do anything!

If you want to try Object#dont, or want to use Object#tap with Ruby 1.8, sudo gem install andand. Rails users can also drop andand.rb in config/initializers as mentioned above. Enjoy!

Kestrels on Rails

As mentioned, Ruby on Rails provides #returning, a method with K Combinator semantics:

returning(expression) do |name|
  # name is bound to the result of evaluating expression
  # this block is evaluated and the result is discarded
end
  => # the result of evaluating the expression is now returned

Rails also provides object initializer blocks for ActiveRecord models. Here’s an example from one of my unit tests:

@board = Board.create(:dimension => 9) do |b|
  b['aa'] = 'black'
  b['bb'] = 'black'
  b['cb'] = 'black'
  b['da'] = 'black'
  b['ba'] = 'white'
  b['ca'] = 'white'
end

So, it looks like in Rails you can choose between an object initializer block and #returning:

@board = returning(Board.create(:dimension => 9)) do |b|
  b['aa'] = 'black'
  b['bb'] = 'black'
  b['cb'] = 'black'
  b['da'] = 'black'
  b['ba'] = 'white'
  b['ca'] = 'white'
end

In both cases the created object is returned regardless of what the block would otherwise return. But beyond that, the two Kestrels have very different semantics. “Returning” fully evaluates the expression, in this case creating the model instance in its entirety, including all of its callbacks. The object initializer block, on the other hand, is called as part of initializing the object before starting the lifecycle of the object including its callbacks.

“Returning” is what you want when you want to do stuff involving the fully created object and you are trying to logically group the other statements with the creation. In my case, that’s what I want, I am trying to say that @board is a board with black stones on certain intersections and white stones on other intersections.

Object initialization is what you want when you want to initialize certain fields by hand and perform some calculations or logic before kicking off the object creation lifecycle. That wasn’t what I wanted in this case because my []= method depended on the object being initialized. So my code had a bug that was fixed when I changed from object initializers to #returning.

Summary: In Rails, object initializers are evaluated before the object’s life cycle is started, #returning’s block is evaluated afterwards. And that is today’s lingua obscura.

Rewriting “Returning” in Rails

One of the most useful tools provided by Ruby on Rails is the #returning method, a simple but very useful implementation of the K Combinator or Kestrel. For example, this:

def registered_person(params = {})
  person = Person.new(params.merge(:registered => true))
  Registry.register(person)
  person.send_email_notification
  person
end

Can and should be expressed using #returning as this:

def registered_person(params = {})
  returning Person.new(params.merge(:registered => true)) do |person|
    Registry.register(person)
    person.send_email_notification
  end
end

Why? Firstly, you avoid the common bug of forgetting to return the object you are creating:

def broken_registered_person(params = {})
  person = Person.new(params.merge(:registered => true))
  Registry.register(person)
  person.send_email_notification
end

This creates the person object and does the initialization you want, but doesn’t actually return it from the method, it returns whatever #send_email_notification happens to return. If you’ve worked hard to create fluent interfaces you might be correct by accident, but #send_email_notification could just as easily return the email it creates. Who knows?

Second, in methods like this as you read from top to bottom you are declaring what the method returns right up front:

def registered_person(params = {})
  returning Person.new(params.merge(:registered => true)) do # ...
    # ...
  end
end

It takes some optional params and returns a new person. Very clear. And the third reason I like #returning is that it logically clusters the related statements together:

returning Person.new(params.merge(:registered => true)) do |person|
  Registry.register(person)
  person.send_email_notification
end

It is very clear that these statements are all part of one logical block. As a bonus, my IDE respects that and it’s easy to fold them or drag them around as a single unit. All in all, I think #returning is a big win and I even look for opportunities to refactor existing code to use it whenever I’m making changes.

DWIM

All that being said, I have observed a certain bug or misapplication of #returning from time to time. It’s usually pretty subtle in production code, but I’ll make it obvious with a trivial example. What does this snippet evaluate to?

returning [1] do |numbers|
  numbers << 2
  numbers += [3]
end

This is the kind of thing that sadistic interviewers use in coding quizzes. The answer is [1, 2], not [1, 2, 3]. The << operator mutates the value assigned to the numbers variable, but the += statement overwrites the reference assigned to the numbers variable without changing the original value. #returning remembers the value originally assigned to numbers and returns it. If you have some side-effects on that value, those count. But assignment does nothing to the value.

This may seem obvious, but in my experience it is a subtle point that causes difficulty. Languages with referential transparency escape the confusion entirely, but OO languages like Ruby have this weird thing where we have to keep track of references and labels on references in our head.

Here’s something contrived to look a lot more like production code. First, without #returning:

def working_registered_person(params = {})
  person = Person.new(params.merge(:registered => true))
  if Registry.register(person)
    person.send_email_notification
  else
    person = Person.new(:default => true)
  end
  person
end

And here we’ve refactored it to use #returning:

def broken_registered_person(params = {})
  returning Person.new(params.merge(:registered => true)) do |person|
    if Registry.register(person)
      person.send_email_notification
    else
      person = Person.new(:default => true)
    end
  end
end

Oops! This no longer works as we intended. Overwriting the person variable is irrelevant, #returning returns the unregistered new person no matter what. So what’s going on here?

One answer is to “blame the victim.” Ruby has a certain well-documented behaviour around variables and references. #returning has a certain well-documented behaviour. Any programmer who makes the above mistake is–well–mistaken. Fix the code and set the bug ticket status to Problem Between Keyboard And Chair (“PBKAC”).

Another answer is to suggest that the implementation of #returning is at fault. If you write:

returning ... do |var|
  # ...
  var = something_else
  # ...
end

You intended to change what you are returning from #returning. So #returning should be changed to do what you meant. I’m on the fence about this. When folks argue that designs should cater to programmers who do not understand the ramifactions of the programming language or of the framework, I usually retort that you cannot have progress and innovation while clinging to familiarity, an argument I first heard from Jef Raskin. The real meaning of “The Principle of Least Surprise” is that a design should be internally consistent, which is not the same thing as familiar.

Ruby’s existing use of variables and references is certainly consistent. And once you know what #returning does, it remains consistent. However, this design decision isn’t really about being consistent with Ruby’s implementation, we are debating how an idiom should be designed. I think we have a blank canvas and it’s reasonable to at least consider a version of #returning that handles assignment to the parameter.

Rewriting #returning

The RewriteRails plug-in adds syntactic abstractions like Andand to Rails projects without monkey-patching. RewriteRails now includes its own version of #returning that overrides the #returning shipping with Rails.

When RewriteRails is processing source code, it turns code like this:

def registered_person(params = {})
  returning Person.new(params.merge(:registered => true)) do |person|
    if Registry.register(person)
      person.send_email_notification
    else
      person = Person.new(:default => true)
    end
  end
end

Into this:

def registered_person(params = {})
  lambda do |person|
    if Registry.register(person)
      person.send_email_notification
    else
      person = Person.new(:default => true)
    end
    person
  end.call(Person.new(params.merge(:registered => true)))
end

Note that in addition to turning the #returning “call” into a lambda that is invoked immediately, it also makes sure the new lambda returns the person variable’s contents. So assignment to the variable does change what #returning appears to return.

Like all processors in RewriteRails, #returning is only rewritten in .rr files that you write in your project. Existing .rb files are not affected, including all code in the Rails framework: RewriteRails will never monkey with other people’s expectations. RewriteRails doesn’t physically modify the .rr files you write: The rewritten code is put in another file that the Ruby interpreter sees. So you see the code you write and RewriteRails figures out what to show the interpreter. This is a little like a Lisp macro.

The Thrush

In Combinatory Logic, the thrush is an extremely simple permuting combinator; it reverses the normal order of evaluation. The thrush is written Txy = yx. It reverses evaluation. In Ruby terms,

thrush.call(a_value).call(a_proc)
  => a_proc.call(a_value)

In No Detail Too Small, I defined Object#into, an implementation of the thrush as a Ruby method:

class Object
  def into expr = nil
    expr.nil? ? yield(self) : expr.to_proc.call(self)
  end
end

If you are in the habit of violating the Law of Demeter, you can use #into to make an expression read consistently from left to right. For example, this code:

lambda { |x| x * x }.call((1..100).select(&:odd?).inject(&:+))

Reads “Square (take the numbers from 1 to 100, select the odd ones, and take the sum of those).” Confusing. Whereas with #into, you can write:

(1..100).select(&:odd?).inject(&:+).into { |x| x * x }

Which reads “Take the numbers from 1 to 100, keep the odd ones, take the sum of those, and then answer the square of that number.”

A permuting combinator like #into is not strictly necessary when you have parentheses or local variables. Which is kind of interesting, because it shows that if you have permuting combinators, you can model parentheses and local variables.

But we are not interested in theory. #into may be equivalent to what we can accomplish with other means, but it is useful to us if we feel it makes the code clearer and easier to understand. Sometimes a longer expression should be broken into multiple small expressions to make it easier to understand. Sometimes it can be reordered using tools like #into.

Let

Object#into defines the thrush as a method that takes a block, lambda, or anything that can become a block or lambda as its argument. There is another way to formulate a Thrush:

class Kernel
  def let it
    yield it
  end
end

It’s remarkably simple, so simple that it appears to be less useful than #into. The example above would look like this if we used let:

let (1..100).select(&:odd?).inject(&:+) do |x|
  x * x
end

How does that help? I’ll let you in on a secret: Ruby 1.9 changes the game. In Ruby 1.8, x is local to the surrounding method, so it doesn’t help. But in Ruby 1.9, x is a block local variable, meaning that it does not clobber an existing variable. So in Ruby 1.8:

def say_the_square_of_the_sum_of_the_odd_numbers(x)
  sotos = let (1..x).select(&:odd?).inject(&:+) do |x|
    x * x
  end
  "The square of the sum of the odd numbers from 1..#{x} is #{sotos}"
end

say_the_square_of_the_sum_of_the_odd_numbers(10)
 => "The square of the sum of the odd numbers from 1..25 is 625"

1..25!? What happened here is that the x inside the block clobbered the value of the x parameter. Not good. In Ruby 1.9:

say_the_square_of_the_sum_of_the_odd_numbers(10)
 => "The square of the sum of the odd numbers from 1..10 is 625"

Much better, Ruby 1.9 creates a new scope inside the block and x is local to that block, shadowing the x parameter. Now we see a use for let:

let(some_expression) do |my_block_local_variable|
  # ...
end

let creates a new scope and defines your block local variable inside the block. This signals that the block local variable is not used elsewhere. Imperative methods can be easier to understand when they are composed of smaller blocks with well-defined dependencies between them. A variable local to the entire method creates a dependency across the entire method. A variable local to a block only creates dependencies within that block.

Although Ruby 1.8 does not enforce this behaviour, it can be useful to write code in this style as a signal to make the code easier to read.

Summary

We have seen two formulations of the thrush combinator, #into and let. One is useful for making expressions more consistent and easier to read, the other for signaling the scope of block-local variables.

Songs of the Cardinal

In Combinatory Logic, the cardinal is one of the most basic permuting combinators; it reverses and parenthesizes the normal order of evaluation. The cardinal is written Cxyz = xzy. In Ruby:

cardinal.call(proc_over_proc).call(a_value).call(a_proc)
  => proc_over_proc.call(a_proc).call(a_value)

What does this mean? Let’s compare it to the Thrush. The thrush is written Txy = yx. In Ruby terms,

thrush.call(a_value).call(a_proc)
  => a_proc.call(a_value)

The salient difference is that a cardinal doesn’t just pass a_value to a_proc. What it does is first passes a_proc to proc_over_proc and then passes a_value to the result. This implies that proc_over_proc is a function that takes a function as its argument and returns a function.

Or in plainer terms, you want a cardinal when you would like to modify what a function or a block does. Now you can see why we can derive a thrush from a cardinal. If we write:

identity = lambda { |f| f }

Then we can write:

thrush = cardinal.call(identity)

What we have done is say a thrush is what you get when you use a cardinal and a function that doesn’t modify its function but answers it right back.

Note to ornithologists and ontologists:

This is not object orientation: a thrush is not a kind of cardinal. The correct relationship between them in Ruby is that a cardinal creates a thrush. Or in Smullyan’s songbird metaphor, if you call out the name of an identity bird to a cardinal, it will call out the name of a thrush back to you.

Now, this bizarre syntactic convention of writing foo.call(bar).call(bash) is not very helpful for actually writing software. It is great for explaining what’s going on, but if we are going to use Ruby for the examples, we need to lift our game up a level and make some idiomatic Ruby.

Building a Cardinal in Ruby

The next chunk of code works around the fact that Ruby 1.8 can’t define a proc that takes a block and also doesn’t allow define_method to define a method that takes a block. So for Ruby 1.8, we will start by making a utility method that defines methods that can take a block, based on an idea from coderr. For Ruby 1.9 this is not necessary: you can use define_method to define methods that take blocks as arguments.

def define_method_taking_block(name, method_body_proc)
  self.class.send :define_method, "__cardinal_helper_#{name}__", &method_bo\
dy_proc
  eval <<-EOM
    def #{name}(a_value, &a_proc)
      __cardinal_helper_#{name}__(a_value, a_proc)
    end
  EOM
end

Now we can see what the expression “accidental complexity” means. Do you see how we need a long paragraph and a chunk of code to explain how we are working around a limitation in our tool? And how the digression to explain the workaround is longer than the actual code we want to write? Ugh!

With that out of the way, we can write our cardinal:

def cardinal_define(name, &proc_over_proc)
  define_method_taking_block(name) do |a_value, a_proc|
      proc_over_proc.call(a_proc).call(a_value)
  end
end

Ready to try it? Here’s a familiar example. We’ll need a proc_over_proc, our proc that modifies another proc. Because we’re trying to be Ruby-ish, we’ll write it out as a block:

do |a_proc|
lambda { |a_value|
    a_proc.call(a_value) unless a_value.nil?
  }
end

This takes a a_proc and returns a brand new proc that only calls a_proc if the value you pass it is not nil. Now let’s use our cardinal to define a new method:

cardinal_define(:maybe) do |a_proc|
  lambda { |a_value|
    a_proc.call(a_value) unless a_value.nil?
  }
end

Let’s try it out:

maybe(1) { |x| x + 1 }
  => 2
maybe(nil) { |x| x + 1 }
  => nil

If we’re using Rails, we can make a slightly different version of maybe:

cardinal_define(:unless_blank) do |a_proc|
lambda { |a_value|
    a_proc.call(a_value) unless a_value.blank?
  }
end

unless_blank(Person.find(...).name) do |name|
  register_name_on_title(name)
end

Remember we said the cardinal can be used to define a thrush? Let’s try our Ruby cardinal out to do the same thing. Recall that expressing the identity bird as a block is:

do |a_proc|
  a_proc
end

Therefore we can define a thrush with:

cardinal_define(:let) do |a_proc|
  a_proc
end

let((1..10).select { |n| n % 2 == 1 }.inject { |mem, var| mem + var }) do |\
x|
  x * x
end
  => 625

As you can see, once you have a defined a cardinal, you can create an infinite variety of methods that have thrush-like syntax–a method that applies a value to a block–but you can modify or augment the semantics of the block in any way you want.

In Ruby terms, you are meta-programming. In Smullyan’s terms, you are Listening to the Songs of the Cardinal.

Quirky Birds and Meta-Syntactic Programming

In Combinatory Logic, the Queer Birds are a family of combinators which both parenthesize and permute. One member of the family, the Quirky Bird, has interesting implications for Ruby. The quirky bird is written Q3xyz = z(xy). In Ruby:

quirky.call(value_proc).call(a_value).call(a_proc)
  => a_proc.call(value_proc.call(a_value))

Like the Cardinal, the quirky bird reverses the order of application. But where the cardinal modifies the function that is applied to a value, the quirky bird modifies the value itself. Let’s compare how cardinals and quirky birds work.

a cardinals refresher

The cardinal is defined in its simplest Ruby form as:

cardinal.call(proc_over_proc).call(a_value).call(a_proc)
  => proc_over_proc.call(a_proc).call(a_value)

From that definition, we wrote a method called cardinal_define that writes methods in idiomatic Ruby. For example, here’s how we used cardinal_define to generate the maybe method:

cardinal_define(:maybe) do |a_proc|
  lambda { |a_value|
    a_proc.call(a_value) unless a_value.nil?
  }
end

maybe(1) { |x| x + 1 }
  => 2
maybe(nil) { |x| x + 1 }
  => nil

Now we are not looking at the source code for maybe, but from the definition of a cardinal above we know that any method defined by cardinal_define will look roughly like:

def defined_by_a_cardinal(a_value, &a_proc)
  proc_over_proc.call(a_proc).call(a_value)
end

Or in our case:

def maybe(a_value, &a_proc)
  lambda do |a_proc|
    lambda { |a_value|
      a_proc.call(a_value) unless a_value.nil?
    }
  end.call(a_proc).call(a_value)
end

and now to the quirky bird

From the definition for the quirky bird, we expect that if we write quirky_bird_define, the methods it generates will look roughly like:

def defined_by_a_quirky_bird(a_value, &a_proc)
  a_proc.call(value_proc.call(a_value))
end

So, are we ready to write quirky_bird_define? This seems too easy. Just copy the cardinal_define code, make a few changes, and we’re done:

def quirky_bird_define(name, &value_proc)
  define_method_taking_block(name) do |a_value, a_proc|
    a_proc.call(value_proc.call(a_value))
  end
end

# method_body_proc should expect (a_value, a_proc)
# see http://coderrr.wordpress.com/2008/10/29/using-define_method-with-bloc\
ks-in-ruby-18/
def define_method_taking_block(name, &method_body_proc)
  self.class.send :define_method, "__quirky_bird_helper_#{name}__", method_\
body_proc
  eval <<-EOM
    def #{name}(a_value, &a_proc)
      __quirky_bird_helper_#{name}__(a_value, a_proc)
    end
  EOM
end

Ok, let’s try it out on something really trivial:

quirky_bird_define(:square_first) do |a_value|
  a value * a_value
end

square_first(1) { |n| n + 1 }
  => 2

square_first(2) { |n| n + 1 }
  => 5

It works, good. Now let’s define maybe using the quirky bird we just wrote. Just so we’re clear, I want to write:

quirky_bird_define(:maybe) do |a_value|
  # ... something goes here ...
end

maybe(1) { |n| n + 1 }
  => 2

maybe(nil) { |n| n + 1 }
  => nil

Scheisse! Figuring out what to put in the block to make maybe work is indeed queer and quirky!!

Now, the simple truth is, I know of no way to use a quirky bird to cover all of the possible blocks you could use with maybe so that it works exactly like the version of maybe we built with a cardinal. However, I have found that sometimes it is interesting to push an incomplete idea along if it is incomplete in interesting ways. “Maybe” we can learn something in the process.

A limited interpretation of the Quirky Bird in Ruby

Let’s solve maybe any-which-way-we-can and see how it goes. When we used a cardinal, we wanted a proc that would modify another proc to such that if it was passed nil, it would answer nil without evaluating its contents.

Now we want to modify a value such that if it is nil, it responds nil to the method +. This is doable, with the help of the BlankSlate class, also called a BasicObject. You’ll find BlankSlate and BasicObject classes in various frameworks and Ruby 1.9, and there’s one at blank_slate.rb you can use.

BlankSlate is a class with no methods, which is very different from the base class Object. That’s because Object in Ruby is heavyweight, it has lots of useful stuff. But we don’t want useful stuff, because our mission is to answer a value that responds nil to any method you send it.

The Ruby way to handle any method is with method_missing. Here’s a really simple expression that answers an object that responds nil to any method:

returning(BlankSlate.new) do |it|
  def it.method_missing(*args)
    nil
  end
end

Hmmm. What about:

quirky_bird_define(:maybe) do |value|
  if value.nil?
    returning(BlankSlate.new) do |it|
      def it.method_missing(*args)
        nil
      end
    end
  else
    value
  end
end

This is saying, “Let’s define a quirky bird method based on a value_proc as usual. Our value_proc will take a value, and if the value is nil we will return an object that responds with nil to any method. But if the value is not nil, our value_proc will respond with the object.”

Let’s try it:

maybe(1) { |n| n + 1 }
  => 2

maybe(nil) { |n| n + 1 }
  => nil

Now, I admit this is very flawed:

maybe(nil) { |n| n + 1 + 1 }
=> NoMethodError: undefined method `+’ for nil:NilClass
maybe(nil) { |n| 1 + n }
=> TypeError: coerce must return [x, y]

The basic problem here is that we only control the value we pass in. We can’t modify how other objects respond to it, nor can we control what happens to any objects we return from methods called on it. So, the quirky bird turns out to be useful in the case where (a) the value is the receiver of a method, and (b) there is only one method being called, not a chain of methods.

Hmmm again.

Embracing the Quirky Bird

Maybe we shouldn’t be generating methods that deal with arbitrary blocks and procedures. One way to scale this down is to deal only with single method invocations. For example, what if instead of designing our new version of maybe so that we invoke it by writing maybe(nil) { |n| n + 1 } or maybe(1) { |n| n + 1 }, we design it so that we write nil.maybe + 1 or 1.maybe + 1 instead?

In that case, maybe becomes a method on the object class that applies value_proc to its receiver rather than being a method that takes a value and a block. Getting down to business, we are going to open the core Object class and add a new method to it. The body of that method will be our value_proc:

def quirky_bird_extend(name, &value_proc)
  Object.send(:define_method, name) do
    value_proc.call(self)
  end
end

Just as we said, we are defining a new method in the Object class.

We are using define_method and a block rather than the def keyword. The reason is that when we use define_method and a block, the body of the method executes in the context of the block, not the context of the object itself. Blocks are closures in Ruby, which means that the block has access to value_proc, the parameter from our quirky_bird_extend method.

Had we used def, Ruby would try to evaluate value_proc in the context of the object itself. So our parameter would be lost forever. Performance wonks and compiler junkies will be interested in this behaviour, as it has very serious implications for garbage collection and memory leaks.

Now let’s use it with exactly the same block we used with quirky_bird_define:

require 'quirky_bird'
require 'blank_slate'
require 'returning'

quirky_bird_extend(:maybe) do |value|
  if value.nil?
    returning(BlankSlate.new) do |it|
      def it.method_missing(*args)
        nil
      end
    end
  else
    value
  end
end

nil.maybe + 1
  => nil

1.maybe + 1
  => 2

It works. And it looks familiar! We have defined our own version of andand, only this is much more interesting. Instead of a one-off handy-dandy, we have created a method that creates similar methods.

Let’s try it again, this time emulating Chris Wanstrath’s try:

quirky_bird_extend(:try) do |value|
  returning(BlankSlate.new) do |it|
    def it.__value__=(arg)
       @value = arg
    end
    def it.method_missing(name, *args)
      if @value.respond_to?(name)
        @value.send(name, *args)
      end
    end
    it.__value__ = value
  end
end

nil.try + 1
  => nil

1.try + 1
  => 2

1.try.ordinalize
  => nil

As you can see, we can used the quirky bird to create a whole family of methods that modify the receiver in some way to produce new semantics. I can’t show you the source code, but here is something from a proprietary Rails application:

Account.without_requiring_authorization.create!(...)

In this case, without_requiring_authorization follows the quirky bird pattern, only instead of taking an instance and producing a version that handles certain methods specially, this one takes a class and produces a version that doesn’t enforce authorization for use in test cases.

so what have we learned?

The quirky bird is superficially similar to the cardinal, however it can be used to generate syntax that is a little more method-oriented rather than function-oriented. And what’s better than a handy method like andand? A method for defining such methods, of course.

Andand even more

As we’ve discovered, “andand” is a Quirky Bird. Here’s a little tip for using it effectively: You already know that you can use it to conditionally invoke a method on an object:

"foo".andand + "bar"
    # => "foobar"
nil.andand + "bar"
    # => nil

In other words, it’s a Quirky Bird. But did you know that you can also use it to conditionally invoke a block?

(1..10).andand do |numbers|
  doubles = numbers.map { |n| n * 2 }
  double_doubles = doubles.map { |n| n * 2 }
end
    # => [4, 8, 12, 16, 20, 24, 28, 32, 36, 40]

nil.andand do |numbers|
  doubles = numbers.map { |n| n * 2 }
  double_doubles = doubles.map { |n| n * 2 }
end
    # => nil

It’s not just a Quirky Bird, it’s also a Cardinal!

Consider this conditional code:

if my_var = something_or_other()
  3.times do
    yada(my_var)
  end
end

I’m not a big fan. The obvious sin is the pathetic 90s cultural reference. But I’m even more annoyed by having side-effects in the predicate of an if clause, in this case assigning something to the variable my_var. Although I’m not switching to a purely functional language any time soon, I strongly prefer that when you write if something(), then “something()” should not cause any side effects, ever.

Another problem is that we are obviously creating my_var just to use inside the block, but we’re declaring it in top-level scope. We could fool around with a Thrush like let, but instead let’s use Object#andand:

something_or_other().andand do |my_var|
  3.times do
    yada(my_var)
  end
end

Now we are making it clear that we wish to execute this block only if something_or_other() is not nil. Furthermore, we are assigning the result of something_or_other() to my_var and using it within the block. Crisp and clean, no caffeine.

Note that if we don’t actually need my_var in the block, we don’t really need andand either:

something_or_other() and begin
  3.times do
    yada()
  end
end

Like anything else, andand do ... end is a tool to be used in specialized situations. Use it whenever you want to do something more complicated than a simple message send, but only when the subject is not nil.

Aspect-Oriented Programming in Ruby using Combinator Birds

In Combinatory Logic, the bluebird is one of the most important and fundamental combinators, because the bluebird composes two other combinators. Although this is usually discussed as part of functional programming style, it is just as valuable when writing object-oriented programs. In this post, we will develop an aspect-oriented programming (or “AOP”) module that adds before methods and after methods to Ruby programs, with the implementation inspired by the bluebird. The bluebird is written Bxyz = x(yz). In Ruby, we can express the bluebird like this:

bluebird.call(proc1).call(proc2).call(value)
  => proc1.call(proc2.call(value))

If this seems a little arcane, consider a simple Ruby expression (x * 2) + 1: This expression composes multiplication and addition. Composition is so pervasive in programming languages that it becomes part of the syntax, something we take for granted. We don’t have to think about it until someone like Oliver Steele writes a library like functional javascript that introduces a compose function, then we have to ask what it does.

Before we start using bluebirds, let’s be clear about something. We wrote that bluebird.call(proc1).call(proc2).call(value) is equivalent to proc1.call(proc2.call(value)). We want to be very careful that we understand what is special about proc1.call(proc2.call(value)). How is it different from proc1.call(proc2).call(value)?

The answer is:

proc1.call(proc2.call(value))
  => puts value into proc2, then puts the result of that into proc1

proc1.call(proc2).call(value)
  => puts proc2 into proc1, getting a function out, then puts value into the\
 new function

So with a bluebird you can chain functions together in series, while if you didn’t have a bluebird all you could do is write functions that transform other functions. Not that there’s anything wrong with that, we used that to great effect with cardinals and quirky birds.

Giving methods advice

We’re not actually going to Greenspun an entire aspect-oriented layer on top of Ruby, but we will add a simple feature, we are going to add before and after methods. You already know what a normal method is. A before method simply specifies some behaviour you want executed before the method is called, while an after method specifies some behaviour you want executed after the method is called. In AOP, before and after methods are called “advice.”

There is an unwritten rule that says every Ruby programmer must, at some point, write his or her own AOP implementation –Avdi Grimm

Ruby on Rails programmers are familiar with method advice. If you have ever written any of the following, you were using Rails’ built-in aspect-oriented programming support:

after_save
validates_each
alias_method_chain
before_filter

These and other features of Rails implement method advice, albeit in a very specific way tuned to portions of the Rails framework. We’re going to implement method advice in a module that you can use in any of your classes, on any method or methods you choose. We’ll start with before methods. Here’s the syntax we want:

def something(parameter)
  # do stuff...
end

before :something do |parameter|
  # stuff to do BEFORE we do stuff...
end

before :something do |parameter|
  # stuff to do BEFORE stuff to do BEFORE we do stuff...
end

As we can see, the before methods get chained together before the method. To keep this nice and clean, we are going to make them work just like composable functions: whatever our before method’s block returns will be passed as a parameter up the chain. We also won’t fool around with altering the order of before methods, we’ll just take them as they come.

This is really simple, we are composing methods. To compare to the bluebird above, we are writing before, then the name of a method, then a function. I’ll rewrite it like this:

bluebird.call(something).call(stuff_to_do_before_we_do_stuff).call(value)
  => something.call(stuff_to_do_before_we_do_stuff.call(value))

Now we can see that this newfangled aspect-oriented programming stuff was figured out nearly a century ago by people like Alonzo Church.

Okay, enough history, let’s get started. First, we are not going to write any C, so there is no way to actually force the Ruby VM to call our before methods. So instead, we are going to have to rewrite our method. We’ll use a trick I found on Jay Fields’ blog:

module NaiveBeforeMethods

  module ClassMethods

    def before(method_sym, &block)
      old_method = self.instance_method(method_sym)
      if old_method.arity == 0
        define_method(method_sym) do
          block.call
          old_method.bind(self).call
        end
      else
        define_method(method_sym) do |*params|
          old_method.bind(self).call(*block.call(*params))
        end
      end
    end

  end

  def self.included(receiver)
    receiver.extend         ClassMethods
  end

end

As you can see, we have a special case for methods with no parameters, and when we have a method with multiple parameters, our before method must answer an array of parameters. And the implementation relies on a “flock of bluebirds:” Our before methods and the underlying base method are composed with each other to define the method that is actually executed at run time.

Using it is very easy:

class SuperFoo

  def one_parameter(x)
    x + 1
  end

  def two_parameters(x, y)
    x * y
  end

end

class Foo < SuperFoo

  include NaiveBeforeMethods

  before :one_parameter do |x|
    x * 2
  end

  before :two_parameters do |x, y|
    [x + y, x - y]
  end

end

Foo.new.one_parameter(5)
  => 11

Foo.new.two_parameters(3,1)
  => 8

This could be even more useful if it supported methods with blocks. Adventurous readers may want to combine this code with the tricks in cardinal.rb and see if they can build a version of before that supports methods that take blocks.

The super keyword, perhaps you’ve heard of it?

Of course, Ruby provides a means of ‘decorating’ methods like this by overriding a method and calling super within it. So we might have written:

class Foo < SuperFoo

  def one_parameter(x)
    super(x * 2)
  end

  def two_parameters(x, y)
    super(x + y, x - y)
  end

end

On a trivial example, the two techniques seem equivalent, so why bother with the extra baggage? The answer is that using super is a little low level. When you see a method definition in a language like Ruby, you don’t know whether you are defining a new method, overriding an existing method with entirely new functionality, or “decorating” a method with before advice. Using advice can be useful when you want to signal exactly what you are trying to accomplish.

Another reason to prefer method advice is when you want to share some functionality:

class LoggingFoo < SuperFoo

  def one_parameter(x)
    log_entry
    returning(super) do
      log_exit
    end
  end

  def two_parameters(x, y)
    log_entry
    returning(super) do
      log_exit
    end
  end

end

This could be written as:

class LoggingFoo < SuperFoo

  include NaiveBeforeMethods

  before :one_parameter, :two_parameters do # see below
    log_entry
  end

  after :one_parameter, :two_parameters do
    log_exit
  end

end

This cleanly separates the concern of logging from the mechanism of what the methods actually do

Although this is not the main benefit, method advice also works with methods defined in modules and the current class, not just superclasses. So in some ways it is even more flexible than Ruby’s super keyword.

The Queer Bird

That looks handy. But we also want an after method, a way to compose methods in the other order. Good news, the Queer Bird combinator is exactly what we want. Written Qxyz = y(xz), the Ruby equivalent is:

queer_bird.call(something).call(stuff_to_do_after_we_do_stuff).call(value)
  => stuff_to_do_after_we_do_stuff.call(something.call(value))

Which is, of course:

def something(parameter)
  # do stuff...
end

after :something do |return_value|
  # stuff to do AFTER we do stuff...
end

The difference between before and after advice is that after advice is consumes and transforms whatever the method returns, while before advice consumes and transforms the parameters to the method.

We could copy, paste and modify our bluebird code for the before methods to create after methods. But before you rush off to implement that, you might want to think about a few interesting “real world” requirements:

  1. If you define before and after methods in any order, the final result should be that all of the before methods are run before the main method, then all of the after methods. This is not part of combinatory logic, but it’s the standard behaviour people expect from before and after methods.
  2. You should be able to apply the same advice to more than one method, for example by writing after :foo, :bar do ... end
  3. If you declare parameters for before advice, whatever it returns will be used by the next method, just like the example above. If you do not declare parameters for before advice, whatever it returns should be ignored. The same goes for after advice.
  4. If you override the main method, the before and after methods should still work.
  5. The blocks provided should execute in the receiver’s scope, like method bodies.

One implementation meeting these requirements is in the appendix. Embedded in a lot of extra moving parts, the basic pattern of composing methods is still evident:

naive_before_advice.rb


module NaiveBeforeAdvice

  module ClassMethods

    def before(method_sym, &block)
      old_method = self.instance_method(method_sym)
      if old_method.arity == 0
        define_method(method_sym) do
          block.call
          old_method.bind(self).call
        end
      else
        define_method(method_sym) do |*params|
          old_method.bind(self).call(*block.call(*params))
        end
      end
    end

  end

  def self.included(receiver)
    receiver.extend         ClassMethods
  end

end

That is why we looked at supporting just before methods first. If you are comfortable with the naïve implementation of before advice discussed above, the mechanism is easy to understand. The complete version is considerably more powerful. As mentioned, it supports before and after advice. It also uses instance_exec to evaluate the blocks in the receiver’s scope, providing access to private methods and instance variables. And it works properly even when you override the method being advised.

Mockingbirds

In this chapter, we will meet a combinator that duplicates its arguments, and see how to use it to achieve recursion. Such combinators are called recursive combinators, and are an important foundation for separating the concrete implementation of an algorithm from its definition.

Duplicative Combinators

Almost all of the combinators we’ve seen in previous essays about combinators “conserve” their arguments. For example, if you pass xyz to a Bluebird, you get one x, one y, and one z back, exactly what you passed in. You get x(yz) back, so they have been grouped for you. But nothing has been added and nothing has been taken away. Likewise the Thrush reverses its arguments, but again it answers back the same number arguments you passed to it. The Kestrel, on the other hand, does not conserve its arguments. It erases one. If you pass xy to a Kestrel, you only get x back. The y is erased. Kestrels do not conserve their arguments.

Today we are going to meet another combinator that does not conserve its arguments, the Mockingbird. Where a Kestrel erases one of its arguments, the Mockingbird duplicates its argument. In logic notation, Mx = xx. Or in Ruby:

mockingbird.call(x)
  #=> x.call(x)

The Mockingbird is not the only combinator that duplicates one or more of its arguments. Logicians have also found important uses for many other duplicating combinators like the Starling (Sxyz = xz(yz)), which is one half of the SK combinator calculus, and the Turing Bird (Uxy = y(xxy)), which is named after its discoverer.

The great benefit of duplicative combinators from a theoretical perspective is that combinators that duplicate an argument can be used to introduce recursion without names, scopes, bindings, and other things that clutter things up. Being able to introduce anonymous recursion is very elegant, and there are times when it is useful in its own right.

Recursive Lambdas in Ruby

Let’s write a simple recursive combinator in Ruby from first principles. To start with, let’s pick a recursive algorithm to implement: We’ll sum the numbers of a nested list. In other words, we’re going to traverse a tree of numbers and generate the sum of the leaves, a recursive problem.

This is a trivial problem in Ruby, [1, [[2,3], [[[4]]]]].flatten.inject(&:+) will do the trick. Of course, it does so by calling .flatten, a built-in method that is itself recursive. However, by picking a really simple example, it’s easy to focus on the recursion rather than by the domain-specific parts of our problem. That will make things look a little over-engineered here, but when you’re interested in the engineering, that’s a good thing.

So what is our algorithm?

  1. If we are given a number, return it.
  2. If we are given a list, call ourself for each item of the list and sum the numbers that are returned.

In Ruby:

sum_of_nested_list = lambda do |arg|
  arg.kind_of?(Numeric) ? arg : arg.map { |item| sum_of_nested_list.call(it\
em) }.inject(&:+)
end

One reason we don’t like this is that it breaks badly if we ever modify the variable sum_of_nested_list. Although you may think that’s unlikely, it can happen when writing the method combinators you’ve seen in previous chapters. For example, imagine you wanted to write to the log when calling this function, but only once, you don’t want to write to the log when it calls itself.

old_sum = sum_of_nested_list
sum_of_nested_list = lambda do |arg|
  puts "sum_of_nested_list(#{arg.inspect})"
  old_sum.call(arg)
end

sum_of_nested_list.call([[[[[6]]]]])

  sum_of_nested_list([[[[[6]]]]])
  sum_of_nested_list([[[[6]]]])
  sum_of_nested_list([[[6]]])
  sum_of_nested_list([[6]])
  sum_of_nested_list([6])
  sum_of_nested_list(6)
    #=> 6

This doesn’t work because inside our original sum_of_nested_list, we call sum_of_nested_list by name. If that gets redefined by a method combinator or anything else, we’re calling the new thing and not the old one.

Another reason to eschew having lambdas call themselves by name is that we won’t be able to create anonymous recursive lambdas. Although naming things is an important part of writing readable software, being able to make anonymous things like object literals opens up a world where everything is truly first class and can be created on the fly or passed around like parameters. So by figuring out how to have lambdas call themselves without using their names, we’re figuring out how to make all kinds of lambdas anonymous and flexible, not just the non-recursive ones.

Recursive Combinatorics

The combinator way around this is to find a way to pass a function to itself as a parameter. If a lambda only ever calls its own parameters, it doesn’t depend on anything being bound to a name in its environment. Let’s start by rewriting our function to take itself as an argument:

sum_of_nested_list = lambda do |myself, arg|
  arg.kind_of?(Numeric) ? arg : arg.map { |item| myself.call(myself, item) \
}.inject(&:+)
end

One little problem: How are we going to pass our function to itself? Let’s start by currying it into a function that takes one argument, itself, and returns a function that takes an item:

sum_of_nested_list = lambda do |myself|
  lambda do |arg|
    arg.kind_of?(Numeric) ? arg : arg.map { |item| myself.call(myself).call\
(item) }.inject(&:+)
  end
end

Notice that we now have myself call itself and have the result call an item. To use it, we have to have it call itself:

sum_of_nested_list.call(sum_of_nested_list).call([1, [[2,3], [[[4]]]]]) #=> 10

This works, but is annoying. Writing our function to take itself as an argument and return a function is one thing, we can fix that, but having our function call itself by name defeats the very purpose of the exercise. Let’s fix it. First thing we’ll do, let’s get rid of myself.call(myself).call(item). We’ll use a new parameter, recurse (it’s the last parameter in an homage to callback-oriented programming style). We’ll pass it myself.call(myself), thus removing myself.call(myself) from our inner lambda:

sum_of_nested_list = lambda do |myself|
  lambda do |arg|
    lambda do |arg, recurse|
      arg.kind_of?(Numeric) ? arg : arg.map { |item| recurse.call(item) }.i\
nject(&:+)
    end.call(arg, myself.call(myself))
  end
end

sum_of_nested_list.call(sum_of_nested_list).call([1, [[2,3], [[[4]]]]])
#=> 10

Next, we hoist our code out of the middle and make it a parameter. This allows us to get rid of the ` sum_of_nested_list.call(sum_of_nested_list)` by moving it into our lambda:

sum_of_nested_list = lambda do |fn|
  lambda { |x| x.call(x) }.call(
    lambda do |myself|
      lambda do |arg|
        fn.call(arg, myself.call(myself))
      end
    end
  )
end.call(
  lambda do |arg, recurse|
    arg.kind_of?(Numeric) ? arg : arg.map { |item| recurse.call(item) }.inj\
ect(&:+)
  end
)

sum_of_nested_list.call([1, [[2,3], [[[4]]]]])
#=> 10

Lots of code there, but let’s check and see that it works as an anonymous lambda:

lambda do |fn|
  lambda { |x| x.call(x) }.call(
    lambda do |myself|
      lambda do |arg|
        fn.call(arg, myself.call(myself))
      end
    end
  )
end.call(
  lambda do |arg, recurse|
    arg.kind_of?(Numeric) ? arg : arg.map { |item| recurse.call(item) }.inj\
ect(&:+)
  end
).call([1, [[2,3], [[[4]]]]])
#=> 10

Looking at this final example, we can see it has two cleanly separated parts:

# The recursive combinator

lambda do |fn|
  lambda { |x| x.call(x) }.call(
    lambda do |myself|
      lambda do |arg|
        fn.call(arg, myself.call(myself))
      end
    end
  )
end.call(

  # The lambda we wish to make recursive

  lambda do |arg, recurse|
    arg.kind_of?(Numeric) ? arg : arg.map { |item| recurse.call(item) }.inj\
ect(&:+)
  end

)

Recursive Combinators in Idiomatic Ruby

We’ve now managed to separate the mechanism of recursing (the combinator) from what we want to do while recursing. Let’s formalize this and make it idiomatic Ruby. We’ll make it a method for creating recursive lambdas and call it with a block instead of a lambda:

def lambda_with_recursive_callback
  lambda { |x| x.call(x) }.call(
    lambda do |myself|
      lambda do |arg|
        yield(arg, myself.call(myself))
      end
    end
  )
end

sum_of_nested_list = lambda_with_recursive_callback do |arg, recurse|
  arg.kind_of?(Numeric) ? arg : arg.map { |item| recurse.call(item) }.injec\
t(&:+)
end

sum_of_nested_list.call([1, [[2,3], [[[4]]]]])
#=> 10

Not bad. But hey, let’s DRY things up. Aren’t x.call(x) and myself.call(myself) the same thing?

The Mockingbird

Yes, x.call(x) and myself.call(myself) are the same thing:

def mockingbird &x
  x.call(x)
end

def lambda_with_recursive_callback
  mockingbird do |myself|
    lambda do |arg|
      yield(arg, mockingbird(&myself))
    end
  end
end

sum_of_nested_list = lambda_with_recursive_callback do |arg, recurse|
  arg.kind_of?(Numeric) ? arg : arg.map { |item| recurse.call(item) }.injec\
t(&:+)
end

sum_of_nested_list.call([1, [[2,3], [[[4]]]]])
#=> 10

But does it blend?

lambda_with_recursive_callback { |arg, recurse|
  arg.kind_of?(Numeric) ? arg : arg.map { |item| recurse.call(item) }.injec\
t(&:+)
}.call([1, [[2,3], [[[4]]]]])
#=> 10

Yes!

And now we have our finished recursive combinator. We are able to create recursive lambdas in Ruby without relying on environment variables, just on parameters passed to blocks or lambdas. Our recursive combinator is built on the simplest and most basic of duplicating combinators, the Mockingbird.

In the next chapter, we’ll build more sophisticated (and practical) recursive combinators. And while doing so, we’ll take an aggressive approach to separating interfaces from implementations in algorithms.

Refactoring Methods with Recursive Combinators

In previous chapters, we have met some of Combinatory Logic’s most interesting combinators like the Kestrel, Thrush, Cardinal, Quirky Bird, and Bluebird. Today we are going to learn how combinators can help us separate the general form of an algorithm like “divide and conquer” from its specific concrete steps. Consider the method #sum_squares: It sums the squares of a tree of numbers, represented as a nested list.

def sum_squares(value)
  if value.kind_of?(Enumerable)
    value.map do |sub_value|
      sum_squares(sub_value)
    end.inject() { |x,y| x + y }
  else
    value ** 2
  end
end

p sum_squares([1, 2, 3, [[4,5], 6], [[[7]]]])
  => 140

And the method #rotate: It rotates a square matrix, provided the length of each side is a power of two:

def rotate(square)
  if square.kind_of?(Enumerable) && square.size > 1
    half_sz = square.size / 2
    sub_square = lambda do |row, col|
      square.slice(row, half_sz).map { |a_row| a_row.slice(col, half_sz) }
    end
    upper_left = rotate(sub_square.call(0,0))
    lower_left = rotate(sub_square.call(half_sz,0))
    upper_right = rotate(sub_square.call(0,half_sz))
    lower_right = rotate(sub_square.call(half_sz,half_sz))
    upper_right.zip(lower_right).map { |l,r| l + r } +
      upper_left.zip(lower_left).map { |l,r| l + r }
  else
    square
  end
end

p rotate([[1,2,3,4], [5,6,7,8], [9,10,11,12], [13,14,15,16]])
  => [[4, 8, 12, 16], [3, 7, 11, 15], [2, 6, 10, 14], [1, 5, 9, 13]]

Our challenge is to refactor them. You could change sub_square from a closure to a private method (and in languages like Java, you have to do that in the first place). What else? Is there any common behaviour we can extract from these two methods?

Looking at the two methods, there are no lines of code that are so obviously identical that we could mechanically extract them into a private helper. Automatic refactoring tools fall down given these two methods. And yet, there is a really, really important refactoring that should be performed here.

Divide and Conquer

Both of these methods use the Divide and Conquer strategy.

As described, there are two parts to each divide and conquer algorithm. We’ll start with conquer: you need a way to decide if the problem is simple enough to solve in a trivial manner, and a trivial solution. You’ll also need a way to divide the problem into sub-problems if it’s too complex for the trivial solution, and a way to recombine the pieces back into the solution. The entire process is carried our recursively.

For example, here’s how #rotate rotated the square. We started with a square matrix of size 4:

[
  [  1,  2,  3,  4],
  [  5,  6,  7,  8],
  [  9, 10, 11, 12],
  [ 13, 14, 15, 16]
]

That cannot be rotated trivially, so we divided it into four smaller sub-squares:

[            [
  [  1,  2],   [  3,  4],
  [  5,  6]    [  7,  8]
]            ]

[            [
  [  9, 10],   [ 11, 12],
  [ 13, 14]    [ 15, 16]
]            ]

Those couldn’t be rotated trivially either, so our algorithm divide each of them into four smaller squares again, giving us sixteen squares of one number each. Those are small enough to rotate trivially (they do not change), so the algorithm could stop subdividing.

We said there was a recombination step. For #rotate, four sub-squares are recombined into one square by moving them counter-clockwise 90 degrees. The sixteen smallest squares were recombined into four sub-squares like this:

[            [
  [  2,  6],   [  4,  8],
  [  1,  5]    [  3,  7]
]            ]

[            [
  [ 10, 14],   [ 12, 16],
  [  9, 13]    [ 11, 15]
]            ]

Then those four squares were recombined into the final result like this:

[            [
  [  4,  8],   [ 12, 16],
  [  3,  7]    [ 11, 15]
]            ]

[            [
  [  2,  6],   [ 10, 14],
  [  1,  5]    [  9, 13]
]

And smooshed (that is the technical term) back together:

[
  [  4,  8,  12, 16],
  [  3,  7,  11, 15],
  [  2,  6,  10, 14],
  [  1,  5,   9, 13]
]

And Voila! There is your rotated square matrix.

Both rotation and summing the squares of a tree combine the four steps of a divide and conquer strategy:

  1. Deciding whether the problem is divisible into smaller pieces or can be solved trivially,
  2. A solution fro the trivial case,
  3. A way to divide a non-trivial problem up,
  4. And a way to piece it back together.

Here are the two methods re-written to highlight the common strategy. First, #sum_squares_2:

public

def sum_squares_2(value)
  if sum_squares_divisible?(value)
    sum_squares_recombine(
      sum_squares_divide(value).map { |sub_value| sum_squares_2(sub_value) \
}
    )
  else
    sum_squares_conquer(value)
  end
end

private

def sum_squares_divisible?(value)
  value.kind_of?(Enumerable)
end

def sum_squares_conquer(value)
  value ** 2
end

def sum_squares_divide(value)
  value
end

def sum_squares_recombine(values)
  values.inject() { |x,y| x + y }
end

And #rotate_2:

public

def rotate_2(value)
  if rotate_divisible?(value)
    rotate_recombine(
      rotate_divide(value).map { |sub_value| rotate_2(sub_value) }
    )
  else
    rotate_conquer(value)
  end
end

private

def rotate_divisible?(value)
  value.kind_of?(Enumerable) && value.size > 1
end

def rotate_conquer(value)
  value
end

def rotate_divide(value)
  half_sz = value.size / 2
  sub_square = lambda do |row, col|
    value.slice(row, half_sz).map { |a_row| a_row.slice(col, half_sz) }
  end
  upper_left = sub_square.call(0,0)
  lower_left = sub_square.call(half_sz,0)
  upper_right = sub_square.call(0,half_sz)
  lower_right = sub_square.call(half_sz,half_sz)
  [upper_left, lower_left, upper_right, lower_right]
end

def rotate_recombine(values)
  upper_left, lower_left, upper_right, lower_right = values
  upper_right.zip(lower_right).map { |l,r| l + r } +
  upper_left.zip(lower_left).map { |l,r| l + r }
end

Now the common code is glaringly obvious. The main challenge in factoring it into a helper is deciding whether you want to represent methods like #rotate_divide as lambdas or want to fool around specifying method names as symbols. Let’s go with lambdas for the sake of writing a clear example:

public

def sum_squares_3(list)
  divide_and_conquer(
    list,
    :divisible? => lambda { |value| value.kind_of?(Enumerable) },
    :conquer    => lambda { |value| value ** 2 },
    :divide     => lambda { |value| value },
    :recombine  => lambda { |list| list.inject() { |x,y| x + y } }
  )
end

def rotate_3(square)
  divide_and_conquer(
    square,
    :divisible? => lambda { |value| value.kind_of?(Enumerable) && value.siz\
e > 1 },
    :conquer => lambda { |value| value },
    :divide => lambda do |square|
      half_sz = square.size / 2
      sub_square = lambda do |row, col|
        square.slice(row, half_sz).map { |a_row| a_row.slice(col, half_sz) }
      end
      upper_left = sub_square.call(0,0)
      lower_left = sub_square.call(half_sz,0)
      upper_right = sub_square.call(0,half_sz)
      lower_right = sub_square.call(half_sz,half_sz)
      [upper_left, lower_left, upper_right, lower_right]
    end,
    :recombine => lambda do |list|
      upper_left, lower_left, upper_right, lower_right = list
      upper_right.zip(lower_right).map { |l,r| l + r } +
      upper_left.zip(lower_left).map { |l,r| l + r }
    end
  )
end

private

def divide_and_conquer(value, steps)
  if steps[:divisible?].call(value)
    steps[:recombine].call(
      steps[:divide].call(value).map { |sub_value| divide_and_conquer(sub_v\
alue, steps) }
    )
  else
    steps[:conquer].call(value)
  end
end

Now we have refactored the common algorithm out. Typically, something like divide and conquer is treated as a “pattern,” a recipe for writing methods. We have changed it into an abstraction by writing a #divide_and_conquer method and passing it our own functions which it combines to form the final algorithm. That ought to sound familiar: #divide_and_conquer is a combinator that creates recursive methods for us.

You can also find recursive combinators in other languages like Joy, Factor, and even Javascript (the recursive combinator presented here as #divide_and_conquer is normally called multirec). Eugene Lazutkin’s article on [Using recursion combinators in JavaScript](http://lazutkin.com/blog/2008/jun/30/using-recursion-combinators-javascript/ “”) shows how to use combinators to build divide and conquer algorithms in Javascript with the Dojo libraries. This example uses binrec, a recursive combinator for algorithms that always divide their problems in two:

var fib0 = function(n){
    return n <= 1 ? 1 :
        arguments.callee.call(this, n - 1) +
            arguments.callee.call(this, n - 2);
};

var fib1 = binrec("<= 1", "1", "[[n - 1], [n - 2]]", "+");

The Merge Sort

Let’s look at another example, implementing a merge sort. This algorithm has a distinguished pedigree: It was invented by John Von Neumann in 1945.

Von Neumann was a brilliant and fascinating individual. he is most famous amongst Computer Scientists for formalizing the computer architecture which now bears his name. he also worked on game theory, and it was no game to him: He hoped to use math to advise the United States whether an when to launch a thermonuclear war on the USSR. If you are interested in reading more, Prisoner’s Dilemma is a very fine book about both game theory and one of the great minds of modern times.

Conceptually, a merge sort works as follows:

The merge sort part will be old hat given our #divide_and_conquer helper:

def merge_sort(list)
  divide_and_conquer(
    list,
    :divisible? => lambda { |list| list.length > 1 },
    :conquer    => lambda { |list| list },
    :divide     => lambda do |list|
      half_index = (list.length / 2) - 1
      [ list[0..half_index], list[(half_index + 1)..-1] ]
    end,
    :recombine  => lambda { |pair| merge_two_sorted_lists(pair.first, pair.\
last) }
  )
end

The interesting part is our #merge_two_sorted_lists method. Given two sorted lists, our merge algorithm works like this:

As you can tell from the description, this is another divide and conquer algorithm:

def merge_two_sorted_lists(*pair)
  divide_and_conquer(
    pair,
    :divisible? => lambda { |pair| !pair.first.empty? && !pair.last.empty? \
},
    :conquer => lambda do |pair|
      if pair.first.empty? && pair.last.empty?
        []
      elsif pair.first.empty?
        pair.last
      else
        pair.first
      end
    end,
    :divide => lambda do |pair|
      preceding, following = case pair.first.first <=> pair.last.first
        when -1: [pair.first, pair.last]
        when 0:  [pair.first, pair.last]
        when 1:  [pair.last, pair.first]
      end
      [
        [[preceding.first], []],
        [preceding[1..-1], following]
      ]
    end,
    :recombine => lambda { |pair| pair.first + pair.last }
  )
end

That’s great. Well, that’s barely ok, actually. The problem is that when doing our merge sort, when we decide which item is the preceding item (least most, front most, whatever you want to call it), we already know that it is a trivial item and that it doesn’t need any further merging. The only reason we bundle it up in [[preceding.first], []] is because our #divide_and_conquer method expects to recursively attempt to solve all of the sub-problems we generate.

In this case, #merge_two_sorted_lists does not really divide a problem into a list of one or more sub-problems, some of which may or may not be trivially solvable. Instead, it divides a problem into a part of the solution and a single sub-problem which may or may not be trivially solvable. This common strategy also has a name, linear recursion.

Let’s write another version of #merge_two_sorted_lists, but his time instead of using #divide_and_conquer, we’ll write a linear recursion combinator:

def merge_two_sorted_lists(*pair)
  linear_recursion(
    pair,
    :divisible? => lambda { |pair| !pair.first.empty? && !pair.last.empty? \
},
    :conquer => lambda do |pair|
      if pair.first.empty? && pair.last.empty?
        []
      elsif pair.first.empty?
        pair.last
      else
        pair.first
      end
    end,
    :divide => lambda do |pair|
      preceding, following = case pair.first.first <=> pair.last.first
        when -1: [pair.first, pair.last]
        when 0:  [pair.first, pair.last]
        when 1:  [pair.last, pair.first]
      end
      [ preceding.first, [preceding[1..-1], following] ]
    end,
    :recombine => lambda { |trivial_bit, divisible_bit| [trivial_bit] + div\
isible_bit }
  )
end

def linear_recursion(value, steps)
  if steps[:divisible?].call(value)
    trivial_part, sub_problem = steps[:divide].call(value)
    steps[:recombine].call(
      trivial_part, linear_recursion(sub_problem, steps)
    )
  else
    steps[:conquer].call(value)
  end
end

You may think this is even better, and it is.

Separating Declaration from Implementation

Using recursive combinators like #divide_and_conquer and #linear_recursion are abstraction wins. They make recursive code much easier to read, because you know the general form of the algorithm and don’t need to pick through it to discover the individual steps. But there’s another benefit we should consider: Recursive combinators separate declaration from implementation.

Consider #linear_recursion again. This is not the fastest possible implementation. There is a long and tedious argument that arises when one programmer argues it should be implemented with iteration for performance, and the other argues it should be implemented with recursion for clarity, and a third programmer who never uses recursion claims the iterative solution is easier to understand…

Imagine a huge code base full of #linear_recursion and #divide_and_conquer calls. What happens if you decide that each one of these algorithms should be implemented with iteration? Hmmm… How about we modify #linear_recursion and #divide_and_conquer, and all of the methods that call them switch from recursion to iteration for free?

Or perhaps we decide that we really should take advantage of multiple threads… Do you see where this is going? You can write a new implementation and again, all of the existing methods are upgraded.

Even if you do not plan to change the implementation, let’s face a simple fact: when writing a brand new recursive or iterative method, you really have two possible sources of bugs: you may not have declared the solution correctly, and you may not implement it correctly.

Using combinators like #divide_and_conquer simplifies things: You only need to get your declaration of the solution correct, the implementation is taken care of for you. This is a tremendous win when writing recursive functions.

For these reasons, I strongly encourage the use of recursion combinators, either those supplied here or ones you write for yourself.

Practical Recursive Combinators

We’ve seen how recursive combinators like #divide_and_conquer and #linear_recursion are abstraction wins. They make recursive code much easier to read, because you know the general form of the algorithm and don’t need to pick through it to discover the individual steps.

We also saw that by separating the recursion implementation from the declaration of how to perform the steps of an algorithm like #rotate, we leave ourselves the opportunity to improve the performance of our implementation without the risk of adding bugs to our declaration. And today we’re going to do just that, along with a few tweaks for usability.

In this section, we’re going to optimize our combinators’ performance and make them a little easier to use with goodies like string_to_proc. To do that, we’re going to work with closures, defining methods with define_method, and implement functional programming’s partial application. We’ll wrap up by converting linrec from a recursive to an iterative implementation.

First, a little organization. Here are the original examples. I’ve placed them in a module and named the combinators multirec and linrec in conformance with common practice:

module RecursiveCombinators

  def multirec(value, steps)
    if steps[:divisible?].call(value)
      steps[:recombine].call(
        steps[:divide].call(value).map { |sub_value| multirec(sub_value, st\
eps) }
      )
    else
      steps[:conquer].call(value)
    end
  end

  def linrec(value, steps)
    if steps[:divisible?].call(value)
      trivial_part, sub_problem = steps[:divide].call(value)
      steps[:recombine].call(
        trivial_part, linrec(sub_problem, steps)
      )
    else
      steps[:conquer].call(value)
    end
  end

  module_function :multirec, :linrec

end

Since they are also module functions, call them by sending a message to the module:

def merge_sort(list)
  RecursiveCombinators.multirec(
    list,
    :divisible? => lambda { |list| list.length > 1 },
    :conquer    => lambda { |list| list },
    :divide     => lambda do |list|
      half_index = (list.length / 2) - 1
      [ list[0..half_index], list[(half_index + 1)..-1] ]
    end,
    :recombine  => lambda { |pair| merge_two_sorted_lists(pair.first, pair.\
last) }
  )
end

Or you can include the RecursiveCombinators module and call either method directly:

include RecursiveCombinators

def merge_two_sorted_lists(*pair)
  linrec(
    pair,
    :divisible? => lambda { |pair| !pair.first.empty? && !pair.last.empty? \
},
    :conquer => lambda do |pair|
      if pair.first.empty? && pair.last.empty?
        []
      elsif pair.first.empty?
        pair.last
      else
        pair.first
      end
    end,
    :divide => lambda do |pair|
      preceding, following = case pair.first.first <=> pair.last.first
        when -1: [pair.first, pair.last]
        when 0:  [pair.first, pair.last]
        when 1:  [pair.last, pair.first]
      end
      [ preceding.first, [preceding[1..-1], following] ]
    end,
    :recombine => lambda { |trivial_bit, divisible_bit| [trivial_bit] + div\
isible_bit }
  )
end

merge_sort([8, 3, 10, 1, 9, 5, 7, 4, 6, 2])
  => [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

Ok, we’re ready for some slightly more substantial work. These methods were fine for illustration, but I have a few questions for the author(!)

Spicing things up

First, note that every single time we call a method like merge_sort, we create four new lambdas from scratch. This seems wasteful, especially since the lambdas never change. Why create some objects just to throw them away?

On the other hand, it’s nice to be able to use create algorithms without having to define a method by name. Although I probably wouldn’t do a merge sort anonymously, when I need a one-off quickie, I might like to write something like:

RecursiveCombinators.multirec(
  [1, 2, 3, [[4,5], 6], [[[7]]]],
  :divisible? => lambda { |value| value.kind_of?(Enumerable) },
  :conquer    => lambda { |value| value ** 2 },
  :divide     => lambda { |value| value },
  :recombine  => lambda { |list| list.inject() { |x,y| x + y } }
)
  => 140

But when I want a permanent sum of the squares method, I don’t want to write:

def sum_squares(list)
  RecursiveCombinators.multirec(
    list,
    :divisible? => lambda { |value| value.kind_of?(Enumerable) },
    :conquer    => lambda { |value| value ** 2 },
    :divide     => lambda { |value| value },
    :recombine  => lambda { |list| list.inject() { |x,y| x + y } }
  )
end

…because that would create four lambdas every time I call the function. There are a couple of ways around this problem. First, our “recipe” for summing squares is a simple hash. We could extract that from the method into a constant:

SUM_SQUARES_RECIPE = {
   :divisible? => lambda { |value| value.kind_of?(Enumerable) },
   :conquer    => lambda { |value| value ** 2 },
   :divide     => lambda { |value| value },
   :recombine  => lambda { |list| list.inject() { |x,y| x + y } }
}

def sum_squares(list)
  RecursiveCombinators.multirec(list, SUM_SQUARES_RECIPE)
end

That (and the isomorphic solution where the constant SUM_SQUARES_RECIPE is instead a private helper method #sum_squares_recipe) is nice if you have some reason you wish to re-use the recipe elsewhere. But we don’t, so this merely clutters our class up and separates the method definition from its logic.

I have something in mind. To see what it is, let’s start by transforming our method definition from using the def keyword to using the define_method private class method. This obviously needs a module or class to work:

class Practicum

  include RecursiveCombinators

  define_method :sum_squares do |list|
    multirec(
      list,
     :divisible? => lambda { |value| value.kind_of?(Enumerable) },
     :conquer    => lambda { |value| value ** 2 },
     :divide     => lambda { |value| value },
     :recombine  => lambda { |list| list.inject() { |x,y| x + y } }
    )
  end

end

Practicum.new.sum_squares([1, 2, 3, [[4,5], 6], [[[7]]]])

As you probably know, any method taking a block can take a lambda using the & operator, so:

define_method :sum_squares, &(lambda do |list|
  multirec(
    list,
  :divisible? => lambda { |value| value.kind_of?(Enumerable) },
  :conquer    => lambda { |value| value ** 2 },
  :divide     => lambda { |value| value },
  :recombine  => lambda { |list| list.inject() { |x,y| x + y } }
 )
end)

This is useful, because now we can express what we want: a lambda taking one argument that in turn calls multirec with the other arguments filled in. Functional programmers call this Partial Application. The idea is that if you have a function or method taking two arguments, if you only give it one argument you get a function back that takes the other. So:

multirec(x).call(y)
  => multirec(x,y)

Now the drawback with this “standard” implementation of partial application is that we would pass a list to multirec and get back a function taking a hash of declarations. That isn’t what we want. We could partially apply things backwards so that multirec(x).call(y) => multirec(y,x) (if Ruby was a concatenative language, we would be concatenating the multirec combinator with a thrush). The trouble with that is it is the reverse of how partial application works in every other programming language and functional programming library.

Instead, we will switch the arguments to multirec ourselves, so it now works like this:

multirec(
  {
    :divisible? => lambda { |value| value.kind_of?(Enumerable) },
    :conquer    => lambda { |value| value ** 2 },
    :divide     => lambda { |value| value },
    :recombine  => lambda { |list| list.inject() { |x,y| x + y } }
  },
  list
)

The drawback with this approach is that we lose a little of Ruby’s syntactic sugar, the ability to fake named parameters by passing hash arguments without {} if they are the last parameter. And now, let’s give it the ability to partially apply itself. You can do some stuff with allowing multiple arguments and counting the number of arguments, but we’re going to make the wild assumption that you never attempt a recursive combinator on nil. Here’s multirec, you can infer the implementation for linrec:

def multirec(steps, optional_value = nil)
  worker_proc = lambda do |value|
    if steps[:divisible?].call(value)
      steps[:recombine].call(
        steps[:divide].call(value).map { |sub_value| worker_proc.call(sub_v\
alue) }
      )
    else
      steps[:conquer].call(value)
    end
  end
  if optional_value.nil?
    worker_proc
  else
    worker_proc.call(optional_value)
  end
end

Notice that you get the same correct result whether you write:

RecursiveCombinators.multirec(
  :divisible? => lambda { |value| value.kind_of?(Enumerable) },
  :conquer    => lambda { |value| value ** 2 },
  :divide     => lambda { |value| value },
  :recombine  => lambda { |list| list.inject() { |x,y| x + y } }
).call([1, 2, 3, [[4,5], 6], [[[7]]]])
  => 140

Or:

RecursiveCombinators.multirec(
  {
     :divisible? => lambda { |value| value.kind_of?(Enumerable) },
     :conquer    => lambda { |value| value ** 2 },
     :divide     => lambda { |value| value },
     :recombine  => lambda { |list| list.inject() { |x,y| x + y } }
  },
  [1, 2, 3, [[4,5], 6], [[[7]]]]
)
  => 140

Let’s go back to what we were trying to do with &:

define_method :sum_squares, &(lambda do |list|
  multirec(
    list,
  :divisible? => lambda { |value| value.kind_of?(Enumerable) },
  :conquer    => lambda { |value| value ** 2 },
  :divide     => lambda { |value| value },
  :recombine  => lambda { |list| list.inject() { |x,y| x + y } }
 )
end)

Now we know how to build our lambda:

require 'partial_application_recursive_combinators'

class Practicum

  extend PartialApplicationRecursiveCombinators   # so we can call multirec\
 in class scope

  define_method :sum_squares, &multirec(
   :divisible? => lambda { |value| value.kind_of?(Enumerable) },
   :conquer    => lambda { |value| value ** 2 },
   :divide     => lambda { |value| value },
   :recombine  => lambda { |list| list.inject() { |x,y| x + y } }
  )

end

Practicum.new.sum_squares([1, 2, 3, [[4,5], 6], [[[7]]]])
  => 140

You can verify for yourself that no matter how many times you call sum_squares, you do not build those lambdas again. What we have just done is added partial application to multirec and linrec, which in turn allows us to ensure that he cost of constructing lambdas for our methods is only done when the method is defined, not every time it is called.

Building on a legacy

We have already renamed divide_and_conquer and linear_recursion to bring them into line with standard practice and other programming languages. Now it’s time for us to bring the parameters–the declarative lambdas–into line with standard practice.

The four arguments to both methods are normally called cond, then, before, and after:

Things look very similar with the new scheme for now:

require 'legacy_recursive_combinators'

class Practicum

  extend LegacyRecursiveCombinators   # so we can call multirec in class sc\
ope

  define_method :sum_squares, &multirec(
   :cond   => lambda { |value| value.kind_of?(Numeric) }, # the only change\
 right now
   :then   => lambda { |value| value ** 2 },
   :before => lambda { |value| value },
   :after  => lambda { |list| list.inject() { |x,y| x + y } }
  )

end

All right, now our combinators will look familiar to functional programmers, and even better when we look at functional programs using recursive combinators we will understand them at a glance. Okay, let’s get serious and work on making our combinators easy to use and our code easy to read.

Seriously

As long as you’re writing these lambdas out, writing :cond => isn’t a hardship. And in an explanatory article like this, it can help at first. However, what if you find a way to abbreviate things? For example, you might alias lambda to L. Or you might want to use string\_to\_proc:

string_to_proc.rb


class String
  unless ''.respond_to?(:to_proc)
    def to_proc &block
      params = []
      expr = self
      sections = expr.split(/\s*->\s*/m)
      if sections.length > 1 then
          eval sections.reverse!.inject { |e, p| "(Proc.new { |#{p.split(/\\
s/).join(', ')}| #{e} })" }, block && block.binding
      elsif expr.match(/\b_\b/)
          eval "Proc.new { |_| #{expr} }", block && block.binding
      else
          leftSection = expr.match(/^\s*(?:[+*\/%&|\^\.=<>\[]|!=)/m)
          rightSection = expr.match(/[+\-*\/%&|\^\.=<>!]\s*$/m)
          if leftSection || rightSection then
              if (leftSection) then
                  params.push('$left')
                  expr = '$left' + expr
              end
              if (rightSection) then
                  params.push('$right')
                  expr = expr + '$right'
              end
          else
              self.gsub(
                  /(?:\b[A-Z]|\.[a-zA-Z_$])[a-zA-Z_$\d]*|[a-zA-Z_$][a-zA-Z_\
$\d]*:|self|arguments|'(?:[^'\\]|\\.)*'|"(?:[^"\\]|\\.)*"/, ''
              ).scan(
                /([a-z_$][a-z_$\d]*)/i
              ) do |v|
                params.push(v) unless params.include?(v)
              end
          end
          eval "Proc.new { |#{params.join(', ')}| #{expr} }", block && bloc\
k.binding
      end
    end
  end
end

So we should support passing the declarative arguments by position as well as by ‘name.’ And with a final twist, if any of the declarative arguments aren’t already lambdas, we’ll try to create lambdas by sending them the message to_proc. So our goal is to write what we wrote above or either of the following and have it “just work:”

define_method :sum_squares, &multirec(
  lambda { |value| value.kind_of?(Numeric) }, # the only change right now
  lambda { |value| value ** 2 },
  lambda { |value| value },
  lambda { |list| list.inject() { |x,y| x + y } }
)

include 'string-to_proc'

define_method :sum_squares, &multirec("value.kind_of?(Numeric)", "value ** \
2","value","value.inject(&'+')")

And here is the code that makes it work:

recursive_combinators.rb


module RecursiveCombinators

  separate_args = lambda do |args|
    if ![1,2,4,5].include?(args.length)
      raise ArgumentError
    elsif args.length <= 2
      steps = [:cond, :then, :before, :after].map { |k| args.first[k].to_pr\
oc }
      steps.push(args[1]) unless args[1].nil?
      steps
    else
      steps = args[0..3].map { |arg| arg.to_proc }
      steps.push(args[4]) unless args[4].nil?
      steps
    end
  end

  define_method :multirec do |*args|
    cond_proc, then_proc, before_proc, after_proc, optional_value = separat\
e_args.call(args)
    worker_proc = lambda do |value|
      if cond_proc.call(value)
        then_proc.call(value)
      else
        after_proc.call(
          before_proc.call(value).map { |sub_value| worker_proc.call(sub_va\
lue) }
        )
      end
    end
    if optional_value.nil?
      worker_proc
    else
      worker_proc.call(optional_value)
    end
  end

  define_method :linrec do |*args|
    cond_proc, then_proc, before_proc, after_proc, optional_value = separat\
e_args.call(args)
    worker_proc = lambda do |value|
      trivial_parts, sub_problem = [], value
      while !cond_proc.call(sub_problem)
        trivial_part, sub_problem = before_proc.call(sub_problem)
        trivial_parts.unshift(trivial_part)
      end
      trivial_parts.unshift(then_proc.call(sub_problem))
      trivial_parts.inject do |recombined, trivial_part|
        after_proc.call(trivial_part, recombined)
      end
    end
    if optional_value.nil?
      worker_proc
    else
      worker_proc.call(optional_value)
    end
  end

  module_function :multirec, :linrec

end

Now when we have trivial lambdas, we can use nice syntactic sugar to express them. string_to_proc is not part of our recursive combinators, but making recursive combinators flexible, we make it “play well with others,” which is a win for our code.

Separating Implementation from Declaration

In Refactoring Methods with Recursive Combinators, we read the claim that by separating the recursion implementation from the declaration of how to perform the steps of an algorithm like #rotate, we leave ourselves the opportunity to improve the performance of our implementation without the risk of adding bugs to our declaration.

In other words, we can optimize linrec if we want to. Well, we want to. So what we’re going to do is optimize its performance by trading time for space. Let’s have a quick look at the worker_proc lambda inside of linrec:

worker_proc = lambda do |value|
  if cond_proc.call(value)
    then_proc.call(value)
  else
    trivial_part, sub_problem = before_proc.call(value)
    after_proc.call(
      trivial_part, worker_proc.call(sub_problem)
    )
  end
end

As you can see, it is recursive, it calls itself to solve each sub-problem. And here is an iterative replacement:

worker_proc = lambda do |value|
  trivial_parts, sub_problem = [], value
  while !cond_proc.call(sub_problem)
    trivial_part, sub_problem = before_proc.call(sub_problem)
    trivial_parts.unshift(trivial_part)
  end
  trivial_parts.unshift(then_proc.call(sub_problem))
  trivial_parts.inject do |recombined, trivial_part|
    after_proc.call(trivial_part, recombined)
  end
end

This version doesn’t call itself. Instead, it uses an old-fashioned loop, accumulating the results in an array. In a certain sense, this uses more explicit memory than the recursive implementation. However, we both know that the recursive version uses memory for its stack, so that’s a bit of a wash. However, the Ruby stack is limited while arrays can be much larger, so this version can handle much larger data sets.

If you drop the new version of worker_proc into the linrec definition, each and every method you define using linrec gets the new implementation, for free. This works because we separated the implementation of recursive divide and conquer algorithms from the declaration of the steps each particular algorithm. Here’s our new version of linrec:

define_method :linrec do |*args|
  cond_proc, then_proc, before_proc, after_proc, optional_value = separate_\
args.call(args)
  worker_proc = lambda do |value|
    trivial_parts, sub_problem = [], value
    while !cond_proc.call(sub_problem)
      trivial_part, sub_problem = before_proc.call(sub_problem)
      trivial_parts.unshift(trivial_part)
    end
    trivial_parts.unshift(then_proc.call(sub_problem))
    trivial_parts.inject do |recombined, trivial_part|
      after_proc.call(trivial_part, recombined)
    end
  end
  if optional_value.nil?
    worker_proc
  else
    worker_proc.call(optional_value)
  end
end

A Really Simple Recursive Combinator

In [Recursive Lambdas in Ruby using Object#tap](http://ciaranm.wordpress.com/2008/11/30/recursive-lambdas-in-ruby-using-objecttap/ “”), Ciaran McCreesh explained how he used #tap to write a recursive function without cluttering the scope up with an unneeded variable. (If you would like a refresher, Object#tap is explained in Kestrels).

Ciaran’s final solution was:

lambda do | recurse, spec |
  case spec
    when AllDepSpec, ConditionalDepSpec
      spec.each { | child | recurse.call(recurse, child) }
    when SimpleURIDepSpec
      puts spec
  end
end.tap { | r | r.call(r, id.homepage_key.value) } if id.homepage_key

There are two great things about this solution. First, Ciaran doesn’t need to calculate a result, he is just performing this computation for its side-effect, puts. Therefore, using a kestrel like #tap signals that he is not interested in the result. Second, he is using an off-the-shelf component and not writing a “horrid untyped lambda calculus construct” to get the job done. Fewer moving parts is a laudable goal.

That being said, when solving other problems, this solution may not meet our needs:

If we find ourselves needing to work around these limitations, we’ll need to go a bit further. Let’s use a brutally trivial example, factorial. (The naive implementation of factorial is a terrible piece of programming, but it’s simple enough that we can focus on how we’re implementing recursion and not what we are computing).

We could use one of our existing recursive combinators like linrec:

include 'string-to_proc'

linrec('< 2', '1', 'n -> [n, n - 1]', '*').call(5)
  => 120

# or perhaps you prefer...

linrec(
  lambda { |n| n < 2 },
  lambda { |n| 1 },
  lambda { |n| [n, n - 1] },
  lambda { |n, m| n * m }
).call(5)
  => 120

That gets us what we want without using a untyped lambda calculus construct, because it uses a combinatorial logic construct instead. But let’s work something out that is closer to the spirit of Ciaran’s approach. For starters, we can’t use #tap because we need the result of the computation, so we’ll imagine we have a new method, #rcall. Our first cut will look like this:

class Proc

  def rcall(*args)
    call(self, *args)
  end

end

lambda { |r, n| n < 2 ? 1 : n * r.call(r, n-1) }.rcall(5)

That solves our first problem very nicely: we can call a lambda with a value and it knows to pass itself to itself. Now what about our second problem? We are still cluttering up the inside of our function with passing itself to itself. Instead of calling r.call(r, n-1), can we just call r.call(n-1)?

That would make our function look a lot simpler.

Well, we start with lambda { |r, *args| ... }. But if we are to call r.call(n), we need to pass in a function like lambda { |*args| ... }. What does that function do? Send the message #rcall to our original function, of course. So we can write:

class Proc

  def rcall(*args)
    call(lambda { |*args| self.rcall(*args) }, *args)
  end

end

lambda { |r, n| n < 2 ? 1 : n * r.call(n-1) }.rcall(5)
  => 120

And that’s it, we’ve accomplished recursion without using any untyped lambda calculus constructs. It may look at first glance like we’re using an anonymous recursive combinator like Y, but we aren’t. We’re actually taking advantage of Ruby’s self variable, so #rcall does not really implement anonymous recursion, it just lets us write recursive lambdas without explicitly binding them to a variable.

And our new method, #rcall, returns a value from our recursion and doesn’t force us to remember to pass our lambda to itself when making a recursive call.

Cheers!

class Proc

  def rcall(*args)
    call(lambda { |*args| self.rcall(*args) }, *args)
  end

end

You can’t be serious!?

In Practical Recursive Combinators, we enhanced multirec (a/k/a “Divide and Conquer”) and linrec (“Linear Recursion”) to accept as arguments any object that supports the #to_proc method. Today we’re going demonstrate why: We will look at how removing the ceremony around lambdas makes using combinators like multirec more valuable for code we share with others.

Using recursive_combinators.rb to define how to sum the squares of a nested list of numbers, we can write:

require 'recursive_combinators'

include RecursiveCombinators

multirec(
  lambda { |x| x.kind_of?(Numeric) },
  lambda { |x| x ** 2 },
  lambda { |x| x },
  lambda { |x| x.inject { |sum, n| sum + n } }
)

The trouble with this–to quote a seasonally appropriate character–is the noise, noise, Noise, NOISE! All those lambdas and parameter declarations outweigh the actual logic we are declaring, so much so that declaring this function using our abstraction is longer and may seem more obscure than declaring it without the abstraction.

This whole thing reminds me of languages where the keywords must be in UPPER CASE. Reading code in such languages is like listening to a poetry reading where the author shouts the punctuation:

Two roads diverged in a yellow wood COMMA!
And sorry I could not travel both
And be one traveler COMMA! long I stood
And looked down one as far as I could
To where it bent in the undergrowth SEMI-COMMA!!

Finding ways to abbreviate our declaration is more than just a little “syntactic sugar:” It’s a way of emphasizing what is important, our algorithms, and de-emphasizing what is not important, the scaffolding and ceremony of instantiating Proc objects in Ruby. One of those ways is to use String#to_proc.

String to Proc

String#to_proc adds the #to_proc method to the String class in Ruby. This allows you to write certain simple lambdas as strings instead of using the lambda keyword, the proc keyword, or Proc.new. The reason why you’d bother is that String#to_proc provides some shortcuts that get rid of the noise.

gives

String#to_proc provides several key abbreviations: First, -> syntax for lambdas in Ruby 1.8. So instead of lambda { |x,y| x + y }, you can write 'x,y -> x + y'. I read this out loud as “x and y gives x plus y.”

This syntax gets rid of the noisy lambda keyword and is much closer to Ruby 1.9 syntax. And frankly, reading it out loud makes much more sense than reading lambdas aloud. Our example above could be written:

require 'string_to_proc'

multirec(
  'x -> x.kind_of?(Numeric)',
  'x -> x ** 2',
  'x -> x',
  'x -> x.inject { |sum, n| sum + n }'
)

This is a lot better than the version with lambdas, and if the -> seems foreign, it is only because -> is in keeping with modern functional languages and mathematical notation, while lambda is in keeping with Lisp and lambda calculus notation without the ability to use a single lambda character unicode.

inferred parameters

Second, String#to_proc adds inferred parameters: If you do not use ->, String#to_proc attempts to infer the parameters. So if you write 'x + y', String#to_proc treats it as x,y -> x + y. There are certain expressions where this doesn’t work, and you have to use ->, but for really simple cases it works just fine. And frankly, for really simple cases you don’t need the extra scaffolding. Here’s our example with the first three lambdas using inferred parameters:

multirec(
  'x.kind_of?(Numeric)',
  'x ** 2',
  'x',
  'z -> z.inject { |sum, n| sum + n }'
)

I have good news and bad news about inferred parameters and String#to_proc in general. It uses regular expressions to do its thing, which means that complicated things often don’t work. For example, nesting -> only works when writing functions that return functions. So 'x -> y -> x + y' is a function that takes an x and returns a function that takes a y and returns x + y. That works. But 'z -> z.inject(&"sum, n -> sum + n")' does NOT work.

I considered fixing this with more sophisticated parsing, however the simple truth is this: String#to_proc is not a replacement for lambda, it’s a tool to be used when what you’re doing is so simple that lambda is overkill. If String#to_proc doesn’t work for something, it probably isn’t ridiculously simple any more.

it

The third abbreviation is a special case. If there is only one parameter, you can use _ (the underscore) without naming it. This is often called the “hole” or pronounced “it.” If you use “it,” then String#to_proc doesn’t try to infer any more parameters, so this can help you write things like:

multirec(
  '_.kind_of?(Numeric)',
  '_ ** 2',
  '_',
  '_.inject { |sum, n| sum + n }'
)

Admittedly, use of “it”/the hole is very much a matter of taste.

point-free

String#to_proc has a fourth and even more extreme abbreviation up its sleeve, point-free style: “Function points” are what functional programmers usually call parameters. Point-free style consists of describing how functions are composed together rather than describing what happens with their arguments. So, let’s say that I want a function that combines .inject with +. One way to say that is to say that I want a new function that takes its argument and applies an inject to it, and the inject takes another function with two arguments and applies a + to them:

lambda { |z| z.inject { |sum, n| sum + n } }

The other way is to say that I want to compose .inject and + together. Without getting into a compose function like Haskell’s . operator, String#to_proc has enough magic to let us write the above as:

".inject(&'+')"

Meaning “I want a new lambda that does an inject using plus.” Point-free style does require a new way of thinking about some things, but it is a clear win for simple cases. Proof positive of this is the fact that Ruby on Rails and Ruby 1.9 have both embraced point-free style with Symbol#to_proc. That’s exactly how (1..100).inject(&:+) works!

String#to_proc supports fairly simple cases where you are sending a message or using a binary operator. So if we wanted to go all out, we could write our example as:

multirec('.kind_of?(Numeric)', '** 2', 'x', ".inject(&'+')")

There’s no point-free magic for the identity function, although this example tempts me to special case the empty string!

When should we use all these tricks?

String#to_proc provides these options so that you as a programmer can choose your level of ceremony around writing functions. But of course, you have to use the tool wisely. My personal rules of thumb are:

  1. Embrace inferred parameters for well-known mathematical or logical operations. For these operations, descriptive parameter names are usually superfluous. Follow the well-known standard and use x, y, z, and w; or a, b and c; or n, i, j, and k for the parameters. If whatever it is makes no sense using those variable names, don’t used inferred parameters.
  2. Embrace the hole for extremely simple one-parameter lambdas that aren’t intrinsically mathematical or logical such as methods that use .method_name and for the identity function.
  3. Embrace point-free style for methods that look like operators.
  4. Embrace -> notation for extremely simple cases where I want to give the parameters a descriptive name.
  5. Use lambdas for everything else.

So I would write:

multirec( '_.kind_of?(Numeric)', '** 2', '_', "_.inject(&'+')")

I read the parameters out loud as:

And yes, I consider multirec( '_.kind_of?(Numeric)', '** 2', '_', "_.inject(&'+')") more succinct and easier to read than:

def sum_squares(value)
  if value.kind_of?(Numeric)
    value ** 2
  else
    value.map do |sub_value|
      sum_squares(sub_value)
    end.inject { |x,y| x + y }
  end
end

If all this is new too you, String#to_proc may seem like gibberish and def sum_squares may seem reassuringly sensible. But try to remember that combinators like multirec are built to disentangle the question of what we are doing from how we are doing it. This is the third straight post about recursive combinators using one of three different examples. So of course we know what sum_squares does and how it does it.

But try to imagine you are looking at a piece of code that isn’t so simple, that isn’t so obvious. Maybe it was written by someone else, maybe you wrote it a while ago. If you see:

def rotate(square)
  return square unless square.kind_of?(Enumerable) && square.size > 1
  half_sz = square.size / 2
  sub_square = lambda do |row, col|
    square.slice(row, half_sz).map { |a_row| a_row.slice(col, half_sz) }
  end
  upper_left = rotate(sub_square.call(0,0))
  lower_left = rotate(sub_square.call(half_sz,0))
  upper_right = rotate(sub_square.call(0,half_sz))
  lower_right = rotate(sub_square.call(half_sz,half_sz))
  upper_right.zip(lower_right).map { |l,r| l + r } +
    upper_left.zip(lower_left).map { |l,r| l + r }
end

Do you see at once how it works? Do you see at a glance whether the recursive strategy was implemented properly? Can you tell whether there’s something buggy about it? For example, this code only works rotating square matrices that have sides which are powers of two. What needs to be changed to fix that? Are you sure you can fix it without breaking the divide and conquer strategy?

For a method like this, I would write:

multirec(
  :cond => "!(_.kind_of?(Enumerable) && _.size > 1)",
  :then => "_",
  :before => lambda do |square|
    half_sz = square.size / 2
    sub_square = lambda do |row, col|
      square.slice(row, half_sz).map { |a_row| a_row.slice(col, half_sz) }
    end
    upper_left = sub_square.call(0,0)
    lower_left = sub_square.call(half_sz,0)
    upper_right = sub_square.call(0,half_sz)
    lower_right = sub_square.call(half_sz,half_sz)
    [upper_left, lower_left, upper_right, lower_right]
  end,
  :after => lambda do |list|
    upper_left, lower_left, upper_right, lower_right = list
    upper_right.zip(lower_right).map(&'+') + upper_left.zip(lower_left).map(\
&'+')
  end
end

And be assured that months from now if I wanted to support rotating rectangular matrices of arbitrary size, I could modify :cond, :before, and :after with confidence that the basic method was not being broken.

The Message

The message here is that taken by themselves, tools like recursive combinators or String#to_proc just look strange. But when we use them together, they reinforce each other and the sum becomes much greater than the sum of the parts. In the case of String#to_proc, it looks like frivolity to most Ruby programmers, because they don’t use that many lambdas: Why should they when the existing syntax makes writing combinators hard to use? But when we have combinators in our hand, we see how String#to_proc can make them a win. So two things that look weird on their own are a useful tool when used in conjunction.

Our final example ended up being slightly longer than a naive version, however it is longer in ways that matter rather than longer in a mindless ceremonial way like some languages.

And that’s the point of languages like Ruby: You have the tools to decide which portions of you code matter more than others, and to make the parts that matter stand out and the parts that don’t go away. You may disagree with my choice of what matters for a recursive divide and conquer algorithm, but I hope we can agree that it’s valuable to be able to make that choice for yourself or your team.

Seriously.

The Hopelessly Egocentric Book Chapter

In Raymond Smullyan’s delightful book on Combinatory logic, To Mock a Mockingbird, Smullyan explains combinatory logic and derives a number of important results by presenting the various combinators as songbirds in a forest. One of his concepts is the Hopelessly Egocentric Bird:

We call a bird B hopelessly egocentric if for every bird x, Bx = B. This means that whatever bird x you call out to B is irrelevant; it only calls B back to you! Imagine that the bird’s name is Bertrand. When you call out “Arthur,” you get the response “Bertrand”; when you call out “Raymond,” you get the response “Bertrand”; when you call out “Ann,” you get the response “Bertrand.” All this bird can ever think about is itself!

Some folks have proposed that by making Ruby’s nil hopelessly egocentric, we can avoid the need for monadic idioms like #andand. Let’s examine the idea and see what consequences this has.

Object-oriented egocentricity

One of the tenets of OO programming is that programs consist of objects that respond to messages they send each other. A hopelessly egocentric object is easy to imagine: No matter what message you send it, the hopelessly egocentric object responds with itself:

class HopelesslyEgocentric < BlankSlate

  def method_missing(*arguments); self; end

end

Now you can create a hopelessly egocentric object with HopelesslyEgocentric.new and no matter what message you send it, you will get it back in response. And? What good is this? What can it do? Why should we put it in our Zoo?

In Objective C, nil is hopelessly egocentric. As Learn Objective-C puts it, You usually don’t need to check for nil before calling a method on an object. If you call a method on nil that returns an object, you will get nil as a return value. The idea here is that instead of getting a NoMethodError when we send a message to nil, we get nil back.

Some people like this so much they’ve composed the same semantics for Ruby:

class NilClass

  def method_missing(*args); nil; end

end

Now instead of writing person && person.name && person.name.upcase or person.andand.name.andand.upcase, you write person.name.upcase and either get the person’s name in upper case or nil. Wonderful! Or is it? Let’s take a look at what we’re trying to accomplish and the limitations of this approach.

queries

Hopelessly egocentric nil works reasonably for querying properties, in other words sub-entities when an entity is constructed by composition, things like .name. I’m quite happy if person.name returns nil whether we don’t have a person or if the person doesn’t have a name. And we can extend this to what I would call purely functional transformations like .upcase. Just as ''.upcase is '', it is reasonable to think of nil.upcase as nil.

Now let’s look at some things that aren’t properties and aren’t purely functional transformations. What do we do with methods that are intended to update their receiver? Consider a bank account object. Do we really want to write things like:

person.account = nil
person.account.increment_balance(100)
  => nil

This makes no sense. If we want to give them a hundred dollars, we had better have their actual account on hand! Clearly there is a huge difference between methods that are queries and methods that are updates. (Note that andand doesn’t save us either, except by virtue of being explicit rather than magical so we can eschew it for update methods like #increment_balance.)

updates

Now that we are talking about methods with side-effects, let’s be more specific. Our hopelessly egocentric nil does return nil to any method. But it has another property, it has no side-effects. This is sometimes what we want! Let’s look at our nil account again. What about this code:

person.account.update_attribute(:primary_email, 'reg@braythwayt.com')

To decide what we think of this, we need to be specific about the meaning of nil. Generally, nil means one of two things:

  1. NONE, meaning “There isn’t one of these,” or;
  2. UNKNOWN, meaning “There is one of these, but we don’t know what it is.”

person.account.update_attribute(:primary_email, 'reg@braythwayt.com') is an example of why this difference matters. If person.account is an account, we want to update its primary email address, of course. And if person.account is NONE, we might be very happy not updating its primary email address. Perhaps our code looks like this:

class Person < ActiveRecord::Base
  belongs_to :account

  def update_email(new_email)
    self.class.transaction do
      update_attribute(:primary_email, new_email)
      account.update_attribute(:primary_email, new_email)
    end
  end

  # ...

end

Person.find(:first, :conditions => {...}).update_email('reg@braythwayt.com'\
)

Meaning, update our person’s primary email address, and if they have an account, update it too. If nil means NONE, this works. But what if nil really means UNKNOWN rather than NONE? Now it is wrong to silently fail. Let me give you a very specific way this can happen. When performing a database query, we can specify the exact columns we want returned. In Active Record, we might write something like this:

person = Person.find(:first, :conditions => {...}, :select => 'id, name')

What this means is that there is an account_id column in the people table, however we are deliberately not loading it into person. ActiveRecord will still supply us with a #account method, however it will return nil. This absolutely, positively means that person.account is UNKNOWN, not NONE. There could well be an account in our database for this person, and now if we write:

person.update_email('reg@braythwayt.com')

We do not want it to silently ignore the account email update, because we haven’t loaded the account associated model. So for UNKNOWN, our two rules are:

  1. Querying UNKNOWN returns UNKNOWN;
  2. All attempts to update UNKNOWN are errors.

What about NONE? We gave two examples of updates, one of which was a really bad idea,#increment_balance, and the other of which was fine update_attribute(:primary_email, new_email). Thus we have three rules for NONE:

  1. Querying NONE returns NONE;
  2. Some updates to NONE may return NONE and have no side effects;
  3. Some updates to NONE may be errors.

With a little forethought and design, you may be able to construct one or more classes if your application for which all updates to NONE return NONE and have no side effects. But for all others, methods like #increment_balance represent a semantic problem with using a hopelessly egocentric nil to represent NONE. We also see a problem with writing a hopelessly egocentric nil to handle UNKNOWN: How does it know which methods are queries and which methods are updates?

If we work really hard and eliminate all possibility of an update to NONE being an error, are there any other issues with using a hopelessly egocentric nil? Let’s return to our initial case:

person.name
  => nil

person.name.upcase
  => nil

Makes sense. And then we write:

person.name + ", esq."
  => nil

Dubious, but let’s go with it. If this makes sense, we ought to be able to write this as well:

"Mister " + person.name
  => TypeError: can't convert nil into String

Why is this an error? Things don’t get any better using a hopelessly egocentric nil to handle UNKNOWN. Even if we can get past the issue of update methods, we have another problem that is much more difficult to resolve. UNKNOWN introduces tri-value logic:

UNKNOWN == Object.new
  => UNKNOWN
UNKNOWN != Object.new
  => UNKNOWN
UNKNOWN == UNKNOWN
  => UNKNOWN
UNKNOWN != UNKNOWN
  => UNKNOWN
Object.new == UNKNOWN
  => UNKNOWN
Object.new != UNKNOWN
  => UNKNOWN

When you don’t know something’s value, it is neither equal to nor not equal to any other value, including another unknown value. And our fifth and sixth examples suffer from the same problem as nil + ", esq." vs. "Mister " + nil. We would need to patch all sorts of other objects to make equality testing many many other methods work. (What is 42 < UNKNOWN?) But things get worse:

How does truthiness work? In Ruby, you cannot override the way and, or, if, unless, &&, and || work. What are the semantics of if UNKNOWN? What do true && UNKNOWN or UNKNOWN or true return? Before implementing a true UNKNOWN in any language, I would want those questions answered.

Finally, there is actually a fifth and sixth rule that we are ignoring because these examples are in Ruby rather than a language with an expressive type system. Consider:

'Reg Braithwaite'.wealthy?
  => NoMethodError: undefined method `wealthy?' for "Reg Braithwaite":Strin\
g

And now we write:

person.name.wealthy?         # or...
person.name.andand.wealthy?

What happens if person.name is NONE? What happens if person.name is UNKNOWN? Our problem here is that #wealthy? is never a valid message to send to something returned by person.name. Our behaviour ought to be:

There is no easy way to do this in Ruby, of course. Not only do we have trouble disambiguating queries from updates, we have trouble disambiguating valid from invalid messages.

For all of these reasons, I am loathe to implement a hopelessly egocentric nil and prefer to use an explicit idiom like #andand or #try. With explicit idioms, I can deal with the ambiguity between nil meaning NONE and nil meaning UNKNOWN and make sure my code does not violate the rules given here. But what I like about the idea of a hopelessly egocentric nil is that thinking the consequences provokes me to really think about the semantics of my data schemas.

Representing NONE and UNKNOWN values is a subtle problem requiring a deep and pervasive approach to typing similar to C++’s const keyword and/or writing custom null objects that understand which methods are safe to respond egocentrically and which are errors.

Bonus Chapter: Separating Concerns in Coffeescript using Aspect-Oriented Programming

This chapter isn’t strictly about combinatory logic and it especially isn’t about Ruby programming. However, once you grasp the underlying fundamental principles, you can apply them in other environments using other programming languages.

You shouldn’t find it too difficult to relate the content to previous chapters, the title alone provides a massive hint.

Modern object-oriented software design favours composition over inheritance and celebrates code that is DRY. The idea is to separate each object’s concerns and responsibility into separate units of code, each of which have a single responsibility. When two different types of objects share the same functionality, they do not repeat their implementation, instead they share their implementation.

When composing functionality at the method level of granularity, techniques such as mixins and delegation are effective design tools. But at a finer level of granularity, we sometimes wish to share functionality within methods. In a traditional design, we have to extract the shared functionality into a separate method that is called by other methods.

decomposing methods

You might think of extracting smaller methods from bigger methods as decomposing methods. You break them into smaller pieces, and thus you can share functionality or rearrange the pieces so that your code is organized by responsibility.

For example, let’s say that we are writing a game for the nostalgia market, and we wish to use partially constructed objects to save resources. When we go to actually use the object, we hydrate it, loading the complete object from persistent storage. This is a coarse kind of lazy evaluation.

Here’s some bogus code:

class Wumpus
  roar: ->
    # code that hydrates a Wumpus
    # ...
    # code that roars
    # ...
  run: ->
    # code that hydrates a Wumpus
    # ...
    # code that runs
    # ...

class Hunter
  draw: (bow) ->
    # code that hydrates a Hunter
    # ...
    # code that draws a bow
    # ...
  run: ->
    # code that hydrates a Hunter
    # ...
    # code that runs
    # ...

We can decompose it into this:

class Wumpus
  roar: ->
    hydrate(this)
    # code that roars
    # ...
  run: ->
    hydrate(this)
    # code that runs
    # ...

class Hunter
  draw: (bow) ->
    hydrate(this)
    # code that draws a bow
    # ...
  run: ->
    hydrate(this)
    # code that runs
    # ...

hydrate = (object) ->
  # code that hydrates the object from storage

composing methods

On an ad hoc basis, decomposing methods is fine. But there is a subtle problem. Implementation tricks like hydrating objects, memoizing return values, or other performance tweaks are orthogonal to the mechanics of what methods like roar or run are supposed to do. So why is hydrate(this) in every method?

Now the obvious answer is, “Ok, it might be orthogonal to the main business of each method, but it’s just one line.” The trouble with this answer is that method decomposition doesn’t scale. We need a line for hydration, a line or two for logging, a few lines for error handling, another for wrapping certain things in a transaction…

Even when each orthogonal concern is boiled down to just one line, you can end up having the orthogonal concerns take up more space than the main business. And that makes the code hard to read in practice. You don’t believe me? take a look at just about every programming tutorial ever written. They almost always say “Hand waving over error handling and this and that” in their code examples, because they want to make the main business of the code clearer and easier to read.

We ought to do the same thing, move hydration, error handling, logging, transactions, and anything else orthogonal to the main business of a method out of the method. And we can.

method combinations

Here’s our code again, this time using the YouAreDaChef library to provide before combinations:

YouAreDaChef = require('YouAreDaChef.coffee').YouAreDaChef

class Wumpus
  roar: ->
    # ...
  run: ->
    #...

class Hunter
  draw: (bow) ->
    # ...
  run: ->
    #...

hydrate = (object) ->
  # code that hydrates the object from storage

YouAreDaChef(Wumpus, Hunter)
  .before 'roar', 'draw', 'run', () ->
    hydrate(this)

Whenever the roar, draw, or run methods are called, YouAreDaChef calls hydrate(this) first. And the two concerns–How a Wumpus works and when it ought to be hydrated–are totally separated. This isn’t a new idea, it’s called aspect-oriented programming, and practitioners will describe what we’re doing in terms of method advice and point cuts.

Ruby on Rails programmers are familiar with this idea. If you have ever written any of the following, you were using Rails’ built-in aspect-oriented programming support:

after_save
validates_each
alias_method_chain
before_filter

These and other features of Rails implement method advice, albeit in a very specific way tuned to portions of the Rails framework.

the unwritten rule

There is an unwritten rule that says every Ruby programmer must, at some point, write his or her own AOP implementation –Avdi Grimm

Let’s look at how YouAreTheChef works. Here’s a simplified version of the code for the before combination:

YouAreDaChef: (clazzes...) ->
    before: (method_names..., advice) ->
      _.each method_names, (name) ->
        _.each clazzes, (clazz) ->
          if _.isFunction(clazz.prototype[name])
            pointcut = clazz.prototype[name]
            clazz.prototype[name] = (args...) ->
              advice.apply(this, args)
              pointcut.call(this, args)

This is really simple, we are composing a method with a function. The method already defined in the class is called the pointcut, and the function we are supplying is called the advice. Unlike a purely functional combinator, we are only executing the advice for side-effects, not for its result. But in object-oriented imperative programming, that’s usually what we want.

other method combinations

That looks handy. But we also want an after method, a way to compose methods in the other order. Good news, the after combination is exactly what we want. After combinations are very handy for things like logging method calls or cleaning things up.

But there’s another great use for after combinators, triggering events. Event triggering code is often very decoupled from method logic: The whole point of events is to invert control so that an object like a Wumpus doesn’t need to know which objects want to do something after it moves. For example, a Backbone.js view might be observing the Wumpus and wish to update itself when the Wumpus moves:

YouAreDaChef(Wumpus, Hunter)
  .after 'run', () ->
    this.trigger 'move', this

CaveView = Backbone.View.extend
  initialize: ->
    # ...
    @model.bind 'move', @wumpusMoved
  wumpusMoved: (wumpus) ->
    # ...

The code coupling the view to the model has now been separated from the code defining the model itself.

YouAreDaChef also provides other mechanisms for separating concerns. Around combinations (also called around advice) are a very general-purpose combinator. With an around combination, the original method (the pointcut) is passed to the advice function as a parameter, allowing it to be called at any time.

Around advice is useful for wrapping methods. Using an around combinator, you could bake error handling and transactions into methods without encumbering their code with implementation details. In this example, we define the methods to be matched using a regular expression, and YouAreDaChef passes the result of the match to the advice function, which wraps them in a transaction and adds some logging:

class EnterpriseyLegume
  setId:         (@id)         ->
  setName:       (@name)       ->
  setDepartment: (@department) ->
  setCostCentre: (@costCentre) ->

YouAreDaChef(EnterpriseyLegume)
  .around /set(.*)/, (pointcut, match, value) ->
    performTransaction () ->
      writeToLog "#{match[1]}: #{value}"
      pointcut(value)

summary

Method combinations are a technique for separating concerns when the level of granularity is smaller than a method. This makes the code DRY and removes the clutter of orthogonal responsibilities.

Appendix: Finding Joy in Combinators

In this book, we have looked at a few interesting combinators and some Ruby code inspired by them. Today we’ll review the definition of a combinator, and from there we’ll learn something intriguing about an entire family of programming languages, the Concatenative Languages.

Let’s start at the beginning: What is a combinator?

One definition of a combinator is a function with no free variables. Another way to put it is that a combinator is a function that takes one or more arguments and produces a result without introducing anything new. In Ruby terms, we are talking about blocks, lambdas or methods that do not call anything except what has been passed in.

So if I tell you that:

finch.call(a).call(b).call(c)
  => c.call(b).call(a)

Then you know that finch is a combinator because the effect it produces is made up solely of combining the effects of the things it takes as parameters. That’s easy, but yet… Where is our vaunted simplicity? Working with Ruby’s lambdas and braces and calls gets in our way. We can learn a lot from combinatorial logic to help our Ruby programming, but Ruby is a terrible language for actually learning about combinatorial logic.

Languages for combinatorial logic

Combinatorial logicians use a much simpler, direct syntax for writing expressions:

Fabc => cba

Whenever a logician writes abc, he means the same thing as when a Rubyist writes a.call(b).call(c). Note that like Ruby, the precedence in combinatorial logic is to the left, so abc is equivalent to (ab)c just as in Ruby a.call(b).call(c) is equivalent to (a.call(b)).call(c).

I think you’ll agree that abc is much simpler than a.call(b).call(c). Here’s another look at the combinators we’ve met in this series, using the simpler syntax:

Kxy => x
Txy => yx
Cxyz => xzy
Q3xyz => z(xy) # Q3 is shorthand for the Quirky bird
Bxyz = x(yz)
Qxyz = y(xz) # Q is shorthand for the Queer bird

There are many, many more combinators, of course. Infinitely more, in fact. We only have names for some of the most useful. For example, the Warbler Twice Removed, or W** is written:

W**xyzw => xyzww

(Warblers are actually in a whole ‘nother family of birds that introduce duplication. Other members of that family include the Mockingbird and Starling. They’re incredibly useful for introducing ideas like iteration and recursion.)

You could say that combinators take a string of symbols (like x, y, z, w, and so forth), then they introduce some erasing, some duplication, some permutation, and add some parentheses. That they work to rearrange our string of symbols.

We have seen that parentheses are allowed, and that some combinators introduce parentheses. Before you say that the combinators introduce new symbols, remember that parentheses are punctuation. If you think of the symbols as words and the parentheses as punctuation, you see that the combinators simply rearrange the words and change the punctuation without introducing new words.

Now I said that combinators work with strings of symbols. This was a terrible analogy, because it made us talk about punctuation and why parentheses are not symbols. Another thing you could say is that combinator work with lists of symbols, then they re-arrange the symbols, including removing symbols, introducing sub-lists, and duplicating symbols.

This is more interesting! Now we can see that in our notation, adding parentheses is a way of introducing a sub list. Let’s revisit the bluebird:

Bxyz = x(yz)

Now what we can say is this: The bluebird takes a list of three symbols and answers a list of one symbol and a sublist of two symbols. In Ruby:

bluebird = lambda { |*args|
  x, y, z = args
  [x, [y, z]]
}

bluebird.call(:x, :y, :z)
  => [:x, [:y, :z]]

This is easy. What about the Thrush?

thrush = lambda { |*args|
  x, y = args
  [y, x]
}

thrush.call(:x, :y)
  => [:y, :x]

Now let’s pause for a moment. Imagine we had an entire programming language devoted to this style of programming. The primary thing it does is define combinators that take a list of symbols and recombine them. Since it works with lists and we are thinking about combinatory logic, we will represent our expressions as lists:

idiot :x
  => :x

mockingbird :x
  => :x :x

bluebird :x :y :z
  => :x [:y :z]

thrush :x :y
  => :y :x

Wait! Do not shout Lisp! Just because we have lists of things does not mean we are programming in Lisp!! Let’s keep going, and you will see in the next example that I do not mean Lisp:

bluebird thrush :x :y :z
  => thrush [:x :y] :z
  => :z [:x :y]

And therefore in our fictitious language we can write:

quirky = bluebird thrush

And thus:

quirky :x :y :z
  => :z [:x :y]

This looks familiar. Have you ever written a program in Postscript? Or [Forth](http://en.wikipedia.org/wiki/Forth_(programming_language)? What if instead of using a thrush we used a word called swap? Or instead of a mockingbird we used a word called dup?

Concatenative languages

Concatenative (or stack-based) programming languages–like Postscript, Forth, Factor, and Joy–are almost direct representations of combinatorial logic. There is a list of things, words or combinators permute the list of things, and the things can be anything: data, other combinators, or even programs. These languages are called concatenative languages because the primary way to compose programs and combinators with each other is to concatenate them together, like we did with the bluebird and thrush above.

For me the purpose of life is partly to have joy. Programmers often feel joy when they can concentrate on the creative side of programming, So Ruby is designed to make programmers happy. –Yukihiro Matsumoto

You have probably heard that it is a good idea to learn a new programming language every year. Is a concatenative language on your list of languages to learn? No? Well, here is the reason to learn a concatenative language: You will learn to think using combinatorial logic. For example, the Y Combinator is expressed in Joy as:

[dup cons] swap concat dup cons i

Where dup is a mockingbird, swap is a thrush, i is an idiot bird, and cons and concat are likewise two other combinators. Writing in Joy is writing directly in combinators.

In other programming languages, combinatorial logic is an underpinning. It helps us explain and prove certain things, It inspires us to invent certain things. It is behind everything we do. That’s good. But in a concatenative language, it is not an underpinning or behind a curtain. It is right out there in front of you. And learning to program in a concatenative language means learning to think in combinators.

The combinators we’ve discussed in depth so far are all fascinating, however as a basis for writing programs they are incomplete. You cannot represent every possible program using kestrels, thrushes, cardinals, quirky birds, bluebirds, and queer birds. To represent all possible programs, we need to have at least one combinator that duplicates symbols, like a mockingbird or another from its family.

Appendix: Source Code

All source code is published under the following license:

# The MIT License
# 
# All contents Copyright (c) 2004-2008 Reginald Braithwaite
# <http://braythwayt.com>  except as otherwise noted.
# 
# Permission is hereby granted, free of charge, to any person obtaining a c\
opy
# of this software and associated documentation files (the "Software"), to \
deal
# in the Software without restriction, including without limitation the rig\
hts
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# 
# The above copyright notice and this permission notice shall be included i\
n
# all copies or substantial portions of the Software.
# 
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS O\
R
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL T\
HE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING F\
ROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
# THE SOFTWARE.
# 
# http://www.opensource.org/licenses/mit-license.php

kestrels

returning.rb


# The MIT License
# 
# All contents Copyright (c) 2004-2008 Reginald Braithwaite
# <http://braythwayt.com>  except as otherwise noted.
# 
# Permission is hereby granted, free of charge, to any person obtaining a c\
opy
# of this software and associated documentation files (the "Software"), to \
deal
# in the Software without restriction, including without limitation the rig\
hts
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# 
# The above copyright notice and this permission notice shall be included i\
n
# all copies or substantial portions of the Software.
# 
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS O\
R
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL T\
HE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING F\
ROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
# THE SOFTWARE.
# 
# http://www.opensource.org/licenses/mit-license.php

unless respond_to?(:returning)
  def returning(it)
    yield it
    it
  end
end

thrushes

into.rb


class Object
  def into expr = nil
    expr.nil? ? yield(self) : expr.to_proc.call(self)
  end
end

let.rb


class Kernel
  def let it
    yield it
  end
end

the cardinal

cardinal.rb


def cardinal_define(name, &proc_over_proc)
  define_method_taking_block(name) do |a_value, a_proc|
    proc_over_proc.call(a_proc).call(a_value)
  end
end

# method_body_proc should expect (a_value, a_proc)
# see http://coderrr.wordpress.com/2008/10/29/using-define_method-with-bloc\
ks-in-ruby-18/
def define_method_taking_block(name, &method_body_proc)
  self.class.send :define_method, "__cardinal_helper_#{name}__", method_bod\
y_proc
  eval <<-EOM
    def #{name}(a_value, &a_proc)
      __cardinal_helper_#{name}__(a_value, a_proc)
    end
  EOM
end

quirky birds

andand.rb


module AndAnd

  # This module is included in Object, so each of these methods are added
  # to Object when you require 'andand'. Each method is an *adverb*: they a\
re
  # intended to be enchained with another method, such as receiver.adverb.m\
ethod
  #
  # The purpose of an adverb is to modify what the primary method returns.
  #
  # Adverbs also take blocks or procs, passing the receiver as an argument \
to the
  # block or proc. They retain the same semantics with a block or proc as t\
hey
  # do with a method. This behaviour weakly resembles a monad.
  module ObjectGoodies

    # Returns nil if its receiver is nil, regardless of whether nil actuall\
y handles the
    # actual method ot what it might return.
    #
    #    'foo'.andand.size => 3
    #    nil.andand.size => nil
    #    'foo'.andand { |s| s << 'bar' } => 'foobar'
    #    nil.andand { |s| s << 'bar' } => nil
    def andand (p = nil)
      if self
        if block_given?
          yield(self)
        elsif p
          p.to_proc.call(self)
        else
          self
        end
      else
        if block_given? or p
          self
        else
          MockReturningMe.new(self)
        end
      end
    end

    # Invokes the method and returns the receiver if nothing is raised. The\
refore,
    # the purpose of calling the method is strictly for side effects. In th\
e block
    # form, it resembles #tap from Ruby 1.9, and is useful for debugging. I\
t also
    # resembles #returning from Rails, with slightly different syntax.
    #
    #    Object.new.me do |o|
    #      def o.foo
    #        'foo'
    #      end
    #    end
    #      => your new object
    #
    # In the method form, it is handy for chaining methods that don't ordin\
arily
    # return the receiver:
    #
    #    [1, 2, 3, 4, 5].me.pop.reverse
    #      => [4, 3, 2, 1]
    def me (p = nil)
      if block_given?
        yield(self)
        self
      elsif p
        p.to_proc.call(self)
        self
      else
        ProxyReturningMe.new(self)
      end
    end

    unless Object.instance_methods.include?('tap')
      alias :tap :me
    end

    # Does not invoke the method or block and returns the receiver.
    # Useful for comemnting stuff out, especially if you are using #me for
    # debugging purposes: change the .me to .dont and the semantics of your
    # program are unchanged.
    #
    #    [1, 2, 3, 4, 5].me { |x| p x }
    #      => prints and returns the array
    #    [1, 2, 3, 4, 5].dont { |x| p x }
    #      => returns the array without printing it
    def dont (p = nil)
      if block_given?
        self
      elsif p
        self
      else
        MockReturningMe.new(self)
      end
    end

  end

end

class Object
  include AndAnd::ObjectGoodies
end

unless Module.constants.include?('BlankSlate')
  if Module.constants.include?('BasicObject')
    module AndAnd
      class BlankSlate < BasicObject
      end
    end
  else
    module AndAnd
      class BlankSlate
        def self.wipe
          instance_methods.reject { |m| m =~ /^__/ }.each { |m| undef_metho\
d m }
        end
        def initialize
          BlankSlate.wipe
        end
      end
    end
  end
end

module AndAnd

  # A proxy that returns its target without invoking the method you
  # invoke. Useful for nil.andand and #dont
  class MockReturningMe < BlankSlate
    def initialize(me)
      super()
      @me = me
    end
    def method_missing(*args)
      @me
    end
  end

  # A proxy that returns its target after invoking the method you
  # invoke. Useful for #me
  class ProxyReturningMe < BlankSlate
    def initialize(me)
      super()
      @me = me
    end
    def method_missing(sym, *args, &block)
      @me.__send__(sym, *args, &block)
      @me
    end
  end

end

blank_slate.rb


# The MIT License
# 
# All contents Copyright (c) 2004-2008 Reginald Braithwaite
# <http://braythwayt.com>  except as otherwise noted.
# 
# Permission is hereby granted, free of charge, to any person obtaining a c\
opy
# of this software and associated documentation files (the "Software"), to \
deal
# in the Software without restriction, including without limitation the rig\
hts
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# 
# The above copyright notice and this permission notice shall be included i\
n
# all copies or substantial portions of the Software.
# 
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS O\
R
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL T\
HE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING F\
ROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
# THE SOFTWARE.
# 
# http://www.opensource.org/licenses/mit-license.php

unless Module.constants.include?('BlankSlate')
  if Module.constants.include?('BasicObject')
    class BlankSlate < BasicObject
    end
  else
    class BlankSlate
      def self.wipe
        instance_methods.reject { |m| m =~ /^__/ }.each { |m| undef_method \
m }
      end
      def initialize
        BlankSlate.wipe
      end
    end
  end
end

quirky_bird.rb


# The MIT License
# 
# All contents Copyright (c) 2004-2008 Reginald Braithwaite
# <http://braythwayt.com>  except as otherwise noted.
# 
# Permission is hereby granted, free of charge, to any person obtaining a c\
opy
# of this software and associated documentation files (the "Software"), to \
deal
# in the Software without restriction, including without limitation the rig\
hts
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# 
# The above copyright notice and this permission notice shall be included i\
n
# all copies or substantial portions of the Software.
# 
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS O\
R
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL T\
HE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING F\
ROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
# THE SOFTWARE.
# 
# http://www.opensource.org/licenses/mit-license.php

def quirky_bird_define(name, &value_proc)
  define_method_taking_block(name) do |a_value, a_proc|
    a_proc.call(value_proc.call(a_value))
  end
end

# method_body_proc should expect (a_value, a_proc)
# see http://coderrr.wordpress.com/2008/10/29/using-define_method-with-bloc\
ks-in-ruby-18/
def define_method_taking_block(name, &method_body_proc)
  self.class.send :define_method, "__quirky_bird_helper_#{name}__", method_\
body_proc
  eval <<-EOM
    def #{name}(a_value, &a_proc)
      __quirky_bird_helper_#{name}__(a_value, a_proc)
    end
  EOM
end

def quirky_bird_extend(name, &value_proc)
  Object.send(:define_method, name) do
    value_proc.call(self)
  end
end

quirky_songs.rb


# The MIT License
# 
# All contents Copyright (c) 2004-2008 Reginald Braithwaite
# <http://braythwayt.com>  except as otherwise noted.
# 
# Permission is hereby granted, free of charge, to any person obtaining a c\
opy
# of this software and associated documentation files (the "Software"), to \
deal
# in the Software without restriction, including without limitation the rig\
hts
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# 
# The above copyright notice and this permission notice shall be included i\
n
# all copies or substantial portions of the Software.
# 
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS O\
R
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL T\
HE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING F\
ROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
# THE SOFTWARE.
# 
# http://www.opensource.org/licenses/mit-license.php

require 'quirky_bird'
require 'blank_slate'
require 'returning'

quirky_bird_extend(:maybe) do |value|
  if value.nil?
    returning(BlankSlate.new) do |it|
      def it.method_missing(*args)
        nil
      end
    end
  else
    value
  end
end

quirky_bird_extend(:try) do |value|
  returning(BlankSlate.new) do |it|
    def it.__value__=(arg)
       @value = arg
    end
    def it.method_missing(name, *args)
      if @value.respond_to?(name)
        @value.send(name, *args)
      end
    end
    it.__value__ = value
  end
end

bluebirds

before_and_after_advice.rb


# This code contains ideas snarfed from:
#
#   http://github.com/up_the_irons/immutable/tree/master
#   http://blog.jayfields.com/2006/12/ruby-alias-method-alternative.html
#   http://eigenclass.org/hiki.rb?bounded+space+instance_exec
#
# And a heaping side of http://blog.grayproductions.net/articles/all_about_\
struct

module BeforeAndAfterAdvice

  # Random ID changed at each interpreter load
  UNIQ = "_#{object_id}"

  Compositions = Struct.new(:before, :between, :after)

  module MethodAdvice; end

  module ClassMethods

    # Example:
    #
    #   before :foo, :bar do
    #     # ...
    #   end
    #
    # This executes the body of the block before the #foo and #bar instance\
 methods
    # for side effects without modifying the parameters (if any) passed to \
#foo
    # and #bar
    #
    #   before :fizz, :buzz do |p1, p2|
    #     # ...
    #     [p1, p2]
    #   end
    #
    # This executes the body of the block before the #foo and #bar instance\
 methods
    # for side efects, AND determines what is passed along as parameters. I\
f the block
    # takes parameters, it acts as a filter, transforming the parameters.
    #
    # It is possible to chain #before advice, and you can add more advice i\
n subclasses:
    #
    #   class Foo
    #     def foo(bar); end
    #   end
    #
    #   class Bar < Foo
    #     include MethodAdvice
    #
    #     before :foo do
    #       # ...
    #     end
    #
    #   end
    #
    #   class Blitz < Bar
    #     include MethodAdvice
    #
    #     before :foo do |bar|
    #       # ...
    #       bar
    #     end
    #
    #   end
    #
    def before(*method_symbols, &block)
      options = method_symbols[-1].kind_of?(Hash) ? method_symbols.pop : {}
      method_symbols.each do |method_sym|
        __composed_methods__[method_sym].before.unshift(__unbound_method__(\
block, options[:name]))
        __rebuild_method__(method_sym)
      end
    end

    # Example:
    #
    #   after :foo, :bar do
    #     # ...
    #   end
    #
    # This executes the body of the block after the #foo and #bar instance \
methods
    # for side effects without modifying the return values of the #foo and \
#bar methods
    #
    #   after :fizz, :buzz do |r|
    #     # ...
    #     r
    #   end
    #
    # This executes the body of the block after the #foo and #bar instance \
methods
    # for side efects, AND determines what is returned from the call. If th\
e block
    # takes parameters, it acts as a filter, transforming the return value.
    #
    # It is possible to chain #after advice, and you can add more advice in\
 subclasses:
    #
    #   class Foo
    #     def foo(bar); end
    #   end
    #
    #   class Bar < Foo
    #     include MethodAdvice
    #
    #     after :foo do
    #       # ...
    #     end
    #
    #   end
    #
    #   class Blitz < Bar
    #     include MethodAdvice
    #
    #     after :foo do |r|
    #       # ...
    #       r
    #     end
    #
    #   end
    #
    def after(*method_symbols, &block)
      options = method_symbols[-1].kind_of?(Hash) ? method_symbols.pop : {}
      method_symbols.each do |method_sym|
        __composed_methods__[method_sym].after.push(__unbound_method__(bloc\
k, options[:name]))
        __rebuild_method__(method_sym)
      end
    end

    # Removes all advice from the named methods. Intended for testing.
    #
    def reset_befores_and_afters(*method_symbols)
      method_symbols.each do |method_sym|
        __composed_methods__[method_sym].before = []
        __composed_methods__[method_sym].after = []
        __rebuild_method__(method_sym)
      end
    end

    # Modified to re-apply advice when a method is overridden. So:
    #
    #   class Foo
    #     def foo(bar); end
    #   end
    #
    #   class Bar < Foo
    #     include MethodAdvice
    #
    #     after :foo do
    #       # ...
    #     end
    #
    #   end
    #
    #   class Blitz < Bar
    #     include MethodAdvice
    #
    #     def foo(bar)
    #       # ...
    #     end
    #
    #   end
    #
    # In this case the class Blitz overrides the method #foo, but the advic\
e in
    # class Bar is still applied, the override happens ONLY on the inner me\
thod,
    # not the advice.
    #
    # Note well that super has undefined behaviour in this situation.
    #
    def method_added(method_sym)
      unless instance_variable_get("@#{UNIQ}_in_method_added")
        __safely__ do
          __composed_methods__[method_sym].between = self.instance_method(m\
ethod_sym)
          @old_method_added and @old_method_added.call(method_sym)
          __rebuild_method__(method_sym)
        end
      end
    end

    def __composed_methods__
      ancestral_composer = ancestors.detect { |ancestor| ancestor.instance_\
variable_defined?(:@__composed_methods__) }
      if ancestral_composer
        ancestral_composer.instance_variable_get(:@__composed_methods__)
      else
        @__composed_methods__ ||= Hash.new { |hash, method_sym| hash[method\
_sym] = BeforeAndAfterAdvice::Compositions.new([], self.instance_method(met\
hod_sym), []) }
      end
    end

    def __rebuild_without_advice__(method_sym, old_method)
      if old_method.arity == 0
        define_method(method_sym) { old_method.bind(self).call }
      else
        define_method(method_sym) { |*params| old_method.bind(self).call(*p\
arams) }
      end
    end

    def __rebuild_advising_no_parameters__(method_sym, old_method, befores,\
 afters)
      define_method(method_sym) do
        befores.each do |before_advice_method|
          before_advice_method.bind(self).call
        end
        afters.inject(old_method.bind(self).call) do |ret_val, after_advice\
_method|
          after_advice_method.bind(self).call
        end
      end
    end

    def __rebuild_advising_with_parameters__(method_sym, old_method, before\
s, afters)
      define_method(method_sym) do |*params|
        afters.inject(
          old_method.bind(self).call(
            *befores.inject(params) do |acc_params, before_advice_method|
              if before_advice_method.arity == 0
                before_advice_method.bind(self).call
                acc_params
              else
                before_advice_method.bind(self).call(*acc_params)
              end
            end
          )
        ) do |ret_val, after_advice_method|
          if after_advice_method.arity == 0
            after_advice_method.bind(self).call
            ret_val
          else
            after_advice_method.bind(self).call(ret_val)
          end
        end
      end
    end

    def __rebuild_method__(method_sym)
      __safely__ do
        composition = __composed_methods__[method_sym]
        old_method = composition.between
        if composition.before.empty? and composition.after.empty?
          __rebuild_without_advice__(method_sym, old_method)
        else
          arity = old_method.arity
          if old_method.arity == 0
            __rebuild_advising_no_parameters__(method_sym, old_method, comp\
osition.before, composition.after)
          else
            __rebuild_advising_with_parameters__(method_sym, old_method, co\
mposition.before, composition.after)
          end
        end
      end
    end

    def __safely__
      was = instance_variable_get("@#{UNIQ}_in_method_added")
      begin
        instance_variable_set("@#{UNIQ}_in_method_added", true)
        yield
      ensure
        instance_variable_set("@#{UNIQ}_in_method_added", was)
      end
    end

    def __unbound_method__(a_proc, name_prefx = nil)
      begin
        old_critical, Thread.critical = Thread.critical, true
        n = 0
        n += 1 while respond_to?(mname="#{name_prefx || '__method_advice'}_\
#{n}")
        MethodAdvice.module_eval{ define_method(mname, &a_proc) }
      ensure
        Thread.critical = old_critical
      end
      begin
        MethodAdvice.instance_method(mname)
      ensure
        MethodAdvice.module_eval{ remove_method(mname) } unless name_prefx \
rescue nil
      end
    end

  end

  def self.included(receiver)
    receiver.extend         ClassMethods
    receiver.send :include, MethodAdvice
    receiver.instance_variable_set("@#{UNIQ}_in_method_added", false)
    receiver.instance_variable_set(:@old_method_added, receiver.public_meth\
od_defined?(:method_added) && receiver.instance_method(:method_added))
  end

end

more about combinatory logic

Some technical papers (links and notes courtesy of Bruce Ediger):

Bruce Ediger has built some interpreters you can use:

He writes:

The strong reduction interpreter is the most primitive. The other two have interesting features that they both don't share. For just amusing things done in various CL bases: Examples. Some of the tests for cl and acl also do amusing things, like exercize cycles, enumerate Ogres/Hopelessly egotistic birds, or run large "monsters" out to normal form.

and the blog posts

Here are the blog posts that started it all: Kestrels, The Thrush, Songs of the Cardinal, Quirky Birds and Meta-Syntactic Programming, Aspect-Oriented Programming in Ruby using Combinator Birds, The Enchaining and Obdurate Kestrels, Finding Joy in Combinators, Refactoring Methods with Recursive Combinators, Practical Recursive Combinators, The Hopelessly Egocentric Blog Post, and Wrapping Combinators.

If you're looking for a place to start, the very first post, Kestrels, presumes no previous knowledge of combinators or advanced Ruby-fu. Finding Joy in Combinators was actually written in the middle, but it can be read at any time, it points towards concatenative programming languages, an interesting study area.

Or, return to the top and read the book all over again. Thanks for stopping by!