Saturday, April 7, 2012

Callbacks coming back


Historic Context
pun intended

Callbacks are a wonderful thing.  It's the original object-oriented code.

Code A makes a function call to Code B along with a context.

OOP programming sort of took us off the straight-and-narrow when it generalized this.. The problem became that A started to know a LOT about B, tightening the coupling to the point that code was nearly impossible to be polymorphic (ironicly).. If A wants a list of "Employee" objects, so that it can sum up their salary to produce a company cost.. I can't reuse that same "sum-up" algorithm and apply it to, say, matrix-math on arbitrary data-structures.  I'd need to write an adaptor classes/functions to produce common data-structures that both the algorithm (the sum-up function) and the classes themselves both conform to.  Now do this in such a way that two vendors would have chosen the EXACT same signature.  And your chances of code-reuse go out the door.

The same can be said for execution pipelines.  Do A, then B, then C... BUT if A, B or C fail, do error-code E.  Normally we'd think procedurally and write function-calls DIRECTLY to E everywhere.. OOP gave us the notion of exception-handling, which has it's merits.. But if we're linking to 3rd party code, we have similar problems.. How do I differentiate exactly what each exception means genericly enough that I can link to any 3rd party code?  I can't, I have to read the documentation and import their specific exception-class.. And God help us if they handle this poorly.

But, if our code was instead a sequence of callbacks that had minimalist signatures and data-hidden contexts, we could be more versatile.

a( callback-b ( callback-c ( { DONE }, error-cb-E ), error-cb-E ), error-cb-E)

Here A, B and C do their thing then call either a success or error callback to done the exit-path or the continuation forward-path.  The void-main decides the linkage order, and here's the beauty.. Instead of A hard-coded calling B, it can be swapped out at CALL-time to do any number of alternate things.

If you want specialization, you can write a function which you call which links A to B, but leaves C as a callback; you could write a function which takes A, B, C but defaults a common E, etc.

The main obsticle, I would imagine is readibilty of code.  And here comes the strengths/ weaknesses of the underlying language.

Who's got my back?
The caller does

We'll assume a virtually stateless zero-arg input, zero-arg output for these examples

C func-signatures
=============
typedef void (*voidfunc)(void* data);
void runme(void* data, voidfunc cb) { cb(data); }
void mycallback(void* data) { ... }
int x = 5;
runme(&x, mycallback);

The definition of ugly could reference here.

C++  Objects
==========
class Callback {
public:
  virtual void operator()() = 0;
};
void runme(Callback& cb) { cb(); }


class MyCB: public Callback {
public:
   void operator()() { .. }
};
MyCB cb;
runme(cb);

More code, more assembly-setup (constructors/destructors), and only slightly more readable than C (you could replace operator() with something more readible, but then what's the likelihood other vendors would have chosen your name too?).

C++ 1998 templates
===============
template<typename T>
void runme(T& cb) { cb(); }


class MyCB {
public:
   void operator()() { .. }
};


MyCB cb;
runme(cb);
void foo() { .. }
runme(foo);

Getting there, still have boiler-plate.  Sadly the templates need to live in .h files.

C++ 2011 lambda's / closures
==============
void runme(function<void()> cb) { cb(); }

int x = 5;
int y = 6;
runme([](){ printf("Hello\n"); });
runme([=](){ printf("Hello %i\n", x); });
runme([x](){ printf("Hello %i\n", x); });
runme([&](){ x++; });
printf("x=%i\n", x);
runme([&x](){ x++; });
printf("x=%i\n", x);
runme([&x, y](){ x+=y; /* y++; error */});
printf("x=%i, y=%i\n", x, y);

Ok, I think they got the point.. You can staticly specify the closure (what is copied, what is referenced.  Other than the initial shock of seeing square-brackets, it does stand out as a lambda.. And the rest seems rather intuitive. parents for callback params, brackets for code.  The one remaining oddity is the ret-val signature..

GCC and visual studio should support some subset of this syntax. GCC didn't like =x for example.

For GCC you have to use --std=c++0x
More at:
http://candrews.net/blog/2011/07/understanding-c-0x-lambda-functions/

Apple LLVM C/C++ extension; call-blocks
======================
typedef void (^callback)();
void runme(callback cb) { cb(); }
int x;
runme(^{ printf("%i", x); });

Now we're really on the right track.  Though it gets nasty if you want a reference closure.

__block int x;
runme(^{ x++; });

And lets not get started with the possible bugs introduced from their explicit cloning and reference counting of blocks.

Happiness does not derive from here... Moving on.. (hate mail welcome)

Java Objects
=========
void runme(Runnable r) { r.run(); }


final int x= 5;
final AtomicInteger ai = new AtomicInteger(0);
runme(new Runnable() {
   public void run() { System.console.printf("Hi %i", x);   ai.set(x); }
});

Slightly better than class C++ Objects (due to anonymity).  But hopefully peers have chosen Runnable and Callable as their signatures.  But what if we wanted 2 input params?  What no TupleN data-type?

language-fail


Javascript anonymous functions with full closure
=======================
function runme(cb) { cb(); }
var x = 5;
runme(function() { x++; ...  })

Pretty much the definition of simplicity and readibility.  Sadly, you don't always want closure.. Say in

for (var i = 0; i < 10; i++ ) runme(function() { i--; }) // REALLY BAD


Basically java mandates one way which is safe, javascript mandates the other way which is unsafe.

perl anonymous functions
==================
sub runme($) { my $cb =$_[0]; &$cb(); }
my $x = 5;
runme(sub { $x++ })

Invocation is pretty good, but the dispatch code is nasty looking. Moral of the story.. Perl Rule #1, be a library USER, not a library writer in perl. :)

I will say, however, that the generic stack-structure of perl makes it EXCELLENT for callback oriented code.. You can write a function which doesn't know, nor care the number of arguments.. So, for example, sort-routines, couting routines are natural fits

sub count(@) {
   my $tot = 0;
   $tot += $_ for @_;
   return $tot
}

and the type-coersion means if you can stringify something, then numerify-that string, the count func will work on it.  But lets not forget rule perl # 1, shall we.. Moving on..

ruby lambdas
==========
def runme(cb)
   cb.call();
end
runme lambda { ... }

Their general lambda syntax is excellent.

lambda { |a1,a2| /* closure+local code, inferred returned data-type */ }

If only it weren't for the rest of ruby syntax. :)  I mean, these guys took BASIC, python and C, and said, eh, I think you missed a combination.  Clearly there were not vi fans.

python lambdas
===========
def runme(cb): cb


runme(lambda: ... )

Hard to complain, yet somehow I've never bothered.  And I'm one of the rare people that like their tab-oriented syntax.  I white-board pseudo-code in python.. I just never want to have to use it for some reason.

groovy closures
============
def runme(cb) = cb();

runme( {  ... } );

Groovy has a lot of merit.  It seems to be driving syntactic changes in java itself.  I just fear that it's compromising on a lot of java's strengths. type-safety, tight deterministic execution paths (groovy compiled code can be highly reflective for something that otherwise would look efficient), harder to staticly analyze by an editor (e.g. unsafe refactor operations, or 'who calls this code?' queries).

But if you look at it as a language unto itself, it's certainly in the same domain as jython, jruby.

google dart
========
runme(cb) { cb(); }


runme(() => expression);
or
mylocal() { expressions }
runme(mylocal);

I think Dart has merit as a replacement for javascript, being that it's written as a forwardly compatible language. It's definitely more system-language than javascript and thus more amendable to large-scale 'million-line-apps'.  I think it has some quirks because most feedback in the dart-forum is, "we agree such and such would be better, but we're really trying to maintain the legacy language feel, to help invite new coders".  My response, which is echo'd by others' is that you're going to shock them with the deviations anyway, so just because you can make it smell like C++ isn't going to win anyone's heart once they delve into it.

google go closure
=======
func runme(f func() { f(); }

runme(func() { ... });


scala functions with full closure
===========
def runme(cb: =>) = cb
var x = 5 // variant
val y = 6 // invariant
runme( => println((x++) + y) )

Basically a safer javascript, since you are very prone to lock down almost all variables by default (don't have to lift an extra finger to type "const" or "final").

Almost all traditional syntax is optional (semi-colons, parens).  So we could have been explicit:

def runme(cb: () => Unit): Unit = { cb(); };
...
runme(() => Unit { ... });

I'm becoming prone to using scala as my prototyping language because it's faster to type, more compact than most other languages (including ruby), yet has the full explicit power of the JVM library (threading, async-IO, JNI, video-processing, etc).  Further, they nicely try to mitigate several of the languages bugs like null-pointers.  And the power to which they've extended callbacks is mind blowing (literally hurts the brain).

def createfunc(x: Int)(y:Int):Int = x + y


val cb = createfunc(5)_; // nasty trailing "_" to denote incomplete execution


print(cb(6)); // prints 11

Is a lambda-factory which generates a callback which wraps "x". The callback will take "y" and add it to the wrapped x.

I only elaborate because the next level of callback functions is the synthesis of callback functions


Lisp / clojure / etc
==========
(define runme(cb) ((cb)))
(foo (lambda () ( ... ))

Or even the crazy lamba factory syntax
(def create-foo(x) `(lambda (y) (+ x y)))

// WARNING - syntax most probably incorrect

With the mandetory xkcd link:
http://xkcd.com/224/
or (http://timbunce.files.wordpress.com/2008/07/perl-myths-200807-noteskey.pdf)

Summary
Callback oriented code, IF the underlying syntax promotes it, allows you to almost fully decouple your code.  You can do this with objects, functions, function+data pairs, functions-with-closure.  But the key is that such coding style promotes cross-code-use.  You can scaffold your execution pipeline differently in different contexts within the same executable.

Javascript AJAX and apple grand-central are promoting the hell out of callbacks.  They don't leave you any choice.
Java's spring-framework gets you half-way-there with the notion of dependency-injection... Forcing you at least to interface your code for testibility.  It's a half-measure.

The real power is that once you've callbackified your code, asynchronous programming models become second nature.  If you were going to break up the error-handling or steps 1, 2 and 3 of your code anyway.. Now you have the option to use a dispatch system (grand-central, java executor-pools, javascript worker-threads) to leverage multiple CPUs.  But this isn't always a positive (inter-CPU cache-coherence can actually slow-down performance, and will certainly burn more power for the same unit of work).  Still, it's great for UI stall-mitigation.. backgrounding all non-critical-path work-flows.  If nothing else, the explicit concurrency can flattened into a single thread of execution.  That's just a run-time setting.

No comments:

Post a Comment

Followers

About Me

My photo
Knowledge of any kind is my addiction