Wednesday, January 19, 2011

Monad.Zero -- doing it right

For my own benefit (see title of blog) as much as anything else. When I wrote

Rule 1 of monadic parsers is -- if you return Monad.Zero, you're probably doing it wrong... It's likely that you mean monadic return of an empty list.

it's because there's a distinction between positively returning an empty result (i.e. successfully doing nothing) and returning nothing at all (not succeeding in what you were trying to do) -- as the F# documentation for Zero puts it

Called for empty else branches of if...then expressions in computation expressions.

So, for a parser monad Zero is the function that returns a parser that takes no input and always fails:

for the Enumerable or Seq monad, it's the empty sequence, for Maybe its None and so forth -- basically the only thing you can return if you don't have any value to return.

And for individual leaf parsers -- the functions that define an Operation<TInput, TOutput> and return them wrapped as a Parser<TInput, TOutput>, those Operation<TInput, TOutput>s should return the same non-value when faced with a parse failure.

By contrast, when composing a parser that may try to consume some of the input, but fail, and you want to not fail the whole parse, you need a parser that successfully matches nothing (and consumes nothing) as an alternative that provides a success value -- and that is not Zero, but rather Return of the non-value of the appropriate type. So, for example, the parser combinator Many:

is "Consume as many blocks of p as you can -- or, if that fails, do nothing"; by contrast, the combinator Sat fails if the predicate doesn't match:

So in that case it says Zero and means Zero because that's a failure to satisfy the parser -- here in the equivalent the F# computation expression we don't even have to write the else branch; it's filled in for us by the compiler.

And that's really the one subtlety to be borne in mind -- otherwise you end up with an almost intractable debugging issue when your parser fails unexpectedly, partway through consuming valid input.

No comments :