This code is from Querying a Dataset with Scala's Pattern Matching:
object & { def unapply[A](a: A) = Some((a, a)) }
"Julie" match {
case Brothers(_) & Sisters(_) => "Julie has both brother(s) and sister(s)"
case Siblings(_) => "Julie's siblings are all the same sex"
case _ => "Julie has no siblings"
}
// => "Julie has both brother(s) and sister(s)"
How does &
actually work? I don't see a Boolean test anywhere for the conjunction. How does this Scala magic work?
Here's how unapply
works in general:
When you do
obj match {case Pattern(foo, bar) => ... }
Pattern.unapply(obj)
is called. This can either return None
in which case the pattern match is a failure, or Some(x,y)
in which case foo
and bar
are bound to x
and y
.
If instead of Pattern(foo, bar)
you did Pattern(OtherPattern, YetAnotherPatter)
then x
would be matched against the pattern OtherPattern
and y
would be matched against YetAnotherPattern
. If all of those pattern matches are successful, the body of the match executes, otherwise the next pattern is tried.
when the name of a pattern is not alphanumeric, but a symbol (like &
), it is used infix, i.e. you write foo & bar
instead of &(foo, bar)
.
So here &
is a pattern that always returns Some(a,a)
no matter what a
is. So &
always matches and binds the matched object to its two operands. In code that means that
obj match {case x & y => ...}
will always match and both x
and y
will have the same value as obj
.
In the example above this is used to apply two different patterns to the same object.
I.e. when you do
obj match { case SomePattern & SomeOtherPattern => ...}`
first the pattern &
is applied. As I said, it always matches and binds obj
to its LHS and its RHS. So then SomePattern
is applied to &
's LHS (which is the same as obj
) and SomeOtherPattern
is applied to &
's RHS (which is also the same as obj
).
So in effect, you just applied two patterns to the same object.
Let's do this from the code. First, a small rewrite:
object & { def unapply[A](a: A) = Some(a, a) }
"Julie" match {
// case Brothers(_) & Sisters(_) => "Julie has both brother(s) and sister(s)"
case &(Brothers(_), Sisters(_)) => "Julie has both brother(s) and sister(s)"
case Siblings(_) => "Julie's siblings are all the same sex"
case _ => "Julie has no siblings"
}
The new rewrite means exactly the same thing. The comment line is using infix notation for extractors, and the second is using normal notation. They both translate to the same thing.
So, Scala will feed "Julie" to the extractor, repeatedly, until all unbound variables got assigned to Some
thing. The first extractor is &
, so we get this:
&.unapply("Julie") == Some(("Julie", "Julie"))
We got Some
back, so we can proceed with the match. Now we have a tuple of two elements, and we have two extractors inside &
as well, so we feed each element of the tuple to each extractor:
Brothers.unapply("Julie") == ?
Sisters.unapply("Julie") == ?
If both of these return Some
thing, then the match is succesful. Just for fun, let's rewrite this code without pattern matching:
val pattern = "Julie"
val extractor1 = &.unapply(pattern)
if (extractor1.nonEmpty && extractor1.get.isInstanceOf[Tuple2]) {
val extractor11 = Brothers.unapply(extractor1.get._1)
val extractor12 = Sisters.unapply(extractor1.get._2)
if (extractor11.nonEmpty && extractor12.nonEmpty) {
"Julie has both brother(s) and sister(s)"
} else {
"Test Siblings and default case, but I'll skip it here to avoid repetition"
}
} else {
val extractor2 = Siblings.unapply(pattern)
if (extractor2.nonEmpty) {
"Julie's siblings are all the same sex"
} else {
"Julie has no siblings"
}
Ugly looking code, even without optimizing to only get extractor12
if extractor11
isn't empty, and without the code repetition that should have gone where there's a comment. So I'll write it in yet another style:
val pattern = "Julie"
& unapply pattern filter (_.isInstanceOf[Tuple2]) flatMap { pattern1 =>
Brothers unapply pattern1._1 flatMap { _ =>
Sisters unapply pattern1._2 flatMap { _ =>
"Julie has both brother(s) and sister(s)"
}
}
} getOrElse {
Siblings unapply pattern map { _ =>
"Julie's siblings are all the same sex"
} getOrElse {
"Julie has no siblings"
}
}
The pattern of flatMap
/map
at the beginning suggests yet another way of writing this:
val pattern = "Julie"
(
for {
pattern1 <- & unapply pattern
if pattern1.isInstanceOf[Tuple2]
_ <- Brothers unapply pattern1._1
_ <- Sisters unapply pattern1._2
} yield "Julie has both brother(s) and sister(s)
) getOrElse (
for {
_ <- Siblings unapply pattern
} yield "Julie's siblings are all the same sex"
) getOrElse (
"julie has no siblings"
)
You should be able to run all this code and see the results for yourself.
For additional info, I recommend reading the Infix Operation Patterns section (8.1.10) of the Scala Language Specification.
An infix operation pattern p op q
is a
shorthand for the constructor or
extractor pattern op(p,q)
. The
precedence and associativity of
operators in patterns is the same as
in expressions.
Which is pretty much all there is to it, but then you can read about constructor and extractor patterns and patterns in general. It helps separate the syntactic sugar aspect (the "magic" part of it) from the fairly simple idea of pattern matching:
A pattern is built from constants,
constructors, variables and type
tests. Pattern matching tests whether
a given value (or sequence of values)
has the shape defined by a pattern,
and, if it does, binds the variables
in the pattern to the corresponding
components of the value (or sequence
of values).