[r-t] Scala, part three
Mark Davies
mark at snowtiger.net
Thu Jul 5 23:02:09 UTC 2012
Right, I'm not sure how many are still with me (hands up if you're
there?!) but here's the last article in my Scala-for-composers series.
Here I'll focus on the remaining component of the Helixoid Pipeline
described in previous emails: the search for splice-sisters. We'll see a
bit more Scala, including the promised "node tables without table-build"
optimisation.
The starting point for this stage in the pipeline is a list (or
Iterator) of 1200 Double Helixoid methods which have the correct
congruency property. Using the stage-2 TV expansion described in the
last article, this immediately expands to approximately 73,000 methods,
each of which we need to check for course-splices.
The simplest way to find those course-splices is simply to run the
Helixoid tree-search again, modified to prune off methods which deviate
from the set of rows in the plain course of the method under
consideration. Running 72 thousand separate tree-searches sounds like a
recipe for a long wait, but in fact the pruning achieved by the
course-splice restriction is so severe that the whole thing runs quicker
than the original, unrestricted search.
The modification to the original search algorithm (given in article 1)
is very minor. Before, our an inner loop was as so:
for (pn <- AllPN.filter{isRightPlace(_, pnSoFar.head)})
{
val newRow = row.apply(pn)
if (!rowsThisTime.contains(newRow) &&
!rowsUsed.contains(newRow))
{
... recursive call ...
}
}
All there is to do is to add another check to the if() statement, to
ensure the new row of the new method exists in the plain course of the
old method:
if (!rowsThisTime.contains(newRow) &&
!rowsUsed.contains(newRow) &&
oldMethod.fullCourse.contains(newRow))
Well - not quite. Such methods will not be course splices unless the
leadhead group is the same. The tree search only produces the PN up to
the halflead, so we need to try out different halflead and leadhead PN
to see if we can get the right kind of leadheads.
As we'd expect from Scala, filtering our results to find methods with
the right leadhead is straightforward. First we get the set of allowable
leadheads: given we have generated the entire plain course of the source
method anyway, this boils down to picking the 3rd, 6th, 9th and 12th
leadheads (these will have 123 home).
The quickest way to deal with sets of numbers like this is to use a
Range, another component of Scala's collections library. There are
several ways to create Ranges:
Range(0, 10)
0 until 10
0 to 9
All three of these create a Range object which, when iterated through
using any of the normal collections functions, will produce the numbers
0 to 9. You can also apply a step:
Range(0,10,3)
0 until 10 by 3
0 to 9 by 3
All give 0, 3, 6, 9. Note that "until", "to" and "by" are just ordinary
functions, the first two operating on number, and here shown using infix
notation. We could equally well write
0.until(10)
0.to(9).by(3)
Anyway, using a Range, extracting the right leadheads from the full
course is a one-liner:
val goodLHs = (3 to 12 by 3).map
{ (n) => helixoid.fullCourse(n*HalfLeadLength*2) }
The other part of the job is to generate all the possible leadheads for
the candidate method, given just the first half-lead. Again, it's one line:
List("56","36","16","58","38","18").
map{ new Method(firstHalf++List(PN(_)), PN("14")}.
filter{ goodLHs.contains(_.fullCourse(HalfLeadLength*6)) }
This is simplified by the fact that the nature of my composition was
such that a 14 leadhead was required, so it was just the half-lead PN
possibilities which I needed to check.
There is one more job left to do. This stage of the pipeline produces
quite a lot of splice-sisters, but most of them are pretty poor, with
only very minor variations. It would be nice to output some kind of
score in order to spot examples of very different methods which still
splice. This simple function did the trick for me:
def methodDiff(h1: Method, h2: Method) =
{
val pns1 = h1.pns.slice(0, HalfLeadLength-1)
val pns2 = h2.pns.slice(0, HalfLeadLength-1)
pns1.zip(pns2).filter{ (pair)=>pair._1!=pair._2 }.size
}
This final piece of functional loveliness uses the zip() function, again
provided by the rich api of Scala's collection library. Zip takes two
collections and returns a single collection of pairs of elements. So we
take the two sets of place notation from two splice-sister candidates,
zip them together, and count how many times the PN elements in the
zipped pairs differ.
And there we have it - the final part of the pipeline, and, a few
seconds later, half-lead-splice-sisters H and J popping out of the matrix.
However, I want to stop at this point and re-examine the search
algorithm. Compared to "production-quality" machine searches, such as
that implemented by SMC32, our code suffers from an obvious performance
deficit: we are generating permutations in the inner loop. The line of
code which does this doesn't look very significant:
val newRow = row.apply(pn)
However the implementation of the Row function "apply(pn: Pn)" is hidden
from us, and we can assume it is not very quick. My implementation holds
both the Row and the PN as arrays, and steps through the latter in order
to work out which bells in the former to swap. Whilst fast enough for
our purposes here, a proper machine-search would not wish to incur this
penalty. Instead, a table of nodes would be normally be pre-built, one
for each row in the search space, and containing a list of pointers to
the next node for each allowable permutation. (For some search
algorithms, nodes are also a home for false-node pointers, music counts,
and the like).
Unfortunately when writing quick-and-dirty one-off searches it is quite
hard to motivate oneself to go to the trouble of writing the node-table
build. Scala offers another way. Built into the language is the concept
of a lazy variable, which is only initialised when referenced. I don't
use that here, but what if we treated the node table like that, and
built it as we needed it? Looked at this way, the node table is just a
cache of previous row.apply(pn) results.
We can implement a cache using a Map. Here's one suitable for our purposes:
val nodeMap = mutable.Map[(Row,PN), Row]()
This declares a mutable Map data structure, keyed on a pair (Row,PN),
and having another Row as the lookup value. Now before applying a PN to
a Row, we'll first look it up in the nodeMap, and if it's there, use the
result. If not, we'll carry out the PN apply and store it back to the
map for future use.
That still sounds like a bit too much coding for the hard-pressed
composer; but again Scala has something clever to help us. Using Maps as
caches is so common that the collections library provides a method,
getOrElseUpdate(key, defaultValue), which does all of that for us. Here
it is at work. Remember, previously we had:
val newRow = row.apply(pn)
Using the nodeMap we can change this to:
val newRow = nodeMap.getOrElseUpdate{(row,pn), row.apply(pn)}
The getOrElseUpdate method first tries to get the value out of the Map
using the key passed in the first parameter (here a pair). If this
fails, it then takes the second parameter, puts it into the Map with the
same key, and returns it to you. Exactly what you want a cache to do,
and you don't have to remember to code it yourself. (I have done so in
Java, many times, and once or twice got it wrong - if for instance you
forget to add the new value to the map, the code works, it's just
somewhat slower than it ought to be... very hard for the usual kind of
testing to find!)
But hold on - isn't the second parameter, row.apply(pn), going to be
executed every time? Normally, this would be the case, but Scala
supports the concept of pass-by-name parameters, which are only
evaluated if the function uses them. The getOrElseUpdate method marks
its second parameter as pass-by-name, hence we only incur the cost of
the PN permutation once.
So that's it - node table implemented. Well, perhaps not quite. We've
replaced one cost, that of PN permutation, with another, hash table
lookup. Hash tables are pretty clever these days, so it's likely we've
improved things, but maybe not enough. An old-school node table doesn't
do any lookups, but steps directly from node to node, with an array
index at most.
Now, I haven't implemented this yet, but I am pretty sure we could get
even cleverer, and implement "true" nodetables as a library function,
independent of our permutation type, and almost entirely transparent to
our search algorithm. I won't go into this in depth, but Scala supports
the concept of "traits", which are a bit like a combination of Java
interfaces and Ruby mixins. I think we should be able to code a library
trait called Node, which can be mixed in whenever we need true
nodetables. The search algorithm wouldn't change, but all of a sudden
our row.apply(pn) call would become nodetable-efficient, because we'd
mixed in the Node trait to the Row class. I shall get back to you when
I've coded this. :-D
In the meantime, the caching idea is pretty neat, eh?
Well, I think that'll do. I hope I haven't bored everyone to total
suicide. Indeed, I hope I've whetted some appetites at the possibilities
opened up by this excellent new language.
Are there any downsides to Scala? Yes, certainly. To make the most of it
you may have to get your head into a new way of working, especially if
you haven't previously experienced functional programming. The tooling
support has struggled at times; there are now pretty good plugins for
both Eclipse and IntelliJ (I'd recommend the latter), but such is the
power and complexity of the language that things like auto-complete
aren't yet as superbly slick as they are with Java. Compile times are a
bit slower, too. And debugging functional code can be tricky - sometimes
it's helpful to expand out a series of map() and filter() operations to
see what the hell is going on in there.
But overall, could I have coded up this Helixoid stuff in any other
language, in the late nights and lunch breaks which it seems is all my
spare time consists of these days? The answer is certainly not. In Java,
a language I know inside-out after 15 years of working with it, I
wouldn't even have bothered trying some of the things I experimented
with during this project. With Scala they were easy, and fun.
So there we are - Scala, a language for late nights and lunch breaks.
Try it, let me know what you think. Here, again, are the links you'll need:
<http://www.scala-lang.org/>
<http://www.amazon.co.uk/Programming-In-Scala-2nd-Edition/dp/0981531644/ref=sr_1_1?s=books&ie=UTF8&qid=1340897990&sr=1-1>
MBD
More information about the ringing-theory
mailing list