[r-t] Big search
mark at snowtiger.net
Tue Jan 22 20:05:36 UTC 2008
> I hope he enjoys sorting through > 400 billion compositions....
Already done! Machine does it as it goes along, only compositions passing
the music filter get written to disk. I saved about 70,000 compositions out
of the 458 billion, giving a file size of about 4MB. This is a "raw" file,
with no music stats apart from the total score for each comp, so not so easy
for a human to look through. SMC32 has a separate analysis mode which parses
this file (against the same or different music criteria) and produces
human-readable output with music counts - takes about 4 seconds to do the
70,000 compositions, so much slower than the main search, but quick enough.
It will still take me a while to analyse the compositions that I have kept,
even though I'll apply further filters. It is always worth taking time in
setting up the search, and then in analysing the results.
> Fortunately, you weren't writing an air traffic control system.
Yes, and if the "time taken" had been critical to the search (i.e. would
have crashed it out) I'd have thought about it harder. Most of the counters
are 64-bit where they need to be.
> The very high number of compositions found (100,000 per second) will have
> had a significant slowing effect on progress.
Hmm, it's not too bad Graham. With your "bobs-only extent" search I get
90MN/s; I've just run the first minute or so of the Big Search (which writes
out over 40 compositions to disk, so IO as well as CPU spent on composition
processing) and it achieved 80MN/s. I know the whole search averaged
somewhat less than that, but I suspect that was primarily because I was
running at lower processor priority. It doesn't take very long to analyse a
composition during the search, because the main music "score" is a single
figure calculated by the search loop. The first step in evaluating a
composition is just a check for length then a comparison of this single
music score against the filter - both extremely cheap operations, which will
be failed by 99.9% of the compositions found. You only have to do slow
operations for the tiny percentage of compositions which pass the basic
More information about the ringing-theory