[r-t] PB6 search completes
mark at snowtiger.net
Sat Feb 6 14:42:32 UTC 2010
Ian Broster writes,
> Regarding output size, compression is the key, not file size. As I noted,
> when I started this search, the output was as big as my disk...
> Why not just do something akin to
> program | bzip2 - > output.bz2
My disk is plenty big enough to hold the uncompressed results, but
compressing wouldn't help my file output problems, unless I put the
compression algorithm into the program directly.
The main problem is that SMC32 searches are designed to be restartable,
so as well as the composition output, checkpoint data is written to
disk. In order to write another buffer-full of compositions, the file
pointer must be rewound slightly in order to overwrite the previous
checkpoint. I keep a 64-bit file pointer, but the stdio library I use
only appears to use the low 31 bits, which is annoying. Hence my seek
fails when the file size reaches 2GB.
I could try recoding it so that the checkpoint is written to a separate
file, but that would be less neat for most searches. I'm inclined to
think that no-one really needs to see 1.4 billion extents of Plain Bob
Minor. You could filter down to particular types of composition using
music checks (H/B distinctions, perhaps, or wraps) or call counts if you
really wanted to see some.
More information about the ringing-theory