Reprinted from the Analog Magazine website
Science fiction writers often put a lot of work into researching and developing the backgrounds for their stories. Typically they find and/or invent far more fascinating material than they can smoothly incorporate into the story itself. Part of learning to be a good SF writer is learning to resist the temptation to shoehorn all those extra sidelights into a story, yet many of them are of considerable interest in their own right. "The Science
Behind the Story" is a new forum in which we invite writers to show off some of that "behind-the-scenes" work. It’s also a place for interested readers to learn more about the science behind some of our stories, and how the stories grew out of it. In the future we hope many writers will be interested in participating in "The Science Behind The Story," and that we'll be able to post their entries at the same time their stories appear. To start things off, though, we have a fascinating offering about "The Fruitcake Genome," which appeared in our December 2004 issue. Author Carl Frederick is himself a scientist (a theoretical physicist, to be specific) and a serious dabbler in many other fields including linguistics. His offering here includes a first: your chance to hear for yourself the musical version of the Drosophila melanogaster genome! (And since the story has already appeared, we also include a copy of it here.) --Stan Schmidt
Click HERE to read
the story, 'The Fruitcake Genome',
The Science Behind the Story
The story, 'The Fruitcake Genome' is about receiving genetic information from space--presumably from an alien life-form. I can't think of anything I'd find more intellectually exciting than for SETI to succeed--to find an extraterrestrial intelligence sending information out into the cosmos. Perhaps that is a particular weakness for us astrophysicists. It's hard for me not to think of Carl Sagan's 'Cosmos', or Sir Fred Hoyle's great novel of years back, 'The Black Cloud'.
But the story was written after I finished a computer program to generate 'music' from parts of the fruit fly genome. I used the output of the program as input to a music synthesizer and was frankly 'blown away' when I listened to the music. It, to me at least, sounds 'composed'--and good! I called the piece, 'The Little March of the Fruit Flies'. You can click to hear an MP3 of it. I'd be very interested to hear what people think of it. Maybe any non-random data stream would make music--I don't know. Or maybe there's something intrinsic to the genome that makes it 'musical'.
As for why I wrote this program: From my years of examining squiggles on chart recorders--trying to pull signals from noise, I got the notion that the eye, while great for two and three-dimensional pattern detection, is far inferior to the ear for finding one-dimensional patterns. And from my vantage point of being one of the world's most dreadful violinists, I thought that music might be the right paradigm for an examination of one-dimensional information streams. Arguably, the genome is the most important linear information stream on the planet--so I decided to write a computer program to translate DNA sequences to musical notes.
The program (Genomeplayer) associates elements of the fruitfly genome with music notes and durations. The output (a 'score' file) of Genomeplayer was sent to another of my programs (Kral) which takes score files and 'plays' them with 'real' instruments. The instruments are described in 'instrument files' which can handle enormous volumes of data on how particular instruments sounds under thousands of different conditions. While the Kral program is finished (or as finished as any computer program usually is), I haven't yet had time to create really credible instruments.
To keep to 'the spirit of the genome', I did the conversion, not from the bases (adenine, cytosine, thymine, guanine), but from the 'codons'--the sets of three bases that code for an amino acid. There are 64 (4X4X4) possible codons and only 20 amino acids. Amino acids then, are represented by a variable number of codons--from six down to one. I associated those amino acids with the highest number of codon representations with the most common notes of the c-major scale. I also tried to code so that only the exons (the actual expressed gene sequences, as opposed to the introns, the noise) were used. For this, I needed to use methionine and the three codons that don't represent an amino acid, as start and stop indicators. And, in addition, I used them as musical codas (endings of a section of music). One particular amino acid, I used as a flag to indicate that the following codon represented a change of note duration, or a sharp or flat. These two-codon instructions, since they are comparatively rare, do not drive the music out of c-major, and don't change the tempo particularly often.
To my ear at least, the translation sounds remarkably musical--especially since the 'music' was untouched by human hands. (I wonder who owns the copyright.) But then again, when I was a kid, my violin teacher taught me to play the soda straw (I'm serious). I think now, he was trying to tell me something.
When I did the first translation paradigm file (a 'method file'), I thought the result sounded rather like classical music. I assumed that was because classical music, while concentrating on melody, is not as rhythmically advanced as popular music. Although I didn't have a particularly logical schema for changing note durations, I nonetheless did a second method file, this time with more note duration changes. The results sound to me, rather like jazz.
Recently, I've recast the program to convert virtually any linear data stream to 'music'--stock market data, global climate data, even poetry and fiction. I wonder whether, if I convert some of my fiction to music, I'll be able to hear pattern--particularly elements of bad technique, e.g. repeated words, heavy use of adverbs, a bad balance between narrative and dialog. I'll take any help that I can get.
There's a nifty little primer on genetics on the GlaxoSmithKline website, at http://genetics.gsk.com/overview.htm#overview
You can find the entire fruit fly (drosophila) genome at the Flybase website--a wonderful resource http://www.flybase.org