THE TREMELO swell of the flute rises to a resonant peak, the delayed accent falling a fraction of a second behind the beat and . . . no good. The harpsichord accompanist has muffed it. Playing straight from the score, he has come in too early for the flute's distinctive phrasing. They work through it again and again, until by the sixth pass the accompanist has adjusted to the precise timing of the soloist's style, and pauses for the perfect interval.

Another rehearsal is over. The soloist picks up his coat and leaves. But the accompanist can't. It's wired to the floor.

The accompanist is a computer model called The Synthetic Performer, one of the many ways in which MIT's Experimental Music Studio is exploring the computer as an instrument, a composition tool, a polyphone palette of heretofore unhearable sounds. In Cambridge as elsewhere, cybernetic composers are pushing the definition of music toward the outer edge of imagination.

The Digital Ensemble

Since the dawn of rhythm, the composition and performance of music have demanded the use of human musicians and acoustic instruments. The composer who wrote a concerto couldn't hear it until he'd contrived to assemble a soloist and an orchestra of several dozen musicians -- a formidably costly undertaking. But over the last two decades, those constraints have gradually disappeared as recording techniques and music "synthesizers" have made it possible to duplicate, store and manipulate the sounds of familiar instruments.

And now computer technology is on the verge of freeing the sound of music from virtually all physical limitations and its creation from any formal skills of the composer. Digital data storage {see box} makes it possible to save, copy and manipulate every musical component from a single kazoo note or snare-drum snap to a tutti orchestral roar. The computerized processing of that data provides the opportunity for formerly unimaginable combinations and transformations of sound; and the application of "artificial intelligence" routines to those systems can produce devices like The Synthetic Performer.

"At first it was pretty dumb," says EMS director Barry Vercoe, "but gradually we've improved it to where it approaches the intelligence you'd expect from a human performer." As for adapting to a soloist, "it turns out that it takes it about five or six rehearsals -- about the same as it takes live musicians."

Vercoe's research objective is to create a system that can "understand the dynamics of live ensemble performance well enough to replace any member of the group by a computer model so that the remaining live members cannot tell the difference." That entails the near-instantaneous detection of loudness, pitch, subtle variations in tempo and nuances of "attack." This is hard enough for human professionals, whose on-board wetware processors (or "brain") are capable of handling millions of stimuli simultaneously. For a mere machine, it is a stupefying challenge.

To meet it, Vercoe and his colleagues designed a control structure that simultaneously combines three forms of input: 1) the original musical score, 2) a bank of audio monitors to detect the soloist's sounds, and 3) a set of optical sensors mounted on the flute. Those sensors can see a finger begin to move long (in computer terms) before the intended note is played.

The data are then run through various procedures, including the revolutionary "4X" audio processor developed at French composer-conductor Pierre Boulez' studio. It can conduct eight activities simultaneously and is capable of handling as many as 500 independent sound sources.

The Synthetic Performer determines how fast or slow it should play through a two-stage process. First, every 12 milliseconds it samples the human soloist's tempo and saves the information in clusters Vercoe calls "beat bins." Then every 200 milliseconds or so the contents of the beat bins are read to determine the live player's apparent position in the score.

The program compares that position with the Synthetic Performer's score position, and, says Vercoe, settles on "an appropriately graceful catch-up action. Just five or so such determinations per second seems to represent adequately the manner in which performers do this kind of thing."

Eventually, Vercoe hopes to devise a system that can perform adequately without the rehearsal: "The most demanding test of a Synthetic Performer is how well it behaves in the absence of previously gathered information -- by sight-reading, at is were, on the concert stage."

Inside the Cube

The Synthetic Performer is only one of many projects underway at the EMS, a part of MIT's "Media Lab" -- an academic section devoted to examining creative ways of using computers.

Another is the design and development of "hyperinstruments," control consoles that give a musician or composer easy access to a vast range of musical powers by allowing a single action -- like a simple keystroke -- to control multiple effects at once. In one system, designed for a piano-like keyboard, the computer senses how fast or hard the key is depressed and translates that speed into one of several rhythmic patterns for playing the desired note.

Composer-professor Tod Machover demonstrates. He is standing at the dual-level keyboard of a synthesizer on the dimly lit floor of The Cube -- a 64-foot-square room acoustically suspended inside MIT's Wiesner building -- all six walls of which are covered with gray sound-damping material, giving visitors the sense of being trapped inside a giant charcoal briquet. He strikes a key softly, and a note begins to sound in a syncopated rhythm. He strikes it again hard, and the beat picks up.

Related programs allow patterns to be invoked, superimposed over one another or modified at will during performance. Machover taps another key, and a beat drops out of the rhythmic sequence; hits yet another twice and two staccato pops are added on top of the base rhythm, which in turn can be altered to sound like almost instrument or entity.

Composition on that sort of system eliminates the need for such traditional skills as the ability to read music, since the computer can display sounds and sequences in any form -- moving shapes, graphic blocks, color blobs or whatever representation is most congenial to the composer. And the "score" sequence can then be edited on a personal computer as easily as altering text with a word-processor. The order of notes can be reversed with a single command or two musical lines merged, and the results heard immediately.

That sort of control was not possible on the earliest synthesizers created by such sonic pioneers as Robert Moog. Those "analog" machines mimicked various instruments by using electronic oscillators to make combinations of frequencies. They transformed the nature of pop music, but remained disappointingly unable to reproduce the complex tones and shadings of real acoustic instruments. By the late '70s, Stevie Wonder was lamenting that lack to Raymond Kurzweil, head of Kurzweil Music Systems in Waltham, Mass. The company had produced a device to read books aloud (Wonder was one of the first users), and Kurzweil began pondering whether the same kinds of digital computer routines could be used to analyze and "read" back musical notes.

By that time, analog-to-digital conversion routines {see box} were becoming familiar, allowing Kurzweil and others to take the most complex sonic events and encode them in computer language. Eventually music-hardware manufacturers arrived at a set of standards for data formats, transmission methods and hardware couplings. Known as MIDI (Musical Instrument Digital Interface), this standard allows a Macintosh or IBM-compatible personal computer to be connected to a myriad of musical devices.

And now the theater of experiment is expanding into psychoacoustics, examining ways of modifying the textural nuances of sound. "We want to be able to change the acoustic ambience of the space in mid-note," says Vercoe. "You don't want to be hearing a cathedral sound if you're looking at a video image of a little room. But when the camera pans through the door and goes outside, you should suddenly you feel as if you are outside. That control becomes part of the piece, and gives the composer more choice."

And more economic freedom. For several years it has become indisputably apparent that live orchestral -- and especially operatic -- performances will soon price themselves out of the reach of all but a tiny affluent minority of music-lovers. Computer music promises an experience of comparably sumptuous depth at a fraction of the cost.

The Synthetic Opera

Another area of research involves modeling the human voice. For example, a program called Chant, originally developed at Pierre Boulez' IRCAM (Institut de Recherche et de Coordination Acoustique-Musique), uses a technique called FOF (for "fomantic wave forms," the sound waves associated with vowels) to synthesize human singing and engineer transformations between vocal and non-vocal sounds.

This technology is being refined continuously, and one of its most dramatic implementations will be heard in December when Tod Machover's new "media opera" titled "Valis" will premiere at the Pompidou Centre in Paris.

Based on a novel of the same name by the late science-fiction author Philip K. Dick, Machover's opera calls for only one actor with a speaking part and five singers. The set will be a mammoth abstract model of an integrated computer circuit, with 200 video monitors sloping toward the audience. Amid the special effets, video and projected images, the musicians -- a keyboard and percussion player -- will begin performing normal acoustic music, but their sound will gradually undergo live computer transformation. The manipulations of voice, however, are sure to be the most impressive feature of the show.

Using a computer device called the Phase Vocoder, Machover has developed programs which are capable of slicing a sentence into over 1,000 slivers, then recombining and remixing them. "The next step," he says, "is to use another program to define very slight time-warping for each of the subsidiary sounds, so that each one slides in and out of phase with the others, with very precise points where all the parts come exactly together again. The result is something one has not heard before: It is as if a human voice defuses gradually as one listens, turns into pure timbre, and then snaps back into clear speech at integral parts of a sentence, to emphasize particular words or phrases. In this way, speech truly becomes music."

Machover cues up the "Valis" theme song, and The Cube fills with a sort of living curtain of pulsing voices, out of which -- like figures bulging from a bas-relief -- words, then phrases begin to emerge, surging and receding, as if straining to become recognizable before sinking again into the vocal mass.

A further flourish will arrive at the end of the program. The sole actor's speaking voice will be monitored, reconstituted and turned into song so that he will be speaking his lines accompanied by his own voice singing.

What's next? "One of my dreams for a long time," Machover recently told Stewart Brand (author of "The Media Lab: Inventing the Future at MIT"), "has been to have compositions which are like living organisms."

This might take the form of a self-generating, conductorless symphony. A programmer could marshall an array of basic musical elements -- tones, chords, melodic forms, certain kinds of harmonies and progressions -- and then write some general rules for their interaction and simply set the system off to find its own musical patterns.

Alternatively, Machover envisions a sort of hyper-hyperinstrument, a system that puts a performer "somewhere between improvisation and composition. It would be a very powerful way of using a computer to allow amateurs to participate in the musical process in a way they've never been able to." Even the keyboard itself would be gone. "I imagine an instrument something like a potter's wheel," says Machover. By moving his hands across an undulating surface, the player could mold his sound by feel.

Thus creating a paradox: At the utmost extreme of technology, technology itself would disappear, returning music to its aboriginal source -- in the human soul.