Advent of AI

It was on the second verse of Adestes Fideles that my mind wandered to artificial intelligence one night. We were at Harvard University’s 114th annual Christmas Carol Service, my partner and I, perched in the gorgeous Memorial Church in Harvard Yard. It was a peaceful December Sunday, dotted with the gentlest rain; couples huddled close under their New England umbrellas as they waited in line outside.

We knew to get there early; we knew that even in the rain it was worth it. The queue formed slowly behind us, and precisely an hour before the service began we were all allowed to enter the church. We sat in our usual spot, by the organist; we watched him warm up as the people shuffled in. When the choir assembled we waved and grinned. They were divine that night. No angel would dare interrupt them—or so I thought.

Call it an intrusive thought, then, or call it unholy if you must. What else can I say for myself? As we rounded out that second venite adoramus, the notes and the staves of the part I was reading dissolved, and for that matter so did the church I was standing in. I found myself back in my high school computer lab on a bright California evening, and on my screen were the building blocks of AI.

I was 17, then, in the spring of my senior year and living the good life. My college plans were settled: I was to begin studying computer science at Stanford University that September. Until then, I had only an exam or two standing between me and graduation. It was the home stretch of high school; easy enough—except that one of those exams would be for my music theory class.

Three times a week that spring I would walk into my school’s old music room and hear a series of interminable lectures on the Western classical tradition. The course was focused largely on the theory of arranging four-part harmonies, which is a skill I simply never got the hang of. That spring stands out in my memory for the endless stream of four-part harmony assignments I turned in, and the endless stream of red ink I got back. No matter how hard I tried, there was always something wrong with my work. A wrong chord here, a dissonance there, parallel perfects and notes out of range… The Western classical tradition had simply too many rules, I decided, and I had no interest in following them like a machine.

—ah, like a machine—yes, exactly: you can almost see the mischief work its way up my spine. I wish the word “machine” had not entered my head at all that day. But it did, and when it did, it brought with it a curious idea. It was the kind of idea that burrows and itches itself irresistible into your mind. The programmers reading this essay will know what I mean. It is that idea that brought me to the computer lab after school that day, breathless and impatient. I turned on my computer, and—with my music theory textbook propped open in front of me—I began typing in rule after rule into it, in the form of simple arithmetic formulae.

There was nothing particularly clever about what I did that day. I happened to know that every note you can play on a piano comes with a number representing its pitch (for example, middle “C” is numbered 60, and C#, 61). I also happened to know that once you have numbers, you can start doing math on them. The precise, severe language of my textbook, no less mechanical than the US tax code, practically begged to be treated this way. When my textbook specified vocal ranges for the soprano, alto, tenor, and bass voices, I typed in mathematical bounds using the “≤” and “≥” operators. When my textbook said “basslines must be stepwise,” I typed an equation saying that adjacent bass notes must be within two (±2) piano-keys of one another.

And then? Well, then I did exactly what you would expect: I asked my computer to solve the equations. The computer happily obliged, unaware that it was composing melodies and harmonies by working out numerical values for x and y and z. For all it knew, x and y and z might represent scientific measurements, or census data, or foreign exchange rates. But take those numerical values, turn them back into piano-keys (so 60 becomes middle C, and 61, C#), and soon it begins to resemble music again. It is a remarkable bit of alchemy—it feels like getting something from nothing. Write out the numbers in music notation, and you can hardly tell that the harmonic arrangement came from a computer. In just a couple of hours of tinkering, I had written a crude little algorithm that could do my harmonization homework for me—better and faster than I ever could by myself, and guaranteed to follow all the rules I knew of.

Every schoolchild dreams of such a moment. In 2006, children’s writer Dan Gutman indulged our collective fantasy with his novel The Homework Machine. In the novel, a group of kids write a computer program to do their homework for them with AI. A decade later, I was sitting at school with my own little AI “homework machine,” feeling awfully proud of myself. What I had created was more than a homework machine: it was a music machine. When I piped the output through the crusty school speakers, the harmonizations really did sound like music—which meant that with my computer program, I had outsmarted not only my textbook, but music itself.

Of course, at the end of The Homework Machine the kids realize their mistake and throw their “homework machine” deep into the Grand Canyon. Perhaps because I had read Dan Gutman’s book as a child, or perhaps simply because my parents raised me well, I knew even then that it would be wrong to use my computer program to do my music theory homework for me. Once my coder’s itch was scratched—which is to say, once the problem was solved to my satisfaction—I more or less forgot about my program. I did the rest of my harmonizations by hand that semester, somehow passed the final exam, and never looked back. To this day, harmonic intuition eludes me, but if pressed I can limp along on the crutches of the rules I learned years ago.

I’d like to look back on the whole endeavor as a success, then; I’d like to feel proud of what I accomplished that spring. From a purely academic perspective that seems straightforward. Yet even now, as a 24-year-old graduate student in MIT’s AI lab, I find myself also somewhat ashamed of my teenage self’s actions—the entire affair is inflected with contrition and embarrassment. My computer program shortcut feels like a trespass undetected, though only now have I come to understand the kind of ground I strayed on.

Maybe if I had grown up going to churches of any kind, things would be different. I was raised in a mostly-secular-though-somewhat-Hindu household, going to temple maybe once a year. My partner is the one who loves Christmas; she is the one who brought me to the Carol Service that day. The breathtaking magnificence of the Harvard Memorial Church, the overwhelming beauty of a church choir and organ—these were all new experiences for me, fresh context for the musical tradition I was once taught the theory of. Being moved was simply not on my radar when I was a teenager hung up on the nuts and bolts of Western classical music—which were to me forms without meaning, levers without load.

In this narrow sense of nuts and bolts, even my high school teacher could not argue that his music was inaccessible to mechanical mass-production. The proof was in the program I wrote. By writing it, my 17-year-old self had achieved technological dominance over his art, and had reduced those who practice it to something I could control. Who needs skill when one has silicon? My little shortcut was a power play, plain and simple: a play for the command one loses when encountering mysterious, ineffable beauty spilling mindlessly from an artist’s worn fingers.

When I look around at my fellow AI researchers and entrepreneurs today, I worry that (consciously or not) we are making that same power play at an unimaginable scale. The generative AI technologies we have built can create striking images, texts and sounds instantaneously and for free. In many ways it feels like technology has conquered art at last. “Art is dead, dude” said one AI user to the New York Times last year. In the rhetorical terms of today’s AI companies, creative power has been “unleashed” for the “masses” to deploy. The mystery of creation has been defused—or, at least, refused.

But before we pat ourselves on the back—what have we accomplished here? I think of the man so taken by a street magician’s illusion that he confronts—bribes—even threatens—the illusionist. With the exercise of violence he extracts the performer’s secret—or half of it, at least—the other half being the unimaginable discipline, practice, and labor needed to make the impossible look effortless. (As Teller writes, “sometimes magic is just someone spending more time on something than anyone else might reasonably expect.” Preparation is the weighty hammer; performance the weightless nail.)

The man however is satisfied with knowing the secret to levitation; he walks away content. He grins at having captured another curiosity for his collection. But then the enchantment fades. He realizes that he will never master the illusion himself; he knows, too, that he will never enjoy it again. By the time he turns the block he is left emptied even of what little magic the performer had kindled in him with the performance. The encounter is a loss for that man.

Ada Lovelace, the world’s first programmer, predicted two centuries ago that computers might someday “compose elaborate and scientific pieces of music of any degree of complexity or extent.” Those were radical words when first written—the world was not yet ready to consider creation to be a mechanical activity. Even now, we remain cagey about the prospect of machines themselves creating works of art. The US copyright office has not budged on this—as of today, you cannot register a copyright for an AI-generated piece (though people have tried, unsuccessfully, all year long). It may be wishful thinking, or it may be a collective delusion: but the act of artistic creation remains a mysterious process, one we are willing to hold distinct from computation.

I am thinking again of my old music teacher, who could sit at his piano and harmonize any melody on the spot. I saw him do it morning after morning, lesson after lesson. With his decades and decades of experience, he did not even need to look at his keyboard. Or perhaps (as I now suspect) he needed not to see it. As he worked through examples, he would peer absentmindedly out the window, as if afraid that an errant conscious thought would destabilize the more mysterious source from which his hard-earned musical intuition emanated. My memory of this is as salient as the red marks on my homework. Like many artists, my teacher found articulating the mechanical principles of musical composition a perplexing activity, the same way a native speaker of a language is perplexed as he explains (and, at once, discovers for the first time) the conjugation of an irregular verb.

When I think of my high school music theory teacher, I think of another adoration—by Peter Paul Rubens—an enormous Renaissance painting that hangs in the Prado in Madrid. I was introduced to this work of art by a college art history professor, who lectured vividly about how the painter conjured figure after figure effortlessly, as if drawing from a bottomless cauldron of imagination. As evidence he offered us the sheer exuberance of the painting: the embarrassment of flowing gowns, sparkle-eyed animals, angels, smoke and musculature. Rubens would have his studio staff read out loud to him as he painted—something my college professor took as evidence that Rubens needed this preoccupied, mindless state to do his artistic work.

Rubens Adoration

I bring up that lecture as much for its content as its form. The lecture was expertly delivered, rehearsed but not over-rehearsed. I learned later that the professor would practice his lectures while driving to work, his notes held tight against the steering wheel of his car. By the time he reached the lecture hall he would no longer need his notes: instead, he would look to the back of the auditorium, over our heads—not as if he were reading from the back wall, but rather as if some distraction on the back wall (the molding, the defibrillator) were necessary to keep his conscious mind occupied long enough to let his unmoderated cauldron of a soul bubble up the words for him. The professor would often say that the lectures would surprise him; he would go as far as to say that he was as much a student of the lecture as any of the undergrads in front of him.

The music teacher, the painter, the professor—and for that matter the organist and the magician—I think every creative person chases that state of mindless, trancelike epiphany, when the rules we learn fall away and our art spills autonomically from our muse-kissed fingers. We call it being in “flow” or in “the zone,” but what we mean by that is the feeling of having a mind so practiced, so proficient, that it shocks even its owner by performing miracle after miracle of creativity. We practice our craft to fashion ourselves into such sources. We chase the otherness of being a mind whose workings we cannot explain in words—the otherness of being spoken through, the otherness of the conduit—the ecstasy of unaccountable inspiration.

In this way, then, artistic creation begins exactly where algorithmic computation leaves off.

Adestes Fideles was on Sunday. On Monday, I flew to New Orleans to present a paper at the thirty-seventh Conference on Neural Information Processing Systems. NeurIPS, as we call it, is the largest AI conference in the world—everyone in AI attends. My flight from Boston was full of wiry graduate students clutching their telltale poster tubes. Or were they holding pickaxes? I imagined us an army of prospectors, drawn to a new gold rush upon a great mountain of data. Tens of thousands of us swarmed New Orleans’ Ernst M. Morial Convention Center that week looking to further our careers. Restaurants across town were booked out, the aisles between tables overflowing with backpacks and white conference lanyards. At night, major tech companies threw extravagant rooftop parties, and students traded tips on scoring invitations.

On my way home from the conference, I took a taxi from my hotel to the Louis Armstrong New Orleans International Airport. The driver had on AM radio jazz, because, again, this was New Orleans. Then out of nowhere came an mbira (a kind of thumb piano), there in his hands, held up on the steering wheel. At 40, maybe 50 miles an hour, he plucked along his own melodies with the radio, never once taking his eyes off the road. I tipped him well for the music, thinking again of Rubens’ Adoration, at the baby plucking absentmindedly at the dish held in front of him. I pulled up the painting on my phone, zoomed in on the child, not unaware that my own fingers were echoing his. There again was that image of the artist just barely teasing the lush infinitude that lies untapped before him, just barely aware of the riches and gifts he draws from. (And where did that leave me? I wondered. There in the back seat of the Highlander, I realized I was the boy kneeling in the painting’s center holding the burning poster tube.)

Rubens Adoration

We market generative AI as effortless creation: type in a prompt, push on a button. In its own peculiar way it is not unlike dipping from a bottomless cauldron, except the cauldron is the Internet and the broth is every text or image ever uploaded. You could say, in defense of the technology, that its effortlessness is no different from the effortlessness of the virtuoso—that generative AI distills down the very essence of inspiration, eases the burden of training, makes creation accessible to the masses who are, in the words of one AI CEO, “creatively constipated.”

But I have my doubts, which I seeded the day I pressed a button, watched my little computer program harmonize the first lines of “My Country ‘Tis of Thee,” and saw nothing at all of myself in the chords it conjured. It is one thing to be spoken through, I see now, and it is entirely another to be spoken for. I push the button on earth, I read the output on earth; I am conscious the whole time. That is the price of the terrestrial shortcut: I never again ascended to that strange Rubensian plane, up past the clouds, where minds heavy with thought cool unaware and rain down their ideas as art and self. (What would it take to build technology that elevates us to that plane—instead of locking us out for good?)

Locked out—yes, there it is. I never trespassed with my shortcut; I locked myself out. Which brings me back to the queue outside the church at Harvard, an hour or more before the Carol Service, a week or more before Christmas.

My partner explains Advent to me as the time of waiting and preparation: an austere, quiet, thoughtful time before the joyous euphoria of Christmas Day. I would like to conceive of artistic experience the same way; to center the solemn, laborious, disciplined preparation that precedes the unconscious rapture of creation—of birth. I think of Advent as my college professor’s notes on the steering wheel, which echoes the taxi driver’s mbira on the steering wheel, which echoes the magician and the musician’s years of training before the moment on the curb when they tip their cauldrons and pour.

So we stand together, patience and anticipation, aware of every raindrop. Then the doors open.