Mus(ick)ing – or not – with AI
Copyright ©️ Elizabeth Sheppard, 24 May 2023. All Rights Reserved.
Over the last forty years or so, I’ve witnessed the escalating incursion of AI (Artificial Intelligence) into the domain of music created by humans. When AI began to generate music content, music scholars compared generated AI composition content to content made by composers and performers. As AI imitates human creations more and more closely, it’s becoming difficult to distinguish AI content from human content. Some believe that Large Language System AIs trained in all Westerners know about music, can already surpass humans in intelligent music composition. So do any differences remain between virtual AI composition and real time human music composition? Indigenous song systems based on evolving, adaptive oral systems of place have much to contribute to this ongoing debate.
The AI revolution has changed the way many humans engage with sound. Like written music, printed music, and musical instruments, AI is inert, not alive, but it imitates the organic creative strategies of living beings so well, and so rapidly, that it appears to be alive. So in Western cultures, instruments and scores are seen as an extension of human agency, with performers controlling sonic instrumental output by reading scores that channel composers’ intentions. The instrument is not the music, the score is not the music, the technology is not the music: these are powerful intermediaries, used to activate passive performative codes that output expressive sound designs. So Western instruments, music scores, and their virtual counterparts, can be said to incorporate preconceived interpretive AI algorithms.
In contemporary Western music, and in Indigenous composition, AI software introduces a fourth actor into the aural / oral or written compositional mix – a generative, rule-based, invisible AI algorithm. AI algorithms are often derived from standardised “global music” genres. They may be created by one person, or a group, who usually maintain anonymity. New forms of music copyright attribution are being developed to protect the rights of AI developers, as opposed to the rights of AI users, publishers and performers of AI generated music. The agency of Western music creation has rapidly shifted towards AI developers, and away from passive, non-creative AI users. So for composers, screen musicians and sound designers, it’s become essential to avoid passive AI user engagement with AI, and engage actively with AI systems as a creative developer. To comprehend and adapt to this cataclysmic change, it’s useful to review how AI entered our daily lives, and how it may or may not interact with non-Western Indigenous musical systems.
How should composers approach and relate to AI in Africa and Australia, where all singing voices, instruments, and written inscriptions (whether the songmaker is conscious of this, or not) are embedded in complex Indigenous contexts?. In Africa and Australia, sounding Country includes dancing, speaking and singing in channelled ancestral spirit voices, in songs received from ever-evolving, organic, living Countries. Guarding against invasive external AI corruption of the living, organic Voice of Country, is a custodial obligation for Indigenous peoples. To communicate this important cultural obligation, in 2004 I co-composed a song with Anmatjere Arrernte singer-songwriter Rhubee Neale, called Keep Guard of our Dreams. We sang it at an Eora Aboriginal College concert, at Petersham RSL Club, and at a Seminar held by Reconciliation for Western Sydney at the Karabi Centre, Wentworthville NSW. In September and October 2022 Dharawal Inuk soprano Sonya Holowell and Biripi opera tenor Elias Wilson sang the eight part SSAATTBB arrangement of this song with The Song Company, as part of the Songs from the Heart concert, in response to the 2017 Uluru Statement from the Heart. It reached a huge audience when it was toured to Canberra, Parramatta, Newcastle, Wollongong and Sydney, and was broadcast on the Australian Digital Concert Hall. If it had been composed with an AI algorithm in 2004, it may have reached the charts, recorded on a label, and heard more widely, before 2022. But it wasn’t composed with any kind of AI, it was composed in real time by two Australian women who poured out our love for our Country, in a small rehearsal room, strumming chords on a guitar, singing, picking out a melody on an old piano, and scribbling on a scrap of manuscript paper. In 2004, we knew this song was great, but we didn’t realise that AI cultural control of radio playlist codes meant that it would probably be permanently silenced. I can’t help but wonder how many real time, living Australian songs have been silenced, and consigned to media playlist bins, because of bungled AI music coding.
In my experience, the AI invasion of written and oral music composition began with software programs of two kinds. In global Western music systems, the first kind (Finale, Sibelius) mimicked hand written music scores, and the second kind (DAWs like CoolEdit, Cubase, Reason, Logic and ProTools) offered transposable chord pattern graphs, keyed to beats. Notating composers who prioritised melody and interlocking polyphonic patterns bought Finale and / or Sibelius, while oral musicians who improvised tunes over chord patterns and used chord charts, found DAWs an easier option. At the same time, musicians from non-Western cultures (e.g. India, Samoa, China) translated non-Western notations and previously un-notated sound clusters into software that (unlike Western DAWs and digitized scores) integrated oral music systems with ornamented, expressively coded, modal sound fields.
When I used Finale Notes in the 1990s, I applied my acquired Western music handwriting skills to producing and printing digital scores of my own songs, on my Mac SE computer. The Finale scores I produced were not as sophisticated or as flexible as my handwritten scores or my graphic song charts; they were limited by the tasks and signs offered by the Finale software. In 1988 I’d explored acoustic spectral analysis of song using MacSpeech, by measuring sung vowel and phoneme durations and transitions, but MacSpeech didn’t graph sung pitch or intonations adequately, so its usefulness for song research was limited. I also tinkered with Hypertext, and made networks of linked files and folders. I replaced my manual typewriter, carbon sheets and roneo with a word processor, then a computer keyboard. By 1989 Australian composers like my contemporaries Martin Wesley-Smith and Barry Conyngham were grappling in earnest with the creative arts potential, and the limitations, of experimental AI computer music programs.
In the mid 1990s I was exploring the acoustics of pitched sung vowels via digital recordings, and Plomp’s European research into the mechanics of the singing voice had taken off. Linguists at Sydney University used acoustic speech research findings to develop AI speech systems that joined segmented vowels and consonants into clunky words and sentences. Their AI speech models had distressingly flat intonation and wobbly phoneme transitions, but were nevertheless tested on astonished commuters, in hilarious Sydney railway announcements. Overseas, in France, Pierre Boulez’s IRCAM acoustic music research centre, which I visited in 1993, was attempting to harness the power of computers to music creation and sound design, with various degrees of success. After observing these distant, costly projects, I decided to prioritise my own cultural heritages, and engage with AI and computer music only insofar as it was useful to my country-based living music goals.
As a trained church Cantor in the 1990s and 2000s, I worked with the interface of texts and neume formulae in Gregorian chant and Western hymns and anthems, singing in Greek, Latin and English. I studied how this chanted and harmonized interface had been adapted and simplified for metrical congregational performance, by Reformation musicians like Luther, Calvin, Merbecke, and the Bach family, who contributed many chant-based metricized hymns, chorales and psalm settings to the Scottish Psalter and English hymnbooks. This tradition of rearranging texts to metricized melodic formulae, and synchronizing accentuated lyrics in parallel languages, is part of my Scottish musical heritage. Familiarity with the rhythmic, accentuation and intonation patterns of any language makes expressive melodic text setting possible, so I can, with cultural guidance, apply this skill to text setting in Indigenous languages.
Like modal chant and tonal music systems, AI organizes formulaic fragments into layers, to develop new music. However, I have chosen not to use random, contextless collections of sound objects to assemble songs; instead I rely on intuitive, context driven reception of lyrical song melodies from Country. Some call this dreaming, but it is not an unconscious process. Like chant, AI algorithmic music is limited by the preconceived formulae and criteria that underlie it. If an AI program designed by an outsider lacks cultural content from my Country, or is derived from foreign music concepts, or doesn’t speak my inherited languages, or is not familiar with the birds and creatures of my lands, and doesn’t know about my people or our laws, how can it walk with, hear or sing the melodies my Country gives me? But I can certainly develop my own country-based AI codes and algorithms, create my own sound samples, and integrate them into my songs, as a self-determined Indigenous AI developer. Many Indigenous Australians are harnessing audio and video recording, AI resources and app coding to serve our cultures, in original ways. Drawing on external AI resources is unnecessary for us, since passive AI music, while easy and cheap to produce in comparison to the demanding, costly effort of creating and recording “real” human music, lacks the infinite, musing malleability and deep, rich cultural context of humanised, meaningful, emotional, language driven Indigenous music.
The intersection of algorithmic AI machine music with human music making has escalated at astonishing speed. From an Indigenous worldview, the genre coded sound objects that AI apps are spitting out, fulfil market expectations, but lack culturally grounded depth and richness. The current AI music genre paradigms are not uniquely Australian; most are imported from overseas. Australian Indigenous composers have begun to work with AI systems as developers, to retain control of the coding and marketing of our cultural musics. Traditionally, the only way a song can be birthed by an Australian Country is still through a human Songmaker embedded in Country. So as an Australian composer, I draw on the live, organic intersection between my trained sensory perceptions, the living inhabitants and features of my Country, and the stories of my human communities, as the primary sources of my creative songmaking practice.
Nevertheless, it’s totally clear that AI systems are not going to disappear. As long as AI is managed ethically, it offers amazing opportunities, but passive virtual embedment in AI systems (as opposed to active, organic real time embedment in Country) can be socially divisive, alienating, disempowering, addictive, and downright unhealthy. Digital obsessions have created social divides between generations, and widened the gap between the haves and have nots. People with financially privileged access to passive AI software and virtual gaming systems see no harm in boasting of their technological superiority over people with no access, or limited access.
It’s likely that indulgence in passive AI systems and virtual gaming will never be universally beneficial, unless manufacturers, in collaboration with governments, introduce rules to make it so. Until then, a substantial risk to human diversity and musical evolution exists. Addiction to AI systems that incite social division and conflict, and promote alienation from geographical and cultural contexts of origin, may develop. This risk is enhanced when AI systems promote passive attachment to globalised, regionally marketed music genres, and ignore the contextual needs of local populations. Researchers who create and demonstrate AI music systems need to think about the social effects that dislocated AI systems, and the genre coded music made with them, may have on clients, and on those without AI access.
At present, the primary driver of Australian composer success and playlist track selection in the AI environment, is how much money is made. This is determined not by live gig / concert income, but by how many times a track is played. Ethical obligations to enliven and care for human societies, plants, animals and community environments, by funnelling AI track income back to traditional landowners, are rarely met or promoted. Generating quick income from cobbled together mashups that attract viral click swarms, is considered legitimate. As long as a song can be slotted into a globalised music genre, and meets market demands, it ticks the AI box.
The AI exclusion of local music cultures from standardised global music codes appears to be driving a massively expanded cultural suppression campaign. Conformity to culturally impoverished AI criteria is rapidly replacing the obligation to compose ethical, altruistic music that promotes justice, aims to save threatened peoples, species and languages, warns against excess, reproves crime, or praises worthy things. Costly abstract sound designs stripped of cultural associations, with no discernable value apart from attention-grabbing innovation, are warmly praised for their nihilistic absence of attachment, fleeting audiovisual displays, unregulated extravagance, or eclectic pluralism. This utilitarian approach to music creation is, at the moment, a hugely profitable strategy, but there is little social responsibility, and certainly no future for humanity, in it.
Prioritising personally profitable, socially careless music criteria in AI is also reducing human attention spans, scrambling meaningful lyrics, and diminishing human hearing capacity. These are real risks that could be addressed and guarded against, with wise AI governance and expanded, free live music development programs for all ages. If, as musicians and composers, we aim to develop music that sustains healthy, intelligent human societies, that people can hear and respond to, we need to find ways to balance healthy, active developer interactions with AI, so it’s used responsibly to create peaceful co-existence, and isn’t permitted to foster hatred, or degrade human welfare and flourishing.