Recent Comments


Categories


Archives


Tags

Artificial Intelligence takes over music production (Tuesday 8 August, 2023)


Updated: 15 August 2023; 21 August 2023; 10 September 2023.

 

By the time you read this post, it is probably out of date and the use of artificial intelligence (AI) in music progressed into even more possibilities. This post gives a very short, ad hoc, and somewhat unstructured  history and summary AI developments that affect music production in the broadest sense. Google, and you will find much more. I will update and improve this post over time.

My message today: within 10 years, but probably much sooner, music production as we know it now will cease to exist. If you are in music production (like me) you will be out of business soon.

AI comprises different approaches among which is deep learning. If you are not familiar with this then have a look [here] or have a look at this introductory video to machine learning to get some feeling how some of these AI programs work:

 

Before talking about music, I suggest you first watch the next video. It gives a general summary and reflection about the current and expected possibilities of AI. It is a long video but an hour very well spent. You might get convinced that within the next few years, the sky is the limit for AI music production.

 

A simple explanation of so-called Large Language Models, which are recent advances in deep learning models to work on human languages,  is found [here].

 

Artificial Intelligence (AI) in music production

AI is already used for some time in music production. For example,

  1. To compose music and lyrics (e.g., the Artificial Intelligence Virtual Artist (AVIA); listen [here]); I recently used ChatGPT from OpenAI to write lyrics for my new song Everything (not yet public).
  2. For the automation of mixing (e.g., Unchained Music) and mastering (e.g., iZotope or Landr)
  3. To generate rights-free music for content creators (e.g., Soundfull);
  4. To unmix music in different components (e.g., Spectralayers, Nectar 4 from iZotope);
  5. To analyze music and its characteristics / patterns to provide you with personalized music recommendations (e.g., AI DJ from Spotify).

 

Software that generates (‘composes’) music is not new. The roots of AI go as far back as the work of scientist Alan Turing in 1951 who used a computer to play melodies, including God Save the Queen and Baa Baa Black Sheep, on his gigantic Mark II computer. The melody was played out through a series of short sounds (which Turing described as a mix between a “tap, a click and a thump”) using a loudspeaker. When played quickly enough the distinct sounds blended together, creating music.

Another ancient example is the Bach-inspired “Illiac Suite” from 1958. The Illiac Suite was the first musical composition for traditional instruments that was made through computer-assisted composition by Lejaren Hiller and Leonard Isaacson. It was based on a Monte Carlo algorithm, information theory, and a set of rules.

I have been using Band in a Box for over a decade (if not longer) to generate ‘new’ song. However, BiB does not rely on AI but, instead, it  simply manipulates MIDI and audio (wav files) recorded by real session musicians to fit the selected style, key signature, chords and tempo. Although in many regards this is a very useful program (e.g., practicing, generation of new ideas), the generated songs sound kind of synthetic and boring. AI will go far beyond the capabilities of the current BiB software.

 

Within a few years anyone will be able to produce (compose, arrange, mix, master, etc) music with a push on the button. Training programs and courses in music production will become obsolete.  

 

From Alan Turing to Hello World

Hello World (2018) was the first music album composed with the help of an AI technology (Flow Machines). The research leading to these results received funding from the European Research Council under the European Union’s Seventy Framework Programme (EP7/2007-2013 Grant Agreement no.291156) and has been conducted by Francois Pachet at Sony Computer Science Laboratories Sony CSL Paris) and Pierre and Marie Curie University (UPMC).

Listen to it on Spotify, and then forget about it:

 

An other example from 2017 is the composition from the Artificial Intelligence Virtual Artist (AVIA) that was subsequently performed by the Avignon symphonic orchestra:

Not bad at all, it think!

 

And another one  from from 2018, when Taryn Southern launched her album with the significant name “I am AI,”, which is the first album by a solo artist composed and produced with four AI programs: AIVA, Google Magenta, IBM Watson Beat and Amper Music. Spotify playlist [here]. In all cases, AI software composed the notation, and when Amper was used, the AI also produced the instrumentation. Taryn arranged the compositions and wrote vocal melodies and lyrics, while Producer Ethan Carlson handled vocal production, mixing and mastering.

 

Google Magenta is an open source research project exploring the role of machine learning as a tool in the creative process. It includes MusicLM (see below), SingSong (a system which generates instrumental music to accompany input vocals) and AudioLM (Language Modeling Approach that  learns to generate realistic speech and piano music by listening to audio only).

Image: SingSong (training and application). SingSong builds on recent developments in musical source separation (e.g., like SpectraLayers) and audio generation. Specifically, it applies a state-of-the-art source separation algorithm to a large corpus of music audio to produce aligned pairs of vocals and instrumental sources. Then, we adapt AudioLM (see paper below) —a state-of-the-art approach for unconditional audio generation—to be suitable for conditional ”audio-to-audio” generation tasks, and train it on the source-separated (vocal, instrumental) pairs. Once trained, SingSong takes a voice as input and generates an instrumental part, which is then mixed with the vocal (interference).

 

Amper Music is a cloud-based music generator that creates musical tracks from pre-recorded samples. These are then transformed into real audio, which can be modified with music keys, tempo, individual instruments, and more. Amper Music has been acquired by Shutterstock. You can listen to examples on Shutterstock.

 

Music generation from text

A recent innovation is MusicLM, which generates music from any text description and was developed by Google. MusicLM was trained from a data set of 280,000 hours of music. MusicLM was not released due to potential copyright issues (see below). The paper describing MusicLM was published on arXriv. You can listen to some generated audio fragments [here].

 

In the video below, you can see how ChatGPT (Chat Generative Pre-trained Transformer; from OpenAI) is used to write a guitar solo.  ChatGPT is based on a large language model (LLM), which work by work by taking an input text and repeatedly predicting the next token or word.

 

Musicians become obsolete

One concern is that AI-powered music could render human musicians and songwriters obsolete. Although there are people who think it will not come this far because AI can’t be creative like human musicians, I wouldn’t be so certain about that given the tremendous speed of development. Moreover, (in the future) AI will not only generate music similar to as what was provided as its input (learning) but for sure will also be capable or producing completely new styles of music. Some interesting reflections (see below) are given in the video ‘The AI Effect: A New Era in Music and Its Unintended Consequences’ from Rick Beato that predicts that the audience might like AI-generated music as much (or better) than human-made music. In addition, large labels may decided to completely stop working with artists and just produce generated music reducing the costs for paying artists.

 

Video: ask ChatGPT to write a guitar solo

 

With current Virtual Instruments (VSTi) and sample libraries from Native Instruments, Spitfire Audio, East West, Toontrack, etc, etc we can already make great productions that can contain every instrument we wish for including drums, bass, synths/pianos, guitars, strings, horns, and much more. We even have Virtual Instruments/sample libraries for choirs and voices that can reproduce text you type in, or that play pre-recorded phrases in any key.  The average listener would not be able to tell the difference when used in a music production. However, strong lead vocals that sing the lyrics you wrote are still missing. However, some progress is being made by, for example, Vocaloid which used AI technology to generate more natural-sounding and highly expressive singing voices. See also [Hatsune Miku] and [Wikipedia]. Personally, at this moment, I don’t find Vocaloid very convincing, but a lot can happen in a few year!

Using Text to Speech AI you can already clone your voice and make it sing:

Video: text to singing

 

AI and MIDI

Video: some interesting AI-drive MIDI tools

 

Mixing and mastering

I expect that mixing and mastering will be largely automated in the coming years. Examples of AI-driven mixing and mastering tools that I already mentioned are provided by Unchained Music  iZotope, and Landr.

Recently, an AI clone of Wez Clarke was created. Wez Clarke is a grammy award-winning engineer who has worked with artists such as Clean Bandit, MK, Becky Hill, Zara Larsson, Naughty Boy, Switch Disco and Sigala. In an ongoing collaboration with Masterchannel, an AI clone of his engineering process was created that can be used by every producer to give their tracks a competitive edge.

This video provides an interesting interview about AI in mixing plugins of Wytse Gerichhausen (owner of White Sea Studio)  with Alexander Wankhammer (one of Sonible’s co-founders):

 

Would you still enjoy mixing and mastering if it becomes really simple? Or do you like hours of fiddling around with the settings of your compressors? In the past music production was the domain of analog studios that had access to all the (expensive) equipment that was needed for recording, mixing, and mastering. This situation completely changed. Nowadays, it is relatively easy and not too expensive to setup you home studio. Indeed, many music productions now come from home studios. Still it requires knowledge, training, and experience to come to a convincing music production. The home studio situation will be taken one step further by AI tools. Perhaps , in the near future, we will have a Google or Microsoft ‘MusicProduction’ application on our computers that with some minor configuration generates a Spotify-ready song within a matter of minutes based on your preferences such as style, tempo, instrumentation, etc.

 

AI-cloned music (voice cloning)

AI voice cloning, also known as voice synthesis or voice mimicry, is a technology that uses machine learning to simulate a specific person’s voice. This technology requires a certain amount of voice data to analyze and learn the unique vocal characteristics of the individual. Once trained, it can generate speech that sounds very similar to the original voice. Voice cloning models are typically built using techniques from the field of deep learning, a branch of AI. One common approach is to use a type of model known as a recurrent neural network (RNN), which is particularly well suited to dealing with sequential data like speech. Examples of programs that clone voices are Google’s Tacotron and Speechify Voice Cloning.

You probably heard about the rise of AI voice-cloning scams. But, voice-cloning can also be used to generate vocals for songs using, for example, Voice Swap or Uberduck.ai. A few examples are given below.

 

AI-generated cover (Barbie Girl by Johnny Cash)

 

As an example consider this AI-generated cover (Queen’s Don’t Stop Me Now) by creating a fake version of Kanye West.  Not yet as good as the original, but come back later…..

 

Or what about Get Lucky (Daft Punk) song by Michael Jackson

 

Here is the original Daft Punk video of Get Lucky (still much better)

 

Or the recent ‘Heart on My Sleeve‘: It sounds like it features two of the biggest artists in the world. It has original lyrics. It has a trap beat that could fit in at the top of the charts. And it’s created by an AI-powered software called Amper Music. This song sounds like a collaboration between Drake and The Weeknd. But, it was actually made by an anonymous creator called Ghostwriter, who used AI to replicate the artists’ voices to make a new track. Ghostwriter uploaded the song to TikTok, YouTube, and Spotify, where it generated over 500,000 streams by Monday. On Tuesday, the song was no longer available on the platform. If you are lucky, you can still listen to it [here]. See also This A.I. Drake & The Weeknd song is terrifying (and kinda good). This video, at 5:38, shows how good this works.

 

But what about copyright?

Copyright issues in music industry are already complex but AI will probably make it even more complex. Google, and you find a lot of discussions about this. I am not going into this although very interesting. Perhaps one quote: (see here):

Generative AI software (like Magenta) is “trained” by feeding it vast quantities of content – text, lyrics, code, audio, written compositions – and then programming it to use that source material to generate new material. In October 2022, the RIAA shot a warning flare by declaring that AI-based extractors and mixers were infringing its members’ rights by using their music to train their AI models. Those that side with the RIAA argue that AI’s mindboggling ingestion of copyrighted music violates the Copyright Act’s exclusive rights to reproduce and create “derivative works” based upon one or more preexisting works. Because generative AI produces output “based upon” preexisting works (input), copyright owners insist that a license is needed.

In a recent interview of Rick Beato with Bjorn Ulvaeus (ABBA), Bjorn asked the question if artists could opt-out from being used in the training of AI software. I don’t think that this is possible at the moment, but would perhaps a route towards revenues and credits for the artist. And how to distribute income if the AI-generated song comprises components from different artists? In the same interview several other interesting issues come up: If millions of AI-generated songs would be uploaded to the streaming platforms, this would dilute the royalties of the ‘real’ artists. Thus, the streaming platforms need to come up with filters. In addition, the large costs of storage and energy (for the ICT) would be a waste for the many ‘bulshit’ productions being uploaded. For music that uses voice-cloning the question emerges who owns the voice? The label owns the sound recording but your voice is yours. Can you copyright yourself? Where is the distinction?

 

Some interesting reflections of AI in music

The AI Effect: A New Era in Music and Its Unintended Consequences (Rick Beato)

 

AI Dominance – Quit Making Music (Barry Johns Studio Talk)

 

What about recording?

Didn’t look into this yet. Will AI help during the actual recording?

 

Training

Given the fast progress AI is making in music production, I feel it is very important to incorporate this in training programs that teach music composition, instrumentation, mixing, mastering, etc. Students should at least be aware of these developments. They should understand the (dis)advantages of using AI-based tools.

 

Quotes

Some quotes I collected from several forums and websites regarding AI in music:

  • I’m in this business to interact and communicate with human beings, friends, (local) community members, musicians, composers, conductors, technicians – I’m pretty sure this is the case for most other folks too.
  • Music is so repetitive and restrained these days, AI could definitely be the co-pilot, or in many genres, the pilot.
  • I feel the same about mastering. It’s being restrained by a set of rules set forth by the DSPs and social media, and AI is getting better each year. How many are just using Ozone presets these days. Sure us engineer types will be against it, but the artists and labels looking to save at every turn and have things done right now, will use it to it’s fullest. It’s less than 5 years out imho.
  • I think that the audience might be at least a little disappointed when the curtain lifts and there’s just a laptop sitting on a stool!
  • Don’t worry, there will be plenty of real bands playing in bars which are actually cover bands of AI artists.
  • Music creators can use AI-generative tools to generate new ideas for their music and lyrics.
  • “No one of us will be replaced by a machine because it lacks humanity and expression” a Drummer said in 1978″.
      • What happened? They were replaced and a few remain and they struggle to make a living out of it.
  • “Autotune will not make bad singers be successful because it sounds robotic” said a guy who studied singing in 1997.
    • What happened? Robotic and mechanical sound hided the lack of talent and then it became fashionable to use a drum machine in the 80’s and to sing robotically and unskilled. The low-cost music prevailed.
  • Great music is created from human experience and deep emotion. And all this stuff AI cannot handle. AI can just guess what comes next. Fanbases will become more more important, since the fans would like to be close to the artist they love (interview).

 

Further resources

Published On: April 26th, 2023Last Updated: January 7th, 2024Categories: Audio technology, Musical DiaryTags: , , , , , , ,

Leave A Comment