Delicate sound of AI restoration - ”The Lord of the Things”
Автор: Marinko Vukmanović
Журнал: Social Informatics Journal @socialinformaticsjournal
Статья в выпуске: 1 vol.3, 2024 года.
Бесплатный доступ
Re-defining important changes in the music and film restoration industry, exploring the possibility to: 1. Extract individually chosen sounds from the final movie track in order to re-balance (re-mix) its audio content. 2. Extract chosen sounds from the original audio recording and re-mix, with separated audio stems. The Results section starts by describing the process of creating a completely new mixdown made from AI-generated audio stems (previously extracted from the original mono recordings) joining it with restored original 16 mm video footage, resulting in a full eight-hour movie, separated into three episodes, each with a duration between two and three hours, covering about one week each, of the 21 days of studio time. The material was recorded during the filming of The Beatles’ “Get Back” sessions at Twickenham Studios, London, in January 1969, and chronicles the making of “Let It Be” and the band’s famed final “Rooftop Concert.” The second episode describes the audio/video production of the last-ever song by The Beatles (“Now and Then”) literally made from scratch, as the core of the project was only a poor-quality analog recording made by John Lennon, singing and playing piano to an old, mono, compact cassette recorder.
The Beatles, “Get Back”, AI-generated, re-mix/re-balance
Короткий адрес: https://sciup.org/170204394
IDR: 170204394 | DOI: 10.58898/sij.v3i1.51-54
Текст научной статьи Delicate sound of AI restoration - ”The Lord of the Things”
The history of recording, and playback of recorded, dates back to 1877 when Thomas Alva Edison invented the Phonograph (Greek: sound writing). This invention revolutionized how people perceived sound, particularly music, as it allowed them, for the first time in human history, to hear music without a live performance in their presence,.
The subsequent phase brought the invention of the tape recorder, or “Magnetophone,” in 1936 by AEG and I.G. Farben. After proving its stability and consistent recording/playback results, the next logical step was to commercialize the content recorded, making it available to the general public.
Initially, audio recordings were made on tape machines owned by professional recording environments, such as recording studios, and could only be played back on equipment of the same technical specifications. Those machines were very expensive, and the recording medium (recording tape) was not mass-produced or available for private use.
Maintaining tape machines required frequent (daily) upkeep, including cleaning the erase/ reproduce/playback tape heads, tape guides, pinch roller, adjusting the azimuth and zenith (horizontal and vertical tilt) of the heads, and calibrating the electronics of each channel, to ensure perfect input/ output levels, according to the machine requirements, and the type of tape used.
It was considered that commercial recordings made for global distribution should be available on a medium other than tape. Thus, a more convenient format was chosen (the vinyl record - 78/45/33 rpm) and record companies, owning the rights to press, pack, design the sleeve, advertise, and distribute vinyl to general store chains, took a major partnership share by taking total control over the process.
And since gramophones already existed...

© 2024 by the authors. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license .
Recording Company’s - Recording studio (owners of the whole “package”)
Once upon a time (not a fairy tale), almost all recording sessions, which produced the final master audio recording (delivered on tape) were ordered, paid for, and the tapes solely owned by the record company that booked particular recording session, primarily for their own catalog release.
Multitrack and 2 track master tapes were carefully stored under guarded conditions and kept well-maintained in order to remain playable. They were treated as a business investment, so it was of no surprise that many recording companies built and equipped their own recording studio facilities to oversee the process and keep it under total control.
Investing in studio time, engineers, producers, session musicians, and all other possible side expenses was considered an asset only if the master tapes were exclusively owned by the record company that financed the process. Artists, composers, lyricists, arrangers, producers had their own, much smaller share - later, in various forms of incentives and other legal ways.
It all started with the famous Abbey Road Studios, established as early as 1931 in London (UK) by the Gramophone Company, which preceded EMI.
Originally a nine-bedroom Georgian townhouse built in 1831; Number 3, Abbey Road was purchased by the Gramophone Company in 1929. Using the large garden at the rear, they constructed the world’s first purpose-built recording studios in 1931. On 12 November, following a merge with Columbia Graphophone Company to form Electric and Musical Industries, EMI Studios were opened in a ceremony that saw Sir Edward Elgar conduct the London Symphony Orchestra in a performance of Land of Hope and Glory. Since its transformation, it become the venue for a host of the world’s most celebrated recordings from Oasis, Pink Floyd, Radiohead, The Hollies, Adele, Ella Fitzgerald and of course, The Beatles. Not to mention the incredible cinematic soundtracks from Harry Potter and Star Wars to Lord Of The Rings and Indiana Jones. More importantly, it has been the home of some of the most important technological breakthroughs. Since EMI engineer Alan Blumlein patented stereo at Abbey Road in 1931, the studios have been famed for innovation in recording technology, largely developed by the Record Engineering Development Department (REDD) who were responding to the needs of the artists and producers using the rooms. Their innovations include the REDD and TG desks, as well as studio techniques such as Artificial Double Tracking (ADT), created by studio technician Ken Townsend, who went on to become the studios’ MD, as well as Vice President of EMI Studios Group .
In the late fifties, Sun, Motown and Chess Recording Company studios were operating on the same principle, signing recording contracts with artists and groups all over the USA, which turned out to be a multimillion dollar investment. Some of the “names” were:
Private recording studios and “new” delivery formats
By the end of the sixties, professional recording equipment became more accessible for private enterprise. However, most of the final mono/stereo and 1”/ 2“ multitrack master tapes remained the property of recording companies. Tapes were securely stored and duplicated (safety copied) for potential future re-release across various evolving standards. What began with vinyl record continued with Compact Cassette, Super 8, CD audio, DVD audio, Mini Disc, with more to come.
It became evident that any new delivery standard was a new source of income, using the same old original master. This made the owners of the master recordings (and the artists as well) very happy, as they could “sell the good old goods - for the good new money.”
When the multitrack recording/mixing era arrived, it was even more important to keep the original tapes in playable condition. Not only 2-track masters but also 1” and 2” multitracks were preserved, as there was always the possibility to create a new mix by re-balancing the original version, possibly adding some new sounds or changing the original arrangement to produce a “new/different version of the same.” With the popularity of dance mixes, re-mixing EP versions became standard, along with preparing extra “dynamic” Radio mixes or special TV (mono/phase-checked) mixes, and occasionally, there was a need for a pure instrumental version too.
Over the past two decades, while witnessing the decline of physical delivery formats (not to be confused by the “LP/CD comeback,” as fashion trends differ from sales numbers and profits) many new delivery standards have emerged. These include the latest trend of non-standardized digital re-mastering, leading to a “new loudness playback standard” due to the “loudness war.” This period saw mastered recordings becoming almost unlistenable, and, by many reproduction standards, nearly unplayable.
To add to the complexity, new L.U.F.S. (Loudness Units relative to Full Scale) standards have been established for cinema, TV, radio, and various commercial web platforms.
Restoring “Get Back” sessions
Sir Peter Jackson (famous film director: “The Lord of the Rings” and “Harry Potter”) has been the only person granted exclusive access to the private film archives of “Fab Four” by Apple Corps Limited. After previewing 60 hours of raw footage, 130 hours of audio, and the legendary “Rooftop concert” Jackson decided that it was of crucial interest to improve the existing audio.
During the past period, even though it was considered to be done, the available restoration tools, and methods used were of limited potential for any significant improvement. This was clearly evident while restoring “Free as a Bird” for “The Anthology” sessions in 1995/96.
The complete production work was agreed to be undertaken by Jacksons “Wing Nut Films Company” in Wellington, New Zealand. The entire process took four years to complete.
The process began with Peter Jackson contacting the New Zealand Police (forensic and investigation department), which, in collaboration with Oxford University, worked on specific software designed to make surveillance field recordings more usable, by separating voices from background noise. As a result, some voices became more recognizable, which was helpful for police work, but the musicality of the results was not as good as expected, necessitating further improvements. This led to the development of the Weta restoration software.
Peter Jackson spoke about the process, for “Esquire” magazine:
The software used MAL working principle (Machine Audio Learning) internally called Mal (as a tribute to late Mal Evans, long time Beatles friend and roadie) In machine learning, a neural network (also artificial neural network or neural net, abbreviated ANN or NN) is a model inspired by the structure and function of biological neural networks in animal brains .
The software itself is not yet available for commercial use, but at least a dozen of other similar programs, based on the same principle (some even available as free upload/download) are already on the market.
Presumably MAL is a process based on the principle of constantly learning and improving from experience. It provides results through continuous processing, with known parameters, and then reloading into the next generation of AI data, for further analysis and correction. The process does not end until subsequent requests on the software to distinguish and separate specific/wanted elements from the rest show progress - with each new attempt. As a result, the software pays more attention to specific details (minimal equalization, dynamic and phase differences), thereby providing increasingly better results with each step.
Once the process is finished, the first step is to listen to, check, make notes, and erase all sorts of background noises from the specific instrument or vocal to be separated. In the next phase, all traces of other instruments or sounds still present on that channel (due to acoustic reflections, hiss, buzz, or specific microphone bleed) can be individually marked and further edited out.
The software was also sophisticated enough to separate individual parts of Ringo’s drum kit (kick, snare, hi-hat, toms, and cymbals) and to distinguish, and in many attempts isolate, specific voice timbres. This method made it possible to separate the individual voices of George, Paul, and John, even when they were singing together - on the same microphone!
Restoring dialogues between scenes, especially when there were complex (overlapping) dialogues, or when three, or four, members of the band were speaking simultaneously, was another task accomplished using a similar method.
In the end, the results were nothing short of amazing. Individual “stems” (separated solo voice/ instrument tracks) were created for each song (scene), allowing them to be further polished (removing noise, clicks, pops), manipulated with equalization, compression and ultimately, possibly even altered in timing and pitch.
The final process was undoubtedly the cherry on top: balancing and shaping a new mixdown.
„Now and then“
According to the complexity of the work that had to be done for “Get back” recordings, isolating John’s voice out of piano (and some TV noise in the background) was an easy task.
Latest
Sir Peter Jackson recently discovered, arranged to purchase, and finally acquired the original “Lost Beatles tapes.” His assistant located a 78-year-old man in Florida who was in possession of the two original tapes, recorded in late December 1962 at the “Star Club” in Hamburg, (Germany) being the very first Beatles recordings ever made. The Beatles played at Hamburg’s “Star Club” for three consecutive years. Their repertoire mainly consisted of well-known rock ‘n’ roll standards. Adrian Barber, then the stage manager of the club, decided to tape one of their gigs using a Grundig home reel-to-reel tape recorder, with a small microphone placed on stage, in front of John’s feet.
Now, with Peter Jackson in possession of the tapes, their future might just be similar to...
Conclusions
A final conclusion might be that it is unpredictable how this fantastic AI software will translate into the worlds of:
-
• Professional Audio: This includes re-mixing and re-balancing countless old mono recordings, adding new singers and languages, creating instrumental versions of legendary vocal songs, or producing new stereo, 5.1, or Dolby Atmos versions.
-
• Audio Recording: This involves using various elements of original songs, incorporating them into a new “environment” using individual sounds/samples to become a part of completely new songs, changing original arrangements with unpredictable production impressions.
-
• Classic Film/TV Series: This encompasses the restoration of countless inferior-sounding movies and TV series, since the technology at the time was of much lower quality and with limited manipulation possibilities.
Conflict of interests
The authors declare no conflict of interest.
References (machine_learning)
Список литературы Delicate sound of AI restoration - ”The Lord of the Things”
- https://www.abbeyroad.com/news/then-now-a-brief-history-of-the-worlds-most-famous-recording-studios-2595
- https://www.esquire.com/entertainment/music/a45727137/peter-jackson-beatles-now-and-then-interview/
- https://en.wikipedia.org/wiki/Neural_network_(machine_learning)