“It’s a bit like someone you love for years having a slightly different haircut,” says Giles Martin. “And you realize you still love them.” The producer is talking about his Beatles remixes, which have been available in their most expansive form on Apple Music since June, when the service launched their Dolby Atmos-driven spatial audio feature. As more listeners than ever before encounter Martin’s Atmos mixes of Sgt. Pepper’s Lonely Hearts Club Band and Abbey Road (which happens to be one of the most impressive spatial-audio showcases from any artist), he dug into the technical marvels and challenges of three-dimensional sound, explained why the current version of Sgt. Pepper’s won’t stick around much longer, and much more.
Sgt. Pepper’s was the first Atmos mix you did. What was that process like?
Sgt. Pepper’s, how it’s being presented right now, I’m actually going to change it. It doesn’t sound quite right to me. It’s out in Apple Music right now. But I’m gonna replace it. It’s good. But it’s not right. Sgt. Pepper’s was, I think, the first album ever mixed in Dolby Atmos. And we did that as a theatrical presentation. I liked the idea of the Beatles being the first to do something. It’s cool that they can still be the first to do something. So Sgt. Pepper’s is a theatrical mix that’s then being converted into a smaller medium. Therefore, it’s not quite right. I’m gonna go back to the theatrical mix and and make it into what’s called near-field Dolby Atmos, as opposed to the cinema Dolby Atmos. It’s a bit bright. It’s a bit digital. But again, I’m gonna replace it, so that’s cool.
Abbey Road does seem to sound quite a bit better. There’s something a little float-y about the way Sgt. Pepper’s sounds right now.
It seems to lack a bit of bass and a little bit of weight behind it. Abbey Road is a much better-functioning Atmos mix because it’s much closer to to the stereo mix, sonically.
I presume you start with the stereo mix and then proceed to the multi-channel one, right?
We start off with the stereo. I feel immersive audio should be an expansion of the stereo field, in a way. I like the idea of a vinyl record melting and you’re falling into it. That’s the analogy I like to use. And if you have lots and lots of things all around you all the time, it can get slightly irritating and confusing, depending on what the music is. If it’s EDM, it’s obviously fine. But the interesting thing about immersive audio is there’s a center point to it. So it’s almost like mono, but expanded. It’s like having a bit of toffee and smashing the toffee with a hammer and all the shattered bits going around you. And if you don’t have a focal point, if you don’t put your drums in the center or the vocals in the center, you don’t really get a sense of immersion. It’s a bit like a James Turrell room, where you’re just in this colorless room.
And we are by our nature, forward-facing individuals who don’t like too many things creeping up behind us. If you have a lot of sound coming behind you, you want to turn your head. I get criticized sometimes for not being expansive enough with these mixes, but it’s what I believe. I like the idea of falling in the record as opposed to just being circled around.
It does seem like there’s something very cool going on with John Lennon’s vocal on “Day In the Life” in the Atmos mix, where it feels like the reverb is behind you.
With Beatles mixes, because we have, I suppose, the money to do it, and the luxury of time, what I and [engineer] Sam Okell tend to do, opposed to using digital effects, is we’ll place speakers back in Studio Two [the Abbey Road space where the Beatles originally recorded]. And we’ll re-record John’s voice in Studio Two, so what you’re hearing are the reflections of the room he’s singing in. It brings the vocal closer to you.
What are some of your favorite moments in the Abbey Road Atmos mixes?
“Because” is three tracks of vocal, three tracks of three Beatles singing together. John, Paul, and George sang harmonies together and then did it three times. We also put that back into Studio Two. You get to create this beautiful sound field that wraps around you and you fall into it. It sounds unearthly.
“Sun King” is an interesting example, because you have the crickets in the back. And then the guitar on the [original] record pans left to right. But now in Dolby Atmos, I pushed the sound field further so it comes to the side of you and goes around. You’re dealing with a record that’s been around for 50 years that everyone loves, and no one’s ever said sounded bad. So it’s a bit of a tricky job. But what I like to do is follow what they wanted to do. They panned, so I follow the panning, but we pan around you, as opposed to just side to side. And that’s a really good example of immersive audio.
In “I Want You (She’s So Heavy),” you have the organ in the back right. It’s simple, but very striking.
Yeah, it works well in the back. And then of course there’s that white noise in that song that goes on for years. We have that swirling around you. Yeah, that moves around. And again, if I had the whole record swirling around, it would probably make you feel a bit sick. It’s like being in a tornado — you need to be stationary to feel the tornado. If you’re in the tornado, you’re gonna be just moving round and round. You want to be stationary so you can feel the wind around you.
I’ve heard some Atmos mixes of classic songs where the vocal is more upfront than it’s ever been before, and you suddenly realize that the vocalist was actually pitchy that day. It exposes flaws. How do you avoid that?
I agree with you, and the thing is, it’s a bit like doing an autopsy, where you’re opening up the body and you’re showing the separate parts. Sometimes you start hearing tuning discrepancies that that you didn’t know were there. And part of those discrepancies are what made the record great before so they shouldn’t be tuned. So we never tune the thing or fix timing errors or anything like that. That’s not what my job is. And if you have a very direct bit of audio coming out of a single speaker, you start listening to the speaker and not the sound field. We try and be less discrete with the vocal. In fact, we don’t really put vocals hard center. We used to in the olden days. We don’t anymore. We tend to blend the sound, so it’s more like a Monet painting. Music’s like that, you know: When you have a drum kit or a singer in front of you, you don’t necessarily hear them directly like that. You hear them together in a space.
You don’t always put the drums in the center, either, right?
It depends what the track is, to be honest with you. Especially with earlier Beatles stuff where you have four tracks. If, in “Day in The Life,” you put the drums in the center, you’d have to have bass and drums and vocal in the center, and everything else on the left-hand side. Also, in the case of the Beatles, because of the nature of the way Ringo is as a drummer… He’s quite often more of a parts drummer, a song drummer, rather than just a rhythm drummer. So when songs are being driven by the drums, they should be in the center, but when he’s like a percussionist… On “Day in The Life,” the center is John’s voice, or Paul’s voice in the mid-section, and the rest of the world is wrapped around that.
What’s the actual mechanism for placing parts in the sound field? I think people might picture almost a joystick.
In the old days, when I was doing the Love show [in Las Vegas] and and the Love album, I had a joystick. But now, actually, I have a mouse. I want my joystick back! Essentially you’re looking at a three-dimensional square where you can see inside, and you have a dot and you can move it around that that space. And then you can also make that dot bigger or smaller so it dissipates among the speakers.
Abbey Road Studios has software that allows you to essentially reverse-engineer an old track of audio with multiple instruments on it and separate it into individual tracks. You used that on Ron Howard’s Eight Days a Week documentary and The Beatles: Live At The Hollywood Bowl companion album, right?
Yeah, on Hollywood Bowl I took the crowd [noise] off and and then I sort of put the crowd back on again. With the source separation software, I need to make absolutely sure that it does no harm to the audio whatsoever. With things like Hollywood Bowl, to be honest, the audio is pretty cruddy anyway. So it was actually making the audio sound better, because I was reducing the screaming. I think if you compare it to the Hollywood Bowl release my dad had to work through in the ’70s, it’s far better.
The software is getting a lot better. I’m constantly looking at how we would approach it if I ever get to [remix] Revolver or Rubber Soul, early albums, which a lot of people want me to do. That’s a good example of, “How do we do that?” How do I make sure that John or Paul’s vocal isn’t just in the right-hand speaker, but also make sure that his guitar doesn’t follow him if I put it in the center? On “Taxman,” the guitar, the bass, and drums are all on one track! That’s why the record is basically on the left-hand side, and then there’s a shaker on the right-hand side of the center.
So you want to wait for the source-separation software to continue to improve.
That’s right. Despite the constant requests I get on Twitter or whatever to do these albums, I want to make sure that we can do a good job, and do a beneficial job. You’ve got to make sure that you’re doing things at the right time for the technology.
What do you think of the Atmos experience on headphones? Because it in the end is an emulation of an actual multi-speaker system.
There’s been an exponential growth in technology in spatial audio for headphones, which has happened in an incredibly short space of time. I would say that two years ago, it was unlistenable. And now it’s a good experience. The exciting thing is that it’ll only get better. I think we’re right at the beginning of this. And I I think what we what it can do is it can create intimacy with music. You can hear the difference with spatial audio. It may not always be better, but there’s a difference. I think we’re learning the tools to provide that difference for people. What’s great is that it creates more of a lean-in listening environment where you’re paying attention to it, as opposed to just having audio being played into your head to stop you from thinking.
Spatial audio in headphones is hugely variable, depending on the size of your head, the way your neck is on your shoulders. Where we perceive sound coming from varies with our physical bone structure. So I think what will happen, what is happening, is that there will be a lot of facial recognition, instant body measurements, pressure testing with headphones. That technology will improve. It will become more personalized for you as a headphone experience. On top of that, as you know, I work with Sonos, I’m head of sound experience for Sonos. So I’m involved in the listen-out-loud experience. There’s a huge push from Sonos and other companies to try to create immersive sound fields from single boxes or multiple boxes, and Dolby is doing this as well.
Can you begin to explain what’s happening on a technical level when we’re hearing an approximation of Atmos in headphones?
It’s so complicated. Essentially, if you think about it, we’re listening to just a stereo signal. But our brains think we’re not.The best way to think about it is they are trying to work out how we process directional sound, through phase, EQ, through different time alignments, that trick our brain.
It feels related to how binaural recordings work, where dual microphones placed in approximation of human ears mean that you perceive a sense of dimension when you listen back.
It is exactly the same. It is binaural. You’ve got two ears, but you don’t listen in stereo. You listen in three dimensions. So what everyone’s trying to do is try to create that three-dimensional space into your two ears before it gets to your brain.