Select Page
Sound in Lars von Trier’s Dancer In The Dark and Breaking the Waves

Sound in Lars von Trier’s Dancer In The Dark and Breaking the Waves

Back in 1995, Lars von Trier made waves with his Dogma 95 manifesto, which he co-worte and co-signed with fellow filmmaker Thomas Vinterberg. The manifesto became notorious for its prohibition of manipulations involving technology and special effects and instead it urged for a return to traditional values. Whatever the creative goal of the filmmaker, it had to be acheved through the merits of story, theme, and actors’ performances alone.

Trier himself failed to fully adhere to the strict rules of the manifesto in the films he made after writing it, but the limitations he imposed on himself nevertheless led to the creation of wonders such as Breaking the Waves (1996) and Dancer in the Dark (2000).

With regard to sound, rule number 2 of the manifesto stipulated that “sound must never be produced apart from the images or vice versa. (Music must not be used unless it occurs where the scene is being shot)”. And that’s on top of post-production not being allowed.

My initial reaction when I first read this rule was to think that the man was out of his mind. Isn’t the ability to separate, manipulate, and recombine image and sound the heart and soul of filmmaking? I was convinced this rule would condemn his films to total sonic bleakness.

To my surprise, and as you can probably guess from the length of this post, I was very wrong. Sound does play a very active role in both Breaking the Waves and Dancer in the Dark. It’s firmly embedded in the thematic and structural framework of the narrative, and perhaps the most satisfying thing about it is that it emerges organically from within the story, something that surely came as a consequence of the manifesto’s prohibition of artificiality.

The two films belong to the ‘Golden Heart’ trilogy, which was inspired by a children’s book that Trier loved as a kid and which tells the story of a girl who gives away everything she has and ends up broke and alone in the woods.

They’re also strikingly similar in many ways.

Both are melodramas, a genre which Trier chose because it provided the perfect platform for creating the high levels of emotion he was after.

Both share the same themes of transcendence and sacrifice, or more precisely, transcendence through self-sacrifice.

Both have as protagonist a female Christ-like figure who seems childlike, weak, and helpless but deep inside is strong and committed to staying true to herself, even if it means dying. It’s ultimately through their commitment to this value that they attain transcendence.

Both share the same basic plot: a woman sacrifices her life to save the person she loves because she believes it was her fault that this person got into his predicament.

In terms of sound, both films use it similarly in some respects and differently in others. Most significantly, both use it to dramatise the thematic element of transcendence at the end. But each film uses sound in contrastingly different ways to get there. In Dancer in the Dark, sound has a rich presence and forms part of Selma’s characterisation, thus playing a key role in facilitating audience identification with her (See this post for more on identification). In Breaking the Waves, there’s only one active sonic element, which makes its presence mostly through absence and which we only hear at the end of the film. It does, though, act as none other than the central symbol that delivers the story’s thematic resolution at the end.

I’ll start with Dancer in the Dark.

Dancer in the Darkcontains probably one of the most brutally devastating and heart-shattering endings in the history of cinema. What makes it so harrowing is not just the intensity of the particular situation – and of Björk’s performance – but also how viscerally we the audience feel the pain and the agony.

From a filmmaking point of view, this ending owes its effectiveness to Trier having done an exceptionally good job of putting us in a deep state of identification with the protagonist, and of getting us inside her head and keeping us there all the way to the end. In this respect, the film is a real tour-de-force.

Three key requirements to getting identification right are, ONE, that the audience understand the protagonist’s belief structure, so that they can comprehend how he or she is being affected by the events taking place; TWO, that the audience experience the same emotional states as the protagonist and arrive at them through the same perceptual and cognitive processes; and THREE, that the audience develop a strong emotional connection with the protagonist and get to care for what happens to him or her.

In Dancer in the Dark, sound is at the helm of all three.

The film is about Selma, a single mother and an immigrant factory worker living in rural America for whom life is a struggle. She’s going blind due to a congenital condition which her son has inherited. She’s keeping her progressive blindness secret because she fears she might lose her job and would not be able to pay for an operation to save her son’s eyes. One day Bill, a local cop and her landlord, confesses to her that he’s in a dire financial situation. Selma, to make him feel less ashamed, confesses to him that she’s going blind and tells him about the money she’s saving for her son’s operation. Shortly after, Bill steals the money from her and, in a tragic unfolding of events, Selma ends up killing him and being sentenced to death by hanging. Although the opportunity arises to change the sentence, she refuses as it would mean breaking a promise she made to Bill not to tell anyone about his money situation, and, worst of all, it would jeopardise her son’s chance to have the operation as the money would have to be used to pay for a proper lawyer.

The only little joy Selma has in life is her love for musicals of the classical Hollywood era. She also has a gift for musicality, one of her most distinctive traits. Selma can hear music in the small sounds of everyday life, sounds which magically transport her into a fantasy world of song and dance.

Both her love of the musicals and her musicality serve as a powerful means of identification with Selma. She repeatedly talks about her love for the sonic world and the things she likes and dislikes about musicals in a way that pulls us into her mental model and helps us understand how she copes with the pain in her life.

In terms of getting us into Selma’s head perceptually and cognitively, Trier manipulates image and sound to take us into her inner fantasy world. First, an apparently insignificant noise slowly begins to sound differently. It gains prominence and it becomes more regular and rhythmic. Trier then shows us the source of the sound and then cuts into Selma becoming entranced by it and going into a dreamy state. The colours become brighter. The sounds finally become music, Selma starts to sing, and everyone else to dance with her.

Even when things get really ugly, Selma manages to find a way out of her pain through her gift for musicality.

But this trait does far more than help position us perceptually and cognitively inside Selma’s head. It also helps us create an emotional bond with her.

A character having admirable qualities is a sure way of creating the emotional connection between audience and character required for identification to be effective. Selma’s musicality says a lot about her. This is a woman who’s been dealt a very bad hand in life, and yet she still manages to find beauty through one of the few things she’s got left – the little sounds around her. It does make her very likeable.

The lyrics of her songs, too, give us direct access to her inner world and help us create a strong emotional bond with her. Particularly touching is this song.

Here, Selma demonstrates her truly stoic nature – her ability to turn the most painful of situations into something to embrace with joy rather than fear. The song shows us that Selma not only has accepted her blindness but almost embraces it as an opportunity to more deeply enjoy the richness of the sonic world. Sight has become redundant to her.

With this tactic, Trier is effectively preparing us for the kill. Selma’s love for the sounds of life will play a crucial role in achieving the extreme pathos at the end of the film that he’s wanting to elicit in us. After this song, it’s impossible not to deeply care for Selma and not to admire her resilience. By now, we’re deeply immersed in her inner world and understand that no matter what life throws at her, so long as there are sounds around her that she can turn into music, she’ll be fine. When Trier takes away from her even those little sounds towards the end of the film, our hearts will bleed for her because we understand the devastation she’s feeling.

As part of his strategy for strengthening our emotional connection to Selma, Trier also makes all the people in Selma’s life, including her landlord Bill, deeply care for her. If they do, so do we.

This tactic reaches its climax with Brenda, a prison guard who also develops empathy with Selma. “I know you love your son very much”, she tells Selma. “Got a boy of my own back home”.

What Trier does in actual fact is to combine the two tactics, Selma finding solace in the little sounds around her and other characters deeply caring for Selma, to create a Molotov cocktail of raw emotion at the end which he delivers through a setup/payoff structure.

Brenda wants to make Selma’s final moments as bearable as possible. In the setup scene, she finds out about Selma’s love for sounds and the soothing effect they have on her – as well as the distress their absence causes her.

The payoff comes in the scene where Selma has to march the 107 steps to the gallows but struggles to take even the first one. Brenda has devised a plan to make the walk more bearable for her. It is a most moving act of kindness and makes the injustice that Selma is suffering all the more viscerally felt and painful to watch.

There’s yet another interesting point about Selma’s characterisation through sound. Selma’s drive, apart from wanting to save her son’s sight, is a desire to be true to herself. This value is at the core of her belief structure, and she explicitly expresses it several times in the film through speech. What is most interesting is the words she chooses – “I Listen to my heart”. These words reveal her dominant sensory modality, which is so central to the film.

We first hear the sentence barely a minute into the first scene. We hear it again over two thirds into the film. And we hear it again at the end. It is in fact the very last line of spoken dialogue in the film. It can’t get more privileged than that.

As well as belief structure, perceptual and cognitive alignment, and emotional connection to get us to identify with Selma, Trier uses the musical genre to deepen our connection with her. His use of this genre serves identification in more than one way.

First, there’s the songs. Songs in musicals can serve all sorts of dramatic functions – commentary, narration, exposition, emotional climax, and so on. In Dancer in the Dark, they serve to reveal aspects of Selma’s belief structure – her hopes and fears, her joys, her pain, and so on – and to echo her psychological and emotional states.

But Trier also does something very unique and original with the genre. He uses the conventions of the musical themselves as metalanguage. In the film, Selma loves musicals. She’s taking part in an amateur staging of The Sound of Music, she regularly goes to screenings of Hollywood musicals at the local cinema, and she talks with others about what conventions of the genre she likes and she doesn’t like.

Trier then uses these conventions she talks about as stylistic elements at the end of film, making the overall effect a lot more tragic.

Earlier we heard Selma tell Brenda, “In a musical, nothing dreadful ever happens”. In the previous clip, she tells Bill how much she hates the last song and how she used to cheat to avoid it. She also tells him how she hates it when “It goes really big, and the camera goes like out of the roof, and you just know it’s gonna end”.

This is how the film ends:

Something dreadful happens to Selma – She’s about to be hanged. She sings her last song. This time she can’t escape into her fantasy world. There are no beautiful sounds around her. People don’t start to dance. Colours remain unchanged. Everything is too real. It really is the last song. And the film ends with the camera going off the roof.

There’s one final detail in this ending in terms of the use of sound. Selma’s love for the sonic world has been one of her most defining features, and a great deal of our identification with her has come through her perception of sound. A distinctive feature of sound is that it’s made of mechanical waves that come into actual contact with our bodies. This has made our experience of identification with Selma all the more physical. It has helped us inhabit her being more palpably.

Trier uses this ability of sound to take one notch higher the distress we feel at witnessing Selma’s hanging by completely cutting out all the sound the moment she dies.

The effect is very unsettling. It’s as if we’ve gone to the other side with her. We’re still inhabiting her being, only it’s now a soul staring in shock at the lifeless body that has so violently been taken away from her.

We know, though, that Selma has attained her transcendence. She did the moment Kathy put her son’s glasses in her hands.

Although she doesn’t manage to finish her last song, Trier writes on the screen the last few lines she meant to sing:

“They say it’s the last song. They don’t know us, you see. It’s only the last song if we let it be”.

Breaking the Waves

Breaking the Waves uses sound differently, although to the same end of dramatising the theme. It consists mostly of one sonic element, bells, which make their presence through absence and which we only hear at the end.

The bells act as a symbol for the journey Bess has to take to attain transcendence and also to dramatise its completion. They acquire significance through the events that transpire in the story, so we’ll need to go through the key narrative points to understand how this happens.

Like Selma, Bess will need to sacrifice her life to attain transcendence. There’s a caveat, though. It’s not the act of sacrifice itself that will ultimately lead to transcendence, but the quality of character that empowers her to do so. This quality is the strength to live a self-determined life, or what Selma called listening to one’s heart.

Selma was already 100% there from the beginning. She didn’t need to grow. It was her friend Kathy who had to do the growing inspired by Selma. “You were right, Selma. Listen to your heart!”, she tells Selma at the end.

Bess, on the contrary, is not quite there yet. She still has to overcome some inner blocks to reach the level of self-determination needed to attain transcendence.

The obstacle standing in the way of Bess is the repressive nature of the social order within which she has been brought up – a severely authoritarian Calvinist community in a remote part of Scotland run by male elders who deny a voice to women and who have no mercy for those who sin and disobey their rules, ostracising them in life and condemning them to Hell in death.

The psychological power of the antagonising forces Bess has to overcome is embodied in her mother, who’s so fearful of standing up to the tyranny of this social order that she even participates in the ostracising of her very own daughter out of fear that she might be cast out herself.

When the film starts, Bess already has a good dose of the quality of character she needs, and she seems to be in touch with her inner purpose. In the opening scene, she wins the approval of the elders to marry an outsider, something they tend to oppose. She comes across as astute and fearless, though respectful, of the elders.

The trajectory of the film is hinted at in the sermon the priest gives at Bess’ wedding.

First, there’s mention of God as “the author of every good and perfect gift”. When the priest says this, the camera is on Jan. Later we hear Bess thank God for the gift of Jan. That what Jan is to her, a gift from God. This is one key motivator of the actions that move the narrative forward.

Second, there’s mention of Christ, which foreshadows Bess’ fate to become a Christ figure by sacrificing her life to save Jan.

Third, the priest talks about Bess often giving her time and effort to cleaning the church building, “not so as to be well-thought-of here on earth but our of your love for God in heaven”. This is the source of her strength and self-determination. If Selma listened to her heart and not to reason, Bess listens to God and not worldly authority.

Bess’ greatest talent is her belief in God and in the power of prayer to bring miracles. She believes she can win God’s favours through conversations with Him. It’s a well-founded belief, and it has much to do with what transpires in the story. After Jan leaves for the first time after their wedding, Bess begs God for him to come home early. God answers her prayer, although in a “be-careful-what-you-wish-for” manner. Jan comes back home early, paralysed from the neck down due to an accident at the rig.

This is how God puts Bess through the test she needs to go through in order to overcome her limitations and level up.

Jan regrets he can no longer make love to Bess. “I can hardly remember what it’s like to make love, and if I forget that, I die”. He then urges her to find other men to make love to and then come and tell him all about it. For Jan, this would be like being together again.

Bess gets upset and refuses. She’s struggling morally, a sign in the context of the film that she still lacks full self-determination and is living according to the moral values imposed on her by a worldly power.

Jan’s condition worsens. She fears she might be losing him and begs God not to let Jan die.

“There’s nothing I can do”. She knows there is, but she’s not ready to take that step. “Prove me that you love him and I’ll let him live”. At this point, Bess understands, and begins to accept, that this means giving in to Jan’s request to have sex with other men, so that he won’t forget what love is and die.

Bess tries with the doctor but he refuses out of respect. Then she jumps on a bus and discretely masturbates a man. Afterwards, she’s disgusted with herself and begs God for forgiveness. This is what god replies:

After God’s mention of Mary Magdalene, Bess finally understands that having sex with other men to save Jan may be reason for the elders to condemn her to Hell but it’s not a sin in the eyes of God. That’s all Bess needs to know. From this point on, she becomes more emboldened and less respectful of the tyrannical powers surrounding her:

But her efforts are not enough. The turning point for Bess comes when Dodo tells her that Jan is dying. God has also stopped responding to her, presumably because of her half-hearted attempts at saving Jan. She decides to make the ultimate sacrifice to save him.

Now for the bells.

The bells, both through absence and presence, symbolise Bess’ journey towards her attainment of transcendence. It’s probably not a coincidence that “Bess” sounds a lot like “bells”.

This is how they come into the story:

Their absence symbolises the despotism of the elders of the Church.

For centuries, bells have been an intrinsic part of Christian life and identity and have been used to mark important moments in people’s lives such as birth, adolescence, marriage and death. Their sound has become almost primal and deeply implanted in the psyches of the inhabitants of such communities.

Because of the sway they hold on people, bells have often been used as a political weapon. This is precisely what the elders of Bess’ community are doing. It’s in actual fact something that Jacobin and Calvinist iconoclasts have often done in the past, to remove bells as a way of imposing their social order.

The absence of bells also acts in the film as symbol for Bess’ call to adventure, to borrow from the Hero’s Journey terminology. In this scene, we see her heed the call:

“Let’s put them back on” is a bold declaration of war on the elders and what they stand for.

For most of the film, the bells remain invisible and silent, perhaps present only in Bess’ mind.

The first time we actually hear a bell is when Bess gets on a boat to approach the ship where the men that will eventually kill her are. There’s nothing apparently significant about it on this one occasion. But when Bess gets on the boat the second time, this time knowing that she’s heading towards her death, Trier makes use of cinematic means to imbue the bell with symbolic meaning.

In the third shot, when she gets onboard, the boat bell is framed right in the middle. Then Bess starts her conversation with God. When she asks, “But you’re with me now?” (God had stopped talking to her), God replies, “Of course I’m with you, Bess. You know that”. Bess listens to this reply with her eyes closed, to feel it more deeply. Shortly after “You know that”, the bell boat rings and Bess opens her eyes. She says, “Thank you” and looks into the camera, something she often does when she gets her way with something.

At the literal level, the bell is announcing the arrival of the boat to its destination. At the symbolic level, it’s announcing Bess having won her place with God and therefore her transcendence, an announcement which she herself notes. It’s also an anticipation of the way in which transcendence will be dramatised at the end.

After this trip, Bess dies and Jan recovers miraculously thanks to her sacrifice and to Dodo’s prayer at Bess’ request.

Jan, knowing that the elders won’t give Bess a proper burial and will use the ceremony to condemn her to Hell, steals her body and gives her a proper farewell at the rig. The following day this happens:

Bess has put the bells back on. She’s attained her transcendence and she’s in Heaven next to God. And just like we retain Selma’s point of view/hearing from her place of transcendence beyond physical life in Dancer in the Dark, so do we in Breaking the Waves.

One other thing.

It may be tempting to conclude that there’s not much in the way of cinematic sound in Breaking the Waves apart from the symbolic function it plays. It’s undoubtedly a very clever deployment of this resource, but for most of the film, it offers little to do hearing-wise. However, there’s more sound to it than meets the ear.

After Bess tells Jan, “I like bells. Let’s put them back on”, she attends church, and this is what she does:

She’s imagining how bells in the tower would sound. And what do we, the audience, do? Since we’re in a position of identification with her, we image what she’s imagining, which is the sound of bells pealing away. Since the brain can’t tell real from imagined…

This is actually a trick that directors in the silent era often used to engage the sense of hearing. They’d include shots of objects that are known for the sound they make and of people doing things like cupping their ears or using body language that suggests intentional listening. It is very effective for involving the audience because it engages the brain by giving it a gap that it needs to fill.

Trier uses this technique in another scene:

Here, Jan and Bess are having a little ‘moment’ on the phone. Since they can’t be physically present with each other, they satisfy their desire to be together this way. Later on, Jan will make reference to this moment when he asks Bess to have sex with other men: “Remember when I phoned you from the rig? We made love without being together”.

The scene is constructed in a way that prompts our brain to imagine what the character on the screen is hearing – Jan’s breathing first, and then Bess’. By using this technique of imaginary listening, Trier is effectively making us participant in the couple’s erotic games. This way is probably more effective than if Trier had simply added the sound of their breathing manipulated so as to seem to be coming from a phone line. Our brain has to fill the gap. It is more engaged and it finds it more satisfying – perhaps more titillating, too.

The great thing about Breaking the Waves is that, although it doesn’t seem to offer much in the way of sound, it actually makes a very creative use of it. And it clearly illustrates one very important point about film sound: it’s not all about abundance of loud and espectacular sound effects carefully crafted in post-production. It’s about using sound imaginatively and actively, as part of the narrative and of the process of narration, and of the form and the style of the film.

Finally, Breaking the Waves contains only one active sonic element, but it’s so effective because Trier has done it the right way. He’s selected this element not only because of the rich symbolic meaning it already comes with, but also because, being sonic, it has affordances that allow Trier to tell the story in ways that would not have been possible with a visual symbol.

The concept of affordance – what a medium can offer that others can’t – is essential to understanding the creative potential of the relationship between image and sound in film. It is also a very important concept in Semiotics.

I’ll be starting the exploration of film sound from the angle of Semiotics in my next post. Till then, have a nice one and hope to see you back.

On Combining Image And Sound In Film

On Combining Image And Sound In Film

Many screenwriters and filmmakers find it very difficult to come up with ways of using sound more cinematically and creatively. Whenever they try, all they get is silence in their heads. It’s almost as if the muses themselves panicked and stampeded out of their minds the moment they heard the word “sound”.

This is not because sound doesn’t have much cinematic potential – it does. The problem lies in the way film sound is thought about. So that’ll be the subject of today’s post, the wrong way of thinking about sound and the alternative that can break the impasse.

Movies are so compelling because they’re made of the same stuff of reality – electromagnetic waves the eyes can see and mechanical waves the ears can hear. It is a mistake, though, to think that faithfully imitating perceptual reality will more effectively draw audiences into the story world. It is in fact the main cause of the creative impasse, like trying to complete a shape in a sliding tile puzzle that has no empty space to move the tiles around.

First, combining image and sound according to ready-made perceptual configurations amounts to mere mechanical reproduction, a big no-no in any form of representation.

Second, real does not necessarily mean authentic.

But most importantly, mechanically adding sound to images for the sake of veridicality can have the undesired effect of creating an attention deficit in the audience.

Attention is what makes a film possible. If a movie fails to engage the audience’s attention, then it is not a journey through a fictional world but a mere pile of celluloid or pixels.

The ability to get the sustained and undivided attention of an audience is what sets a good filmmaker apart from the rest. Filmmaking is in fact all about designing the cognitive processes required to create the illusion of reality in the audience’s mind. It’s the art of guiding attention through the skilful use of cinematic techniques such camera movement, composition, editing, and sound.

But anyone who’s had a go knows that this is not easy. Why? Because attention is metabolically expensive. It requires a lot of energy, of which the brain has only so much at its disposal. If it’s to make it through the day, it needs to find ways of making ends meet. In the case of perception, which is the outcome of attention, the brain goes about saving energy by storing and organising sensory information that finds in the environment by means of two systems known as event files and schemas.

Event files is where all the relevant sensory information about people, places, events and objects is stored, so that the next time the brain encounters them, it can retrieve the necessary information – characteristic visual features, sounds, and so on – quickly and efficiently.

Schemas are mental structures for organising more complex types of information: general knowledge about sequences of events, rules, norms, procedures, and social situations that have been acquired through experience. Film is a good example of a schema. It contains all the norms and conventions required to make sense of a movie, all of which we have acquired through experience, i.e., repeated exposure.

This system makes sense: why waste energy reinventing the wheel each time? By using event files and schemas, the brain can quickly and efficiently form a percept whenever it detects a familiar cue.

The brain also has a surveillance system which constantly monitors the environment by scanning every one of the 11 million bits of data that the sensory organs send each second. The brain constantly compares all this data against its existing event files and schemas. If everything is as expected, then it can carry on running in economy mode. Only if anything deviates from the expectations or if anything suddenly changes will the brain ‘wake up’ and deploy its attentional resources to find out whether these deviations or changes represent a threat, an opportunity, or nothing worth investing energy in.

And this is why it is risky to think that combining image and sound realistically will draw audiences deeper into the story world. Ready-made perceptual configurations can have the opposite effect of telling the audience’s brain that there’s no need to waste on attentional resources as everything is as it should be.

This instinct towards naturalness is understandable. From a survival point of view, one major advantage of being able to acquire information about a same event through multiple sensory channels is that it allows the brain to verify the truthfulness of our perceptions. After all, despite their complexity, our senses are still prone to misperceptions and illusions which could potentially be deadly. Therefore, having senses that carry information along separate pathways allows the brain to cross-check our perceptions and confirm that they’re accurate. So it’s little surprise that many feel realistic combinations of image and sound in film are the best approach, lest the brain might detect the perceptual fallacy that a film is.

But the truth is that we don’t need to worry about veridicality when it comes to film. First, for reasons we’ll see later, the brain is hardwired to automatically accept images and sounds that happen simultaneously in time and space as belonging to a same object, person, or event. In normal circumstances it may immediately carry out a reappraisal and either confirm that it was right or realise that it was wrong. But in the case of film, because it knows it is a schema where everything happens for a reason, it would never discard the image-sound connection as wrong.

Also, film is a form of pretend play. Pretend play allows us to modify representations of reality in our heads. It is good in that it opens up a world of new possibilities for exploring different options. But for it to be effective, the brain needs a way of making sure that real and imagined don’t get mixed up – it would be disastrous if we took a fire or a lion to be imagined. The brain goes around this problem by creating a copy of the original percept and then activating a decoupling mechanism that dissociates the copy from reality. That way we can modify the copy as much as we like without jeopardising the truth value of the real percept. So, our safety being secured, the brain is happy to suspend disbelief and go along with whatever comes its way, no matter how out of this world and improbable it may be.

Veridicality, therefore, is not something a filmmaker should be worrying about when it comes to combining image and sound in film.

Does all this mean we should avoid perceptual realism? No, so long as it serves a specific cinematic purpose. It should be a deliberate choice aimed at having some kind of effect on how the audience will perceive the scene. It’s a bit like deciding whether to use a standard medium shot that feels natural and safe instead of a dramatic low-angle, or mid-key lighting as opposed to low-key.

It doesn’t mean either that we can combine image and sound willy-nilly. The combination of the two must meet one fundamental requirement: that it is done in a way that the brain can make sense of. And what is that way?

Since filmmaking is all about hacking the perceptual and cognitive processes of the brain, it’s simply a matter of finding the right process to hack. This shouldn’t be too difficult, since the brain too has to combine images and sounds that have been captured separately into a coherent and meaningful whole.

The brain receives information about the environment through five different sensory organs. Each captures a different spectrum of physical reality. The eyes pick up electromagnetic waves, the ears mechanical waves, the nose chemical substances in the air, and so on. Also, each sensory organ is processed for the most part in a different region of the brain.

To integrate these multiple sources of information into a unified and meaningful percept, the brain has to solve what is commonly known as the binding problem: it must determine which features belong to the same object or event. And it has to do so at two different levels – physical and semantic.

At the physical level, the brain has to determine what multisensory stimuli belong together. It does so by searching for patterns of neural activity across the different cerebral regions.

The senses send signals to the brain which cause the cerebral parts responsible for processing each sense to become active with neural activity. There are other aspects of reality such as time and space that also have their own dedicated regions and that also get activated by sensory stimuli. So what happens every time we perceive a subject is that the regions in the brain responsible for processing vision, sound, time, and space, all fire up at the same time, and this is how the brain determines what stimuli belong together – by detecting overlapping patterns of neural activity across its different regions.

At the semantic level, the brain solves the problem of integration also by searching for overlapping patterns of information. The only difference is that the information is of a semantic nature instead of physical.

Here the problem is not to determine what belongs together but how it belongs together. If the brain grouped multisensory stimuli physically but not semantically, we would experience the world incoherently. We would be able to recognise the different sounds and images but they would not be grouped meaningfully. On a busy road, while talking to someone, we might perceive the noise of car engines as coming out of their mouth and their words as coming out of car engines. Everything would be random and incoherent and we would not be able to use that information to guide our actions and decisions effectively.

You can get a sense of the nature of the binding problem at the semantic level by trying to answer the following question:

Any luck? Are you maybe thinking, “It depends…?”

What makes it impossible to answer this question is that there are three dimensions of information – shape, size, and colour – spread over three different layers – triangles, squares, and circles but no way to link them. In order to answer the question, you’d need to be given the specific dimension you must use as a reference point or unifying value for making meaningful associations across the different layers. If the question was “Which three figures are alike in the dimension of size?”, then your brain would be able to look for patterns of the “size” cue across the three layers, and thus connect the layers meaningfully. If it was colour, then it would look for overlapping patterns of this cue instead.

This is how it would work in a simple real life audiovisual situation. Let’s say the brain wants to determine what gender a person is. In such case, the physical elements are the face in the visual channel and speech in the auditory channel. The semantic element is “gender”. “Gender” then will be the linking value or dimension the brain will use to integrate audiovisual information meaningfully.

The brain will then start looking for overlapping patterns across the two channels. At the visual level it will find things like skin texture and bone structure. At the auditory level, it will find things like pitch and sound power. It will then fuse them and the result will be a coherent percept that communicates something meaningful, i.e., the gender of the person.

We could change this parameter to “truthfulness” and the final percept would be different. The brain would be searching for different types of cues in each sensory channel that normally indicate whether a person is telling the truth or not – sweat and stress in the voice for example – and the final percept would have a different meaning.

For the record, in perceptual terms, dimensions are all the different things going on in the environment on which we could potentially focus our attention and the layers are the senses.

So how does all this apply to film?

You may recall what I said earlier about the brain being hardwired to automatically accept images and sounds that happen simultaneously in time and space as being causally connected. This is because of physical binding. It is so common that the auditory, visual, temporal, and spatial processors fire up simultaneously in the brain that evolution has concocted a rule that goes something like, “If auditory and visual stimuli happen simultaneously in time and space, then automatically synchronise them”. That’s just what we get at the movie theatre.

As for semantic binding, this is the principle we need to follow to combine image and sound in a way that the brain can make sense of, whether the combination is faithful to our everyday perceptual reality or not. It is the process a filmmaker can exploit to use the construction of auditory and visual elements in ways that serve the dramatic, narrative, and cinematic needs of the story and not just the brain’s demand for veridical perception.

The process is the same: to select the auditory and visual elements that will overlap to form an audiovisual pattern, and to manipulate them so that they are congruent with each other by way of sharing a common unifying dimension or value.

So first you need to select a unifying value to integrate image and sound meaningfully. Then you need to include features in both the image and sound channels that a) are semantically related to that unifying value and b) have a counterpart in the other channel that the brain can associate. Or put more simply, you need features in both image and sound that are related to a unifying value and that combined form an audiovisual pattern that the brain can detect and make sense of within the context of the story.

One phenomenon that clearly demonstrates how semantic binding works in film is the ability to successfully use different types of music with a same set of images. Say we have as the visual setting a couple by the beach at sunset. If we add a romantic melody, it will work perfectly well, but so will a suspenseful tune. Why?

In the first case, the visuals contain elements that, at least in our culture, are perceived as romantic: a detached natural setting and dim light that inclines couples to more freely express their feelings. The music also contains elements that are also perceived as romantic: simple chord progressions, a predictable linear melody, use of a major key, and so on. In such case, the brain has no problem finding overlapping patterns across image and sound that it can integrate meaningfully, i.e., that the couple are about to consummate their love for each other.

But the visuals also contain dimensions that we tend to perceive as danger. In the darkness our sexual inhibitions may decrease, but also our darker side may feel freer to come out. In the darkness we also feel more vulnerable. Therefore, the elements that characterise suspenseful music – dissonant chords, eerie intervals, non-linear sounds, a minor key and so on – will also align well with the visual elements of danger.

In short, a lonely beach at sunset is as good a setting for romance as it is for murder. Therefore, whether we use a romantic melody or a suspenseful tune, the brain will find corresponding patterns in both image and sound and will align them in our minds either way. Each alignment, though, will produce a very different meaning.

As we saw earlier, in everyday life perception, our needs and intentions dictate the unifying value we choose to focus on, which in turn will dictate what aspects of the environment make it to our perceptual field.

And as we’ve just seen, in film, the cinematic needs of the story (romance, suspense, and so on) dictate the unifying value, and the filmmaker determines what auditory and visual elements will be selected to align with this value, so that both image and sound work harmoniously with each other to create the desired effect and meaning.

But there’s more to it. This unifying value will not just determine what auditory and visual elements go in but also how they will be manipulated to make the overlapping pattern work. In the beach example, we would use different camera angles, framing, and editing for the romantic option than we would use for the suspenseful one.

In summary, in film, the unifying value is determined by the cinematic needs of the story and the scene in question. The unifying value in turn determines what auditory and visual elements will be required and how they need to be manipulated so as to create correspondences or patterns between image and sound, so that the brain can make the right associations.

Two scenes from two different scenes that fabulously illustrate this principle are the chopper scene in Predator (1988) and the chopper scene in Apocalypse Now (1977). They are ideal because both films share a same subject matter, war, and both scenes share the same setting and many elements: a chopper, soldiers, and a stereo playing music. But because each has a different purpose and therefore requires a very different set of choices regarding the selection and manipulation of auditory and visual elements to create an effective audiovisual pattern.

Predator. This is one of those films where the audience must be made to care for the team as a whole and not just fo the main character. It is in fact the group’s bantering, their comradeship, and how well they work as a team what makes the film so enjoyable to watch.

The chopper scene plays a key role in that respect. Its aim is to establish and build that sense of comradeship among the team and their knack of teasing each other. Comradeship therefore is the unifying value that drives the interaction between image and sound. And this is how the auditory and visual elements were selected, manipulated, and combined to serve that purpose:

On the visual side of things, the inside of the helicopter is lit with a dim red light to create a sense of warmth and intimacy, perfect for creating an atmosphere conducive to bonding. Framing consists of medium close-ups, and editing mostly of action and reaction shots that not only create a stronger connection with the characters but also show the nuances of their interactions and the sense of camaraderie emerging between them.

Sound wise, the dialogue consists mostly of their bantering. “Long Tall Sally”, a 1956 Rock and Roll song, is playing in the background with a small portable stereo player as its source. This is an interesting choice of music, since Rock and Roll has traditionally been used for male bonding purposes. And that’s just what the song is doing here, helping them bond and putting them in the right frame of mind for bantering and developing that sense of care for each other that is so essential to the plot and to the audience getting to like them and wanting to spend time with them. As for the manipulation of the music, because it is there for the benefit of the characters, it has to be diegetic, so the frequency range of the song had to be adjusted to make it sound like it is coming from small stereo and also so as not to interfere with the dialogue.

The correspondences between image and sound that form the audiovisual pattern then are the dim red light, which works well with rock and roll to create an atmosphere conductive to bonding; and the reduction of the frequency range of the music, which gives a natural sense of space and allows for the dialogue to be clear; the clarity of the dialogue in turn plays well with the medium close-ups, which in turn serve the purpose of conveying the sense of emerging camaraderie.

Apocalypse Now. This film is about the moral ambiguity of war, particularly of the Vietnam War. It reveals this theme through the actions of US Army soldiers whose moral values rapidly disintegrate as a result of their participation in a futile, morally unjustified, war. Most notorious is their use of Western cultural artefacts (Wagner, T.S Eliot…) as ‘weapons’ intended to represent a greater “civilised” power that can easily subjugate the indigenous peoples of Vietnam.

The scene of the Ride of the Valkyries captures these thematic elements very skilfully. Its aim is to display the scale and might of the US Army and the way they exploit Western cultural artefacts to tyrannise the invaded. The unifying value that will drive the interaction between image and sound then is scale (of superiority).

Visually, the scene consists of a large number of espectacular extreme wide shots of the fleet getting ready to attack and then charging at the inhabitants of the village.

Aurally, we have Wagner’s opera Ride of the Valkyries being played full blast by a soldier from a stereo inside the helicopter because “It scares the hell out of the slopes”, and the sound of large explosions.

If in Predator images and sounds were warm and intimate, here they are large, distant, and even intimidating. Most interesting is the manipulation of the music. In both scenes the source of the music is a stereo player. But in Apocalypse Now the music had to be manipulated very differently to serve the unifying value of scale and superiority. Even though the music is technically diegetic, it had to be made to work ambi-diegetically, since reducing the frequency range the way Predator does would defeat the point and go against the purpose of the scene, to display scale and might. It wouldn’t match the extreme wide shots and it would not sound large, threatening, and imposing. It would also be asking too much of the audience to believe that the “slopes” would be able to hear a thin-sounding tune, let alone be intimidated by it.

The final point I’d like to make is, why bother? After all, most films seem to be doing just fine with the naturalistic approach.

The reason is simple: the audience. When people invest money, energy and two hours of their time, they want to get a return for it. And what would that return be? Pleasure.

As I mentioned earlier, film is a form of cognitive pretend play. Pretend play is a behaviour that has survival value. Anything that is good for our survival comes with a ‘thank you’ gift from our genes – a shot of dopamine and other feel-good chemicals. And this is ultimately what the audience are after and pay for.

Films are like a gym for the mind. They allow us to hone one of the most fundamental skills we humans need to survive our environment: pattern recognition. That’s what films are, a system of interconnected patterns. And that’s great, because the brain gets a kick out of completing patterns. It loves to impose order on an otherwise highly chaotic environment. It constantly looks for coincidences that alert it to possible causal relationships between events. And when it makes the right connections, that’s when dopamine flows into the bloodstream.

The most common types of patterns used in film tend to be patterns of shapes, light, colour, sound, movement, cause and effect, time, space, behaviour, character, and action. But the relationship between image and sound itself is another rich source of pattern, one that filmmakers seldom exploit. It offers the opportunity to create patterns that convey meaning in a non-linear, more interesting, way and it offers the opportunity to take the brain out of its perceptual slumber. What is there to lose by bothering? And what is there to win? The bottom line is that, when it comes to film, the brain wants to be engaged, and the more layers of engagement, the better. The more patterns to solve, the bigger the fix of dopamine the audience will get.

What I’ve been talking about in this post is far from everything that there is to know about the relationship between image and sound. You can think of the organising principle I’ve described as the overall strategy. Then there’s tactics that offer myriad ways of putting image and sound together. Ultimately, it’s all about creating a whole that is greater than the mere sum of the parts, and that requires a dynamic process.

Exploring what dynamic means requires a different approach to film sound than I’ve been taking so far. So shortly I’ll putting evolution and cognition aside for some time and instead explore the relationship through the lens of semiotics, which deals with human-made meaning.

But before jumping into this fascinating world of meaning making, I’d like pause and take stock of some of the things I’ve been talking about so far. Nothing like seeing things in action, so in my next post I’ll be discussing sound in Lars von Trier’s Dancer In the Dark and Breaking the Waves.

I hope you’ll be joining me. Till then, have a great time.

Film As A Simulation Of The Brain’s Mind

Film As A Simulation Of The Brain’s Mind

In my previous post, I suggested that it is more helpful to look at story as a simulation rather than in terms of plot and character when it comes to thinking about sound cinematically, at least in the early stages. So today I’d like to explore the concept of simulation more in detail.

A simulation is a model of some aspect of reality that allows us to safely carry out experiments, learn new things, and practice skills that we can then apply to real-life situations. To that end, it needs to have some kind of interface that gives us the means to interact with the virtual world it represents. A flight simulator, for example, has a real cockpit that can tilt in any direction, that makes real sounds, and that has real controls linked to a computer system that interprets the pilot’s actions and moves the cockpit accordingly as the pilot reacts to faithfully recreated settings and situations such as airport, mountains, dangerous weather conditions, and emergency landings. All these elements allow the pilot to become immersed in the situation at hand and experience it ‘for real’.

Stories, too, are powerful simulations that allow us to explore and learn about the social world we inhabit. As Jonathan Gottschall puts in his book Storytelling Animals, they are a place “where people go to practice the key skills of human social life”. And the interface that allows us to interact with this simulation is identification with fictional characters.

It makes sense. All species are hardwired to learn specifically about that which is essential to surviving their environment. To us humans, one of these essentials is being able to figure out other people’s needs and intentions so that we can adjust our actions accordingly. Not an easy task, since we have to cohabit and cooperate with large numbers of strangers we know nothing about.

If stories are simulations of the social world, then it makes sense that we interact with it by ‘stepping into the shoes’ of a fictional character. Through identification with them we get to feel their longings, frustrations, virtues and flaws. We experience first-hand their struggles, the moral dilemmas they face, and the consequences of the choices they make. We walk their walk and that’s how we learn.

But how exactly does identification work as an interface? It is easy to see how we can use a cockpit to interact with a simulation, but identification? Stepping into the character’s shoes sounds great as a metaphor but if you’re entrusted with the task of performing the feat, it can leave you feeling a bit baffled.

Luckily, there’s a way of bringing down to earth this concept of identification with fictional characters by looking at it from a more biological point of view.

I’ll start with transportation. This is another term that goes hand-in-hand with identification. Rule number one is that, in order to step into the character’s shoes, the audience need to be transported to the story world first. Another metaphor, but this time it is actually closer to the real thing since films transport us to the story world almost literally – though it is not our legs what take us there, but the physiological responses and emotions we feel in the body as a result of being exposed to the story events.

Physiological responses ‘transport’ us because, if we’re feeling anything, it means our brains think we’re there. If our brains thinks we’re there, we might as well be there. And why do our brains think we’re there?

Films work as an illusion because they exploit loopholes in the perceptual and cognitive processes that we evolved to help us navigate the environment. One of them is the communication time lapse that exists between the unconscious and the conscious brain.

In very simple terms, the brain works like this: it interprets the data that the senses have picked up in the environment, it establishes the context, it evaluates the information according to this context, it determines its importance, and it decides on the best course of action. Is it a threat? Run. An opportunity maybe to reproduce? Strike a sexy pose. A significant change? Get closer and find out more.

The first thing we feel as a result of this process is the physiological response, which tends to be quite basic and which main function is to prepare the muscles to move. Then comes a fully-fledged emotion, which contains more detailed information about the required response. A racing heart in itself does not tell us much. If it comes with a rush of fear or of lust, then we get a much more precise idea of what it’s all about.

On the whole, we’re not aware of these processes. This is because the brain operates in two basic modes, unconscious and conscious. The unconscious mode is much faster at processing things than the conscious. It can organise the neurochemistry and behaviours of our system within 80 milliseconds, whereas it takes the conscious mode about 250 milliseconds to catch up with things.

It is thanks to this gap that we experience films the way we do. When we’re watching a movie, all our brains know for the first 250 milliseconds is that the senses are sending information that is organised in patterns of light, colour, sound, movement and behaviour that feel just like the real world. So the brain does with this information what it evolved to do. It processes and evaluates it, and it prepares the body for the right action. Luckily, though, just before we actually stampede out of the theatre at the sight of a dinosaur, the conscious brain realises that it’s just a movie.

In short, by the time the conscious brain has figured out that we’re only watching a movie, the unconscious brain has already made a full cognitive evaluation of a situation it deemed to be real and it has triggered all sorts of physiological and emotional responses in our bodies. These bodily sensations are what anchor us in the story world. As far as the unconscious brain is concerned, we are there.

All this works very well for transportation into the story world, but that’s only the first stage. Other things still need to happen for identification with a character to take place.

One good way of understanding identification is by looking at what it is not – empathy. These two terms are often used interchangeably, but they are not the same. We can, for example, empathise with humans and animals alike, but we can only identify with humans. Making this distinction is crucial.

Empathy is when we recognise and share the emotions and feelings of another being. We see their situation from their perspective and as a result we get to feel what they feel. This is possible thanks to a mirroring system that we evolved to deal with our social world. It works by replicating in our biology the neural processes that happen in the brain for coordinating and carrying out actions.

Let’s say that a man decides to open the door. A set of neurons will fire away in his brain and will activate the regions responsible for coordinating the actions involved. The motor system will then receive instructions and perform the task. Thanks to this mirroring system, if I watch this man carry out this action, the same neurons will fire in my brain and activate the same regions involved in the operation, only the motor system will not perform the action.

This mechanism works just the same with emotions. By picking up very subtle sensory cues from things such as facial expressions, body language, and tone of voice in another person, the brain is able to replicate in our bodies all the neural processes involved.

It is easy to see from an adaptive point of view the benefits of having such a system. It gives us first-hand information about the intentions and motivations of others and about their mental and emotional states, and it helps us adjust our responses accordingly. If you’ve ever seen someone in distress and immediately understood that all they needed was a hug or a few friendly words, that was your mirror neurons giving you a tip. Or if you’ve ever had a feeling that someone was lying to you or was up to something, that was your mirror neurons too.

As for film, you can imagine how useful these mirror neurons are when it comes to getting the audience to connect both biologically and emotionally with the characters on the screen. Sadly, though, this is not enough. For identification to happen, it is not enough to simply get to experience the same emotions and thoughts as the character. The audience must reach those mental and emotional states through the same cognitive processes that took the character there.

With empathy, things happen there and then, mostly thanks to our mirroring system. Identification is much wider in scope. It requires the audience aligning with not only the character’s feelings and emotions but also with his or her cognitive processes – how he or she reasons, evaluates things, solves problems, sets goals, formulates plans, and so on. That is why we can empathise with an animal such as a dog but not identify with it. Dogs feel emotions similar to ours because they are mammals, but they use different cognitive processes to solve their problems and as a result we cannot ‘step’ into them.

There is one assumption in NLP (Neuro-Linguistic Programming) that makes it easier to understand this concept of identification with fictional characters: “The map is not the territory”.

The territory refers to reality, the physical world that exists independently of our experience of it. The map refers to our minds and to the model of the world we have built through our perceptions, personal experiences, culture, and what we have learnt from the significant others that have been present since early in our lives. It contains, among other things, our beliefs and values, which play a key part in determining the decisions we make and the actions we take.

We only ever get to know our own version of reality. Because it is so unique and personal, we can’t really step into someone else’s mind and experience the world through the lens of their model, not unless he or she is a fictional character, that is.

And that brings us back to story as simulation of the social world and identification as its interface.

If our mental model of the world determines the cognitive processes by which our brains perceive, interpret, and react to events in the environment…

And if identification is achieved by getting the audience to feel and think what the character does through the cognitive processes that took the character there, i.e. through the same mental model…

And if simulations allows us to create models of some aspect of the world that we can use in its stead…

… then we can achieve identification with a character by creating a customised mental model of the character’s world that contains the values, beliefs, and experiences that will lead to the emotions and behaviours we want to explore in the simulation-story. We can then ‘install’ such model in the audience’s minds. This way, their brains will be operating within that specific mental model, and as a result, they will walk the character’s walk, and arrive at the same thoughts and emotions by means of the same filters that took the character to do, think, and feel what he or she did.

Equally important is that all this process generates a lot of neural activity in the brain. The resulting synapses are what get stored in the nervous system as memories. When this happens, the film has served its evolutionary purpose, and the audience get a reward in the form of a shot of feel-good chemicals such as dopamine. That’s why films that get identification right tend to do well at the box office.

One of the main tools for achieving identification with characters is narrative structure. The set-up, for example, is all about ‘installing’ in the audience’s brains the specific parameters in the character’s model of the world that will cause him or her to take the actions and make the decisions that will lead to his or her success or demise. From that point on, even if the character is not present in the scene, the audience will be evaluating the events from within their position of identification. “Are these events good or bad?” And “Will they facilitate achievement of the goal or hamper it?”

This tool, narrative structure, is the staple of all forms of fiction in general. But there are different forms of storytelling – novels, theatre, film – and each, as anyone who’s had a go at adaptation will know, offers dramatic possibilities unique to their own format. Each allows you to get into the character’s mind in ways that the other formats can’t.

What is unique about film is that stories are told through the arbitrary combination of images and sounds which are arranged according to an established cinematic code.

It is easy to see the advantages such feature would bring to a simulation. First, the presence of the two primary senses adds realism and acts as a form of veridicality. The brain is prone to perceptual illusions due to loopholes and it knows it. Seeing is believing, so is hearing, although there’s plenty of room for mistakes. But when you hear and see, there is no doubt in the brain’s mind that it must have been an accurate perception. This makes film even more of an effective illusion.

Also, sound alone does an incredible job of transporting and immersing us in the story world. Being a mechanical wave, it literally touches us and can, through sympathetic resonance, influence our biorhythms in the mental and emotional domains.

Then there’s point-of-view visuals and point-of-hearing sound, both of which can take identification to deeper levels. Although, be warned, in and of itself this technique is not enough to get the audience to identify with a character. In horror films, for instance, a point-of-view shot of the victim does not necessarily lead to identification with the killer (identification, remember, is a process).

But where the real possibilities lie is in the actual relationship between image and sound – in how they are combined meaningfully for the purpose of creating a specific mental or emotional effect.

In reality, the brain uses emotions and perception to guide our actions. After carrying out its evaluation, it activates the right behavioural programs, i.e. emotions, and it determines which bits of information on each sensory channel are most useful to guide our actions. If most of the important information comes in the form of, say, sound waves, it will reduce the presence of other sensory data. The result will be a streamlined percept that includes only what is essential to perform the task at hand efficiently, and that will have excluded any distracting sensory stimuli.

In filmmaking, the director takes charge of this process. He or she creates a percept, or movie, by manipulating and combining images and sounds so as to fool the brain into interpreting things in a way that will lead it to trigger the physiological and emotional responses required by the narrative. Filmmaking is the art of hijacking the brain and tricking it into thinking that the film is a meaningful percept that it, the brain, created by itself.

So what we have on one side is that identification requires that the audience get to feel and think what the character does by the same perceptual and cognitive means.

On the other side, we have that filmmakers can capture image and sound separately and then recombine the two arbitrarily to manipulate the perceptual and cognitive processes that guide the audience through the narrative.

In summary, we can use the relationship between image and sound to deepen identification. Or to give it some slack if that’s what the narrative requires, like for example when the character is a dubious figure and you need to break identification in order for the audience to step back and have a chance to reflect on what has transpired and thus learn the moral lesson of the film. Identification, after all, moves along a continuum. The manipulation of audiovisual information can help things move along this continuum.

In the last few paragraphs I’ve started mentioning a few times the phrase ’the relationship between image and sound’. This is what film sound is all about. Not the sounds. Not even the act of combining the two. The relationship itself, as an entity in its own right, is the actual cinematic device, the dynamic that breathes life into film sound.

So, in my next post, I’ll be talking about the guiding principle that can make this partnership go from ‘mere combination of image and sound’ to ‘meaningful relationship between image and sound’.

The Story of Story

The Story of Story

I ended my first post concluding that until we begin to understand film sound at a deeper level, we will not get it past the creative impasse it is stuck at right now. If you ask me, the first thing we need to do is to start thinking of film sound as a subsystem within a system rather than in terms of sound effects and music, the sole purpose of which is to add realism and emotion to the images.

A good way to start this process is by defining the system sound belongs to: the film. A film is a story told in pictures and sound. So far so good. But that leaves us with another question. What is a story? Not so easy to answer. Yet, this is the first question we need to address if we are to understand this concept of film sound as a subsystem and as a cinematic tool that takes on an active role in the process of narration.

Over 2000 years ago, Aristotle set the precedent in his Poetics for how story would be defined in Western culture. Most attempts today still revolve around the same type of questions: is it action, character, plot, conflict, form, content…?

I spent some time delving into such questions, and I gained some interesting insights, but unfortunately, they proved to be almost (not entirely) futile when it came to grasping the cinematic role of film sound. I decided to look somewhere else, and I found the answer in a somewhat unexpected place: evolution.

The main force behind evolution is natural selection. This is the process by which a behavioural, physical, or physiological genetic mutation either makes it into the permanent genetic make-up of a species or dies away. The selective criteria is simple: Does the mutation give an organism a competitive advantage that makes it more capable of adapting to its environment and fitter to survive it? Yes? Pass. No? Out.

From an evolutionary point of view, stories are a behavioural adaptation. It follows then that they must have given us some sort of adaptive advantage. They have. And this advantage holds not only the key to understanding what makes a good story, and therefore a good film (I will talk about this in another post), but it also provides very important clues to the cinematic potential of film sound. So, let’s dive right into the story of story.

Long before we could tell tales, we were a chimp-like species, much like any other. There was nothing remarkable about us. Then, one day, the story goes, a group of us were outcompeted by other apes and found ourselves having to find another way of foraging. Smart things that we are, in the process of solving this crisis, we invented a new way of cooperating: collaboration, which is similar to cooperation in that it requires the members of a group to work together towards achieving a goal but with a number of important differences.

Cooperation involves working together towards the same goal, but ultimately, participants do it for their own benefit. In the case of chimps, for instance, if teaming up and cooperating with another chimp makes it more likely for them to achieve their aim, then they’ll go along with it, but essentially, they prefer to acquire and eat their food alone if the circumstances allow for it.

With collaboration, however, our thinking became geared toward figuring out ways to coordinate our actions with that of others in order to achieve a joint goal that had been pre-agreed (cooperation lacked this element of predetermination).

To us today, this may not seem like a big deal, but back then, it was a revolutionary way of doing things that required us to push our cognitive skills to the limit. We had to develop the ability to form shared goals, to assign each member of the group an individual task, and to understand how both our own task and that of the others fit within the scheme of things. Then we had to focus our joint attention on the same aim and synchronise our actions in order to achieve our goal. This behaviour was so radically different that it led to us splitting from the chimp lineage and becoming an altogether new species: the human race.

In itself, this was not enough to turn us into what we eventually became, which is the most successful species on earth. It allowed us to form slightly larger societies than other species, but that was about it. The true evolution would come about some few thousand years later, with a rare mutation in the brain that gave us the capacity for story.

Before this mutation happened, our brain was modular in nature. That is, it had separate modules to process different aspects of the environment. There was one module for inanimate objects, one for artefacts, one for animals, one for members of the same species, and so on. Our brain also had modules with dedicated types of intelligence designed to solve specific problems. For instance, it had a technical intelligence for building tools, such as a hammer or a knife, and a social intelligence for making sense of things like facial expressions.

These intelligences were pre-programmed by evolution to obey the rules of the natural world and they could not be consciously changed or controlled. A lot of brains still work like this. A bird, for example, cannot consciously decide to change the way it goes about building its nest since the process is hardwired into its brain.

Having such brain structure meant that we could only process the environment literally. Animals were animals, people were people, flowers were flowers, and ice was ice. There was a fire here and a lion there. Although we had the capacity to have imaginative thoughts within each module – we could imagine that if we hit something with a hammer it might break – we could not bring knowledge from one module to another. This was because there were no neural networks connecting them that could provide such cross-over of information.

Then, this magic mutation changed the wiring of the brain, causing the separate modules to be able to communicate with one another. Thus, the story of how metaphor, fantasy, and wild imagination were born, and how we went from, “Careful! Lion there!” to, “Once upon a time, there lived in a land far, far away a man with a lion’s head, surrounded by the flames of eternal fire. His wife, who had eyes blue like the sky, and was beautiful like a flower, but had snakes for hair, spent her days spinning the fates of the inhabitants of this land.”

To see this in action, you only have to visit the section of any Natural History Museum holing objects dated from 35,000 years ago. There, you’ll find practical utensils like spears, cups, hammers and so on. Skip forward a couple of thousand years or so and you’ll start finding more bizarre things, like the 33,000-year-old lion-headed man carving from Hohlenstein Stadel in Germany.

Anyone with a bit of common sense would have bet that such a futile and potentially dangerous behavioural feature alone would have been enough to bring our entire species to an end – daydreaming in the caveman days does not sound like a wise thing to do. But for some strange reason, that wasn’t the case. On the contrary. Not only did our fellow Neanderthals, who didn’t develop this capacity for telling stories about peoples and places that didn’t exist, start disappearing at an alarming rate from this point on, but we also started living in larger and larger settlements.

What happened was that stories allowed us to invent narratives about our past that gave us a sense of belonging together. Because our bonds were built on a mental ground, there were no limits to how many people these bonds could unite in one single stroke. Compared to our other fellow ape species, who still had to rely on one-to-one grooming to build trust among each other, this represented a big advantage. It meant we could build infinitely bigger communities, achieve more together, and protect each other more efficiently.

We could also imagine stories about the future, better worlds where the problems we faced had been eradicated. We could imagine the values, beliefs, and behaviours that made such worlds possible.

Something very interesting is happening here. On one side, we have a newly-acquired set of cognitive abilities that allow us to form shared goals, to focus our collective attention on them, and to synchronise our actions in order to achieve them. On the other, we have the newly-acquired ability to imagine other worlds and the behaviours required for them to exist.

When we attend to a story collectively, our minds unite and become tuned to the same scenario. The story triggers in us the same thoughts, emotions and learning experiences.

If we put the two skills together, shared intentionality and story, what we get is the ability to synchronise our collective beliefs, values, and behaviours so as to bring our imagined worlds into existence. And that’s just what we did. The outcome? The creation of a new human-made environment: the social domain.

One thing about this new environment is that it is a lot more unforeseeable than the physical and biological habitats we had been inhabiting so far. In the physical realm, we can be sure that the sun will rise in the morning and set in the evening. In the biological realm, we can safely bet that a hungry lion will make lunch of us or that a poisonous mushroom will make us ill at best. It is not so easy, however, to guess what folk with complex psychological computations dictating their behaviour will do next. And that’s what this new environment did to us. It turned us into highly unpredictable and often cunning creatures difficult to figure out.

No need to worry. We had stories, and they turned out to be the solution to the very problem they had created. They became invaluable tools for helping us learn to navigate this new complex social world. Through stories, we could learn the values, attitudes and behaviours that would help us function in our societies or we could explore new behaviours and their consequences. We could also use stories to sharpen our ability to make inferences from and scrutinise other minds – their inner worlds, expectations, intentions, and motivations. This, in turn, would help us make better decisions and adjust our behaviours accordingly, so as to get the best possible outcome.

That’s precisely why stories got the thumbs-up from natural selection. They increased our capability to adapt to our environment. They made us more suitable to survive this new cognitive niche we had just moved into.

One word that sums up very well the adaptive role of stories is LEARNING, but learning with a twist.

Biologically speaking, learning is the process by which information about the environment is stored in our nervous system as memories. It is an essential adaptation for any creature to be able to survive. The only problem is that for learning to take place, we must have some form of direct experience with the environment, something that comes with its risks. What if we don’t survive the experience?

That’s where the twist comes in. Stories provide us with an effective means for learning about the environment and for expanding our repertoire of beneficial behaviours without putting ourselves at risk. They allow us to learn, not through direct experience, but through SIMULATION.

It is when we look at story and film from that perspective – as a simulation of the social world rather than in terms of plot or character – that the cinematic role of sound slowly begins to reveal itself. In actual fact, the whole concept of filmmaking takes on a fresh dimension.

Simulation, therefore, will be the subject of my next blog post.

Why Sound (Still) Is the Ugly Duckling of Filmmaking

Why Sound (Still) Is the Ugly Duckling of Filmmaking

When I first started working as a sound recordist, I had so much enthusiasm. I meticulously studied the script for every project I took and carefully thought about the sounds I wanted to capture and the things I wanted to look out for on the set. When filming started, I gave it my everything. That was in the beginning. By the end of year one, I had learnt that asking for more time to figure something out was equivalent to Oliver Twist begging for more soup.

Eventually, I decided to quit recording to concentrate on postproduction. Things didn't get much better. As a sound editor, I found myself spending my time mostly “fixing it in post”. As a sound designer, except for a few occasions, I spent my hours mostly looking for sound effects that would “sit nicely” with the image. That was it.

At first, I couldn’t comprehend why there was such a negative attitude towards a part of filmmaking that has so much potential. But then, all those stories I had read as a student about the coming of sound to cinema made me realise this was something we inherited, and it still lurks in the collective unconscious of filmmakers today.

The story of the coming of sound to cinema reads like some sort of tale of terror—the tale of how sound mercilessly murdered the beautiful language of silent cinema. The truth is that's what happened. The transition to sound was an apocalypse.

By the time sound came, filmmakers had created a unique way of telling stories through the use of editing techniques and camera movements that had the power to infiltrate the viewer’s mind with the same fluidity and magic of dreams. Then sound happened, and the whole process was turned upside down. Cameras had to be locked in sound-proof booths, filming had to be done in sound studios, actors had to stay put in fixed spots in order to be within range of the microphone, editing had to succumb to the physical laws of real time and space...the list goes on.

If that wasn’t enough, many businesses that couldn’t afford the technology went under, and so did the careers of well-established directors and stars who could not adapt to sound.

And if that still wasn’t enough, audiences simply loved the novelty. They wanted to hear actors talk on the screen. They couldn’t have enough of that which most filmmakers hated: the “talkies.”

It is understandable that many filmmakers in the silent era grew to hate sound and what it had done to their precious art. Gone were the days of roaming the earth unencumbered and free to take the camera where they wanted. That’s what they thought, anyway.
Luckily, they were wrong. A few refused to succumb to “canned theatre,” knowing in their hearts there must be more to sound than talk. They became the big heroes of the transition. Lubitsch, Clair, Mamoulan, Vidor, and the Soviets (though more in theory than in practice since they couldn’t afford the technology!) all summoned the courage to overcome the odds and rescue sound films from the claws of photographed plays, and thus propelled it into a new era of exploration.

Their courage and their trials and errors led them to the discovery of the true soul of sound films. They realised that the commonly held belief that everything that was seen on screen had to be heard and only that which was seen could be heard was nonsense. It dawned on them that they could film with the camera silently and then add the sound, and that in turn led them to realise they could manipulate sound to suit the dramatic needs of their stories, they could evoke mood and atmosphere in ways they couldn't with the image, they could add new levels of fluidity by using asynchronous sound, and rather than spoon-feeding audiences, they could engage their curiosity by combining image and sound in ways that required the audience’s active engagement and interpretation.

Whatever happened to their inquisitive spirit? It seemed to die with them. Once they were gone, things went back to "normal", and sound resumed its passive role as mere accompaniment to the image. The sound technology used in films today may be state-of-the-art, but sound as a narrative and cinematic tool has barely evolved. If anything, it has gone backwards.

The crux of this matter is that we don’t really understand non-musical sound as a creative form of expression. If you think about it, we’ve been recording images since our cave days. And we have been scrutinising and perfecting this practice for over 2000 years. In contrast, we’ve only been recording sound for just over 100 years. In actual fact, even less. It was not until we entered the era of magnetic tape recorders in the 1940s that sound recording technology became widely available and people were able to start experimenting freely with sound as an expressive medium. In film, it was only in 1979 that the term sound designer was introduced to the motion picture in recognition to the contribution this role made to the medium (or more specifically, to the contribution Walter Murch made to Apocalypse Now).

To that, we have to add that there is an alarmingly low number of books on the subject of film sound, and the ones available are either written by sound designers for sound designers or by scholars for scholars. Unfortunately, most of them are obscure to the same extent they are interesting. For example, the books of Michel Chion, one of the most influential figures in film sound theory (and in my career), are fabulously insightful but hardly make one’s beach vacation more relaxing. And if you are a screenwriter or director wanting to extract some practical advice out of them, you better be prepared to forego a few hours of strolling along the seashore aimlessly and instead spend that time digging for the treasures buried in these books.

As for books that deal with filmmaking techniques in general, they all surely have a chapter on film sound. Some are longer and more detailed than others, but their content can invariably be boiled down to one sentence:

Film sound can be diegetic/non-diegetic, simultaneous/non-simultaneous, and synchronous/asynchronous; it consists of dialogue, sound effects, and music; and its role is to enhance the audience's experience, create mood, and elicit emotion.

That’s it. Hardly surprising then that sound is used mostly as mere accompaniment to the image.

The problem is that these principles feel as if they were written in stone. Not many have questioned them and not many have wondered why such principles have failed to inspire a more creative use of sound in film. This type of blind acceptance is a very common problem. We only have to look at the timeline of art history to see how artists often spend many decades stuck in one way of doing things, taking for granted that’s just how things are done. Until, that is, someone the likes of Da Vinci or Picasso comes along with a very different vision, breaking all the conventions that had been written in stone up till that point in history, and suddenly everyone realises that there was another way of doing things after all.

I can’t help feeling that’s what’s happening with film sound. We’re stuck with a theory which, if you ask me, barely scratch the tip of the iceberg.

It’s not as if the real voice of film sound hasn’t been discovered yet. As I mentioned earlier, a few pioneers in the early era had their moments of great revelation, and a few directors more recently, like Darren Aronofsky for example, have used sound incredibly well. The problem is that not many seem to be noticing, let alone following in their steps. Again, in my opinion, things are this way due to a lack of proper understanding of sound’s place in cinematic language.

We need to start opening our minds to a new way of thinking about film sound: as an active element that holds as much power as any other lighting, camera, or editing technique, and as a subsystem within a system, rather than a nice sound effect here and there, or as music that guides the emotional responses of the audience. Film sound is not something that happens mostly in post-production. Film sound has to start with the screenwriter using sound as an active narrative element, continue with the director using it as a cinematic tool in its own right, and end with the sound designer bringing it all to life in an aesthetically pleasing and coherent manner.

“This sounds grand”, you may be thinking, “but how do I do that? And where do I start?”

I myself have thought long and hard about all this, and for some time, when I started asking these questions, my mind was blank.

My breakthrough came when I realised the solution is to understand film sound at a deep level. That’s how creativity works. It starts with the process of gathering information. Then there’s a period of incubation when we don’t consciously think about this information, and during which the unconscious part of the brain starts making associations internally. After that, insights and ideas start emerging as if by magic. And because they have been processed unconsciously by the brain, they feel organic to the whole we are trying to create rather than artificially imposed.

My approach, therefore, will be to talk about film sound - and sound - from many different perspectives, with the hope that this knowledge will slowly make it into the unconscious of screenwriters and directors, and then back out in the form of inspiration and insights that are put into useful form and that give rise to films that offer us all a richer, more fulfilling cinematic experience.

It will be a long journey that will start with my next post, where I will be talking about story from an evolutionary point of view.