Film as Language: The Method and Form of Sergei Eisenstein


   “Now why should the cinema follow the forms of theater and painting rather than the methodology of language, which allows wholly new concepts of ideas to arise from the combination of two concrete denotations of two concrete objects? Language is much closer to film than painting is.” – Sergei Eisenstein, A Dialectic Approach to Film Form (1949)

            Sergei Eisenstein considered the communicative power of film as a combination of previously established art forms to be the highest expression of human emotion.   It was his understanding that all human expression is born out of conflict.  Not aggravated conflict (as in one man murdering another) but in passive conflict (as in disunity, or spatial/temporal/auditory distinction).  For instance, a prerequisite for the articulation of human emotion is a clear personal distinction between emotions.  This distinction is only possible through process, i.e. a person is happy, and then the person is sad.  (Dialectic)

Emotions are non-tangible and non-rational, and are therefore not bound by logical laws such as the law of non-contradiction.  It is possible for two emotions to be experienced simultaneously because, conversely from popular belief, no two emotions are in contradiction to each other.  However, as concerns the communicative art form of film, it is only relevant to view each human emotion as a singular piece to be tantalized at a specific point in the process of a total emotional experience.  When attempting to understand the Soviet Montage, we must recognize that the technique refers as much to a sequence of emotions as it does to a succession of individual frames within a scene. (The Filmic Fourth Dimension)

In his essay A Dialectic Approach to Film Form, Eisenstein posits that the implied motion produced through successive frames in a given shot is by its nature a foundational montage, and that his juxtaposition of spatially dissimilar shots is merely a continuation of the same idea.  The audience sees an image of a horse, updated twenty-four times a second by other images of the same horse, albeit differentiated temporally.  Eisenstein is careful to point out that our minds do not see these images as following one after another, but rather superimposed, one on top of – and overriding – the previous in the same sequence.  Our minds infer that because of the geographical contradictions between the images (the horse has one foot on the ground, then two, then one again) the contents of the image must be moving.  This builds a rhythm that may or may not represent comparable and familiar rhythms in reality.  (Dialectic)

Now that is merely a complicated description of how the mind perceives motion within a single shot, but to understand the Soviet Montage, we must understand its artistic and linguistic DNA.  The human mind is trained through natural experience to connect correlated occurrences, and attempts to make sense out of those connections.  For instance, a child may learn very early to connect the optical observance of lightning with the aural perception of thunder – producing a rational understanding and expectation of the two pieces producing a full-sensory whole.  It is this type of mental process that creates the possibility of coherent cinematic montage.

Several still images of a horse running placed in correct sequence can be juxtaposed with images of a crowd cheering to produce in the rational audience’s mind an idea of a horse running in front of the cheering crowd.  We need not see the two together to unquestionably connect their geographical presence in our imaginations.

This whole routine is unreservedly dependent upon the audience’s emotional perception.  Art is a sensory experience.   Eisenstein once wrote of this saying, “because the limit of organic form (the passive principle of being) is Nature. The limit of rational form (the active principle of production) is Industry. At the intersection of Nature and Industry stands Art.  The logic of organic form vs. the logic of rational form yields, in collision, the dialectic of the art-form.” (Dialectic)

Any art form ought to be understood as a communicative medium in which the thing being communicated is not an idea, but an emotion.  Language communicates intellect, whereas art communicates sensation.  The two are certainly compatible, as in poetry, but also just as certainly inimitably unique.  And as communication requires the process of a message being sent and received, we must acknowledge that distinct communication is impossible without the process of time.  Thus, as words in a sentence are given meaning through context of contiguous words in the same sentence, and sentences are given sub-textual meaning through context of other sentences within a conversation, given shots within a scene will conform to an over-tonal meaning intrinsically contextualized by other shots within the same scene, and in a broader sense, other scenes throughout the film.

A single image has no more inherent meaning than a single letter of the alphabet.  Also, the constant presence of all images has no meaning because it is impossible for the human mind to perceive multiple images simultaneously.  We require chronological juxtaposition for context, and as proven above, context is required for the receiver/audience to deduce meaning.  That is not to say that the audience is incapable of discerning multiple layers of information simultaneously, but rather that such information must exist free from superimposition between parts.  (Film Form)

In the essay The Filmic Fourth Dimension, Eisenstein compares film to music thusly, “There, along with the vibration of a basic dominant tone, comes a whole series of similar vibrations . . . Their impacts against each other . . . envelop the basic tone in a whole host of secondary vibrations . . . We find the same thing in optics, as well. All sorts of aberrations, distortions, and other defects, which can be remedied by systems of lenses, can also be taken into account compositionally, providing a whole series of definite compositional effects.”  To simplify, he is describing the methods by which musicians and filmmakers are capable of manipulating audience emotion.  The interesting thing about this analogy is that film is not only comparable to music, it can also incorporate music into its own being.  The overtones and undertones of an operatic motif compliment the visual overtones and undertones of the cinematography and editing.  This is the heart of Soviet Montage theory: a non-meaningful point is complimented, supplemented, and superimposed by a carefully crafted selection of spatially related points to create a full-sensory emotional experience.  (Dialectic, The Filmic Fourth Dimension, Methods of Montage)

Let’s look at this in practice.  The prime example of Eisenstein’s technique is the famous Odessa Steps sequence from the 1925 film Battleship Potemkin.  The scene begins with a crowd of peasants celebrating the bravery and sacrifice of the crew of the Battleship Potemkin.  “And suddenly,” they are attacked by militarized evil manifest as faceless troops opposed to the ideals of the people.  Though these troops march in unison, they do not march to the beat of the music.  This creates a sense of contradiction.  Our mind tries to make sense of what happens onscreen with what the musical overtones tell us.  But the obvious connection (marching to a beat) is skewed by this audio/visual contradiction.

The people scatter from the gunfire in a chaotic and brazenly anti-uniform fashion.  It could be said that the disharmony of the peasants’ flight is a spatial counterpoint to the geographical consistency of the military troops.  The troops do not waver.  Their murderous march is the most dependably predictable element of the entire movie.  Thus, what the audience would normally take comfort in (regularity) has become inimitably destructive and evil.

Where this scene excels is in its progression of devastation.  At first the audience feels uplifted at the peasants’ security and celebration.  Eisenstein could have interrupted this by abruptly cutting to the soldiers firing upon the civilians.  But instead he chose to prepare the way for this dramatic interruption through the use of an inter-title: “And suddenly . . . “  Had Eisenstein truly wanted to abrupt the ecstatic experience, he may have left this card out.  But instead he builds anticipation by queuing us in to the fact that there will be some sort of terrible change.  By starting a sentence with the words, “and suddenly,” Eisenstein has put the audience into a position of expectation.  These two words do not bear any meaning in themselves. The statement requires clarification, thus, as the audience is looking for an answer to the question of, “and suddenly what?” the answer is presented as an antithesis to the previous emotion felt.

The journey of the audience here is not as simple as directly transitioning from happiness to shock and awe.  The emotional curve actually has that subtle middle step of the inter-title, which puts the audience in the position of wanting to know how next to feel, only to be sorely disappointed by the onslaught of violence portrayed.  The audience then regrets demanding that the film clarify the ambiguous two-word sentence in the inter-titles.  We do not want to see our beloved “comrades” suffer.  And as the scene progresses, we continue to see things that further offend our moral senses.

“Is it not bad enough,” we ask, “for these innocent men and women to die?  Must the children and handicapped also be slaughtered?”  The film’s antagonists are clearly apathetic toward this dilemma.  They march over the body of a young boy, paying as little attention to his presence as that of the nearest flea.  At this our mind screams.  We think surely the worst has come.  But Eisenstein knew how much lower the human spirits could go.  Over the next six minutes, we are subjected to images of mayhem that knows neither boundaries of age, class, or intelligence.  It is not enough for a baby to be murdered.  We are subjected to the tension of watching it fall down steps for over a minute of screen time, only to be slashed by a military officer.

This sort of experience progresses from slight feeling to intense feeling by continually contradicting audience expectations.  As established earlier, we humans are trained to connect everything rationally.  But when some of the peasants attempt to speak with the troops to appeal to their reason, they are shot.  Everything the audience thinks of, the peasants also think of.  But according to Eisenstein’s design, none of these methods (fleeing, rationalizing, fighting) are strong enough to defeat this evil.  That is, until the arrival of the hero ship, the Battleship Potemkin.

There can be no doubt that the emotion received is that which Eisenstein intended to communicate, proving beyond a shadow of a doubt the power of film as a communicative medium.  This type of exercise has been imitated in many movies since, notably in the church-burning scene from Roland Emmerich’s The Patriot.  As the British lock these people in the building and set it ablaze, the townspeople attempt everything the audience wants them to.  Open the doors?  They’re locked.  Break through the windows?  There are bars along them.  The connection between audience and victim works because the audience naturally projects itself into the position of the protagonists.  Thus, the hero’s failures become our failures.

The film’s job is to make the audience ‘help itself,’ not to ‘entertain’ it.  To grip, not to amuse,” wrote Eisenstein in the essay A Course in Treatment.  The discussion at hand was whether or not it was the role of the filmmaker to entertain the masses.  He took the position that any communicative medium must not be held down by any such obligation to communicate a specific thing.  Instead, films ought to harness the very things that make up the psycho/physiological core of every person going to the cinema.

The linguistic possibilities of art have been explored since the beginnings of Creation.  Time allows for Process, and Process allows for temporal art forms such as music and film to harness emotional expression through the means of story.  As audience needs will change over time, the methods by which foundational human emotions are articulated may change with them.  But the potential for human expression concerning spiritual/emotional/psycho/physiological experiences will only grow.

The Odessa Steps and the use of Montage (Battleship Potemkin)


Of the many pioneers of modern editing theory, one of the most important is Sergei Eisenstein.  Known for his use of montage, Eisenstein was capable of directing audience emotions through juxtaposition of images that would collectively bear a given meaning.  Much of this theory would later be pursued by Alfred Hitchcock and Martin Scorsese, ultimately finding its most extreme form in the medium of Music Videos.  Eisenstein’s famous “Odessa Steps” sequence from The Battleship Potemkin is one of the most influential montages in film history, with references to it finding their way into The Untouchables and Star Wars: Revenge of the Sith.

What makes this sequence so memorable?  Is it the content?  Or the cinematography?  Can editing alone be given credit for the end result?  I think it is all these things and more that makes this sequence work as it does.  The fact that the sequence takes place toward the end of the film, after we’ve already been introduced to a number of peasants that get murdered in the scene, helps give us a narrative context for how to feel.  The fact that the soldiers are on top of the stairs walking down imbues them with a sense of authority and power, making their slaughter of the poor masses seem that much more unnecessary.  The soldiers’ actions could have been portrayed as acceptable, or even heroic, had the civilians not been portrayed in such a miserable and sympathetic light.  Typically audiences are much more accepting of battles between equal opponents, and much less accepting of any powerful figure beating down a weaker one.  The only context in which we enjoy watching a stronger force defeat a weaker one is if it comes at the defense of weaker characters that we identify with, as is the case when the Potemkin comes to the rescue of the peasants.

In respect to the meaning and purpose of something being edited this way, it was Eisenstein’s belief that two images juxtaposed together would create a mental image greater than the individual parts.  By extension, this means that eighty shots put together will call for a uniquely strong response within the audience.  This is the heart of Eisenstein’s use of Montage.  To him, film is a language that communicates emotion, and having proper editing is the equivalent to having proper grammar.

In his essay, Film Form, Eisenstein describes the Odessa Steps sequence as a “Rhythmic Montage” where the film is cut to certain beat, giving a methodical impression of the scene.  But as the director points out, the marching of the soldiers and the beat of the drum consistently come in off-beat, creating a sensation that something is amiss, things are not as they ought to be.  The rhythm of the scene is transferred over from the soldiers marching to the baby in the carriage, garnering methodical sympathy from the audience.   This whole sequence causes something in the viewer to cry out at the tragedy.  We naturally try to make sense of the world and the things in it.  But there is no rationality here.  Just meaningless violence.  There is no rational response to this.  We only feel.  And what we feel is technically contrived, by the many tools at the filmmaker’s hand.

It is no surprise that this film (and the Odessa Steps sequence in particular) has gone on to influence a wealth of filmmakers around the world.  In some cases the homage is deliberate, as in The Untouchables or Star Wars Episode III: Revenge of the Sith.  In other cases, it is implicitly felt, as in the various training montages in every movie ever made about an underground fighter going for the gold.  Is the montage an artistic tool, or a linguistic discover?  Or perhaps the real question is, is there a difference?