Lescop, Laurent. (2017). Narrative grammar in 360. In 2017 IEEE International Symposium on Mixed and Augmented Reality ISMAR 2017. Nantes : IEEE. DOI 10.1109/ISMAR-Adjunct.2017.86
VR has now come from industry to everyday application. Mainstream software and devices allow artists to create contents with a fast learning curve. Since 2014, with the launch of Google Cardboard and 360 cameras at a reasonable price, with the massive success of Unity 3D and Unreal UDK, real-time immersion no longer controlled/guided by experts but spreads to creative enthusiasts which has resulted in extensive production of content. Like at the early age of photography and then cinema, questions about composition, narrative structure and visual grammar are slowly emerging.. This article is a raw presentation of issues of narrative grammar in 360.
Keywords: narrative, 360 images, immersion, VR, scenology, ambiance.
Index Terms: J.5 [Arts and Humanities]: Miscellaneous—Ethics
1787 is an important milestone in VR and immersion. Scottish painter Robert Barker opened the first immersive panorama giving the audience the illusion of being on a belvedere above Edinburgh. For this panorama, Barker designed a multi vanishing points image for a scenographic point of view. The illusion of immersion stands on the control of flux, light, sound and that all contributes to the wonderment. Panoramas will be then, for a century, the state of the art of machines of illusion to immerse the audience into a 360 narrative space. The platform where the visitors stood became then a real boat deck, or a train or a balloon. The image turns out to be more realistic with photography and animated with films. Sounds, lights, air were added to reinforce the sensation of being somewhere else, travelling the oceans, witnessing a battlefield or overlooking distant cities. If cinema replaced panoramas as mainstream entertainment they still remain as a paragon of total performance that today’s digital technics try to reach.
Panoramas, as 360 immersive devices, brought a lot of solutions in terms of narrative, not only with the picture  but also in the way to organise space, sights and time .
XIX’s century panorama, ill. L.Lescop.
The main difference between narrative in 360 and classical narrative is the frame. When perspective came as a theory with Brunelleschi in 1425, it was not only the figuration of a three-dimensional world but also a way to structure and control the society . Francastel wrote that perspective organises the world and renders it commensurable. It sets a specific location for the viewer – in front of the image – for the illusion to work. The central point of view, as the basis of immersion, led to the conception of the Italian classic theatre where spectators look toward a certain direction enclosed in a frame. The frame has two very important purposes, which first one is to separate what is visible in the storytelling from what is not visible and where lays the imagination of the spectator. From the pragmatic point of view, the off-frame hides all technical elements and separates what is seen from what builds the illusion. Viewing at 360 doesn’t mean that the frame has disappeared, it still exists with the field of view, but there would be no room for imagination as an extension of the visible nor for the technic as support of illusion.
When shooting in reality with a 360 camera, the technical crew has to be hidden in order to not being seen. Director Céline Tricart (Los Angeles, US) explains that in her film Marriage Equality (2015) , she had to camouflage the camera operator, the sound recorder and other technicians within the crowd while she herself was standing behind a bush delivering instructions.
In full 3D, there is no technical crew nor camera or lights to be hidden, but still, as they’re not yet total, infinite open worlds, tricks have to be set to block the user and let his imagination go beyond. This is solved in several ways: physically closed areas, inner spaces, gaps and cliffs, walls are examples of framing the action in an open world.
Another important difference in the narrative is the use of time. 360 immersion seems to imply a real time experience, events should occur in real time. As we will see in the next part, cinema has very soon played with real time to operate ellipses, dilatation, parallel narration, disjunctions and so on. As a mimic of a real personal experience, the narrative in 360 may not be able to distort time in a nonrealistic experience. For that, video games brought a lot of innovative experiences. In many of them, the walking speed is way faster than in reality and, in open worlds, the length of one day takes only a few minutes. A full day in GTA 5 takes 48 minutes real time and one hour takes 2 minutes. In the game Max Payne, the bullet time effects, slows down time. More interesting, ellipses in video games are made possible by the design of spaces: in Assassin’s Creed or GTA, games which pretend to simulate real environments, spaces are cropped or shortened to go from a place of interest to another faster.
Framing and timing are the two main elements of picture composition that can be summarised with the three rules of dramaturgy: the unity of action, of time and unity of place. Photography, cinematography, video games often respect those three unities understood as capturing reality. A frame is a frozen time. But many counterexamples can be found in paintings: in the Feast of Herod by Fra Filippo Lippi (1452-1460) three distant actions are figured in the same space. Difficulties that we may have to read or understand such images are bound to our cultural environment when in other areas times is not seen as layers or a continuous line, but as a whole still acting on us . Despite this, many images in panoramas used simultaneous chronologies within the same image to tell a story and add temporality.
Narration in a 360 device
Narration can’t be thought without the device. It is roughly possible to categorise as followed: flat screens whatever the size, 360 cylinder screen such as ancient panoramas or actual devices, hemispherical screens (SAT in Montreal or ElbeDom in Magdeburg) and total spherical screens as simulated in immersive masks like Oculus Rift, HTC Vive or Google cardboard like.
Referring to theatre or circus and virtual scenography, we can list four configurations with the narrative world seen as a “virtual bubble” and the position of the audience or the user in case of a solo experience. In the first one, the audience stands in a central position and the virtual bubble is around. This is for example what is experimented with a 360 VR photography. In the second one, the audience and the “virtual bubble” are moving at the same time. It’s a very rare situation of synchrony of the audience’s movement and the virtual one. This can experiment in the game park the Void . In the third one, the audience is moving but the “virtual bubble” is fixed, it is very much like in video games where the horizon and the distant environment is painted on a hemisphere or a cylinder; it’s a tip used in cinema to picture landscapes in studios. The fourth one is when the audience is fixed and the virtual bubble moving. This is, for example, a ride shot in 360 or planetarium.
Audience and narrative spaces
If we focus on a 360 cylinder screen, we can expect three kinds of narrative involving the body and the point of view. First, it can be an image or a film with no vanishing point. It means that the depth’s effects are rendered with the decreasing size of objects and the desaturation of colours with distance. The Water Lilies (or Nymphéas 1920-26) from Claude Monet is a good example of panoramic immersive painting with no vanishing point. The audience can look around, there is no favourite point of view.
Image 1: no perspective, image 2: central perspective, image 3: 360 image with 4 points of view.
The second kind of image is an image with one vanishing point. It means the audience only watches in one direction to have a correct view. The sides of the image are distorted but they are out the main field of view, they only give light and colour information. It’s like being in a car watching through the windshield without turning the head. Those images are efficient for either static scenery or dynamic travelling. The third picture is the nowadays classic four vanishing point perspective image in which the audience can look around without experimenting any distortions and weird effects.
We now see many combination depending on the device, the position of the audience and the kind of image shown. It’s possible to find examples of nearly all configuration that a longer article may illustrate. What is important here is to have an overview of the range of possibilities and what it implies in terms of narration.
Narration in VR
Narration in VR still means telling a good story. If there is no story it’s a technical demo or a tutorial as long as one considers that those can also hold a storyline. Classic structure in the narrative is where a character who’s existence is disrupted by a triggering event and then will be confronted to many hardships either to return to the initial situation or to create a new one by confronting his main threat and take over it. Whether it is in VR or not, the issues remain the same: create interest and empathy, having charismatic characters and a challenging environment.
We speculated that in VR, time is converted into space. It means that a keyframe in time becomes a kind of key space that pushes the action. This is what Jessica Brillhart  calls Probabilistic Experiential Editing with the idea of Points of Interest. In classic editing, a scene is what exists between two moments in time during where the emotional value switches from one side to the other: positive to negative or negative to positive . In VR in an open world, it has to be solved in space with an entrance and an exit now understood as key spaces. The objective is to indicate to the user where could be the exit and to drag him from the entrance to the exit.
Probabilistic is used because there is a probability for the user to not follow the story line and not going from A to B. The reputation of the video game Grand Theft Auto (GTA) has also been built upon to not follow the story line and wander in the city. We can assess that there is a hidden motivation that pushes the player more than anything else: it is the feeling of transgression. The player feels himself aloud to do whatever is forbidden in the real world, mugging people, stealing, driving at high speeds, and killing.
Kevin Lynch’s key elements
Whatever the user’s motivation is, space must be filled with elements in order to become a narrative space. Those elements borrowed from Kevin Lynch are landmark, node, path, edge and district . This can describe either a natural or urban place, it gives the user many possibilities from being guided to explore a totally open world.
One last topic is cutting and editing. In classic narration, the story is always a line whether it is possible to choose between several options. Those options are like nodes. There is a potential of infinite options for each node, but in a 1D universe, like in a book, it’s not possible to have a synoptic vision of the nodes. It’s the same for cinema. In VR, nodes are in a 2D, time become space, it is then possible to see all possible nodes which are every single point of space. It means that, if the storyline remains linear, the user can have a vision of infinite possibilities.
The next question is: is it possible to have multiple storylines simultaneously? In literature, it is not possible and the human ear cannot follow two different conversations at the same time. It’s infrequent in cinema, some examples exist of parallel editing in Brian de Palma’s films for instance. One example rises above others: Jacques Tati’s Playtime. In Playtime Tati used a huge frame, a 70 mm film and put many mini-plots in the image. He asks the audience to pick one of the plots and follow it. He even encourages the audience to comment and share discoveries. In VR, it is obviously possible to have multiple storylines simultaneously. Because time becomes space, and space is at least in 3D, it makes it possible let thing to Multi massive online role playing as a perfect paradigm. Finally, the main question is not linear or not, as a storyline is linear, but to define if the user is following what was the author’s initial concept.
Dealing with narration would mean using a grammar. In basic communication  a message is held by a code that is shared by sender and receiver. A grammar helps to identify “words” that are organised in a “sentence”. In many fields of creation, the notion of grammar is highly discussed and may not be efficient, for instance, an “architectural grammar” is a dead-end according to Nelson Goodman in Languages of Art: An Approach to a Theory of Symbols. Never the less, cinema possesses a grammar resulting to narrative innovations accomplished by Georges Méliès, David Wark Griffith, Lev Kuleshov and Sergueï Eisenstein for the early pioneers. The question is to know if it’s possible to transpose the film grammar to a 360 immersive experience.
The words of the film grammar are like: long shot, medium shot, close-up and as it is in 3D, those words have spatial values such as bird’s eye, canted or high which all refer to camera angles. There are still shots and also camera movements: pan, crab, track, zoom, ped or tilt. The framing is a real language that helps the spectator to understand the non-verbal information: feelings, conflicts, motivations. Editing is like creating sentences, organising words/shots create or more accurate information, the message is more precise or plays with the inner signification of words/shots to create a new meaning such as the Kuleshov effect demonstrates.
Longing for Wilderness, Marc Zimmermann, Filmakademie Baden-Württemberg, 2011 .
Editing is also a way to play with time with Slow motion or accelerated motion, with the use of ellipses as previously mentioned, or parallel storylines. In 360, everything has to be reconsidered. If the idea is to keep a natural vision, the focal length can’t be changed and the framing cannot use distances and angles.
The user has the final decision to look at it is so necessary to adjust the levers for storytelling. As we saw previously, time can be transformed into space. By compressing or expanding space, it is possible to accelerate or to decelerate time without creating a sensation of velocity. Framing can be possible by attracting attention on a specific direction, this can be accomplished by a cinematic direction, by light and shadows and also with sounds. These are a few solutions but much more could be listed. Specific shots like close-up can be interpreted with the user’s interactions. Using a binocular or having a HUD are solutions to create close-up on needed details.
In the end, we see that time become space and framing become choice and interaction. That is the reason why creators like Keith Largo declares that the main inspiration now for VR narrative is theatre . As a result, the area of references slides from cinema to socio-psychological or behavioural problems without losing the issue of story-telling which is why users get into it. Behaviour, immersion and storytelling bring also new concerns that
Figurative vs non-figurative
The most realistic an image is, the better would be the illusion of being immersed in a virtual world. As the processing power of graphic cards has increased, it is now possible to have complex simulations of environments in real time. Lights, shadows, materials can be rendered at a minimum of 50 fps with thousands of polygons and so, even on small devices such as smartphones. But when looking closely at what a realistic image is, it is soon to be realised that it’s an image with – beyond a good light simulation – visual effects like glare, motion blur, glow, and dust and so on. A realistic image mimics what is produced for cinema, which is never a naturalist image. A realistic image is so far a non-naturalist image.
Through the history of computer games and interactive applications, we saw that a good storytelling takes over bad graphics. But this subject is a dead-end, nowadays expectations have nothing in common with those we had before and it may be a kind of nostalgia remembering games we played years ago. More interesting is trying to understand how the nonfigurative VR is also appealing and create amazing immersive experiences.
White Box, PurForm with Alain Thibault,music, and Yan Breuleux, graphics 2011.
The piece White Box is a VR travel into an abstract world populated with black lines on a white screen. Lines are organised in patterns that blend crosses processing an ever renewed landscape. Instinctively, the viewer recognises a storyline made, like in classic storytelling, with loops, crossing, starts and stops and a climax. Non-figurative can also a way to experiment disabilities. Hotel Blind (2016) is a way to embody a blind person in a hotel room. More interesting Notes on Blindness tells the real story of John Hull who gradually lost sight. The VR application “shows” the soundscape represented with dots .
This leads to reconsider sound of a major part of the narrative,  an “audio-vision” experience like in the cinema. In VR, sound design is often weaker than the graphic design, it is just to see how people are involved in one speciality or the other. The earring is faster than viewing that makes the audition our better tools for alert, warning of dangers that can come from anywhere. Our narrative grammar can be fulfilled with sound design. Thereby, a sudden sound can act like a jump cut or a close-up, it focuses attention, surprises drags the vision toward a certain location.
Sound design for VR can be divided into a few groups: immersing and located sounds, non-repeatable localised sounds, non-localized repeatable sounds (sound pattern) and ambient sounds. Ambient sounds which are not visible perform exactly like an off-screen world. It means that in VR we also can have an off-screen world where situates imagination.
Narrative studies are now taking into account the new media like films, video games and VR experiences. We saw that originally following Propp and Todorov, narratology studied the combination of text’s mode where the storyline is structured by the reader. Whatever the events are ordered, the reader rebuilds a narration logic because he knows that the story is aiming a purpose. For Propp  we’ve got this kind of sequence: 1/ possibility, 2/action or no-action, 3/success or not, with the reader acknowledging that success is a wish but only by confronting big opponents.
Then narratology took into account cognitive functions. This allows to state the storyline as being perpetually reconsidered, it is not a fixed structure, and it reshapes itself with emotional and cognitive inputs. It means that the way a story can be experiment depends on the cognitive background and the instant mood. Acting on the plot also affect the storytelling. Acting helps adhesion and therefore the illusion. There are two kinds of VR narrative, ergodic and non-ergodic, ergodic means with strong interactions from the subject to the media and non-ergodic signifies low or no interactions; Ergodic media are games and non-ergodic media are literature or cinema . There are, like in games, two major possibilities in ergotic media: one is to be bound to very strict rules and one is to have a lot of freedom and initiatives.
360 photo of an immersive device.
The last question would be: can we state VR as an imitation of real life  or is VR trying to copy the look and grammar of visual arts? We saw that VR narrative is strongly linked to the device in which it is played. There is a general idea of progression from the flat screen to the immersive headset with the image feeling the entire field of view and being more immersed. The history of machines of illusion shows that the way to prepare the audience, the size of the picture, the shared experience, the set where the story is played is way as important as the content itself.
Classic narrative figures can be adapted to VR narrative creating a specific grammar. In this grammar, time becomes space. Figures like ellipses, prolepses and analepses are modelled as spaces, moreover when there is a continuous narration. The grammar of films is also adapted as it is harder to change frame values, having cuts or shot reverse shot or even reaction shots. In many video games, it has been resolved by interrupting the action with a cinematic sequence. But here again, a transposition is possible using sound. Sound can drive attention from point to another, create a close-up by isolating a sound. We saw that non-figurative VR experiences are perfect platforms to test and invent this narrative grammar.
The main issue is this off-screen notion. In classic filmmaking, in theatre, the off-screen where the audience fills what is not seen with imagination. On the operational side, it’s also where the technic is. In VR, the off-screen has to be suggested by sound opening even wider the fields of virtual worlds.
Robichon F., Les Panoramas en France au XIXe siècle. Thèse de doctorat, Nanterre, 1982
Bapst G., Essai sur l’histoire des panoramas et de dioramas, Impr. nationale (Paris), 1891
Francastel. Binary display of images when spot size exceeds step size. Applied Optics, 15:2513–2519, August 1980.
Tricart C.. Marriage Equality VR, 2015 (https://www.celine-tricart.com/fr/portfolio/marriage-equality-vr/).
Descolla and Lance Williams. Animating images with drawings. In Andrew Glassner, editor, Proceedings of SIGGRAPH ’94 (Orlando, Florida, July 24–29, 1994), Com- puter Graphics Proceedings, Annual Conference Series, pages 409–412. ACM SIGGRAPH, ACM Press, July 1994.
Look Into the Cut: Jessica Brillhart on Editing VR
McKey R. Story : Substance, Structure, Style and the Principles of Screenwriting, ReganBooks, 1997
Lynch K., The Image of the city, Cambridge, Mass, MIT Press, 1960
Jacobson R., Closing statements: Linguistics and Poetics, Style in language, T.A. Sebeok, New-York, 1960
Notes on Blindness is an Archer’s Mark production, in association with Fee Fie Foe Films and 104 Films, and in co-production with Agat Films & Cie and ARTE France. It has been supported by Creative England, British Film Institute, Impact Partners, ARTE France, Swedish Film Institute, BBC Storyville, Cinereach, BRITDOC, New York Times and PROCIREP-ANGOA. It was developed at the Documentary Campus Masterschool 2013.
Chion M. L´Audio-Vision. Son et image au cinéma. Nathan, 2005
Propp V. Morphology of the tale, Leningrad 1928
Guelton B. Les Figures de l’immersion, Rennes : Presses Universitaires de Rennes, 2014
Gilbert J.A. (2013), Les variations de l’imitation, Paris, le Cerf