Towards an Ultimate Display
In 2019, Jak Wilmot livestreamed himself living a week in virtual reality. He ate with a VR headset on and didn’t take it off to sleep or to go to the restroom. When he showered he kept his eyes closed. He watched old black-and-white movies, played Skyrim, hung out with other people in VR Chat, traversed the savanna, and drove a virtual bus for eight hours from Tucson to Las Vegas.
Over the course of the week, Wilmot swung between ecstasy and despair. But it was not until he took headset off on day seven that euphoria struck. Slowly creaking his eyes open, and with a wide grin, he muttered, “the graphics are so good.” Later he went outside on his porch, took a long inhale, and said, “I have never appreciated the smell of outside air so much.”
While the average person might understand technology simply as that which is “new,” Wilmot’s experience suggests the need for a more precise definition. As Marshall McLuhan argued in his 1964 classic, Understanding Media, technology must be understood as an “extension” of the body. A car is an extension of the legs. A stove is an extension of the stomach. The internet is an extension of the nervous system.
Virtual reality is a technology not because it is new but because it is an extension of our body. It’s perhaps the most comprehensive technology today because it extends so many parts of our bodies: our eyes, ears, hands, feet, nervous system, and vestibular system.
In fact the idea of virtual reality is not new at all. And while manufacturers have struggled to make VR a household commodity, its comprehensiveness as a technology reveals much about the broader media ecosystem in which we all increasingly dwell. VR reflects the tendency of all media to converge and combine into a larger, more immersive medium. Observing its properties can prepare us to exist in the hyper-unified, comprehensive, and immersive media landscape to come.
VR also presents an opportunity to reinvision how we can exist. Technology tends to change our definitions of things. The locomotive redefined (compressed) geography, the national newspaper redefined community, and computers are redefining intelligence. In particular, VR media and other nascent immersive technology will initiate a redefinition of what it means to “be.” Paradoxically, this technology centers the human while also rendering the human invisible. It concurrently reveals and redefines our environment, and consequently, who we are as a people. Can we control it?
In 1935, Stanley Weinbaum wrote his short story Pygmalion’s Spectacles, the earliest known narrative to describe a mechanical face-worn gadget that transports the wearer to another place. In 1962, two decades after Weinbaum’s short story, “The Sensorama” was released to the public. It was an arcade-like box designed so that the viewer could place their head inside of it to watch a stereoscopic film and have wind blown on their face and smells delivered to their nose. At the time it was called “Experience Theater.”
1968 was a pivotal year for VR. Ivan Sutherland, a professor of computer science at the University of Utah, with a team developed “The Sword of Damocles,” a head-mounted display which hung suspended from the ceiling of the lab’s office. It covered the wearer’s eyes and allowed them to enter a spatial computing reality with interactive graphics. It was what many recognize as the first true demo of VR. It was a technical and artistic breakthrough.
“The Sword of Damocles” might have made a deeper impression on the public if it were not for Douglas Englebart’s “Mother of all Demos,” presented that very same year. A behemoth in the computer science world and the leader of the Stanford Research Institute, Engelbart pioneered the field of human-computer interaction. Over the course of 90 minutes on a sunny day in San Francisco, Englebart showed an audience of computer scientists the first ever demo of what would become many of the key elements of personal computing: video conferencing, windowed interfaces, real-time collaborative text editing, and even the mouse.
Ideas from the landmark demo trickled throughout Silicon Valley. Researchers at Xerox PARC developed a compelling Englebartian computer interface which Apple famously later stole and used to release its first all-in-one personal computer with an integrated flat display. And so the world went the way of the screen.
Since then, VR has sustained a series of commercial failures. VR did find some niche footholds in academia, the arts, and the military. In the ’80s, the Air Force started using VR to train its pilots in flight simulators. But the technology failed to gain real traction with the public. The tech was not-headtracked, and it made many people nauseous. In 1995 Nintendo released the Virtual Boy, a low-tech VR headset. It flopped, and a year later, the company discontinued the product.
Today, despite some technological advances and minor inroads in the gaming market, we are in the midst of what pundits lament as a “VR Winter.” HTC and Facebook’s Oculus, the big brands behind modern VR software and hardware, have cut R&D spending as their gadgets continue to underperform.
Still, despite VR’s history of market failures, generations of artists and educators have nourished the ideological vision behind the technology. In 1975, Myron Kruguer exhibited his “VIDEOPLACE” work, which demonstrated what a shared virtual reality could feel like. In 1977, Michael Naimark led the “Aspen Moviemap” project, an early precursor to what we can think of now as Google Street View. Contemporarily, Rachel Rossin creates interactive works, like “The Sky is a Gap,” which experiments with mapping time, space, and room-scale tracking together to provide commentary on the changing nature of digital space.
Like these artists, Ivan Sutherland understood early on — even before inventing the Sword of Damocles — that just being in VR had a profoundly immersive effect on the person wearing the headset. At the end of a 1965 essay, Sutherland wrote, “The ultimate display would, of course, be a room within which the computer can control the existence of matter. Handcuffs displayed in such a room would be confining, and a bullet displayed in such a room would be fatal. With appropriate programming such a display could literally be the Wonderland into which Alice walked.”
In 1987, the psychonaut Terrence McKenna gave a lecture at the Earth Trust Benefit in LA in which he described an embodied mode of communication. Human language for him was a “meta-linguistic system,” abstracted from the genetically based communication system apparent in all life. McKenna also had an interest in VR, which came from his belief that it could bring humans one step closer to the low frequency vibrations of this more foundational communication system. To him, the octopus was the prime example of an un-abstracted communication being. An octopus, he said, communicates with its entire body, through movement, and as it changes shape, color, and texture: “The octopus is its own syntax. It doesn’t generate its own syntax. It becomes syntax. The mind of an octopus is worn on its surface… it operationally is a naked mind.”
In 2018, I joined a VR software company which was developing a peculiar tool. The tool allowed me to upload 360-degree videos into a headset which I could then manipulate with a set of hand controllers. I could rotate the world, drag objects in space, pause time, all by pointing, gesturing, and looking, without taking off the headset.
Editing this world in VR with my entire body felt like “becoming my own syntax.” There was no translation between thought and effect — just effect. This is a special feeling and I don’t think there’s a good enough word in the Oxford English Dictionary to describe it. I propose the term “ingenic,” a portmanteau of interior genesis, or creation from within, to describe this phenomenon.
- Content that is ingenic has been generated in the same medium of its consumption.
- Creating ingenically means generating content in the same medium in which it will be consumed.
If I use a VR app like Gravity Sketch, which allows a designer to “paint” a sculpture in 3D, and I hand off that sculpture for someone else to view in a headset, that’s ingenic. If I film a movie in VR, edit it in VR, and premiere it for my friends in VR, that’s ingenic. If I perform live music for an audience in VR, that’s ingenic.
Ingenic implies its opposite: exgenic. Exgenic, or exterior genesis, is a type of content created in a different medium than its consumption. Today most of the media we encounter is produced in another medium: iPhone apps coded on desktop computers, digital movies scanned from celluloid film, and analog music streamed and compressed over Spotify servers.
In the age of the internet, the nomadic and remixable quality of exgenic content is what makes it valuable. To illustrate this, media theorist Lev Manovich chronicled how visual media gains value as it is imported and exported across software in his 2006 essay, the aptly titled “Import/Export.”
“[The] ‘import’ and ‘export’ commands of graphics, animation, video editing, compositing and modeling software are historically more important than the individual operations these programs offer,” Manovich writes. Imagine how much less useful Photoshop would be if you could only draw images within the program, and could not import JPEGs, PNGs, GIFs, TIFFs, and dozens of other file formats. Photoshop is useful precisely because it can accommodate a heterogeneous landscape of media.
Exgenic media is important because it enables the invention of entirely new mediums. Indeed, the medium of recent VR itself comes from an import/export remix of a diversity of hardware and software. Mass-produced and cheap smartphone components (gyroscopes, accelerometers, high-resolution and high-framerate LCDs), combined with spatial tracking software and 3D game engines, have enabled the recent delivery of consumer VR headsets.
Manovich’s import/export model describes the mode of exgenic media: exgenic media is media in motion. It is “transmedia,” as theorist Henry Jenkin calls it. It is media perpendicular, inverted, and parallel to its varied neighbors. Exgenic media is a messy desktop, files strewn up, down, left, and right.
The personal computer with a flat monitor is an optimal exgenic machine. A screen is a map and provides a distanced as-a-God perspective that allows for easier comingling of disparate elements. “What if I combined this PNG on the left with that JPEG on the right?” The mode of being in an exgenic media landscape is that of remix.
The central metaphor of the 21st century is the internet, the network. The internet metaphor has propagated through our language like a virus. Spam, branch, stream, are all real-world nouns transformed into internet-speak verbs. Cloud, mining, crash, leak, bit, freeze, web — all mechanomorphisms. Vice versa, language from internet-speak has made its way into everyday language: algorithm, bandwidth, and data.
As we spend more time in VR and other comprehensive media, our vocabulary and metaphors may be altered. Some common terms used in the VR industry today are degrees of freedom, parallax, field of view, haptics, co-location, immersion, embodiment. These terms all are sensory.
The VR headset is the optimal ingenic machine. VR is spatial, rather than screen-based. There is a general shape to a screen — rectangular — which allows us to pinpoint areas via a coordinate grid — up, down, left, right. But as the architect Buckminster Fuller noted in his 1969 book Utopia or Oblivion, in a spatial reality, “there is no shape.” He scolded MIT scientists for using such delusional, conditioned, and anti-scientific terminology as “up” or “down” when, he claimed, there is neither direction nor shape to space. Space, Fuller insisted, required a base observation that there is only an “omnidirectional conceptual ‘out’ and the specifically ‘directioned conceptual in.’”
Perhaps the central metaphor of the next century is the body. VR, and other ingenic media, recenters the body to replace the network as the central metaphor. Import/export is an extension of self into the relation between multiple tools. Ingenic is a willing extension of the self into the direct experience of one tool — it allows the self to become its own syntax. Ingenic media rids itself of a coordinate grid of ups and downs, and cultivates a bodily experience instead. Ingenic creation brings the computer inside our bubble of consciousness; an extension of ourselves, rather than us an extension of it.
Ingenic media has three phenomenological properties. The first is flow. In the ’70s, the psychologist Mihaly Csikszentmihalyi described flow as “a state in which people are so involved in an activity that nothing else seems to matter.”
In The Digital Plenitude, Jay David Bolter applies Csikszentmihalyi’s theory of flow to new digital media. For Bolter, flow can be induced in what he calls “passive media.” Going downa YouTube rabbit hole and watching videos back-to-back until 3 AM is a passive form of flow. However, he observes that the greater flow experience comes from within “active media” or high-engagement activities, like playing video games. Csikszentmihalyi himself cites very active media like rock climbing, playing tennis, or playing piano as allowing the participant to achieve a deep and near spiritual level of flow.
Seen through Bolter’s lens, ingenic media is a highly active media. VR literally brackets your senses and defocuses distractions. When you are deep into the flow of VR, “nothing else seems to matter.” Have you seen those videos of people in headsets ducking to avoid a projectile in their VR video game but slamming into their dresser in real life? That reaction happens because of flow.
Flow produces the second property of ingenic media: myopia. While flowing, you enter the reality distortion field of the activity. Distanced objectivity is lost when you are in the thick of things.
The artist Hito Steyerl observes a similar effect in a lecture titled “Bubble Vision.” Her thesis is that the emblem of the bubble represents a new paradigm of technology which is eliminating the human subject. One of the bubbles is 360-degree VR video. In the VR bubble, “the viewer is absolutely central, but at the same time, he or she is missing from the scene.” This is true. In nearly all VR apps today, if I were to look straight down at my body, it would be gone — eliminated! For Steyerl, the body-less state of VR indicates a loss of the human subject.
This is where Steyerl and I part ways. If flow is a way to find yourself, and flow makes one feel that “nothing else matters,” then “elimination” is also the act of finding oneself. When you are in flow you lose a sense of your body. You don’t catch the football, your body does. You don’t play the piano, your body does. You don’t sculpt in VR, your body does.
When you are in flow, playing your sport, instrument, video game, or reading, watching, or listening, then you are unaware of yourself. You have eliminated yourself. But are you not your zenith self when you were doing these activities?
Flow is thus an oscillation between destruction of the sense of self and a heightened sense of self. It’s an oscillation which breaks down the dichotomy between you and the object of the activity. What’s left is only an awareness of experience as such. This awareness of the stuff of experience is a product of being in ingenic media. This myopic self elimination is the second phenomenological property of ingenic media.
Wilmot, the man who existed over a week in VR, lived ingenically, to say the least. When he took off his headset he experienced the diametric opposite of flow and the third (and final) phenomenological property of ingenic meida: re-alienation.
In theater this is called the V-effect, or distancing effect. This moment of rupture makes the audience critically aware of the reality they inhabit day-to-day. One thing that ingenic media is particularly effective at is establishing the conditions for this V-effect — or alienation. Any medium which induces higher flow also introduces the higher potential to fall back to reality. The return to reality is a re-alienation because it refreshes what was once familiar.
For Jaron Lanier re-alienation is what makes VR special. Lanier is a scientist and artist essential to the field of VR. It was he who actually coined the term “virtual reality” back in 1987, and he has since arrived at a lot of outwardly counterintuitive opinions about the technology. One of these counterintuitive opinions is that VR headsets should remain ugly. By ugly, he means he wants to keep the headsets looking like a gadget that sticks madly out from your face.
A dangerous VR headset for Lanier would be invisible. Without a clear distinction between VR and reality, then the whole point of VR is lost. If it were invisible, it would be impossible to achieve the beautiful breakage from virtual reality back into reality. For Lanier the breakage back is what makes the technology of VR so magical.
Lanier is hardly the first to play with this concept. In Being and Time, Martin Heidegger developed the concept “readiness-to-hand,” which describes a basic way that humans exist in the world through, for example, the use of tools. Heidegger describes a person who uses a hammer. As they hammer away at a task, the hammer integrates with their flow of the task and the dichotomy between the person and hammer dissipates.
Like Lanier, Heidegger highlights an inflection point, which he calls un-readiness-to-hand. This occurs when flow is broken. A flow might break because the hammer itself malfunctions and physically breaks. In these moments when the hammer breaks, it reveals something new and re-alien about the hammer, and also reveals something re-alien about our “average everydayness,” the part of existence that is normally invisible to us.
Given present currents in politics and technology, it seems likely that eventually, one medium will subsume all others, and all media will be under one roof, in what one could call a “media singularity.”
We can see this trend today. Private corporations are building larger and larger proprietary media ecosystems. For instance, Apple hardware that requires Apple software to run. Apple continues to buy other companies, configuring the new tech to only work in their walled garden. Media streaming sites like Netflix, HBO, and Hulu are increasingly verticalizing and producing exclusive content which requires a user subscribe to an entire ecosystem of video just to watch one movie. Within software the advent of “superapps” –apps with countless features– means we never have to switch apps. WeChat in China is used as the primary platform for social media, payments, news, transportation, and other daily activity. It effectively is the operating system of everyday life for a billion people.
VR is the prime example of a technology that reflects an enlarging (and consolidating) media ecosystem. VR itself is an aggregate of many other technologies: cinema, video game engines, digital photography, surround sound, smartphones, social media. Those technologies themselves are aggregates of older technologies. For instance, cinema aggregates photography, animation, collage, chemistry, and painting.
Each new technology that appears in the media ecosystem is an exgenic remix of multiple mediums. Each new technology is more comprehensive than the previous, as it carves out a greater volume of “inside” space in the medium to accommodate more ingenic activity. Over time, as the prevailing media theories go, one technology remains. Marshall McLuhan describes this final technology as the “final phase of the extension of man [sic] — the technological simulation of consciousness.”
In 1922, the philosopher Pierre Teilhard de Chardin visualized a kind of “final phase of the extension of man” and called it the Noosphere, which translates literally to“mind sphere.” He described it as a sort of consciousness-sheath for the planet. It hovers above the atmosphere of the Earth in the form of a life-advanced mesh of pure thinking, communication, and ideas. It is media incarnate, materialized as a bubble, a sort of virtual twin of the Earth overlaid upon it. For Teilhard, the Noosphere was an evolved form of the Earth as a superorganism; it is part of a natural progression of a biosphere into a technosphere, and finally a “mind sphere.”
We already have hints at what this Noosphere could look like. Companies and governments alike are transforming the physical Earth into a unified operating system. For instance, PokemonGo parent company Niantic is generating an exact 3D map of the world from player-generated scans. The GIS continues to scan cities and landscapes at centimeter accuracy to simulate natural disasters, like flooding. Wearables like the FitBit track personal biometrics throughout the day. Machine-learned object detection is just beginning to taxonomize and label our system of things.
While still fractured and fragmented today the data across these platforms is beginning a long tedious journey of import/export into one another. Application Program Interfaces (APIs), Software Development Kits (SDKs), and open source licensing accelerate this import/export process, transfiguring the heaps of data into a gigamesh mirrorworld and into a media singularity.
In a media singularity we will flow. We will most certainly destroy ourselves, but we will also find ourselves. However, we can not assume this media singularity will re-alienate us. The uncontrollable scale of such a thing is daunting, terrifying, and raises questions. Who will benefit? Who will it hurt more? But if it is used as a tool and understood as an extension of ourselves, I believe we can actually guide it. This will require a conscious, collective effort to make VR headsets visible and ugly — and to equalize power within such an omnipresent media ecosystem. And, as with any extension of ourselves, we should always have the choice to amputate it. A media singularity could attune us to unmediated reality. It could act as our literary foil to reveal to us who we are. We just have to know how to take the headset off.