Vozzy
2020-2022The hackathon project
Work on the project began during quarantine. Most people rarely left the house, and when they did, it was to pick up groceries or other essentials. The team started thinking about the dual forces of social isolation and falling business patronage. How would people find meaningful forms of community when conventional gathering spaces (pubs, bars, restaurants, diners, etc.) were closed? What would happen to bookshops, museums, cafes, record stores, and other small independent businesses that primarily rely on physical sales?
The goal was to address these challenges by building an authentic digital space for people to spend time together, much as they do in the physical world, while providing a way for businesses to keep promoting themselves through virtual means. The platform would let businesses and individuals welcome others into their own spaces and events, such as museums, coffee shops, and parties, and make them feel like they're having genuine interactions. Large group gatherings and events would appear on a real-time map, so users could choose where to be and when.
Many businesses were turning to virtual platforms, such as Zoom, to host large gatherings and meetings. However, these platforms don't offer the natural ability to mingle from conversation to conversation like one would at an in-person gathering (the cocktail party effect). Vozzy would provide a means for users to engage with others and their surroundings in a more organic way.
In the excitement of caffeinated evenings, late-night Zoom pair programming sessions, and take-out pizza that accompanies a virtual hackathon, the team built a prototype. Along the way, a few interesting challenges emerged. First: how to mimic the experience of sound in the real world?
The sound-distance formula came in handy here as a way to dynamically scale the loudness of source audio streams based on the user's position. Below, is the source audio volume and 1000 is an arbitrary multiplication factor.
Agora.io SDKs handled real-time audio streaming; the loudness of each stream was adjusted in real time based on the distance between individuals in the room. This mimics the natural way sound is perceived in physical space.
Moving around the space needed to feel as fluid as possible. Each user has an avatar when they enter the space. Users can drag the avatar with their mouse to move around the space. Firebase stored the location of each person in the room and updated them in real-time.
Mapbox displayed all the open spaces users could join on a world map.
The video submitted to the hackathon competition:
Taking a fresh look
After the hackathon finished, the team dispersed and daily routines resumed, but the potential of what had been built lingered. Gyan kept going on his own, pushing Vozzy from a hackathon demo toward a usable product.
In the spring of 2021, a sound design for theatre course gave him an assignment to design a soundscape that would evoke a sense of physical presence in the listener. That prompt opened further questions about space and sound, and how presence might be prototyped in digital form.
He built a simple prototype with Mousetrap.js for capturing keyboard input and Howler.js for managing audio. One notable feature of Howler.js is its support for spatial audio: not only is volume dynamically adjusted when the user moves, but orientation toward the object affects perception of the audio. This is accomplished by re-computing the stereo pan and volume of the audio track based on the following variables:
- orientation of the audio source (degrees)
- orientation of the listener (degrees)
- position of the audio source relative to the global listener (Cartesian x, y, z)
Below is a prototype with a single speaker. Sound isn't captured in this screen recording, but as you move closer to the speaker, volume increases.
Moving around the space with arrow keys felt more natural than dragging an avatar with a mouse. After all, human movement is incremental: you can walk or run, but never teleport.
In his next iteration, Gyan dropped zooming and panning entirely. The place must be explored manually. Furthermore, everything would be an object: people, speakers, decorative objects, walls, and so on. The "object graph" is the system by which objects are managed, rendered, and removed.
Objects allow collisions. On each move, a shadow of the avatar checks for intersection with an object. On collision, 20% of the animation still runs to produce a "bump" effect. With no intersection, movement proceeds.
The camera follows the avatar only when it leaves the viewport, then pans over accordingly. This lets the viewer develop spatial positioning within the "room" they are in, similar to the dynamics of a game.
Based on the absolute position of the object, the camera decides whether it should bother to do anything. Generally, the Object Manager will instantiate an object when it is inside the camera's scene or within 20% in any direction outside of it. Whenever the person moves, the Object Manager will re-assess the scene and see if any Objects should be detached. The advantage here is that only the relevant objects are ever loaded, and thus only they will have attached event listeners, making for a speedy user experience.
The components of the above diagram:
os: This is the "operating system" of Vozzy. It handles:- UI sounds
- the command bar
- keyboard handling
- toasts and notifications
- alerts and confirm popups
- right-click menus
- audio stuff, including mute, unmute, managing WebRTC streams, etc.
scene: Manages the interactions of objects in Vozzy. It handles:- Calculating spatial things, i.e. calculating the overlapping area between two rectangular objects given their coordinates; getting the absolute x and y of an object based on the relative viewport of the scene; calculating the distance between any two bounding boxes; creating a bounding box given an x, y, width, and height; getting a new bounding box with a given bounding box and x and y offset
- Panning the scene based on the user's location (
os.object.get('me')) - Garbage collection for objects outside of the scene (offload objects which are no longer in the scene or nearby)
- Handling object mutations (i.e. adding, updating, deleting)
object: the base class on which all Vozzy objects (participants, furniture, speakers, etc.) are created. Its interface includes:load()Load the object's data from the database into memoryrender()Create the object element on the canvasattachInteractivity()Attach event listeners to the object for dragging, clicking, etc.bump()When the object collides into another (hard) objectcontextmenu()Right clicking on the objectbeforeUnload()Any rendering/confirmation before the object is removed from the sceneedit()Interfaces with the OS to provide an editing UI for the objectpaint()Paint the objectmove()Move the object in the scenesync()Sync the object (uses themessengerinterface)getDistanceFromMe()Returns the distance between a given object and the current user.distanceFromMeChanged()Triggered whenever the distance between the object and the user changes, after some threshold and until another threshold (distance or event)
messenger: is used for sending multi-purpose messages across a specified channel ID. It exposes a subscribe and send interface, and manages subscriptions internally.
The underlying philosophy here is that Vozzy is quite opinionated over the mechanics of object interaction and management but is uninterested in the shape, design, and interactivity of an object itself. That's for the object creator themselves to decide!
The goal was to mirror the physical delineations of object interactions in the real world. After all, a radio manufacturer can't alter gravity, but they can control their radio's weight and appearance.
A screen recording of chairs being added to a Vozzy space:
Adding a radio to play music for everyone in the space:
Objects are interacted with by bumping against them. Here, the radio is muted by bumping into it: