Jitsi Meet
(in Space)
Issues in video-conferencing
As most of us know, video conferencing can be a boring, tiresome affair
Latency, network reliability, visual and audio fidelity can all contribute to a fatiguing experience
Can we do better with the reproduction of audio?
Starting with sound
We shouldn't reinvent the wheel, current implementations aren't too bad...
but they weren't considering user experience for daily use!
Core idea: Let's start with audio and focus on the goal of telepresence for more realistic conversations
Contributions
- Open source, web based implementation of binaural video conferencing (with technical evaluation)
- An synopsis of the current literature around spatial audio in video conferencing platforms
- A small scale experiment to record participants experience with the platform (through the lens of telepresence)
Spatial audio
Nearly all videoconferencing apps transmit mono audio in multiparty contexts
Spatialization would be difficult from an acoustic perspective, but most of us use headphones anyway...
1. I can't build a teleconferencing system from scratch
A videoconferencing system is a sprawling, complicated beast
I needed a free & open-source app that was purely web based
Serve as a template for this novel integration
2. The academic & conceptual angle
There's a lot to say about teleconferencing and how we might best experience each others' voices over the net
Similar works that have tested this idea
Before I began - some considerations
A balance between over-programming and over-writing
Jitsi Meet
An open source video conferencing platform
WebAudio
An api in Javascript for working with audio
Jitsi is huge
Working with a mature application, in React (a library built on top of Javscript) is hard
Just understanding the ecosystem, finding the audio, was challenging
The difficulty was compounded when making changes that needed to propogate
How should audio be represented in group discussions?
- Panning vs. ambisonics (HRTFs)
- User control vs. automated
- Headphones vs. speakers
Lots of interesting stuff
-
Corona has pushed everyone online
- Estimates of +10-40% (Nokia, Cloudflare)
- And away from cities!
-
Audio is way more important than video in conferencing and communication
- Video for social bond forming
-
Lateralizing/spatializing audio can improve intelligibility in many scenarios
- Double talk!
- Reduce cognitive load, increase intelligibility, memory, and is generally more favorable
Where am I now?
- I have a server running for development and testing
- It's in Oslo (~300ms ping)
- Public and readily accessible
- I've integrated WebAudio in Jitsi
- It should be working well for Firefox and Chrome
- Both equal-power panning and HRTF based binaural lateralization
- And automatic spatialization of users
- Intelligent rearrangement of user's audio tracks
A general idea
A little diagram
A small experiment soon!
To see how binaural audio in a standard conferencing layout might affect a number of metrics and user opinions - stay tuned!
What's left to do?
A little more serious examination of performance
- Test CPU utilization
- More people --> more CPU, all local tracks are spatialized separately
- Test differently sized groups
- De-syncing possibilities
- Audio/video may not sync
Thanks!
Spatial audio in Jitsi Meet
By jacksongoode
Spatial audio in Jitsi Meet
- 50