Jitsi Meet

(in Space)

Issues in video-conferencing

As most of us know, video conferencing can be a boring, tiresome affair

Latency, network reliability, visual and audio fidelity can all contribute to a fatiguing experience

Can we do better with the reproduction of audio?

Starting with sound

We shouldn't reinvent the wheel, current implementations aren't too bad...

but they weren't considering user experience for daily use!

Core idea: Let's start with audio and focus on the goal of telepresence for more realistic conversations

Contributions

Open source, web based implementation of binaural video conferencing (with technical evaluation)
An synopsis of the current literature around spatial audio in video conferencing platforms
A small scale experiment to record participants experience with the platform (through the lens of telepresence)

Spatial audio

Nearly all videoconferencing apps transmit mono audio in multiparty contexts

Spatialization would be difficult from an acoustic perspective, but most of us use headphones anyway...

1. I can't build a teleconferencing system from scratch

A videoconferencing system is a sprawling, complicated beast

I needed a free & open-source app that was purely web based

Serve as a template for this novel integration

2. The academic & conceptual angle

There's a lot to say about teleconferencing and how we might best experience each others' voices over the net

Similar works that have tested this idea

Before I began - some considerations

A balance between over-programming and over-writing

Jitsi Meet

An open source video conferencing platform

WebAudio

An api in Javascript for working with audio

Jitsi is huge

Working with a mature application, in React (a library built on top of Javscript) is hard

Just understanding the ecosystem, finding the audio, was challenging

The difficulty was compounded when making changes that needed to propogate

How should audio be represented in group discussions?

Panning vs. ambisonics (HRTFs)
User control vs. automated
Headphones vs. speakers

Lots of interesting stuff

Corona has pushed everyone online
- Estimates of +10-40% (Nokia, Cloudflare)
- And away from cities!
Audio is way more important than video in conferencing and communication
- Video for social bond forming
Lateralizing/spatializing audio can improve intelligibility in many scenarios
- Double talk!
- Reduce cognitive load, increase intelligibility, memory, and is generally more favorable

Where am I now?

I have a server running for development and testing
- It's in Oslo (~300ms ping)
- Public and readily accessible
I've integrated WebAudio in Jitsi
- It should be working well for Firefox and Chrome
- Both equal-power panning and HRTF based binaural lateralization
- And automatic spatialization of users
- Intelligent rearrangement of user's audio tracks

A general idea

A little diagram

A small experiment soon!

To see how binaural audio in a standard conferencing layout might affect a number of metrics and user opinions - stay tuned!

What's left to do?

A little more serious examination of performance

Test CPU utilization
- More people --> more CPU, all local tracks are spatialized separately
- Test differently sized groups
De-syncing possibilities
- Audio/video may not sync

Jitsi Meet

(in Space)

Issues in video-conferencing

Starting with sound

Contributions

Spatial audio

1. I can't build a teleconferencing system from scratch

2. The academic & conceptual angle

Before I began - some considerations

Jitsi Meet

WebAudio

Jitsi is huge

How should audio be represented in group discussions?

Lots of interesting stuff

Where am I now?

A general idea

A little diagram

A small experiment soon!

What's left to do?

Thanks!

Spatial audio in Jitsi Meet

Spatial audio in Jitsi Meet

jacksongoode

Jitsi Meet

(in Space)

Issues in video-conferencing

Starting with sound

Contributions

Spatial audio

1. I can't build a teleconferencing system from scratch

2. The academic & conceptual angle

Before I began - some considerations

Jitsi Meet

WebAudio

Jitsi is huge

How should audio be represented in group discussions?

Lots of interesting stuff

Where am I now?

A general idea

A little diagram

A small experiment soon!

What's left to do?

Thanks!

Spatial audio in Jitsi Meet

More from jacksongoode