Introducing Receiver Audio Subscriptions
Jitsi Meet has had support for ReceiverConstraints for video for a long time. Receivers can specify which streams they wish to receive, and at what resolutions, and the backend will attempt to satisfy these constraints subject to the available bandwidth. But the same flexibility was not available for audio — clients always receive all available audio sources. Until now.
This project, implemented as part of GSoC 2025, introduces the ability for receivers to specify which audio sources they wish to receive.
What can we build with this API? Here are some ideas we’re had, but we’re also excited to see what the community comes up with:
- Breakout rooms without entirely separate meetings
- Deafen
- Live-translated meetings (multilingual webinars)
In the rest of this post we describe how audio subscriptions are implemented and what challenges we faced.
Signaling message semantics
At the heart of this project is a new signaling message: ReceiverAudioSubscription
. This message expresses which audio sources a participant wishes to receive from the bridge. It has two fields:
type ReceiverAudioSubscription = {
mode: "All" | "None" | "Include" | "Exclude";
list?: string[];
};
All
– subscribe to every audio sourceNone
– do not receive any audioInclude
– receive only the sources listed inlist
Exclude
– receive all except the sources listed inlist
The list
field is relevant only for Include
and Exclude
. By default, every participant is in All
mode, which means that unless a subscription message is sent, all audio sources are forwarded (preserving the old behavior). For example:
{ "colibriClass": "ReceiverAudioSubscription", "mode": "All" }
{ "colibriClass": "ReceiverAudioSubscription", "mode": "Include", "list": ["alice-a0", "bob-a0"] }
This message format defines the contract between clients and the Videobridge: clients describe what they want, and the bridge enforces it.
Client support
On the client side, support for audio subscriptions was added in lib-jitsi-meet PR #2869. Applications can now send ReceiverAudioSubscription
messages via simple API calls. The main entry point is:
conference.setAudioSubscriptionMode({
mode: "Include",
list: ["alice-a0", "bob-a0"]
});
This instructs the bridge to forward only the streams from Alice and Bob to the local client. For convenience, there is also a higher-level method for the deafen feature:
conference.muteRemoteAudio(true); // equivalent to { mode: "None" }
conference.muteRemoteAudio(false); // equivalent to { mode: "All" }
Internally, lib-jitsi-meet translates these calls into JSON messages sent over the bridge channel. If Include
or Exclude
is called with an empty list, the client library automatically coerces the request to None
or All
respectively. This avoids sending no-op messages and keeps behavior consistent.
Backend Support
On the backend, Jitsi Videobridge receives the ReceiverAudioSubscription
messages and enforces them. Each conference has an AudioSubscriptionManager
, which tracks the subscriptions for all participants. For every incoming RTP packet, it decides which participants should receive it based on their current subscription mode.
Subscription Management in Jitsi Videobridge
Within the Videobridge, each conference has a dedicated AudioSubscriptionManager
. This class maintains the subscriptions for all participants and decides, for each incoming packet, which participants should receive it and which should not.
The above sounds straightforward, but in practice there were two main challenges: co-existing with the route-loudest-only
option, and handling multi-bridge and mesh scenarios.
Co-existing with route-loudest-only
Videobridge has a configuration option called route-loudest-only
. When enabled, it forwards only the three loudest audio sources. This filtering happens early in the packet pipeline, before any other processing. To coexist with this setting, Videobridge ensures that explicitly subscribed sources are always forwarded, regardless of their loudness. For example, in a breakout room scenario, participants must still hear each other even if none of them are among the loudest speakers. Without special handling, route-loudest-only=true
would discard these streams before the subscription logic could act. To solve this, the pre-decrypt loudness check was extended to also verify whether any participant has explicitly subscribed to the source. Because this check runs on every incoming packet, it must be highly efficient. To achieve this, the AudioSubscriptionManager
keeps a pre-computed set of explicitly subscribed sources, allowing constant-time checks during packet processing.
Multi-Bridge Conferences
In large conferences, Jitsi Meet often deploys multiple Videobridge servers connected in a mesh. A participant connected to one bridge may subscribe to a source hosted on another bridge (“remote source”). However, subscriptions are not automatically propagated across bridges.
To address this, we introduce two new inter-bridge signaling messages: AddAudioSubscription
and RemoveAudioSubscription
.
- When a participant subscribes to a remote source, the first subscription on that bridge triggers an
AddAudioSubscription
message to the bridge that owns the source. - Conversely, when the last subscription to a remote source is removed, a
RemoveAudioSubscription
is sent.
This ensures that the subscription state is propagated across the mesh with minimal signaling overhead. The same mechanism applies to “Receiver” endpoints—endpoints that do not send media themselves but only receive streams (see this post for the details). When a bridge receives an AddAudioSubscription
, it first checks whether the source belongs to one of its directly connected participants. If yes, it updates its AudioSubscriptionManager
; if not, it forwards the message toward the bridge that might host the source. This propagation allows subscriptions from Receiver endpoints, even those not directly connected in the mesh, to reach the correct bridge.
Wrapping up
With this new subscription model, clients can now choose which audio sources to receive, in a similar way they already could for video. The feature supports use cases like breakout rooms, deafen, and selective forwarding in large conferences. The API will soon be available on the server side too, so you can build custom features on top of it.