In this post I want to collect some problems I stumbled over while administrating a BigBlueButton during the last year. It is created in the hope, that others may not expirience the same misery I experienced while troubleshooting some nasty problems with my setup. But let’s start at the beginning.

Turn – A Solution for a Problem, that shouldn’t Exist

Let’s imagine an optimal world. Everyone has an IPv6 address, there are no NATs, no content filters, no overly restrictive firewalls towards the internet – oh, and rainbow puking unicorns of course. If one wants to meet others in a BigBlueButton (wherever that belongs in such a world), you can just connect to the server. You simply1 send your media streams (audio, video, screensharing, etc.) via UDP to the server, which mixes the streams and distributes them to the other participants. Everyone can see and hear everybody else. Nice.

But the real word is cruel and dark – at least from a network guy’s prepective. There are multiple aspects, which may disrupt the free exchange of media streams like

  1. Restrictive outgoing firewalls in the local network towards the internet which prevent communictaion via “unknown” UDP ports
  2. Restrictive outgoing firewalls on the local machine, sometimes incorperated by company policies or virus scanners
  3. Badly configured NAT gateways
  4. Missing or badly configured IPv6

To overcome those obstacles, turn relays are used. They are basically a way to trick firewalls into thinking, that the multimedia traffic is “valid” according to its rules. A turn relay runs on a specfied port which is unlikeley to be blocked by firewalls or passed through content filter proxies (like tcp 443). It then brokers a UDP stream to the streaming endpoint and relays the traffic.

Sounds quite simple. So where is this technique required specifically in a BigBlueButton (BBB) setup? Let’s take a look at this beauty of a diagram from the BBB documentation: BBB High-level architecture You will quickly notice, that BBB is not a single software but rather a software stack which is built of various components tied together by carefully crafted configuration of its components. Two components in this diagram are of intrest when debugging multimedia streams from and to participants:

  • kurento for webcam, screensharing and whiteboard streams (video)
  • freeswitch for audio streams

A client needs to connect to both of these components to be able to use BBB to its full extend. So let’s setup a turn server. Should not be that hard, should it?

My Setup

In my specific setup, I won’t use the turn relay for the BBB exclusively, but also for Nextcloud Talk and Jitsi Meet. Thus my configuration might look pretty similar to you, if you have ever seen the Coturn configuration proposed by Jitsi:

fingerprint
lt-cred-mech
use-auth-secret
keep-address-family
static-auth-secret=SECRET
realm=<realm>
cert=/etc/coturn/certs/fullchain.pem
pkey=/etc/coturn/certs/privkey.pem
no-multicast-peers
no-cli
no-loopback-peers
no-tcp-relay
#no-tcp
listening-port=3478
tls-listening-port=5349
relay-ip=<external_v4>
relay-ip=<external_v6>
listening-ip=<external_v4>
listening-ip=<external_v6>
no-tlsv1
no-tlsv1_1
# https://ssl-config.mozilla.org/#server=haproxy&version=2.1&config=intermediate&openssl=1.1.0g&guideline=5.4
cipher-list=ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384
denied-peer-ip=0.0.0.0-0.255.255.255
denied-peer-ip=10.0.0.0-10.255.255.255
denied-peer-ip=100.64.0.0-100.127.255.255
denied-peer-ip=127.0.0.0-127.255.255.255
denied-peer-ip=169.254.0.0-169.254.255.255
denied-peer-ip=127.0.0.0-127.255.255.255
denied-peer-ip=172.16.0.0-172.31.255.255
denied-peer-ip=192.0.0.0-192.0.0.255
denied-peer-ip=192.0.2.0-192.0.2.255
denied-peer-ip=192.88.99.0-192.88.99.255
denied-peer-ip=192.168.0.0-192.168.255.255
denied-peer-ip=198.18.0.0-198.19.255.255
denied-peer-ip=198.51.100.0-198.51.100.255
denied-peer-ip=203.0.113.0-203.0.113.255
denied-peer-ip=240.0.0.0-255.255.255.255
syslog

As I hate IPv4 (and v4 addresses are relatively expensive) I decided to run the turn server in parallel to another web application (like Jitsi) I host on my server. Again, inspired by the Jitsi Meet configuration, I used a Nginx with a config snippet like this to multiplex the traffic between my web application and the turn server based on the existence of an ALPN header:

# cat modules-available/60-turn.conf

stream {
    upstream web {
        server <web application>;
    }
    upstream turn {
        server [<turn IPv6>]:5349;
    }
    map $ssl_preread_alpn_protocols $upstream {
        ~\bh2\b         web;
        ~\bhttp/1.      web;
        default         turn;
    }
    server {
        listen 80;
        listen [::]:80;
        proxy_pass <web application>:80;
    }
    server {
        listen 443;
        listen [::]:443;

        ssl_preread on;
        proxy_pass $upstream;
    }
}

This has also the neat advantage, that you can still use Let’s Encrypt with HTTP challanges for the turn server’s certificate. A certificate for the turn server is not strictly required, as the WebRTC traffic should already be encrypted using dTLS. But you never can be too paranoid, can you? The BBB configuration is exactly like BBB recommends. Et voilà! Everything seems to work as intenteded… or does it?

Pitfall #1: “My shitty old computer shows an empty page!”

After a while, complaints reached me, that the web application cannot be reached with a very old MacBook. Some debugging revealed, that this MacBook did not send or understand ALPN (granted: ALPN is relatively new in context of those technologies). So I dropped the idea of multiplexing the web application and I bought another IPv4 address for the turnserver. I still kept the Nginx to allow the Let’s Encrypt HTTP ACME challange.

“But why do you use the HTTP challange in the first place instead of the DNS challange?” I hear you ask. In my particular setup, HTTP is currently much easier and I want to use it.

Pitfall #2: “Audio works, but I can’t send video!” (Error 1020)

After further testing, I noticed that some clients had troubble sending video streams. As audio worked flawlesly, the problem had to be somewhere in the Kurento media server. It turns out that under some circumstances kurento has troubble deciding on what IP address to use as valid streaming candidate. I use a routed setup to assign BBB’s public IPs to its VM/container using dummy interfaces. Kurento gathers all interface IPs and sometimes tries to use the internal IPs from the transfer network instead its public ones. A simple fix is to add the following two lines to /etc/kurento/modules/kurento/WebRtcEndpoint.conf.ini:

externalAddressIPv4=<external_ipv4>
externalAddressIPv6=<external_ipv6>

Pitfall #3: “Video works, but I can’t send audio!” (Error 1007)

That’s a nasty one. This hints towards a problem with Freeswitch and only happens if the client does not have an IPv6 address.

It took me quite a lot of time to realize, that Freeswitch eliminates streaming candidates (either relayed through the turn or directly) based on the address familiy used to negotiate the parameters via the websocket. So if you initially conect to Freeswitch using an IPv4, all non-IPv4 candiates are eliminated in the following negotiation of the media transport. This is also why you need this Nginx map to make BBB compatible with IPv6 only clients, because the Freeswitch WebRTC endpoint is reverse proxied in the default setup. Otherwise, Freeswitch would only choose IPv4 candidates.

So if we want to make this work, we need to keep the address family throughout the complete chain of components on the way to the Freeswitch server. Our Coturn configuration already contains keep-address-family, so the problem here must be in the Nginx. And bingo: We proxy only to the IPv6 address of our Coturn, so if you want to use the turn relay when only having an IPv4 address, the address family is proxied to IPv6, coturn keeps this family and passes it on to Freeswitch, which eliminates the relay IPv6 candidate, because the client initially conncted via IPv4. Therefore, the turn relay is not considered for use and the client cannot connect to the Freeswitch (sigh). Kurento does not care of the address family which is in my humble opinion the correct behaviour. The behaviour of Freeswitch may (I am guessing) be rooted in the application structure of Freeswitch: Profiles for IPv4 and IPv6 are configured seperately and thus Freeswitch may need to seperatre handling of the different address families relatively early in the process.

To fix this, we need to add some lines to the Nginx configuration (similar to the map mentioned above) to distinguish between the adress families for proxying:

stream {
    upstream web {
        server <web application>:443;
    }
    upstream turn {
        server [<turn IPv6>]:5349;
    }
    upstream turn4 {
        server <turn IPv4>:5349;
    }
    map $ssl_preread_alpn_protocols $upstream {
        ~\bh2\b         web;
        ~\bhttp/1\.     web;
        default         turn;
    }
    
    map $ssl_preread_alpn_protocols $upstream4 {
        ~\bh2\b         web;
        ~\bhttp/1\.     web;
        default         turn4;
    }
    
    server {
        listen 80;
        listen [::]:80;
        proxy_pass 10.0.3.1:80;
    }
    server {
        listen 443;
    
        ssl_preread on;
        proxy_pass $upstream4;
    }
    server {
        listen [::]:443;
    
        ssl_preread on;
        proxy_pass $upstream;
    }
}

With these changes we finally seem to be able to break out of the most “broken” client networks.

Acknowledgements

Thanks to Felix Dörre for proofreading this article and helping to debug those problems.


  1. Let’s assume for the sake of argument that WebRTC can be considered “simple”. That one is a rabbit hole of its own. ↩︎