In this tutorial we will explain how to create the basic functionality for a browser-based software phone (softphone) using the Jabra library in conjunction with WebRTC. WebRTC is a set of native browser APIs that provides a relatively simple way of doing real-time peer-to-peer communication with audio and video.
Try demo#
Prerequisites- This tutorial assumes some knowledge on how to use the Jabra library covered in previous articles, such as the call control tutorial.
- The tutorial refers to
device
assuming that selection of an active Jabra device is handled in the UI.
#
SummaryWe will go through the following steps:
- The flow of controlling audio tracks in conjunction with the Jabra library.
- Initialize an audio stream with the selected Jabra device.
- Initialize a peer connection (for demo purposes both local and remote connection will be within same webpage).
- Playback "remote" audio.
The tutorial will focus on audio, but most concepts can be used with video as well.
#
Softphone flowA softphone consists of three key elements: local audio (your audio), remote audio (audio of the person(s) you are talking to) and the connection between the two. Luckily, the browser does most of the heavy lifting for all three elements, so our primary job is to handle the flow correctly.
#
Create an ICallControl objectThe first step of performing any call control-related functionality on a device is to initialize the Call Control module.
This is done by creating an CallControlFactory
object and using it to create ICallControl
object associated with a Jabra device.
Follow the Using SDK modules guide in that tutorial for more information.
[Note] For the rest of this tutorial, we will assume the
ICallControl
instance for device "X" is calledcallControl
.
#
Start a callThe first step is to start a call. In our example we assume that the user presses a start-call button in the UI to trigger this intent.
As explained in the call control tutorial, we need to acquire a call lock from the device in order to use call-control APIs.
If we get the lock, we can proceed with initializing the local audio stream, initializing the peer connection and lastly playback the remote audio stream. We will go into depth with these procedures in a later step - for now we just show the flow.
[Note] In this tutorial, we pass back our own audio as the peer connected audio for testing purposes. Hence, when running the demo you should be able to hear your own voice with a slight delay.
[Note] In a real world example, you would also need to handle accept/reject scenarios from the remote peer.
#
Mute/unmuteSome Jabra devices will actively stop audio from the device when set to mute state, but other devices only signal mute by turning on LEDs leaving it up to the integrator to handle the actual audio suspension.
Therefore, when muting your device we also need to disable the local audio track - and vice versa for unmute - like this:
[Note] The mute state can be triggered through the library as we are doing here, but it can also be triggered by pressing the physical mute button that most devices have.
#
Hold/resume callWebRTC offers different ways of holding a call, but the simplest one is to mute both local and remote audio stream. This method also makes it possible to replace the local audio stream with a music track to provide "waiting music".
Put call on hold:
Resume call:
#
End callClean up when ending a call.
#
Initialize local audio streamIn the section "Start a call" we initialized a local audio stream:
In this section we will cover how to do that.
#
Get microphone permissionGoogle Chrome requires that the user actively grants permission to use the microphone on a given webpage. This permission is only necessary to obtain once per domain. Trigger the dialogue like this:
[Note] Unfortunately, a browser quirk means that we need to call
getUserMedia
to get microphone permission before we can get a list of connected devices, and then call it again to get the stream of the correct device. The Permissions API wil be a better way of handling this when the spec is out of draft.
#
Get Jabra device from browserThe browser holds a list of connected audio devices, and after getting microphone permission we can get this list by calling navigator.mediaDevices.enumerateDevices()
.
Most laptops have built-in microphones so we need to filter the list to only return our selected Jabra device.
We can do that by using the browserLabel
property on the device
object returned from the Jabra library.
#
Initialize stream for Jabra deviceLastly, we initialize the stream for the selected Jabra device.
This will return a stream object, which is capturing audio without playing it back (we do that at a later step).
#
Initialize a peer connectionThe WebRTC specification lets you set up a communication channel between peers via an ICE (Internet Connectivity Establishment) Server. We will not go into details about this part but advise reading Getting started with peer connections from webrtc.org.
In our example we setup a simple peer connection within the same webpage routing the outgoing signal back to ourselves.
#
Playback remote audioLast step is to play back remote audio via an AudioContext
.
[Note] A Chrome bug requires us to setup a muted audio element alongside the audio context in order to play back sound sent over RTC. Please see this issue on StackOverflow and this bug report on the Chromium project.
#
Wrapping upThese are steps needed to create the basic functionality for a browser-based softphone.
The full flow and how everything is tied together should become clearer when trying out and reading the source code of the call simulation demo, which is based on the concepts of this tutorial.
There are several other desirable features required for a production ready softphone such as handling incoming calls, reject calls, multiple calls, react to device signals and setting up a real remote connection. Please advise webrtc.org and MDN for more in-depth knowledge about the WebRTC specification. This list of WebRTC samples is also a useful resource.