It's Here! Monitor Microsoft Teams Audio Video Conferencing

Sponsored Post

It's Here! Monitor Microsoft Teams Audio Video Conferencing

Exoprise released its long awaited Teams Audio Video Conferencing sensor. This sensor fully tests Audio/Video end-to-end capacity, throughput, and network performance through the actual underlying Microsoft Teams and Azure infrastructure. The Teams AV sensor provides deep insight into a network’s capability to handle the Teams/Skype Unified Communications (UC) platform. Desktop, Collaboration and Network administrators can now easily diagnose and proactively plan a Microsoft Teams rollout and migration.

Exoprise is the only vendor that can automate, test, and proactively monitor the true experience of the Team platform. When you want to know how Microsoft Teams is performing before that important call or conference, you need Exoprise CloudReady.

Microsoft Teams relies on WebRTC and Exoprise captures low-level network characteristics without the need for SSL inspection or complex Netflow instrumentation. The Teams AV Conferencing sensor relies on a hosted media bot to proactively test and monitor low-level the UC workflow, Audio, and Video streams in real-time. Use the Teams AV sensor for accurate networking testing as recommended for deploying and assessing network capacity for Microsoft Teams according to Microsoft.

Microsoft Teams Built Atop WebRTC

Microsoft Teams, both for regular browsers and their Teams App relies on WebRTC for audio and video communications. WebRTC is a free and open standard that provides browsers and mobile apps with the ability to perform Real-Time Communications (RTC) via simple JavaScript compatible APIs. The WebRTC project is supported by Google, Mozilla, and now with the announcement of Microsoft Edge to be based off of Chrome/Chromium, also by Microsoft.

WebRTC is responsible for the various workflows involved in establishing real-time communications between peers or to central servers. It utilizes many standards such as Session Traversal Utilities for NAT (STUN) and Traversal Using Relays around NAT (TURN) for discovering and optimizing the connections. Once network discovery is complete, WebRTC handles the required signaling and session establishment protocols such as JSEP, ICE, SIP, SDP, NAT and UDP/TCP. We help you ascertain the suitability of WebRTC in your LAN/WAN environment.

As opposed to the legacy Skype for Business protocols, WebRTC is an open standard that can more easily be supported on mobile clients or Linux desktops. Hurrah!

Skype Transition to Microsoft Teams

At the end of July, Microsoft announced that the Skype for Business Online Service will be ending on July 31, 2021 and that Microsoft is encouraging its customers to use the Microsoft Teams service instead.

The writing was on this wall for sometime. On-premises Skype for Business Server will exist for some time but Microsoft is putting its full weight behind Microsoft Teams. Thankfully, Exoprise has arrived with its Microsoft Teams monitoring solutions in the nick of time. Now, armed with a cadre of CloudReady Teams, VoIP and Skype sensors, Unified Communications administrators can properly baseline, test, and evaluate their network for Microsoft Teams performance.

How Exoprise Monitor’s Microsoft Teams

Deploying and setting up an Exoprise Teams AV Conferencing Sensor is easy. Because the sensor utilizes an Azure-hosted Teams Bot to stream audio and video into a client conference call, it doesn’t require a peer-to-peer setup like the Exoprise Skype sensors. However, it does require more maintenance and infrastructure from the team at Exoprise.

Teams AV Sensor Diagram

Teams AV Monitoring with the Exoprise Teams AV Bot

Easy Deployment

All that’s required for setting up a Teams AV Conference sensor is a Teams account and one-time administrative access to the tenant so that it can accept an Exoprise OAuth registration for the Bot to join the meaning. Accepting the OAuth registration currently requires Tenant administrative rights but the OAuth registration only needs to happen once for the entire tenant. See the following screenshots for reference:


Microsoft Teams AV Conference Sensor Introduction

Microsoft Teams AV Conference Sensor Introduction


Teams AV Sensor Configuration

Configure a Teams AV Sensor for Monitoring Microsoft Teams Audio Video Conferencing

Configure A CloudReady Teams AV Conferencing Sensor:

  1. Supply the credentials for a valid account within your tenant. This can be a managed or federated account – the sensor supports both and both types of accounts should be tested.
  2. Optionally, if you supply a specific team then you must create the team first and make sure that the account is a member of the team. If you leave it blank, then the account must have permission to create a team. It will create a team in the form of exoprise_test_team(account_name). This is detailed in the introduction (please read!).
  3. For step 3, if you’ve already accepted our Teams AV Bot OAuth registration, then you can just select it from the drop down list. You may have other domains within your tenant so choose the correct registration that corresponds to the domain of the test account.
  4. If this is your first time setting up a Teams AV Sensor, then you will need to accept our OAuth registration and you’ll need an account that has administrative access just to accept the OAuth registration. You don’t need an admin account for the sensor access, just for the one time registration acceptance. Supply a registration label and click ‘Add New’. You’ll see a normal Azure AD popup to accept our OAuth registration like the following screenshot.


Sign In to Accept Teams AV Bot OAuth Registration

Sign In to Accept Teams AV Bot OAuth Registration


That’s it. That is all it takes to configure a Teams AV Conferencing sensor and begin monitoring end-to-end Microsoft Teams network performance. Once you proceed to the next step, your configuration will be validated and then onto deployment to a CloudReady private sites you’ve deployed.

Diagnose Microsoft Teams Audio Video Performance Issues

Once you deploy a Microsoft Teams AV Sensor, it will create conference sessions (Meet Now) and invite the Exoprise AV bot to join. When the Bot joins it starts streaming audio and video content into the conference which the sensor monitors. High level actions and low-level WebRTC metrics about the session, network and performance of the audio and video content are aggregated for analysis and alarming. Below are metric details and how they are used to diagnose and baseline Microsoft Teams performance on your network:

Audio / Video Streams

The streaming of audio and video from the Exoprise AV bot is monitored from the WebRTC perspective and a compliment of statistics is captured:

  • Max Audio Jitter
    The maximum audio jitter as perceived by the client is sampled and recorded once a second and aggregated for the session. Jitter, or network jitter, is the variance in time delay in milliseconds (ms) between data packets over a network. The technical term for jitter is “packet delay variance”. Generally, you want the maximum audio jitter to be below 20ms but periodic spikes in the 20ms-30ms with average jitter in the single digits is acceptable.
  • Mean (Avg) Audio Jitter
    Avg audio jitter is the aggregate average over the life of the session. You definitely want to see this value below 20ms or your users conference experience will be poor. Average jitter values of 20ms or higher and higher spikes in Max Jitter will indicate poor network quality for Microsoft Teams.
  • Audio Packet Loss
    Percentage audio packet loss in either audio or video channels is very detrimental to the experience of a conference or meeting. Audio packets are smaller and more monotonic, so if you see audio packet loss it is a strong indicator of poor networking and congestion.

Comparing any of these metrics to the CloudReady crowd helps you in pinpointing the problem. If the crowd metrics are exhibiting the same patterns as your own sensors, that points to a shared problem with Teams or Azure infrastructure. If the your deployed sensors are only experiencing detrimental affects then the problem is localized to your own networks or where the sensor is deployed.

  • Video Packet Loss
    Sessions with elevated video packet loss will have jumpy or missed video frames. Video packets are larger and more consumptive within the transport so more video packet loss spikes are common. Most AV protocols and codecs, including WebRTC, prioritize audio channels over video because it is more detrimental to have audio drop than lost video frames.
  • Audio & Video Bitrate
    Lower streaming bitrates are also an indicator of poor conference networking. Most AV transports and codecs try to maintain a consistent bitrate or Bits-Per-Second (bps). If the Teams client can’t sustain a healthy bitrate then it will have make up for lost packets with frame skipping, audio cutouts, or attempting to synthesize the lost bits (error correction). You should see a consistent audio bitrate and a more variable video bitrate (they’re bigger packets and less prioritized).

Of interest for advanced telemetry in the Teams AV sensor is NACK, FIR and PLI packets. These special WebRTC transport packets are generated used in response to poor networking conditions and the detection of dropped frames or audio samples. Basically, the client sends a “reset” to the server or peer and requests that the missing bits or frames are resent or re-synchronized. If these event occurrences are sustained then you have some poor networking on your hands.

Meeting Connection

The metrics on the Meeting Connection tab cover the overall quality of the connection for the conference to the actual Microsoft Teams hosting infrastructure.

  • Max RTT
    The maximum round-trip time (RTT) is a measure of latency as perceived by the WebRTC client throughout the session. Every real-time connection (RTC) stack utilizes protocol headers, fields and control packets to evaluate round-trip times during sessions for quickly determining when the streams may be out of sync or underlying QoS issues. Max RTT is the maximum round-trip time that was seen during the session. Sustained Max RTT rates above 300ms between the client and the edge/media servers will reflect a poor conferencing experience.
  • Avg RTT
    This is the average round-trip time seen during the session. For good audio/visual conferencing experience you want this Avg RTT to be below 100ms but, ideally, below 50ms.

Teams QoS and Media Quality

  • Video Frames Per Second Dropped
    Frames dropped per second indicate the video frames during the conference are dropped per second. Depending on the network quality, and the negotiated video frame size, this number can be higher than expected. The video playback will skip frames and the WebRTC stack will try to compensate for the lost video frames during the session. Also, audio delivery is prioritized over video delivery, the audio samples are smaller, more constant and consistent in packet size.
  • Video Frames Per Second Received
    This is a measure of the video frames received and rendered during the Video conference. Currently, the Exoprise AV Bot will choose and downgrade video frame sizes depending on the end-to-end quality of the connection.
  • Audio Sample Rate
    Audio samples are sent in 20 millisecond packets and are prioritized over video samples.

Login & Render, Page Layout & Connect Times

The Microsoft Teams AV Sensor includes metrics for the underlying page navigation, login, authentication, and connection times. Aggregate metrics for the underlying TCP/IP Connection and SSL Timings are also captured for each sensor run. This is an aggregate sample of everything it takes to get into a meeting within Microsoft Teams including signing in and starting a meet-now conference session.

Sample Screenshots of the Teams AV Sensor

It wouldn’t be an Exoprise blog post without the requisite set of screenshots. We believe in presenting screenshots and examples to show prospects and customers that our platform and technology are real. There’s no magical hand-waving when it comes to seeing our sensors in action. Try the new Microsoft Teams Audio Video Conferencing see what you see for yourself.

1 / 6
Teams AV Media Stream Metrics
2 / 6
Teams AV TCPIP Connection Timings
3 / 6
Teams AV QoS
4 / 6
"Teams Login & Authentication Metrics
5 / 6
Teams Client DOM Loading
6 / 6
Teams Conferencing Latency, SSL And Connection Times