Understanding how Skype for Business establishes audio/video paths using ICE

Skype for Business audio not working?

This is a common issue in Skype for Business deployments. Understanding how Skype for Business establishes audio and video (media) paths can improve your ability to troubleshoot these tricky issues. This article will give you some new tools to help you troubleshoot a Skype for Business audio not working issue.

Audio and video establishment in Skype for Business takes a different approach to most network traffic. SIP signalling and control traffic in Skype for Business takes the run of the mill approach – traffic is routed directly between server and client. Any delays (latency) in this traffic is typically unknown to the user.

On the other hand, any significant delays to audio and video traffic will be immediately noticed by the user and could disrupt the call. To reduce the likelihood of this happening, Skype for Business will attempt to find and use the shortest path between users – This can result in two users on the same network sending their audio and video traffic directly to each other, all the while, the SIP signalling traffic continues server/client. To do this, Skype for Business uses Interactive Connectivity Establishment (ICE). ICE is the overall process that helps discover and exchange ‘candidates’ to finds most optimal audio and video path.

Definitions

  • SIP signalling – simply put, it’s the control traffic. For example, its the thing that tells the server you want to make a call by sending a SIP INVITE.
  • Interactive Connectivity Establishment (ICE) – Process used to discover and exchange candidates to find the most optimal audio or video path.
  • Candidates – A list of possible IP addresses that could be used to establish an audio or video path.
  • Reflective/Session Traversal Utilities for NAT (STUN) – STUN ‘reflects’ or returns the public NAT address to the Skype for Business client e.g. a home based user sends a packet to edge server, which discovers the public IP address (a candidate), and returns it to the client.
  • Relay/Traversal Using Relays around NAT (TURN) – TURN allows the audio or video traffic to be relayed/proxied by the Edge server to the client by providing the client relay addresses to send audio or video.
  • ICE endpoints – An ICE endpoint is anything that is involved in audio or video e.g. Skype for Business Clients, Skype for Business Web App, Skype for Business Phone, Front End Server (App Sharing MCU, RGS, Call Park A/V Conf etc), mediation Server, SBA, Exch UM. Session Border controllers and the director role would not be considered as ICE endpoints. Edge server is doing STUN and TURN but not an ICE endpoint, and is more of an ICE server.

The 5 Phases of Audio or Video Path
Establishment

If your Skype for Business audio is not working (or any other media type), understanding this process will help you narrow down where to look for potential issues.

1. TURN provisioning and credentials (MRAS)

The Skype for Business client does an SRV lookup to find an Edge server to register against and then performs a SIP register. The server provides a 200 OK which includes in-band provisioning details, including MRAS (audio or video relay authentication services) which tells the client there is an Edge server deployed. With this, the client sends a SIP service request to the Front End server which includes the client’s location (internal or external). Because the Edge server is not on the domain it can’t authenticate client directly, so the Front End server requests the credentials on behalf of the client. The AV Edge service creates credentials using the AV Edge certificate for the Front End server which sends a 200 OK back to client with the Edge server it should connect to, ports and username and password. Credentials are valid for 8 hours, and for this period the client can now communicate directly with the Edge server. In a conferencing scenario the same thing happens, however, because of the possibility of joining a meeting anonymously, the Front End server checks to see if a meeting exists, and then gets and passes the credentials to the anonymous meeting participant.

Skype for Business MRAS

Tip – Always make sure you use the same external certificate for all Edge servers. The certificate is used to create credentials for the client to connect. If an Edge server goes down, and the client try’s to connect to another Edge server using a different certificate, it will not be able to validate the credentials and authentication will fail. Search for “MRAS” in the client or server SIP logs (use Snooper tool) to find authentication messages. There should be 3 messages per request. Port 5062 for MRAS.

2. Address Discovery (Allocation)

Address discovery is the process the client goes through to determine what IP addresses it might be reached on. These IP addresses are the client’s candidates.
Audio/Video:
  • Discover local UDP candidate for every network card (peer to peer so UDP is best)
  • Connect to media relay (Edge server) to discover reflexive address (the address the Edge server sees the client connect from) and allocate a candidate on the media relay for UDP then TCP
File Transfer and Desktop Sharing (RDP over RTP) – Both require TCP:
  • Discovers local TCP candidate
  • Media Relay TCP only


3. Address Exchange (SIP Invite/200OK) 

Address exchange is the process of sharing candidates with other endpoints that will be part of the call (peers). This is achieved by sending a SIP invite to the peer, who in turn will discover their own candidates, and send them back as part of a SIP 183 Session Progress.


4. Connectivity Checks

This is the process of taking the provided candidates and determining a possible audio or video path. The Skype for Business client validates the list of candidates by opening connections to all entries in the list simultaneously. The first to respond is used to establish the “Early Media” connection, however the audio or video path may change during the call using a process called candidate promotion. When the called party picks up it will again send its candidates to the caller, but this time part of a 200OK.
  1. Connect directly (peer to peer)
  2. Connect to reflective address
  3. Connect via media relay by connecting to the Edge server and asking it to contact a candidate and establish a connection on its behalf
Note: If there is no Edge server it only does local candidates.


5. Candidate Promotion

This is the process of determining the best possible candidate for the session. If a better path is found the then audio or video path can change during the call:
  • Host/Local Candidate (UDP) – The most preferred candidate is always a local candidate and is the reason that peer to peer audio or video sessions between clients on the same network will never use the Edge server.
  • Reflexive/STUN Candidate (UDP) – The next preferred option is to use the server reflexive candidate which is provided by the Edge Server using STUN. This scenario involves attempting to connect to the reflexive IP addresses for each externally connected user. The reflexive IP address is the public IP address of the external user e.g. a home router.
  • Relay/TURN Candidate (UDP) – In the event that STUN fails then the final option is to utilise the Edge Server as a media relay. The calling client will establish an audio or video session directly with the A/V Edge Server as will the receiving client. This connectivity is relayed through the public IP address of the Audio/Video Edge service.
  • Relay/TURN Candidate (TCP) – when connectivity is not available on UDP. TCP Relay is a last resort.

    Skype for Business audio/video ICE Protocol

SIP Messages in Audio or Video Path Establishment

When you are troubleshooting Skype for Business audio not working its also good to understand how clients communicate with each other using SIP messages. This help you confirm if the negotiation process is working as expected.

  • Out INVITE (SDP session description protocol – tells the other party what I can do e.g. what codecs). The first set of candidates is ICE v6 (ms-proxy-2007 fallback) and a second set is ICE V19. OCS r2+ uses V19, but both are included for backward compatability. Candidates come in peers – one for Real-time Transport Protocol (RTC) and one for RTP Control Protocol (RTCP).
    A=candidate 1 1 Protocol (UDP/TCP Passive – candidate I expect to send traffic to /TCP Active – candidate that sends me traffic) priority (high best) IPAddress Port Typ(host/relay/server reflective)
    A=candidate 1 2 UDP priority (high best) IPAddress Port Typ (host/relay/server reflective).
  • In SIP 183 Peer sends its candidates. You may see multiple – one for each end point.
  • In SIP 200 OK Peer picks up the call. This still includes a full candidate set as the best have not been negotiated yet.
  • Out INVITE Re-invite which will include the 1 chosen candidate pair as decided in the earlier process.
  • In SIP 200 OK Includes other party’s final candidates.
NOTE: The Edge server is used in discovery process, but not necessarily once audio or video path has established. This is why it can be important for internal clients to be able to access the internal NIC for edge. If the candidate list doesn’t include UDP and TCP reflective then it probably can’t talk to Edge server. If you see only UDP or only TCP then firewall might be blocking ports.

Call Scenarios and Connections Options

When you are troubleshooting Skype for Business audio not working you’ll also need to know the differences in how audio or video establishment works when users are inside and outside the corporate network.

Inside <-> Inside

  • Peer to Peer

Inside <-> Outside

  • Peer to peer will not work
  • Outside connects to reflective candidate UDP or TCP
  • Outside connects to own edge server (relay) which hairpins traffic to internal user

Skype for Business audio/video ICE Mixed Peer

Outside <-> Outside

  • Peer to peer might work if clients are on the same network
  • Reflective candidate UDP or TCP
  • Relay via Edge server
    Skype for Business audio/video ICE External peer to peer

Federation OCS 2007

  • Edge servers connect to each other on the 50k port range directly and relay the call. Ports need to be open in both directions.

Federation 2007 R2 (tunnel mode introduced)

  • The Edge server sends a special packet to UDP port 3478 on the other Edge to find out if it is OCS 2007 R2 or above. If it is then tunnel mode can be used, and all UDP traffic can be sent on these ports. Candidate data still includes the 50k ports, but the Edge server just contacts the other Edge server to share this information and connect.
  • TCP is very similar, but because a connection to a source IP/port and destination IP/port can only be in use at one time, the Edge server allocates a port in 50k range as a source port, and then opens a connection to the other Edge server on port 443. This gets around having to have 50k ports open which is required for OCS 2007.
While the 50k port range is not required for OCS 2007 R2 and above, there are still benefits to opening it. In a situation where 2 Edge servers would normally be involved in relaying audio or video, this situation allows both
clients to connect to the same Edge server. The initiating client connects to its home Edge server, gets candidates and passes those to the other party. The other party then attempts to connect to the 50k range directly on the initiators home Edge server. Without these ports open this would not work, and the client would need to involve its own Edge server and ask that it connects to the initiating Edge to relay on its behalf. This introduces a longer audio or video path.

Troubleshooting Audio or Video Connectivity

  • Get Snooper installed to make reviewing the client and server logs easier.
  • Get client logs from a fresh sign-in – is there MRAS? If no, it can’t talk to the Edge server.
  • Check if the Front End server can telnet to the FQDN on Edge server internal NIC. Check logs for STUN and TURN candidates. If none, then there is an issue between the client and Edge server
  • Use port query tool to test UDP ports
  • When the Edge server sends candidates in a NAT situation, it uses the external IP configured in topology and sends this to client – make sure it’s correct!
  • Search client and server logs for a=candidate to find candidate information
  • Search a=remote-candidate to find the final candidates that are chosen
  • After a call is picked up, it can take several seconds before the final candidates are chosen and audio or video paths are subject to change. The final re-invite will include this, but the result may not be in logs for a few seconds after connection.

You are now ready to troubleshoot Skype for Business audio not working!

Hopefully, you found this article useful and next time you have a Skype for Business audio not working issue you’ll have some new tools to help you.

Thanks to Thomas Binder for this excellent deep dive as well as Jeff Schertz for his summary.

Andrew Morpeth
Andrew Morpethhttps://ucgeek.co/author/amorpeth/
Andrew is a Modern Workplace Consultant specialising in Microsoft technologies based in Auckland, New Zealand; Andrew is a Director and Professional Services Manager at Lucidity Cloud Services and a Microsoft MVP.

Related Articles

26 COMMENTS

  1. Good summary. I've been trying to get a handle on the entire process as we have deployed Lync 2013 in our environment (without enterprise voice), One question I can't seem to find an answer on is, what path does an external client that is using Lync Web App take when connecting to a conference for desktop sharing. The behavior I see in our environment is that when a guest is connected from external using the Lync Web App connected either to someone external using the full client or someone internal using the full client, we have a difficult time establishing a desktop sharing session or maintaining one. Looking at our firewall logs, I see dropped communication between our front end server ip's and the external edge nic ip's. I see the external edge ip trying to talk to the front end server ip on 3478 stun and occasionally other 50,000 or higher ports, as well as the front end server trying to talk to the external edge ip in the 50,000 range. This traffic will all drop because I was under the impression that communication with the front end servers or internal clients should go through the edge internal interface due to our persistent routes on edge.

    I started exploring the log files and candidate lists. With an internal full lync client user and an external lwa user scenario, I see the internal user candidate list look correct.. it sends an invitation and has its local ip of typ host on tcp-act and tcp-pass. I also see the relay ip of the external edge ip in the list. What seems odd is that, the internal client then receives a SIP 200 OK with a candidate list, but the list is the local ip of the front end server's nics (both the ip of the default nic for communication and the ip of the nic used to connect to some back end storage for the lync file share). It also shows the relay address of the edge's external ip.

    Looking at the logs on the lwa client, I see a candidate list that looks correct for its local ip information, but I also see candidate lists show up which list the information for our front end server as well.

  2. That's an interesting problem and one that I have not seen before. The Lync Web App is still an ICE client and will attempt to establish the media path in the same way. Signally will occur via the FE web services. The FE server is also an ICE client and this maybe why you are seeing its IP addresses in some logs, however I would not expect this to be in the client invites. It may be that the FE forwards the request to the internal client, acting as an ICE proxy, however that's just me speculating.

    What are you using to publish web services externally? Make sure that the time-out is set at 200 seconds or more, I normally configure 3600 seconds.

  3. We are publishing web services externally using an F5 Big IP as a reverse proxy which is a supported device. I thought perhaps this was a routing issue on the Edge server but I have double checked the persistent routes.

  4. Shouldn't be a routing issue if your other client types are working externally. Did you check what your time out is set at?

    Is the issue only effecting content sharing? Does the audio and video also drop?

    Are you using 1 or 3 IP's on your edge server? Are you using NAT?

  5. Thanks Andew

    I think single pool media flow has been documented well online in regards to Lync and media establishment.
    What I struggle to find is media flow in scenarios with two central sites each with their own edge pool.
    For example, how would a remote user in FEPool1 connect to an internal user in FEPool2?

    Would media flow to Edgepool1 > internally to FEPool1 then intercluster routing to FEPool2 and to the client

    OR

    Would media flow to Edgepool01 then proxied across to Edgepool02 and internally to FEPool02 and to the client

    Thanks Andrew

    • Thats a very good question! I would expect the media to be proxied via EdgePool1, then:
      1. If EdgePool1’s internal interface can route to the client this would take place internally (this is a requirement based on my experience)
      2. EdgePool1 could possibly proxy to EdgePool2 – This one I am not sure about, but this is how it would work for a federated deployment. Considering the ICE process this should be possible.

      • Thanks Andrew, just a questions based on your reply.

        “If EdgePool1’s internal interface can route to the client this would take place internally (this is a requirement based on my experience)!”

        Are you suggesting that it should be a requirement for every client on the LAN despite their home pool should be able to access the internal interface on all edge servers in the topology?

        Thanks

        • Ive seen issues where you have a user connected externally on Site1 who communicates with an internal user on Site2. If the internal users PC knows how to route to the Edge internal network in Site1, but traffic is blocked by a firewall or the Edge doesn’t know how to route back, media fails.

  6. we run private ip addressing internally, and have good sized bandwidth available to us (higher ed institution, and can test easily during “downtime”).
    We’re using Skype For Business in the cloud, (no on-prem), and point to point calls are totally fine. When we get into multiparty calls, we occasionally have issues.
    When we use the Skype/Lync tool at FastTrack Network Analysis tool (http://em1-fasttrack.cloudapp.net/o365nwtest), ti gives us poor results for Consistency and Quality of service.

    I’m not seeing it reflected in troubleshooting. Where we NAT our clients at our edge firewall, is it possible the NAT’ing is presenting issues? Possibly ALGS on that same fw? We’re at a loss at this point of where to look next.Any advice is appreciated. Thanks.

    • Hey Matt,

      The the point to point calls you talk about with internal parties? or does this comment included external and federated parties? One thing to note is that when you go multi-party, the O365 conferencing server comes in to play. Compared with point to point calls the audio path in some cases can be direct. NATing you clients out to the internet shouldn’t be an issues as this is very common. If you could provide a bit more detail on the call flows I may be able to help further.

  7. If you have multiple endpoints active, letsay a Lync 2013 client on a PC and Lync on a mobile device, with no simultaneous ring setup, then a person with a Lync client calls you, which of the two endpoints will receive the call? Will they ring at the same time?

    • Yes calls should ring on both devices. In my experience iPhone/Android are slower notify of incoming calls. Compared to Windows Phone which uses push notifications, the phone often rings before the desktop client. In general SfB applies a preference system for uses active on multiple devices. This can be seen most apparently for incoming IM’s. If you are active on desktop and mobile, SfB will prefer desktop for delivery. Hope that helps!

  8. Thanks for the Great Article,

    Quick Question:
    Calls to a remote client (Skype or another Company’s Lync) do not connect. The call goes through but won’t establish. The strange behavior is that on our external firewall I can see both the Lync Client and the Lync Edge server trying to talk to the LOCAL IP address of the remote (for example 192.168.1.6). So I tested with a Lync machine and skype machine inside of the same local subnet and it works fine. Ever heard of that?

    For some reason our lync client and edge server know the local RFC1918 IP of the remote user.

    • Thanks for your feedback! It sounds like you have a firewall between client networks and that the ports required for a point to point connection are not open. If this is the case, then the client will try and use the Edge server to relay. If this is not working then you likely have a firewall or routing issue between each client networks and the internal interface of the Edge server. Hope that points you in the right direction.

  9. Hi Andrew,

    If two internal client networks have overlapping IP Address spaces will the clients be able to use the internal Edge interface to relay media between the two internal networks/clients?

    • That’s a damn good question, not something I have come across before. I’m picking this would cause an issue from a routing perspective so would be problematic. How can the Edge determine how to route to the correct network when they are overlapping is the question that comes to mind.

  10. We are trying to do Split Tunnel configuration for AV traffic, so when users are working from home, and they VPN into work, their inside status will be TRUE, as all the traffic will come in directly to Lync servers, except AV traffic. We would like VPN client’s AV traffic to come via Edge server. Can you please shed some light on what we would need to do achieve this. I am thinking we would need to firewall block the AV ports for VPN clients, to prevent AV traffic to come in directly from VPN clients. We have dedicated DNS server for VPN clients, do we need to create any DNS records for VPN clients, or something else I might be missing, please advice.

    • Hey, Is the VPN capable of managing this by FQDN? Which VPN do you use? This is often how we achieve this – ALL SfB FQDN’s are bypassed for DNS and service connection and sent straight out to the internet. An example of what I mean tih regard to Direct Access is here -https://blogs.technet.microsoft.com/edgeaccessblog/2010/05/12/split-brain-dns-configuring-directaccess-for-office-communications-server-ocs/

    • Hey, in the case of conferencing everything goes via the conferencing server (Front End). This is because it needs to be be centrally processed and delivered to the many different devices which all have differing capabilities.

  11. Andrew have you ever seen a case where the mediation servers simply do not attempt to negotiate STUN/TURN candidates with the edge servers? We implemented our edge servers and everything works great except calls from external clients destined to go out the pstn so therefore the media path should build between the clients and the mediation servers (via edge). Wire shark on the mediation servers show zero 3478/udp packets sent to the edge. Instead they attempt directly to communicate with the the client TURN candidate which is the external edge interface.

    Calls from external clients to internal client and IVRs successfully negotiate valid candidates (we see the internal client and the front end servers sending 3478/udp packets to the edge internal)

    It’s as if the mediation servers are not ICE servers or are unaware of the existence of the edge servers.

    • Hey Jeff, no I haven’t seen that specific case. Seen lots of issues where audio in general fails. Usually comes down to firewall issues, asynchronous routing, or incorrectly configured next hop. Have you tried Admin Tools port checker? This may help identify the issue – https://ucgeek.co/admin-tools/. Let me know how you get on.

  12. Hi there,
    wondering if you can shed some light please:

    we deployed a sfb2015 environment based on build guides from microsoft, F5 (for the ltm / reverse proxy) and vmware (full virtual enviroment)
    and initially all services worked like a charm.

    month or so later, external federated user cannot create a video call or join a conference that we create.
    yet we can video/audio to free skype (we enabled this a while earlier)

    not sure what im looking for as i cant see anything that stands out.
    any suggestions would be greatly appreciated

    • Hey Karl,

      Is your external federation SRV record correctly published? Through the PIC process, Skype consumer knows your Edge server FQDN, while other federated partners don’t unless they have explicitly listed you as an allowed domain on their end.

  13. Hey Andrew,

    We are migrating from Lynch 2013 to SFB15. we have 3 Locations where in currently the remote users are unable to have Audio\Video calls, Desktop Sharing. Calls are getting though but when clicked on Answer it goes into connecting and disconnects with Network error. remote to remote users working fine, even calls with Fedrated partners working fine. Issue is even users is in

    Users in SFB pool not working,
    With Internal user in Lynch2013 pool also not working in same site.
    With Internal user in Lynch2013 pool also not working in different site.

    Any help will appreciated.

    • I can almost guarantee this will be a network/firewall issue. Without all the detail its pretty hard to troubleshoot. If I understand correctly external to external is fine which means you Edge server is successfully relaying media on its external interface. But, you have an issue with media establishing between internal and external users – this requires media to relay through the edge to your clients on the corp network. Can all your client networks communicate directly your Edge servers internal DMZ IP? Have you added static routes on your Edge server so it knows how to route to internal networks?

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Andrew Morpeth
Andrew Morpethhttps://ucgeek.co/author/amorpeth/
Andrew is a Modern Workplace Consultant specialising in Microsoft technologies based in Auckland, New Zealand; Andrew is a Director and Professional Services Manager at Lucidity Cloud Services and a Microsoft MVP.

Latest Articles