A customer recently had a problem with Windows 2012 DirectAccess connected clients performing desktop sharing, audio and video conversations with internal clients. DirectAccess is a Windows service that is part of the Remote Access role that allows domain joined clients to access internal resources over the internet as if they were on the LAN. DirectAccess does this by providing seamless VPN connectivity without any user input. Lync can work over DirectAccess (and Lync 2013 works a lot better as it supports IPv6) but because the traffic is encrypted and is real time communication, it is recommended to use the Lync Edge server for connectivity rather than sending the traffic over the DirectAccess VPN. For more information see this NextHop article Enabling Lync Media to Bypass a VPN Tunnel.
The primary issue the client was facing was that when users were out of the office using DirectAccess, application sharing / remote desktop failed with an error ‘Sharing failed to connect due to network issues. Try again later’. This meant that internal support staff could not provide assistance for remote users without third party tools. Audio and video would not work either when using DirectAccess.
Not having used DirectAccess before, I had a quick scan over the DirectAccess config and went into some testing. I saw that even with DirectAccess disconnected (forcefully disconnect as it starts up automatically) the client could not share programs from Lync and audio/video would fail. So I ignored DirectAccess and started concentrating on the Lync Edge. TMG was being used as the firewall with a DMZ leg that contained the Lync Edge server. I validated all required ports were open as per Reference Architecture 1: Port Summary for Single Consolidated Edge which they were but I still saw a lot of denied traffic on TMG from the Lync Edge internal interface to the internal client IP address.
Client logging showed some strange behaviour in the failing session. The SDP (Session Description Protocol) candidate list in the SIP invite did not contain any information about the public IP address of the remote client or the Lync Edge server, it would just give the private IP as a host (local address):
a=candidate:1 1 TCP-PASS 2120613887 192.168.0.102 15380 typ host
a=candidate:1 2 TCP-PASS 2120613374 192.168.0.102 15380 typ host
a=candidate:2 1 TCP-ACT 2121006591 192.168.0.102 10295 typ host
a=candidate:2 2 TCP-ACT 2121006078 192.168.0.102 10295 typ host
An example of a correct SDP candidate list with the media relay (TURN) and reflexive (STUN) addresses:
a=candidate:1 1 TCP-PASS 2120613887 192.168.0.102 1105 typ host
a=candidate:1 2 TCP-PASS 2120613374 192.168.0.102 1105 typ host
a=candidate:2 1 TCP-ACT 2121006591 192.168.0.102 19166 typ host
a=candidate:2 2 TCP-ACT 2121006078 192.168.0.102 19166 typ host
a=candidate:3 1 TCP-PASS 6556159 203.x.x.x 55741 typ relay raddr 101.x.x.x rport 36324
a=candidate:3 2 TCP-PASS 6556158 203.x.x.x 55741 typ relay raddr 101.x.x.x rport 36324
a=candidate:4 1 TCP-ACT 7076607 203.x.x.x 55741 typ relay raddr 101.x.x.x rport 36324
a=candidate:4 2 TCP-ACT 7076094 203.x.x.x 55741 typ relay raddr 101.x.x.x rport 36324
a=candidate:5 1 TCP-ACT 1684797695 101.x.x.x 36324 typ srflx raddr 192.168.0.102 rport 29176
a=candidate:5 2 TCP-ACT 1684797182 101.x.x.x 36324 typ srflx raddr 192.168.0.102 rport 29176
For a great overview of the Lync Edge media negotiation and media traversal for remote clients which goes into detail on TURN and STUN, see the article by Jeff Schertz Lync Edge STUN versus TURN. The failing application sharing session would only list the private 192.168.0.x address, not the STUN or TURN candidates and would fail with an error:
ms-client-diagnostics: 23; reason=”Call failed to establish due to a media connectivity failure when one endpoint is internal and the other is remote”;CallerMediaDebug=”application-sharing:ICEWarn=0x80020,LocalSite=192.168.0.102:15380,RemoteSite=10.x.x.x:7342,RemoteMR=203.x.x.x:52241,PortRange=1025:65000,RemoteMRTCPPort=52241,LocalLocation=1,RemoteLocation=2,FederationType=0″
A successful session would reference the public IP address as the LocalSite and the Lync Edge server as the Local Media Relay:
ms-client-diagnostics: 51007;reason=”Callee media connectivity diagnosis info”;CalleeMediaDebug=”application-sharing:ICEWarn=0x0,LocalSite=101.x.x.x:14747,LocalMR=203.x.x.x:59191,RemoteSite=10.30.3.146:7500,RemoteMR=203.x.x.x:54525,PortRange=1025:65000,LocalMRTCPPort=59191,RemoteMRTCPPort=54525,LocalLocation=1,RemoteLocation=2,FederationType=0″
This led me down a long path of firewall troubleshooting where I experimented with opening the high TCP and UDP ports I could see being blocked by TMG but it did not help. The firewall being the problem was reinforced by finding this article Application Layer Firewall Blocks Lync Application Sharing but in the end the firewall had nothing to do with the problem. I then started digging in Snooper into the ‘Traces’ tab of the client .uccapilog and found a line stating:
‘ResolveHostName – Name resolution for avconf.domain.com failed’
I had first tested nslookup, ping and telnet to the Edge from my laptop and I knew that desktop sharing and audio/video federation worked. I had not tested it from the remote client as I knew it was setup correctly. When testing from the remote DirectAccess client (with DirectAccess disabled), I could telnet to the IP address on port 443 and could get nslookup to resolve the name to IP, but if I tried to ping the name it would not resolve similar to this screenshot:
It says it cannot find the host, but DNS resolution is working so it should be able to find the host. This was something I hadn’t seen before and pointed at configuration specific to the remote DirectAccess client I was testing with. I added the AV Edge name and IP into the hosts file – desktop sharing and audio/video worked straight away. This led me to an old OCS and DirectAccess configuration guide Split-Brain DNS: Configuring DirectAccess for Office Communications Server (OCS) which matched the client’s split DNS configuration.
The article above talks about the DNS Name Resolution Policy Table (NRPT). Running ‘netsh namespace show policy’ showed that entries existed for sip.domain.com, sipinternal.domain.com, _sipinternaltls._tcp.domain.com etc. as mentioned in that OCS article. However the Web Conferencing Edge and AV Edge names were not on the list.
In the DirectAccess configuration under ‘Step 3 Infrastructure Servers’:
We added webconf.domain.com and avconf.domain.com to the DNS Server Addresses list with a blank DNS server entry similar to the screenshot below. This forces the client to use the DNS of its primary connection, not internal DNS for these specific entries.
After applying the configuration and updating the GPO, I run ‘gpupdate /force’ on the client (over the internet) which updated the NRPT table similar to this where the ‘DirectAccess (DNS Servers)’ is blank and ‘DirectAccess (Proxy Settings)’ is set to ‘Use default browser settings’:
After that the client can resolve the IP for avconf.domain.com and application sharing as well as audio and video worked while using DirectAccess. The problem was that the client had the same namespace internally and externally and the default DirectAccess configuration forces all .domain.com resolution to the internal DNS servers. Even when DirectAccess is not connected the NRPT configuration remains active. This is why even though it could resolve the address using the public DNS of the mobile hotspot, when it came to actually using the DNS entry for Lync, ping or telnet the DirectAccess NRPT configuration overrides it.
A simple solution, but not obvious when troubleshooting with DirectAccess turned off. The takeaway from this is that all Lync external namespaces should be added to the DirectAccess NRPT bypass list. That means all of the Lync Edge interfaces as well as the Reverse Proxy entries for external web services, meet, dialin, lyncdiscover (for the Windows Store Lync clients) and Office Web Apps Server.