Polycom VVX 310 – Unable to do blind transfer internally

I’ve been working with one SFB customer recently. I met some unique issues and I would like to share the experience of what I did to solve the problem.

Issue Description:

Customers were experiencing Polycom handsets unable to transfer external calls to a particular internal 4 – digit number range xxxx. All the agent phones are VVX 310 and agents sign in via extension & pin. When the call transfer failed, what the callers heard is the placid recorded female voice: “we’re sorry, your call cannot be completed as dialled. Please check the number and dial again”. Interesting thing is the call transfer failed scenarios only happen to blind transfers while the supervised transfers worked perfectly. Polycom handsets can successfully make direct calls and receiving calls. Well, this kind of doesn’t make any sense to me.

Investigations:

Firstly, I went through all the SFB dial plans and Gateway routing and transformation rules. The number range was correctly configured and nothing is different from other range.

I upgraded the firmware to the latest V5.5 SFB-enabled version on one of the Polycom handsets. It didn’t make any improvement. The result is still the same.

I was thinking about the Digimap settings on the configuration file which may cause this issue, so I logged into the web interface -> settings -> sip -> Digitmap, removed the regex in the Digitmap field, and rebooted the phone. Still when doing blind transfer to internal number range xxxx, no luck. It failed again. :/

Interesting thing happened when I tested by using the SFB client and log in as two users within the number range xxxx and did the same blind transfer, It worked! When I using the SFB client transfer to the Polycom handset, it also worked. But it stopped working when I did transfer from the Polycom handset.

Since I can hear the Telco’s voice, I thought it would be good to do a tracing from the Sonus end to see why the transfer failed first. From the live trace, I can see the invite number is not what I expected. Something went wrong when the number normalizations happened. The extension was given the wrong prefix. Where the wrong prefix come from?

I logged in the SFB control panel to re-check the voice routing. It shows me nothing is wrong with the user dial plan and user normalization rules. The control panel testing tool gave different prefix result compared with prefix in the live tracing. Where could possibly go wrong?? Ext xxx1 is mapped with SFB user 1, when I log in my SFB client as user 1 and everything works, but when I log in the Polycom phone as Ext xxx1, the blind transfer failed when I transfer to problematic number range xxxx.

All of the sudden, I noticed global dial plan has the strange prefix configured there which was matching the prefix (+613868) pending in the live trace. So I believed, for some reasons Polycom handsets are using the global dial plan when it doing the blind transfer, this may be a bug . The handsets are using global dial plan during the blind transfer while the SFB client is using user profile dial plan. This approved the behaviour difference between the handsets and the desktop clients.

Solution Summary:

After I created a new entry for the number range xxxx in the global dial plan. Polycom phone rebooted and started working again. The result looks all correct. Verified the issue resolved. 

 

 

Hopefully it can help someone else who have similar issues.

 

 

Resolving presence not up-to-date & unable to dial-in the conferences via PSTN issues Lync 2013

 

Recently I’ve been working with one SFB customer recently. I met some unique issue and I would like to share the experience of what I did to solve the problem

Issue Description: After SQL patching on Lync servers, all users’ presence was not up-to-date and people are unable to dial in to the scheduled conference.

Investigation:

when I used Lync shell moving testing user from SBA pool A to pool B on one FE server, but I checked the user pool info on the SBA pool A, the result still showed the testing user is under pool A. This indicates either the FE Lync databases are not syncing with each other properly or there are database corruptions.

I checked all the Lync FE servers, all the Lync services are running. all look good. I re-tested the conference scenarios, the PSTN conference bridge number is unavailable while people can still make incoming/outgoing calls.

I decided to go back to check the logs on all the Lync FE servers, I noticed on one of the Lync FE servers, I got “Warning: Revocation status unknown. Cannot contact the revocation server specified in certificate”, weird, does this mean there was something wrong with the cert on this FE server? No way, I didn’t see this error on the other FE server, both FE servers are supposed to use the same certs, this means it’s not the cert issue. It is something wrong with the FE server.

Next, I tried to turn off all the Lync services on the problematic FE server to see if it made any difference. Interesting thing happened, once I did that, all users’ presence became updated and also the PSTN conference bridge number became available. I could dial in from my mobile after that. it verified it was server issue.

Root Cause:

What caused the FE server having the cert error? Which cert was used on this FE server? I manually relaunched the deployment wizard, wanted to compare the certs between the 2 FE servers. Then I noticed that the Lync server configurations are not up-to-date from the database store level. This was a surprise to me because there was no change on the topology, so I never thought about re-run the deployment wizard after FE SQL patching. On the other FE server which was working as expected, I can see all the green checks on each step of the deployment wizard. Bingo, I believed all the inconsistent issues from users end were related with the inconsistent SQL databases across all the two FE ends.

<

p style=”margin-left:36pt;”>

 

Solution:

Eventually, after the change request approved by the CAB, re-run the deployment wizard to sync the SQL store and also re-assign the certs to Lync services resolved the issue.

 

Hopefully it can help someone else who have similar issues.

Resolving presence not up-to-date & unable to dial-in the conferences via PSTN issues Lync 2013

 

Recently I’ve been working with one SFB customer recently. I met some unique issue and I would like to share the experience of what I did to solve the problem

Issue Description: After SQL patching on Lync servers, all users’ presence was not up-to-date and people are unable to dial in to the scheduled conference.

Investigation:

when I used Lync shell moving testing user from SBA pool A to pool B on one FE server, but I checked the user pool info on the SBA pool A, the result still showed the testing user is under pool A. This indicates either the FE Lync databases are not syncing with each other properly or there are database corruptions.

I checked all the Lync FE servers, all the Lync services are running. all look good. I re-tested the conference scenarios, the PSTN conference bridge number is unavailable while people can still make incoming/outgoing calls.

I decided to go back to check the logs on all the Lync FE servers, I noticed on one of the Lync FE servers, I got “Warning: Revocation status unknown. Cannot contact the revocation server specified in certificate”, weird, does this mean there was something wrong with the cert on this FE server? No way, I didn’t see this error on the other FE server, both FE servers are supposed to use the same certs, this means it’s not the cert issue. It is something wrong with the FE server.

Next, I decided to turn off all the Lync services on the problematic FE server to see if it made any difference. Interesting thing happened, once I did that, all users’ presence became updated and also the PSTN conference bridge number became available. I could dial in from my mobile after that. it verified it was server issue.

Root Cause:

What caused the FE server having the cert error? Which cert was used on this FE server? I manually relaunched the deployment wizard, wanted to compare the certs between the 2 FE servers. Then I noticed that the Lync server configurations are not up-to-date from the database store level. This was a surprise to me because there was no change on the topology, so I never thought about re-run the deployment wizard after FE SQL patching. On the other FE server which was working as expected, I can see all the green checks on each step of the deployment wizard. Bingo, I believed all the inconsistent issues from users end were related with the inconsistent SQL databases across all the two FE ends.

<

p style=”margin-left:36pt;”>

 

Solution:

Eventually, after the change request approved by the CAB, re-run the deployment wizard to sync the SQL store and also re-assign the certs to Lync services resolved the issue.

 

Hopefully it can help someone else who have similar issues.

Configure SBC to forward calls out from the original SG where incoming call comes in

Issue Description:

Recently I did a project of adding additional Telstra SIP trunks into Sonus production environment. The customer Sonus environment has primary SIP trunks with another SIP provider. They are going to use the new Telstra SIP trunks set up for new established small office. After the new trunk set up. I had one issue: when the calls were forwarded from SFB clients to users’ mobile phones, the A party number didn’t Pass through, instead, the number shown on mobile phones is the pilot number of the primary SIP trunk. The feature of A Party Number Pass-through was not supported by the primary SIP trunk provider, but on the new Telstra SIP trunk, this feature is definitely supported as I know. Now my question is: how to configure SBC to forward calls out back to Telstra Signalling group where the incoming calls comes in?

 

Investigation:

I did a testing call (A Party Rang B Party, B Party set to forward the call externally to C Party) and captured the logs for the whole call forwarding scenario, I can see the forwarding part of the call has a SIP “invite” message sent from the mediation server. in the SIP header, it contains all the numbers of Party A, B, C. Screenshot as below:

I can see the HISTORY-INFO data field contains “B Party Number” during the call forwarding. what I was thinking to do is to create a transformation rule to compare on the History-Info data field value, if the value contains the Telstra SIP trunk number range, the call should be routed out via the Telstra SIP Trunk.

Before creating new rule, I wanted to verify the A Party Number Pass-through was working. I created an optional rule to match calling address/number with my mobile number (A Party). When I called the Telstra number range, calls coming in from Telstra trunk and went out ringing another mobile (C Party) via the same trunk, A party number displayed as Caller number. It’s all good.

I set up the message manipulation rule for invite message based on below Sonus Doc (the first half of the doc) https://support.sonus.net/display/UXDOC61/Using+HISTORY-INFO+to+set+the+FROM+Number.

After that I tried to put an optional transformation rule matching the Telstra outgoing calls out, it didn’t work and still the call went out from the primary trunk. Because this rule can’t be mandatory under the current SFB to Telstra routing table. It will disconnect the normal outgoing calls on the Telstra trunk.

Next, I created a mandatory rule to compare the history-info value and bind this rule with Telstra SG, re-test the call forwarding scenario, still the call went out from the primary trunk. :/

When I moved to the Sonus local system log, I couldn’t see any logs contains the “SG User” variable. This made me realise that the inbound manipulation rule assigned to Telstra SG was totally wrong, because the invite is from SFB server, so I correct this setting. It started to work as expected.

Solution Summary:

  1. Create a message manipulation rule “Collect History-Info” on the “invite” message to collect “Collect History-Info” to “SG User Value 1”: Applicable Messages – Selected Messages, Message Selection – Invite, Table Result Type – Mandatory, refer to screenshot below:

  2. Assign it to the Inbound Message Manipulation of SFB SG
  3. Create a transformation rule to compare “SG User Value 1” with the Telstra SIP trunk number range:

  4. Create a NEW route match this transformation rule.

 

After this, retest the call forwarding scenario, both inbound part and the forwarding part of the call are routed through Telstra SIP Trunk. The result looks all correct. Verified the issue resolved. 😊

Resolving Skype for business 2015 Backup service “ErrorState” issue

I’ve been working with one SFB customer recently. I met some unique issue and I would like to share the experience of what I did to solve the problem.

Issue Description:

When I went through the Lync Event logs, I noticed the SFB FE servers are having lots of LS Backup Service with Event 4052, Event 4098 and Event 4071. Error logs are saying 

“Skype for business Server 2015, backup service users store backup module has backup data that never gets imported by backup pool.Backup data “file:\filestore\2-backupservice-1\backupstore\userservice\PresenceFocus\Data\Backup.zip

Cause: Import issue in the backup pool. Please check event log of Skype for business Server 2015, Backup service in the backup pool for more information.

Resolution:

Fix import issue in the backup pool”

 After I read these errors, I did a health check by running “get-csbackupservicestatus -poolfqdn primarypoolname” The result showed: OverallExportStatus: ErrorState, OverallImportStatus: NormalState.

By running the same cmd on the backup pool “get-csbackupservicestatus -poolfqdn backuppoolname”, the result showed: overallExportStatus: ErrorSate, OverallImportStatus: ErrorState.

I Checked the filestore folder permissions settings, it looked all correct, everyone is given access to the folder with read & write permissions. So this issue was not related with folder permission settings. This made sense because the backup services were running all good prior to certain time point. 

Then I did a bit of googling: people say to solve the backup service problem by recreating the backup folder. I tried to stop SFB Backup Service, File Transfer Agent Service, Master Replicator Agent Service on the FE servers across both primary pool and DR pool. Deleted the folder structure within the backup service folders. After this, I restarted all the stopped services above. After a few seconds, the new backup folder structures were recreated again. I run “Invoke-csbackupservicesync -poolfqdn primarypoolname” also “Invoke-csbackupservicesync -poolfqdn backuppoolname” Everything looked just fine. But when I run “get-csbackupservicestatus -poolfqdn poolname” on both pools, I get the same error results as previous.

To me, this is not good news. I was sure that something changed from the environment background. I started to do basic troubleshooting again from the primary site, I browse to the backup folder on the primary site servers recheck the folder permissions and everything looked good. I tried to browse to the DR folder at the primary site. It looked successful, nothing wrong. :/

Root Cause:

When I moved to the DR servers and tried to browse to primary site backup folder via the same directory, interesting thing happened, obviously, the filestore with the same directory path name on the DR servers was totally different from the filestore I browsed on the Primary servers. I did further ping test and verified that filestore host name was resolved differently at primary site and DR site. This meant the filestore of primary site and DR site can’t talk with each other, so that’s the root cause of the backup service having error status.

What exactly changed?

I spoke with customer IT team and they advised that originally both primary filestore and DR filestore were located on one DFS host. A couple of weeks ago, the IT team made some changes on the DFS farm, at the end, SFB FE servers at primary site resolved the filestore name against the Primary site DFS host, however, the SFB FE servers at DR site resolved the filestore name against the DR site DFS host, which is a totally different host. This breaks the configuration sync and caused backup service failed.

Solutions:

Reconfigured the DFS farm, all the SFB FE servers across both sites resolved the filestore name against the primary site DFS host. After that, restarted the backup services, everything started working again.

 Run health check again “get-csbackupservicestatus -poolfqdn primarypoolname”, the overallexportstatus: FinaState, the overallimportstatus:NormalState. Run health check for DR site, the result looks all correct. Verified the issue resolved.

So posting this as I couldn’t find any reference to this particular environment related SFB backup service error issue elsewhere. Hopefully it can help someone else too.