Scheduled Runbook Tasks at background to automatically back up VMs with tag@{backup = ‘true’}

I always like to create some automation tasks to replace the tedious manual click job. This can be very helpful for customers with large environment. In this blog, I want to share the Azure Runbook which I made to run at the Azure background and automatically back up the VMs with tag@{backup = ‘true’}. This can standardize the VM backup with certain backup policy and automatically audit the environment and make sure to back up the required computing VM resources.

In order to run the runbook, add below modules into your Azure automation account environment:

  • RecoveryServices Version 4.1.4
  • RecoveryServices.backup Version 4.3.0

Pic1

Below is the Runbook PS script file:


#define login

 

function Login() {

$connectionName = "AzureRunAsConnection"

try

{

Write-Verbose "Acquiring service principal for connection '$connectionName'" -Verbose

 

$servicePrincipalConnection = Get-AutomationConnection -Name $connectionName

 

Write-Verbose "Logging in to Azure..." -Verbose

 

Add-AzureRmAccount `

-ServicePrincipal `

-TenantId $servicePrincipalConnection.TenantId `

-ApplicationId $servicePrincipalConnection.ApplicationId `

-CertificateThumbprint $servicePrincipalConnection.CertificateThumbprint | Out-Null

}

catch {

if (!$servicePrincipalConnection)

{

$ErrorMessage = "Connection $connectionName not found."

throw $ErrorMessage

} else{

Write-Error -Message $_.Exception

throw $_.Exception

}

}

}

 

Login

 

#define global variables

 

$rsVaultName = "myRsVault"

$rgName = "edmond-guo-rg"

$location = "Australia Southeast"

$keyvault = "edkeyvault1"

$vmrg = "VMs"

$backupvms = (Get-AzureRmResource -Tag @{ backup="true"} -ResourceGroupName edmond-guo-rg -ResourceType Microsoft.Compute/virtualMachines).Name

 

# Register the Recovery Services provider and create a resource group

 

Register-AzureRmResourceProvider -ProviderNamespace "Microsoft.RecoveryServices"

 

# Create a Recovery Services Vault and set its storage redundancy type

 

New-AzureRmRecoveryServicesVault `

-Name $rsVaultName `

-ResourceGroupName $rgName `

-Location $location

$vault1 = Get-AzureRmRecoveryServicesVault –Name $rsVaultName

Set-AzureRmRecoveryServicesBackupProperties -Vault $vault1 -BackupStorageRedundancy LocallyRedundant

 

# Set Recovery Services Vault context and create protection policy

 

Get-AzureRmRecoveryServicesVault -Name $rsVaultName | Set-AzureRmRecoveryServicesVaultContext

$schPol = Get-AzureRmRecoveryServicesBackupSchedulePolicyObject -WorkloadType "AzureVM"

$retPol = Get-AzureRmRecoveryServicesBackupRetentionPolicyObject -WorkloadType "AzureVM"

 

 

 

foreach($backupvm in $backupvms)

{

# Provide permissions to Azure Backup to access key vault and enable backup on the VM

 

Set-AzureRmKeyVaultAccessPolicy -VaultName $keyvault -ResourceGroupName $rgName -PermissionsToKeys backup,get,list -PermissionsToSecrets backup,get,list -ServicePrincipalName 17078714-cbca-45c7-b486-5d9035fae0b5

$pol = Get-AzureRmRecoveryServicesBackupProtectionPolicy -Name "NewPolicy"

Enable-AzureRmRecoveryServicesBackupProtection -Policy $pol -Name $backupvm -ResourceGroupName $vmrg

 

# Modify protection policy

 

$retPol = Get-AzureRmRecoveryServicesBackupRetentionPolicyObject -WorkloadType "AzureVM"

$retPol.DailySchedule.DurationCountInDays = 365

$pol = Get-AzureRmRecoveryServicesBackupProtectionPolicy -Name "NewPolicy"

Set-AzureRmRecoveryServicesBackupProtectionPolicy -Policy $pol -RetentionPolicy $RetPol

 

# Trigger a backup and monitor backup job

 

$namedContainer = Get-AzureRmRecoveryServicesBackupContainer -ContainerType "AzureVM" -Status "Registered" -FriendlyName $backupvm

$item = Get-AzureRmRecoveryServicesBackupItem -Container $namedContainer -WorkloadType "AzureVM"

$job = Backup-AzureRmRecoveryServicesBackupItem -Item $item

$joblist = Get-AzureRmRecoveryservicesBackupJob –Status "InProgress"

Wait-AzureRmRecoveryServicesBackupJob `

-Job $joblist[0] `

-Timeout 43200

}

So this runbook job will run every day at 5AM and taking the VM snapshot and save the VM backup images in your Backup Vault which is defined in the script.

pic2

Hopefully this runbook script can help you with the day to day operations task. 😉

Back up API Management Service to blob storages by calling Azure API from .Net App

Recently I am setting up some Azure API management services and thinking about how to automate the process of backing up and restoring API management configurations in case of disaster recovery scenarios.

I understand there are many ways to do that. I will start with a C# app first and show you how I achieved to back up the Azure API management service’s configurations to a blob storage via API calls.

Creating Azure AD Application for Token Authentications

  1. Login Azure AD and navigate to the App registrations
  2. Create a new application registration
  3. Fill in the application name and select Native for the application type
  4. Enter a URL for the URL redirection field
  5. Complete the app registration
  6. Go to settings -> Add “Windows Azure Service Management API” as required permissions

Windows Azure Service Management API

Once finish the steps above, you will have Azure App Id & redirect URL, we will use these values to retrieve the authtoken to access azure environment.

I installed  “Microsoft.IdentityModel.Clients.ActiveDirectory” NuGet package in my local visual studio environment and use below code to retrieve the token

var authenticationContext = new AuthenticationContext(“https://login.microsoftonline.com/{tenant id}”);

var result = authenticationContext.AcquireToken(“https://management.azure.com/”, {application id}, new Uri({redirect uri});

My local visual studio code looks like below:

Windows Azure Service Management API 1

Once the tenant Id, Application Id, Redirect URL are correctly specified in the , you will get below token result:

Windows Azure Service Management API 2

Calling Azure API to back up or restore API Management service

Back up an API Management Service

POST https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.ApiManagement/service/{serviceName}/backup?api-version={api-version}

Restore an API Management Service

POST https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.ApiManagement/service/{serviceName}/restore?api-version={api-version}

The API request header Content-Type value needs to be set to “application/json”

In the body of request, specify the target Azure Storage account name, access key, blob container name, and backup name

‘{

storageAccount : {storage account name for the backup},

accessKey : {access key for the account},

containerName : {backup container name},

backupName : {backup blob name}

}’

Below is the code I created to make the API call to trigger backup.

Windows Azure Service Management API 3

Once the code is successfully run and the backup process is initiated, I receive 202 response code. on the Azure API management service side, I can see my http client retrieves the token and completes the back up from the API management service activity log:

Windows Azure Service Management API 4

If we go to the backup storage location, we will be able to see the successful backup in the target storage container, the storage account can be different regions in order to provide Geo-redundancy.

Windows Azure Service Management API 5

To restore the API service, you can follow the same process of one click re-run the app, retrieve authtokens and call the restore API interface. Once it’s complete the restore, the logs in the API management service will look like below:

Windows Azure Service Management API 6

Hopefully this can be helpful when it comes to API management service. I will try to make the Net code to work on Azure functions next and share with you guys later. 🙂

 

How far to take response group

I have been working on a SFB Enterprise Voice Implementation project recently. The client is very keen to use native response group to create a corporate IVR for their receptions. The requirement in particular ended up needing 4 workflows, 19 Queues, 2 Groups and going beyond 2-Level, 4-Options IVR simple cases. The whole implementation won’t be completed under GUI, instead, Lync Powershell is the only way to meet the requirement.

I drew the reception IVR workflow below:

RGS

The root level menu is 7 options with the option 9 to loop back and the sub menu is also up to 8 options to help receptions to reduce the workload.

I like to start with GUI to set up the quickly set up the IVR framework with first 4 options and then we use scripts to extend options and manage the IVR framework. Take the “Reception Main Menu” as an example, I used the below scripts adding in Option 5, Option 6, Option 9.

##Create Option 5

$Workflow=get-csrgsworkflow -name "Reception Main Menu"

$queue = Get-CsRgsQueue -name "Press5sub Queue[R]"

$Question = $workflow.DefaultAction.Question

$Action5 = New-CsRgsCallAction -Action TransferToQueue -QueueID $queue.Identity

$Answer5 = New-CsRgsAnswer -Action $Action5 -DtmfResponse 5 -VoiceResponseList "Option5"

$Question.AnswerList.Add($Answer5)

Set-CsRgsWorkflow -Instance $workflow

##Create Option 6

$Workflow=get-csrgsworkflow -name "Reception Main Menu"

$queue = Get-CsRgsQueue -name "Press6sub Queue[R]"

$Question = $workflow.DefaultAction.Question

$Action6 = New-CsRgsCallAction -Action TransferToQueue -QueueID $queue.Identity

$Answer6 = New-CsRgsAnswer -Action $Action6 -DtmfResponse 6 -VoiceResponseList "Option6"

$Question.AnswerList.Add($Answer6)

Set-CsRgsWorkflow -Instance $workflow

##Create Option 9

$Workflow=get-csrgsworkflow -name "Reception Main Menu"

$queue = Get-CsRgsQueue -name "Press9sub Queue[R]"

$Question = $workflow.DefaultAction.Question

$Action9 = New-CsRgsCallAction -Action TransferToQueue -QueueID $queue.Identity

$Answer9 = New-CsRgsAnswer -Action $Action9 -DtmfResponse 9 -VoiceResponseList "Option9"

$Question.AnswerList.Add($Answer9)

Set-CsRgsWorkflow -Instance $workflow

To manage the business hours of IVR workflows, I used the below scripts to reset/update the business hours:

##Business Hours update

$weekday = New-CsRgsTimeRange -Name "Weekday Hours" -OpenTime 00:08:30 -CloseTime 17:30:00

$x = Get-CsRgsHoursOfBusiness -Identity "service:ApplicationServer:nmlpoolaus01.company.com.au" -Name "Reception Main Menu_434d7c29-9893-4946-afcf-3bb9ac7aad8a"

$x.MondayHours1 = $weekday

$x.TuesdayHours1 = $weekday

$x.WednesdayHours1 = $weekday

$x.ThursdayHours1 = $weekday

$x.FridayHours1 = $weekday

Set-CsRgsHoursOfBusiness -Instance $x

$x

To manage the greeting/announcement of IVR workflows, I used the below scripts to reset/update the IVR greeting:

##greeting/announcement update

$audioFile = import-CsRgsAudioFile -Identity "service:ApplicationServer:nmlpoolaus01.company.com.au" -FileName "Greeting reception.wma" -Content (Get-Content "C:\temp\Greeting Reception.wma" -Encoding byte -readcount 0)

$prompt = New-CsRgsPrompt -AudioFilePrompt $audioFile -TextSpeechPrompt ""

$workflow.DefaultAction.Question.Prompt = $prompt

$workflow.DefaultAction.Question

Set-CsRgsWorkflow $workflow

The native Lync response group is a basic IVR platform that covers most simple cases and can even go as far as multiple level and multiple option IVR with text-to-speech, and speech recognition (Interactive workflow), that’s not too shabby at all!

Hopefully my scripts can help you to extend your Lync IVR RGS workflow. 😊

Visual Studio Team Services (VSTS) Continuous Integration and Continuous Deployment

I have been working on an Azure Pass Project recently and try to leverage VSTS DevOps CICD features to automatic the build and deployment process. Thanks to my colleague Sean Perera, he helped me and provided a deep dive on the VSTS CICD process.

I am writing this blog to share the whole workflow:

  1. Create new project in VSTS, create Dev branch based on the master branch

1

  1. Establish the connection from local VS to the VSTS project

2

  1. Push web app codes to the VSTS dev branch environment

3

3.1

  1. Set up the endpoint connections between VSTS and Azure:
  • Login to the Azure tenant environment, create new registration for VSTS tenant.

4.1

  • Generate service principle key and keep it safe

4.2

  • Come to VSTS online portal, go to settings -> services-> create a new service endpoint, the service principal client ID will be the Azure application ID, service Principle key will be Azure service principal key.

4.3

  • Click “Verify connection” to make sure it passed the connection testing
  1. Go to Create a build definition:
  • Define the build task: select the repo source, define the azure subscription, the destination to push to, all the app settings and parameter definitions

5.1

  • Go to Triggers and enable the CI settings:

5.2

  1. Create a new release definition
  • Define the release Pipeline: where is the source build and where is the environment, in my case, I am using VSTS to push codes to Azure PaSS environment.

6.1

  • Enable the Continuous Deployment settings

6.2

  • Define the release tasks: in my case I am using pre-build deploy azure app service and also swap from staging to prod environment

6.3

6.4

  1. Auto build and release process

Once I make change on my project code from my local visual studio environment, I commit the code and push up to the VSTS dev environment, VSTS will automatically start the build and release process, complete the release and push to Azure web app environment.

7.1

7.2

  1. Done, test my code in the dev and prod environment. It looks good. the VSTS DevOps features speed up the whole deployment process.

 

Azure AD Identity and Access Management & Features

I’ve been using Azure AD Identity for quite a while now. I thought it would be good to share the summary of Azure AD Identity features and gather some feedbacks.

Azure AD Identity

Azure Active Directory: A comprehensive identity and access management cloud solution for your employees, partners, and customers. It combines directory services, advanced identity governance, application access management, and a rich standards-based platform for developers.

Identity and access management License option: Azure Active Directory Premium P2 (E5), P1 (E3)

“Identity as the Foundation of Enterprise Mobility”

Identity and access management

Protect at the front door: innovative and advanced risk-based conditional accesses, protect your data against user mistakes, detect attacks before they cause damage

Identity and access management in the cloud:

  • 1000s of apps, 1 identity: Provide one persona to the workforce for SSO to 1000s of cloud and on-premises apps.
  • Enable business without borders: Stay productive with universal access to every app and collaboration capability.
  • Manage access at Scale: Manage identities and access at scale in the cloud and on-premises, advanced user lifecycle management and advanced identity monitor tools
  • Cloud-powered protection: Ensure user and admin accountability with better security and governance

Azure AD portal:

Configure users & groups, Configured SaaS applications identity, configure on-prem applications with Application proxy, license management, password reset, password reset notifications, password reset authentication methods, company branding, whether users can register/consent applications, whether users can invite external contacts, whether guest can invite external contacts, whether users can register devices with Azure AD, whether require MFA, Define whether use pass-through authentication or federation authentication.

Azure AD application integration:

3 types of applications integration:

  • LOB applications: using Azure AD for authentication
  • SaaS applications: configure SSO
  • Azure AD Application proxy: we can publish on-prem applications to internet through Azure AD application proxy.

Inbound/outbound user provisioning to SaaS apps

User Experience with Integrated apps: Access Panel https://myapps.microsoft.com. Custom Branding? Load by appending your organization’s domain https://myapps.microsoft.com/company_domain_name. From Myapps, users can: change PW, Edit PW reset, MFA, view account details, view launch apps, self-management groups. Admins can configure apps to be self-service -users add apps by themselves.

Authentication (Front End & Back End) & Reporting (reporting access & alerts, reporting API, MFA)

Front End Authentication 

Back End Authentication 

Pass-thru authentication:

  • Traffic to the backend app NOT authenticated in Azure AD
  • Useful for NDES, CRLs, etc
  • Still has benefits of not exposing backend apps to http based attacks

Pass-thru authentication:

  • Does not try and authenticate to the backend
  • Useful with forms based applications
  • Auth headers returned to client
  • Can be used with front-end pre-authentication

Pre-Authentication

  • Users must authenticate to AAD to access backend app
  • Allows ability to plug into AAD control plane
  • Can also be extended to provide true SSO to the backend app

Kerberos/IWA

  • Must use pre-authentication on front end
  • Allows for an SSO experience from AAD to the app
  • Support for SPNego (i.e. non AD Kerberos)

 

Azure AD Connect health

Monitor & Report on ADFS, AAD Sync, ADDS. Advanced logs for configuration troubleshooting.

Azure Identity protection (Azure AD premium P2)

  • AIP dashboard is a consolidated view to examine suspicious user activities and configuration vulnerabilities
  • Remediation recommendations
  • Risk Severity calculation
  • Risk-based policies for protection for future threats

If user is at risk, either we can block users or we can trigger MFA automatically

AIP can help to identify spoof attack happening or leak credentials, suspicious sign in activities. infected devices, configurations vulnerabilities, for example, when a user signed in from unfamiliar location, then we can trigger to reset his/her password or we can use user risk condition to allow user access to corporate resources with password change or block access straight away. Alternatively, we can configure the alert to send an approval request to admin.

Identity protection risk types and reports generated:

Azure AD privileged Identity Management

For examples, I am on leave for 2 days and I want my colleagues to become global admin for only two days. if I come back from leave and forget to remove the global admin permissions from that colleagues, he will still be global admin, this will be put company at risk, because potentially either global admin password can be compromised.

Just in time administrative access, we can use this to give only has 2 days “global admin” access

Securing Privileged access: just in Time administration

  • Assume breach of existing AD forests may have occurred
  • Provide privileged access through a workflow
  • Access is limited in time and audited
  • Administrative account not used when reading mail/etc.

Result = limited in time & capability

 

 

 

Resolving unable to access App published with Barracuda WAF over Azure Express Route

Recently, one of the customers reported they can’t access to all UAT apps from their Melbourne office, but it worked fine for other offices. When they tried to access the UAT app domains, they were getting below errors: “The request service is temporarily unavailable. It is either overloaded or under maintenance. Please try later.”

WAF error

Due to the UAT environment IP restrictions on the WAF, it is normal behaviour for me to get the error messages due to the fact our Kloud office’s public IPs are not in the WAFs’ whitelist. This error approved the web traffic did hit the WAFs. Ping the URL hostname, it returned the correct IP without DNS problems, this means that the web traffic did go to the correct WAF farm considering the customer has a couple of other WAF farms in other countries. So we can focus on the AU WAFs now for the troubleshooting.

I pulled out all the WAFs access logs and planned to go through those to verify if the web traffic was hitting on the AU WAFs or went to somewhere else. I did a log search based on the public IPs which were provided by customer, no results returned for the last 7 days.Search Result 1

interesting. did it mean no traffic from Melbourne office came in? I did another search based on my public IPs, it clearly returned a couple of access logs related with my testing, correct time, correct client IP, correct WAF domain hostname, method is GET, Status is 503 which is correct because my office IP is restricted.

Search Result 2

Since customer mentioned all other offices had no problem to access the UAT app environment, I asked them to provide me with one public IP from another office, we tested it again and verified people in Indian office can successfully open the web app and I can see their web traffic appear in the WAF logs as well. I believed when Melbourne staff tried to browse the web app, the traffic should go to the same WAF farm because the DNS hostname was resolved to the same IP no matter whether in Melbourne or in India.

The question is what exactly happened and what was the root cause? :/

In order to capture another good example, I noted down the time and asked the customer to browse the website again. This time I did an access log search based on the time instead of Melbourne public IPs. I got a couple of results returned with some unknown IPs.

Search result 3

I googled the unknown IPs, it turned out they are Microsoft Australian data centre IPs. Now I kind of felt there are some routing or NAT issues in the customer network. I contacted the customer and provided the unknown IPs, customer did a bit of investigations on this and advised that those unknown IPs are the public IPs for their Azure Express Route interfaces. It makes sense now. Because customer didn’t whitelist their new Azure public IPs, so when web traffic came from the unknown source IPs (Azure Public IPs), WAF doesn’t know them and they were all being blocked as well, just like me. Once I added the new Azure IPs into the app whitelist IPs, all the access issues were resolved.

How I prepared for the 70-533 Azure Exam

I have been IT professional for over 7 years now. During this time, I have seen and experienced many critical changes in the IT Infrastructure field. Personally, I started as a network engineer at a software company and then moved to a MSP as infrastructure engineer and looked after servers, firewalls, network, application deploy, etc. for medium and large finance institutions before I join Kloud Solutions and started to evolve by learning the Microsoft Azure. Obviously, Cloud technology is the most significant shift that the IT industry is experiencing today. As a Microsoft guy, it makes sense for me to start utilizing Azure as a platform to provide solutions to our customers.

Back in August, I had a chat with Gary Needham who is one of the Azure guys at work and figured it was time to develop my Azure skills via exam 70-533 Implementing Azure Infrastructure Solutions. As an ops guy with strong hand-on experience, I decided to learn best by doing lab one by one. With a full-time job, I kind of had limited time to study so I wanted to make sure that no matter what labs I was doing, they must be against the exam objectives. As such, I wrote down my goal- obtain Azure certification by the end of October. I had some experience with Azure environment previously by doing Azure Express Route, Azure Backup & Azure Site Recovery, so I was confident that I can achieve that within 3 months’ time.

After quickly getting my subscription ready, I went into the Azure ARM portal and started deploying Azure Resource Group, storage and Virtual networks, subnets and so on. I was getting really excited over the Azure technology. There were lots of fun and this enjoyment made this journey to become an outstanding learning experience.

Azure GitHub quick-start template is a fantastic place to learn Azure template. In the modern IT world, we standardized infrastructure deployment by using JSON template (Infrastructure as code), this can bring us enormous benefits. For me, this came somewhat naturally since I actually started IT as a coder in school. It costs me some trials and errors, I eventually figured out how to use Azure template to deploy Azure ARM resources. It took me a couple of weeks in the lab sessions to deploy Azure resources via template with either powershell or Azure CLI. I felt this is awesome.

Over the next few weeks, I continued to expand my lab environments bit by bit. I like to use Microsoft online documentations. It’s the latest and provide best practice and recommendations during the deployments and configurations. DSC is very a good example, Microsoft provides very good information about the DSC templates and how to build a DSC template, how to use it. Also, how to manage the version control of large DSC environment.

As I went through the exam objects more and more, I tried to link the labs which I did with the simple exam objects in my mind, through this I could remind myself which exam objects I needed to go back to review, this eventually paid off when I took the exam because it helped me to consolidate all the knowledge I quickly learnt.

The online course I was using is Udemy’s 70-533 videos, I am using this course to test myself more from different angles to see if I mastered all the knowledge which are required in the exam.

The practice exam I was using is Microsoft Official Practice exam 70-533. I believe my exam-prep is a bit more in depth than most. I didn’t try to memorize and exam questions and focus on the answers. I never rely on memorization. For some questions I got it wrong, then I went back to the lab and went back to the documents and tried to understand the whole work flow around that particular exam subject. I am a heavy Googler and I like to google the extended questions to know more.

To prepare the exam, I went through the entire collection of practice questions twice to test myself to see if I am ready for the exam. In the first practice exam, I had lots of uncertain questions and it gave me some objective feedbacks, like what is the Azure technology limitations compared with on-prem. Something may be storage limitations, replication limitations, Geo-location limitations, the number of storage accounts limitations per subscriptions, also storage speed limitations. All these knowledges will be tested against designing the Azure solutions. I followed the explanation links and researching any questions I answered incorrectly. At the end of second try, I got 9 questions still incorrect. I noted all the wrong questions and studied hard on those. After I felt comfortable on all the questions, I booked in my exam on Wednesday morning.

On the exam day, I got up at 6:30AM and had a decent breakfast and rode train to the test centre. It took me about an hour to finish all the questions. Eventually I passed the exam 70-533 Implementating Microsoft Azure Infrastructure Solutions.

 

Trust the process:

Read Microsoft Azure Online Documentations and Practice & Implement them in the labs

 

 

Hopefully this blog can help people who is planning to take Azure exam.

Polycom VVX 310 – Unable to do blind transfer internally

I’ve been working with one SFB customer recently. I met some unique issues and I would like to share the experience of what I did to solve the problem.

Issue Description:

Customers were experiencing Polycom handsets unable to transfer external calls to a particular internal 4 – digit number range xxxx. All the agent phones are VVX 310 and agents sign in via extension & pin. When the call transfer failed, what the callers heard is the placid recorded female voice: “we’re sorry, your call cannot be completed as dialled. Please check the number and dial again”. Interesting thing is the call transfer failed scenarios only happen to blind transfers while the supervised transfers worked perfectly. Polycom handsets can successfully make direct calls and receiving calls. Well, this kind of doesn’t make any sense to me.

Investigations:

Firstly, I went through all the SFB dial plans and Gateway routing and transformation rules. The number range was correctly configured and nothing is different from other range.

I upgraded the firmware to the latest V5.5 SFB-enabled version on one of the Polycom handsets. It didn’t make any improvement. The result is still the same.

I was thinking about the Digimap settings on the configuration file which may cause this issue, so I logged into the web interface -> settings -> sip -> Digitmap, removed the regex in the Digitmap field, and rebooted the phone. Still when doing blind transfer to internal number range xxxx, no luck. It failed again. :/

Interesting thing happened when I tested by using the SFB client and log in as two users within the number range xxxx and did the same blind transfer, It worked! When I using the SFB client transfer to the Polycom handset, it also worked. But it stopped working when I did transfer from the Polycom handset.

Since I can hear the Telco’s voice, I thought it would be good to do a tracing from the Sonus end to see why the transfer failed first. From the live trace, I can see the invite number is not what I expected. Something went wrong when the number normalizations happened. The extension was given the wrong prefix. Where the wrong prefix come from?

I logged in the SFB control panel to re-check the voice routing. It shows me nothing is wrong with the user dial plan and user normalization rules. The control panel testing tool gave different prefix result compared with prefix in the live tracing. Where could possibly go wrong?? Ext xxx1 is mapped with SFB user 1, when I log in my SFB client as user 1 and everything works, but when I log in the Polycom phone as Ext xxx1, the blind transfer failed when I transfer to problematic number range xxxx.

All of the sudden, I noticed global dial plan has the strange prefix configured there which was matching the prefix (+613868) pending in the live trace. So I believed, for some reasons Polycom handsets are using the global dial plan when it doing the blind transfer, this may be a bug . The handsets are using global dial plan during the blind transfer while the SFB client is using user profile dial plan. This approved the behaviour difference between the handsets and the desktop clients.

Solution Summary:

After I created a new entry for the number range xxxx in the global dial plan. Polycom phone rebooted and started working again. The result looks all correct. Verified the issue resolved. 

 

 

Hopefully it can help someone else who have similar issues.

 

 

Resolving presence not up-to-date & unable to dial-in the conferences via PSTN issues Lync 2013

Recently I’ve been working with one SFB customer recently. I met some unique issue and I would like to share the experience of what I did to solve the problem

Issue Description: After SQL patching on Lync servers, all users’ presence was not up-to-date and people are unable to dial in to the scheduled conference.

Investigation:

when I used Lync shell moving testing user from SBA pool A to pool B on one FE server, but I checked the user pool info on the SBA pool A, the result still showed the testing user is under pool A. This indicates either the FE Lync databases are not syncing with each other properly or there are database corruptions.

I checked all the Lync FE servers, all the Lync services are running. all look good. I re-tested the conference scenarios, the PSTN conference bridge number is unavailable while people can still make incoming/outgoing calls.

I decided to go back to check the logs on all the Lync FE servers, I noticed on one of the Lync FE servers, I got “Warning: Revocation status unknown. Cannot contact the revocation server specified in certificate”, weird, does this mean there was something wrong with the cert on this FE server? No way, I didn’t see this error on the other FE server, both FE servers are supposed to use the same certs, this means it’s not the cert issue. It is something wrong with the FE server.

Next, I tried to turn off all the Lync services on the problematic FE server to see if it made any difference. Interesting thing happened, once I did that, all users’ presence became updated and also the PSTN conference bridge number became available. I could dial in from my mobile after that. it verified it was server issue.

Root Cause:

What caused the FE server having the cert error? Which cert was used on this FE server? I manually relaunched the deployment wizard, wanted to compare the certs between the 2 FE servers. Then I noticed that the Lync server configurations are not up-to-date from the database store level. This was a surprise to me because there was no change on the topology, so I never thought about re-run the deployment wizard after FE SQL patching. On the other FE server which was working as expected, I can see all the green checks on each step of the deployment wizard. Bingo, I believed all the inconsistent issues from users end were related with the inconsistent SQL databases across all the two FE ends.

Solution:

Eventually, after the change request approved by the CAB, re-run the deployment wizard to sync the SQL store and also re-assign the certs to Lync services resolved the issue.

Hopefully it can help someone else who have similar issues.

Resolving presence not up-to-date & unable to dial-in the conferences via PSTN issues Lync 2013

Recently I’ve been working with one SFB customer recently. I met some unique issue and I would like to share the experience of what I did to solve the problem

Issue Description: After SQL patching on Lync servers, all users’ presence was not up-to-date and people are unable to dial in to the scheduled conference.

Investigation:

when I used Lync shell moving testing user from SBA pool A to pool B on one FE server, but I checked the user pool info on the SBA pool A, the result still showed the testing user is under pool A. This indicates either the FE Lync databases are not syncing with each other properly or there are database corruptions.

I checked all the Lync FE servers, all the Lync services are running. all look good. I re-tested the conference scenarios, the PSTN conference bridge number is unavailable while people can still make incoming/outgoing calls.

I decided to go back to check the logs on all the Lync FE servers, I noticed on one of the Lync FE servers, I got “Warning: Revocation status unknown. Cannot contact the revocation server specified in certificate”, weird, does this mean there was something wrong with the cert on this FE server? No way, I didn’t see this error on the other FE server, both FE servers are supposed to use the same certs, this means it’s not the cert issue. It is something wrong with the FE server.

Next, I decided to turn off all the Lync services on the problematic FE server to see if it made any difference. Interesting thing happened, once I did that, all users’ presence became updated and also the PSTN conference bridge number became available. I could dial in from my mobile after that. it verified it was server issue.

Root Cause:

What caused the FE server having the cert error? Which cert was used on this FE server? I manually relaunched the deployment wizard, wanted to compare the certs between the 2 FE servers. Then I noticed that the Lync server configurations are not up-to-date from the database store level. This was a surprise to me because there was no change on the topology, so I never thought about re-run the deployment wizard after FE SQL patching. On the other FE server which was working as expected, I can see all the green checks on each step of the deployment wizard. Bingo, I believed all the inconsistent issues from users end were related with the inconsistent SQL databases across all the two FE ends.

Solution:

Eventually, after the change request approved by the CAB, re-run the deployment wizard to sync the SQL store and also re-assign the certs to Lync services resolved the issue.

Hopefully it can help someone else who have similar issues.