OpsMgr 2012 R2 – PSScript: Automate Config Failover Gateway for SCOM agents

It’s an easy way to monitor servers in an untrusted domain. With a gateway is SCOM able to monitor the servers in a untrusted domain. Mostly and normally, you planned to implement 2 gateways per untrusted domain for high availability of monitoring the untrusted domain. Unfortunately, if you discover a new server in that untrusted domain the second gateway is not configured as failover automatically. The agent has only a connection with the primary gateway, which you gave up in the installation or discovery.

To set the failover you have to use PowerShell. With some Cmdlets you could set the primary and the failover gateway per agent. I’m not a fan of manual actions. So, I made a PowerShell script for a monitor and one script for setting the failover on SCOM agents. The reason for automate this process is; if you add a new server into SCOM, you will forget this manual action to set the failover and that’s not good for the availability of the servers, if the primary gateway goes down for a restart or whatsoever. So it’s was time to build a script to automate this process..

I have basic PowerShell skills. If you have another, better and efficient idea or you have a comment. Please let me know, I appreciate that..

I have to skip some standard steps, otherwise the blog will be too big..

What I did:

  • Made a new monitor (rule is also possible) Unfortunately, you have to use Authoring Console or MPAuthor to make a PowerShell based monitor or rule, instead the SCOM console himself.
  • Made a PowerShell script to set the failover for the SCOM agent.
  • Made a notification based on the alert from the new monitor. Rule is also possible, but I will only explain the monitor in this blogpost.
  • The notification start the script.
  • The script update the alert, closed the alert later and reset the monitor.

I have worked with System Center Authoring Console to build the monitor. That’s why this blogpost is based on Authoring Console only.

Before you build a monitor you have to make a Probe action. This Probe contains the PowershellPropertyBagProbe, like this:

Click on the Edit button to edit the Probe. You have to choose which editor you want to edit. (you have to hit 2 times on Edit for editing the XML file)

Then, you have to add the PowerShell script into the XML file between <SCRIPTBODY> </SCRIPTBODY> We are not using Arguments (is for VBScript) and Parameters. Please use also the <![CDATA[ at the beginning of the PowerShell Script and ]]> at the end of the script. This one is needed if you are using strange characters which are illegal in XML elements, like & and <

Like this:

<Configuration p1:noNamespaceSchemaLocation=”C:\Users\albert.neef\AppData\Local\Temp\Script – Microsoft.Windows.PowerShellPropertyBagProbe.xsd” xmlns:p1=”http://www.w3.org/2001/XMLSchema-instance”&gt;

<ScriptName>GetFailoverConfig.ps1</ScriptName>

<Arguments />

<ScriptBody> <![CDATA[

#SCOM settings

$api = New-Object -ComObject “MOM.ScriptAPI”

$bag = $api.CreatePropertyBag()

[xml]$XML = Get-Content “C:\Program Files\Microsoft Monitoring Agent\Agent\Health Service State\Connector Configuration Cache\SystemCenterTest\OpsMgrConnector.Config.xml”

$Parents = $XML.Message.State.Parents.Added.Item

$api.LogScriptEvent(“GetFailoverConfig.ps1”, 451, 0, “Script is reading the XML file “)

if($error) {

$bag.AddValue(“Result”, “GOOD”)

$bag.AddValue(“Info”, “o OpsMgrConnector.Config.xml found. Maybe this is a gateway”)

$bag

$api.LogScriptEvent(“GetFailoverConfig.ps1”, 451, 0, “No OpsMgrConnector.Config.xml found. Maybe this is a gateway”)

exit

}

if($Parents.Count -gt 1) {

$bag.Addvalue(“Result”, “GOOD”)

$bag.Addvalue(“Info”, “Failovergateway has been set”)

}else{

$bag.AddValue(“Result”, “ERROR”)

$bag.AddValue(“Info”, “No failover gateway found in OpsMgrConnector.Config.xml”)

}

$bag

$api.LogScriptEvent(“GetFailoverConfig.ps1”, 451, 0, “Script is done with reading”)

]]> </ScriptBody>

<TimeoutSeconds>60</TimeoutSeconds>

</Configuration>

The script is reading the XML file from the SCOM agent. This file is located in the Health Service State. This XML has the information about the connection with the gateway. If the agent is connected with more than 1 gateways you should see more gateways in the XML.

An example of the OpsMgrConnector.Config.xml file:

 

XML_Config_SCOM

It’s a very simple and basic PowerShell script. What it does is; get the gateway information and count how much gateways are in the XML. If one then it’s an error. If there are more than 1 gateways, it gets a Good status. Save this and close the editor. You should see that the textboxes are refreshed into the information from the editor.

Click on Apply and Ok or only Ok . You have to add a DataSource into the Management Pack.

You have to add the new created Probe and SimpleScheduler. The scheduler is needed for running the script in XX seconds. I have set this scheduler on 60 seconds for testing the script, but this is temporarily . once a day is ok for monitoring. But first you have to promote IntervalSeconds for using overrides. With override you could change the intervals for running the script. Click on the ‘triangle’ icon/button and choose for Promote…

SyncTime must be empty. Click on Ok and go to Configuration Schema tab.

Here you have to change the Type for IntervalSeconds. Default is String but IntervalSeconds is Integer, so changed that into Integer. Go to the next tab, Overridable Paremeters.

This one is empty and you should add intervalSeconds as an override. So, click on Add and choose $Config/IntervalSeconds$ You have to name it. I always use the same name as from the Configuration Scheme. Change the Configuration Element into Integer. Click on Apply and Ok. Next step is to make a MonitorType . MonitorType is only for a monitor, thus not for a Rule.

You have to add 3 things and that are the created DataSource and 2 ExpressionFilter. With ExpressionFilter you can link the healthy and unhealthy with the results from the PropertyBag out the script. Because of this the monitor knows which result is bad or good.

Like this:

Parameter Name is: Property[@Name=’Result’] Result is the name of the PropertyBag in the PowerShell. Use this setting also for the ERROR result. The parameter is case sensitive.

This is DataSource. You have to promote also the IntervalSeconds. Repeat also the steps for Configuration Schema and Overridable Parameters. Click Apply and Ok. Next step create a monitor.

We will use the Windows Server class, so the target is Microsoft.Windows.Server.Computer. Parent Monitor is System.Health.ConfigurationHealth. Go to the Configuration Tab and browse to the created monitortype. You have the edit the IntervalSeconds for the monitor.

Also Health and Alerting must be configured. Choose which unhealthy(critical or warning) you want to choose and add some text into the Alert, with automatically close if you object is healthy.

Best practices is to disable the monitor by default and use overrides for enabling to monitor for a specific server. So change Enabled from true to false. Save the management pack and import this management pack in SCOM. Error handling is because of this a lot easier. I skip the override step, this is a default step in the SCOM console.

This is the PowerShell script for setting the failover on the SCOM agent.

Param($ComputerName, $alertid)

#Function GetGatewayServers is a function that’s searching for gateways in a specific untrusted domain.

Function GetGatewayServers {

Param($Node, $startTime)

$node = $node.split(“.”)

if($node.Count -eq 3) {

$domain = $node[1]

}elseif($node.Count -eq 4) {

$domain = $node[2]

}

$Gateway = Get-SCOMGatewayManagementServer -Name “*.$domain.local”

#Logging

$GatewayNames = $Gateway.DisplayName

“$StartTime : The gateways in $domain are $GatewayNames” | Out-File “Failover.log” -Append

$script:Gateway = $Gateway

}

#This function is checking the primary and the failover gateway. Only the failovergateway server will be used for this script.

Function SetFailoverForAgent {

Param($ComputerName, $IsFailOver, $startTime, $alertid)

$Agent = Get-SCOMAgent | where {$_.DisplayName -eq $ComputerName}

$Primary = Get-SCOMManagementServer -Name $IsPrimary

$Failover = Get-SCOMManagementServer -Name $IsFailOver

Set-SCOMParentManagementServer -Agent $Agent -FailoverServer $Failover

#Logging

$FailoverName = $Failover.DisplayName

“$StartTime : The failover has been set. The failover is: $FailoverName.” | Out-File “Failover.log” -Append

#This for logging into the Alert in SCOM.

Get-SCOMAlert -Id “$alertid” | Set-ScomAlert -Comment “Failover has been set. The Failover is $FailoverName”

}

#this function is getting the primary gateway and will be used later in the ‘body’ script to filter out the primary to get only the failover gateway.

Function GetIsPrimary {

Param($ComputerName, $startTime)

$Agent = $Agent = Get-SCOMAgent | where {$_.DisplayName -eq $ComputerName}

$isPrimary = Get-SCOMParentManagementServer -Agent $Agent

#Logging

$PrimaryName = $isPrimary.DisplayName

“$StartTime : Function GetIsPrimary has found the primary gatewayserver: $PrimaryName” | Out-File “Failover.log” -Append

$script:isPrimary = $isPrimary

}

#Reset monitor

Function ResetMonitor {

Param($AlertId)

$Alert = Get-SCOMAlert -Id $AlertId

$Monitor = Get-SCOMMonitor -Id $Alert.MonitoringRuleId

Get-SCOMClassInstance -id $Alert.MonitoringObjectId | foreach { $_.ResetMonitoringState($Monitor) }

#Logging

“$StartTime : Reset monitor: $Monitor” | Out-File “Failover.log” -Append

}

#Date and time for the logfile.

$startTime = [DateTime]::Now

Import-Module OperationsManager

#if $ComputerName is empty, stop the script and log into the logfile.

if(!$ComputerName) {

“$StartTime : ERROR No ComputerName: $ComputerName” | Out-File “Failover.log” -Append

exit

}

#Logging

“$StartTime : Starting for Agent $ComputerName” | Out-File “Failover.log” -Append

#Logging into the Alert in SCOM

Get-SCOMAlert -Id “$alertid” | Set-ScomAlert -Comment “Starting script SetFailoverOnAgent.ps1”

#Get the gataways from the domain where the agent is located.

GetGatewayServers -Node $ComputerName -StartTime $startTime

#Get the Primary Gateway Server

GetIsPrimary -ComputerName $ComputerName -StartTime $startTime

#Get the failover gatewayname.

foreach($GWnode in $Gateway) {

if($IsPrimary.DisplayName -ne $GWNode.DisplayName) {

$isFailover = $GWNode.DisplayName

}

}

#Set failover for the agent

SetFailoverForAgent -ComputerName $ComputerName -IsFailover $IsFailOver -startTime $StartTime -alertid $alertid

#Logging

“$StartTime : Done..” | Out-File “Failover.log” -Append

#This for logging into the Alert in SCOM.

Get-SCOMAlert -Id “$alertid” | Set-ScomAlert -Comment “Script SetFailoverOnAgent.ps1 has finished”

Get-SCOMAlert -Id “$alertid” | Set-ScomAlert -ResolutionState 255 -Comment “Closed by SetFailoverOnAgent.ps1 Script”

#Call function RestMonitor for resetting the monitor.

“$StartTime : Reset the monitor” | Out-File “Failover.log” -Append

ResetMonitor -AlertId $alertid

Save this script on the Management Servers (all SCOM management servers)

Next step is; we have to make a new Notification. Go to the SCOM console and make a new channel. This channel is a command channel. Give the channel a name and click next.

You have to add the path to Powershell.exe and the path where the script is located. As parameters you have to add $Data/Context/DataItem/ManagedEntityDisplayName$ and $Data/Context/DataItem/AlertId$. Startup folder is the same as the path to the script. Then you have to add subscribers and the subscriptions. The subscription must be pointed to the Alert of the created monitor.

For check you could look into the eventvwr of OperationsManager on the server which has the override enabled. The monitor logged ID 451 into the Eventvwr. If you see this event ID, then the monitor is working properly and will reporting to SCOM if the monitor does not find the second/failover gateway.

You will get an alert and that alert will start the subscription that’s linked to the Powershell Script. The script will make a log file in the start-up folder. This whole process will also be logged into the history of the alert itself. You have to run this Powershell Cmdlet to check if the agent has configured failover gateway. Via this CmdLet you know that the script has ran successful.

Get-SCOMParentManagementServer –Agent (Get-SCOMAgent where {$_.DisplayName –eq “YOURHOSTNAMESERVER”} )

If you have any questions, let me know..

Thanks for reading!