Quis Custodiet Ipsos Custodes?

In other words: who checks the checker? Or who watches the watchmen? So what will check if SCOM is running when there is a DB (connection) lost? Nothing?

We can setup another (extra) monitoringtool like Nagios or any other free Linix/Unix-based monitoring tool but the easiyest way is to make an Event ID trigger which will be sent by email.

Strategy

There are 2 affected Event ID’s which definately should be monitored. We are going to monitor these Events with a Powershell-script which will send out an email to the appropriate groups and specialists.

  • DataAccessLayer – Event ID 26308 – This event is generated the very instant SCOM loses connectivity.
Query notification processing failed due to a sql exception. 
System.Data.SqlClient.SqlException (0x80131904): A transport-level error has occurred when receiving results from the server. (provider: TCP Provider, error: 0 - The semaphore timeout period has expired.)
   at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection)
   at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning()
   at System.Data.SqlClient.SqlCommand.InternalEndExecuteReader(IAsyncResult asyncResult, String endMethod)
   at System.Data.SqlClient.SqlCommand.EndExecuteReader(IAsyncResult asyncResult)
   at Microsoft.EnterpriseManagement.DataAccessLayer.QueryNotificationManager.HandleNotifications(Object state)
  • OpsMgr SDK Service – Event ID 26330 – This event is generated after SCOM has been unable to connect to SQL for approximately 5 minutes.
The System Center Data Access service lost database connectivity.
 Database name: OperationsManager
 Server instance name: SQLHOST\instance
 Exception message: A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: SQL Network Interfaces, error: 26 - Error Locating Server/Instance Specified)

Trigger

  • Download Send_Html_Email.ps1 and save to C:\scripts on the RMS

  • Open the .PS1 file and edit and save the file for EACH Event ID with the correct information like here:

###########    Script    ################

$smtp = "internalrelay.domain.local"
$to = "Name <emailaddress>"
$from = "SCOM2012 <scom2012@domain.local>"
$subject = "SCOM2012 Database connectivity lost" 
$body = "<b><font color=red>Attention:<br></b></font> <br>"
$body += "The SCOM2012 Root Manager Server (RMS) has lost it's connection with the SCOM Database.<br>"
$body += "This alarm is triggered by <b><font color=blue>Event ID 26308</b></font>. When this event is generated <b>SCOM will NOT be working</b>.<br><br>"
$body += "How to solve:<br><br>"
$body += "1) Please check servername\instance for any errors<br>"
$body += "2) Reboot RMS.domain.local if there's nothing wrong with SQL cluster<br>"

#### Now send the email using \> Send-MailMessage 

send-MailMessage -SmtpServer $smtp -To $to -From $from -Subject $subject -Body $body -BodyAsHtml -Priority high

########### End of Script################
###########    Script    ################

$smtp = "internalrelay.domain.local"
$to = "Name <emailaddress>"
$from = "SCOM2012 <scom2012@domain.local>"
$subject = "SCOM2012 is unable to connect to SQL" 
$body = "<b><font color=red>Attention:<br></b></font> <br>"
$body += "The SCOM2012 Root Manager Server (RMS) is unable to connect to SQL for approximately 5 minutes.<br>"
$body += "This alarm is triggered by <b><font color=blue>Event ID 26330</b></font>. When this event is generated <b>SCOM will most likely having issues</b>.<br><br>"
$body += "How to solve:<br><br>"
$body += "1) Please check servername\instance for any errors<br>"
$body += "2) Reboot RMS.domain.local if there's nothing wrong with SQL cluster<br>"

#### Now send the email using \> Send-MailMessage 

send-MailMessage -SmtpServer $smtp -To $to -From $from -Subject $subject -Body $body -BodyAsHtml -Priority high

########### End of Script################
  • Set-ExecutionPolicy Unrestricted (and set to “0”):
1
notepad C:\Scripts\EventID26308.ps1:Zone.Identifier
1
notepad C:\Scripts\EventID26330.ps1:Zone.Identifier
  • Search for the Event ID’s 26330 and 26308 in Eventviewer on the RMS

  • Attach a task to the Event ID (right mouse click):

Create basic task

Start a program

Fill in these information

Program: C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe
Argument: -command C:\Scripts\EventID26330.ps1
Start in: C:\Scripts\

Next – Run with the highest privilages – Run whether user is logged on or not – Finish
(and repeat this for the other Event ID too)

Comments