Tuesday, October 10, 2017

Using a LINUX machine for receiving SNMP traps

Linux machines present normally better performance receiving SNMP traps than Windows machines, so if you need to receive traps, the best way of doing it might be by using a dedicated Linux machine.

This Linux machine is in turn monitored by SCOM, via the SCOM agent for Linux.

We will use this agent to generate the alerts from SNMP traps:


Steps:

1- Preparing the LINUX system
:
   Ensure that the NET-SNMP package is installed.

Edit the file /etc/snmp/snmptrapd.conf, bellow is my default conf: the received traps are always logged to /var/log/snmptrapd.log
disableAuthorization yes
authCommunity log public
logoption f /var/log/snmptrapd.log
logoption s 2

2- Create a management pack that reads alerts from files and inserts them into SCOM

This MP can check, for instance, ~monuser/alerts/new and if there are any files then it inserts them into SCOM, as alerts. The data inside these files must follow a predefined format, such as:
Alert Severity | Alert Priority | Alert name | Alert description | <Other fields you need> ...
Once a file is processed it is removed or moved to other directory.


3- Insert the MIB file into the LINUX system

This is made by placing the mib file into a certain path, normally /usr/local/share/snmp/mibs

Then edit /etc/snmp/snmpd.conf and add the following line:
# Read mib file
mibfile /usr/share/snmp/mibs/<your mib file>   

4- Develop a shell script, for instance named trap2scom.sh, for receiving trap data and creating alert files

It should be something like:
#!/bin/bash
read host
read ip
vars=
while read oid val
do
   val=$(echo $val | tr -dc '[:print:]')
   val=${val//\|/\:}
   oid=$(echo $oid | tr -dc '[:print:]')
   oid=${oid//\|/\:}
   vars="$vars$oid = $val
"
   oid=${oid//\./\_}
   oid=${oid//\:/\_}
   oid=${oid// /\_}
   oid=$(echo ${oid// /\_} | tr -dc '[:alnum:]_')
   eval "Var_$oid=$val"
done
#  Create alert file with the following fields:
#     owner | origin (device) | severity | priority | name | description |!|
FOUT="alert${RANDOM}.alr"
cd ~monuser/alerts/new
echo  "SNMP_Trap | $host | 0 | 1 | $1 | $vars" '|!|'  > $FOUT
chown monuser $FOUT

Note: This is the default behavior but this script also sets the trap's bind variables as script variables (named Var_<bind variable>) so you can use them to check certain values before creating the alert. Also the severity might depend on a certain variable. For instance:
if [ "$Var_Severity" -eq "critical" ]
then
   # Create a critical alert ....
fi

5- Edit again the file /etc/snmp/snmptrapd.conf and add handles for the traps you want to process

traphandle <your-mib>::<trap-name> <path-to-your-script>/trap2scom.sh MyAlert

Finally restart the daemon snmptrapd. In RedHat the command is:

service snmptrapd restart

Wednesday, August 9, 2017

Processing Unix logfiles

Processing Unix logfiles can be tricky because we have to keep track of all the entries already sent to SCOM and because these files usually rotate regularly.

If a logfile entry is unique, either by having a date or by having an ID it is possible to record what was already processed and sent to SCOM, so we only have to report new entries.

I've created a shell script that finds and reports all the new occurrences of a string inside a log file.

In this case I'm only reporting the number the occurrences but the script can easily be changed to report the complete lines.

#!/bin/bash

# -- Parameters:
# $1 - Pathname to logfile
# $2 - String to search: Regular expression for 'grep'

umask 022

# -- Create tmp file name

ftmp=`echo "$1" | tr -dc '[:alnum:]'`"-"`echo "$2" | tr -dc '[:alnum:]'`

> $ftmp.send

# -- If logfile doesn't exist just initialize previous found file

if [ ! -f "$1" ] || [ ! -s "$1" ]; then
   > $ftmp.prev
else
  # -- Search log for string

  grep "$2" "$1" > $ftmp.new

  # -- Process found lines: if not already sent them send them

  if [ ! -f $ftmp.prev ] || [ ! -s $ftmp.prev ]; then
    cp $ftmp.new $ftmp.send
  else
    (while read a
     do
       rm $ftmp.found 2>/dev/null
       (while read b
        do
          if [ "$b" == "$a" ]; then echo "" > $ftmp.found; break; fi
        done) < $ftmp.prev
        if [ ! -f $ftmp.found ]; then echo "$a" >> $ftmp.send; fi
     done) < $ftmp.new
  fi

  mv $ftmp.new $ftmp.prev
fi

# Count the number of lines to send

wc -l $ftmp.send


exit 0

Thursday, June 29, 2017

How to run a SCOM task from Orchestrator

First we must ensure that the powershell is executed in native mode (normally 64bit) instead of the Orchestrator's internal mode.

I use this type of code:

# -- Set script's parameters, if any

$parms= @("aaaa","bbbb")

# -- Calling powershell, in native mode

$p = Powershell {

  # ... Your code... check $args for parameters

  # -- Return a result object, for instance:

  $ret = @{returnCode=0; returnMsg="OK"; aaa=$args[0]; bbb=$args[1]}
  return $ret

} -args $parms

# -- If need: Set local variables from returned object

$a=$p.aaa
$b=$p.bbb
$rcode=$p.ReturnCode
$rmsg=$p.ReturnMsg

echo "returnCode=$rcode; returnMsg=$rmsg; aaaa=$a; bbbb=$b"


Then we can run the SCOM task in the following way:

$p = Powershell {
# ...........................................................................

# -- Logging: Activate for debugging purposes.

$PROGNAME="runtask"
$MSG_LOGFILE="C:\TEMP\$PROGNAME.log"
#$MSG_LOGFILE=$false
$MSG_SHOW=$false

# ...........................................................................

Function Msg
{
  $dt = Get-Date –f "yyyy-MM-dd HH:mm:ss"
  $msg="$dt | $args"
  if ($MSG_LOGFILE) {
    $sw = new-object system.IO.StreamWriter($MSG_LOGFILE,1)
    $sw.writeline($msg)  
    $sw.close()
  }
  if ($MSG_SHOW) { write-host $msg }
}
# ...........................................................................

# -- SCOM Parameters

# -- MS where to connect
$SCOM_MS="<Insert MS>"
 
# -- Instance: Normally the server's name where to run the task
$SCOM_INSTANCE="<Insert server name>"

# -- Task to execute
$SCOM_TASK="<Insert task name>"

# -- Task's parameters / overrides
$SCOM_TASKPARMS=@{ <... Insert parameters, if any ... > }

Msg "Starting SCOM TASK * Instance:$SCOM_INSTANCE * Task:$SCOM_TASK"

# -- Initialize

Import-module operationsmanager
New-SCOMManagementGroupConnection –ComputerName $SCOM_MS

$Instances = Get-scomclass -name Agent.Management.Class | Get-SCOMClassInstance | ?  {$_.Displayname -eq $SCOM_INSTANCE}
$Task = Get-SCOMTask -name $SCOM_TASK

# -- Execute task

$exec = Start-SCOMTask -Task $Task -Instance $Instances -Override $SCOM_TASKPARMS
$startDate = Get-Date

Msg "Task Started * Exec ID:$($exec.ID)"

# -- Wait for completion

do {
    start-sleep -s 6
 $r=Get-SCOMTaskResult -Id $exec.ID
 Msg "Task $($exec.ID) * Waiting for completion * Current status:$($r.status)"

} while ($r.status -ne "Succeeded" -and $startDate.AddMinutes(2) -gt (Get-Date))

# -- Return

Msg "Task $($exec.ID) * Result:$($r.status) * Output:$($r.Output)"

$ret = @{returnCode=0; returnMsg=$r.Output}

if($r.status -ne "Succeeded") { $ret.returnCode=1 }

return $ret
# ...........................................................................

}









Thursday, February 9, 2017

How to access HTML5 pages from SCOM 2012 console

You can define HTML views at the SCOM console which is a very a nice feature, allowing to display further monitoring data, including, for instance, charts based on data from the Operations Manager Data Warehouse Database.

The problem is that many of the modern chart libraries take advantage of HTML5:


The SCOM views, although, allow only HTML 4. I guess that, back in 2012, HTML 5 wasn't much used.

Fortunately there is a simple way of bypassing this problem:

It is possible to create a HTML 4 page that in turn opens the another HTML 5 page as a new window and,  in this case, the Internet Explorer is invoked, so the new page is displayed according to the IE current configuration, nowadays the chances are that your installed IE version correctly displays HTML 5.

I've created the calling page in PHP but this method can be used in any language.

My scom2html5.php PHP program accepts at least 2 parameters:

The first parameter is the dimensions and location of the new HTML window.

The second parameter is the page to be called. It there are any more parameters then they will be sent to the called page.

For instance:
scom2html5.php?900,700,342,74+mypage.html
This opens a window of 900 x 700 pixels at the position 342,74 of the screen. This window will then present the page: mypage.html

The program scom2html5.php creates the new window by using javascript:

<?php
...
// -- Get script's parameters
$ARGS = explode('+',$_SERVER['QUERY_STRING']);
$ARG_NO = count($ARGS); if ($ARG_NO < 2) die('Invalid arguments');
$url = $ARGS[1];
$win = explode(',',$ARGS[0]); if (count($win) != 4) die('Invalid arguments');
$uparms = '';
for($t=2;$t<$ARG_NO;$t++) $uparms .= '+' . rawurlencode($ARGS[$t]);
if (strlen($uparms) > 1) $uparms = '?' . substr($uparms,1);
else
$uparms = '';
$url .= $uparms;
$pos = "width=${win[0]},height=${win[1]},left=${win[2]},top=${win[3]},screenX=${win[2]},screenY=${win[3]}";
// -- Open new window via javascript
?>
<script type="text/javascript">
var winopen =window.open('<?php echo $url; ?>',
'HTML5Window',<?php
echo "'$pos,toolbar=yes,location=yes,directories=yes,status=yes," .
"menubar=yes,scrollbars=yes,copyhistory=yes,resizable=yes');" ?>
winopen.focus();
</script> </div>
<?php 

...
Once you create a calling page using this strategy, in any language you prefer, you can define a view to your HTML 5 page following these steps:

Right click the mouse to create the view:




Finally the HTML5 page is presented. In this case, the window position was defined to match the usual dimensions of a maximized console.








Wednesday, February 8, 2017

Implementing SCOM syslog Linux monitoring without using a syslog server

It is possible to configure SCOM to act as a syslog server. You need to instruct all the Linux machines to send the events to this server and finally you can define SCOM rules generating alerts according to syslog events, check this link

Using this architecture, the Linux machines send syslog events to SCOM as snmp traps, these events are then filtered in order to generate the alerts. Because the number of those traps can be huge, this sometimes causes performance issues, besides greater complexity.

If you have SCOM already monitoring Linux then you can use it to monitor syslog and there should be several ways of doing it... I'm describing mine but I'm open to suggestions, advises and criticism.


STEPS:

1- Configuring syslog

First I configure  syslog at each Linux machine to also write messages to three new files, according to their importance: critical, warning or information. These files will later be processed by SCOM.

1.1- Creating the log directory 

As monuser user I create the log directory at the monuser's home. Normally the monuser's home is "/home/monuser":
cd /home/monuser
mkdir log 

1.2- Setting permissions for the log directory

As superuser:
chgrp root log  
chmod 660 log

1.3- Configuring syslog

As superuser, I add the following lines to /etc/rsyslog.conf (or to /etc/syslog.conf if /etc/rsyslog.conf doesn't exist):
# -- SCOM definitions
#
# Log every msg with priority greater or equal to ERROR, as SCOM CRITICAL
*.err;mail.none;auth.none;authpriv.none                        /home/monuser/log/SCOM.Critical
#
# Log every msg with priority WARNING, as SCOM WARNING
*.=warn;mail.none;auth.none;authpriv.none;cron.none            /home/monuser/log/SCOM.Warning
#
# Log every msg with priority NOTICE, as SCOM INFORMATION
*.=notice;mail.none;auth.none;authpriv.none;cron.none         /home/monuser/log/SCOM.Information
In this case syslog's information type messages are ignored because they too often generate a vast number of entries.

This syslog configuration will register events in three files, based on the criticality:

/home/monuser/log/SCOM.Critical
/home/monuser/var/log/SCOM.Warning
/home/monuser/var/log/SCOM.Information

1.3 - Restart the syslog service

The exact command depends on the Linux Operating System, in my case is (as superuser):

service rsyslog restart

1.4 - Change the permissions of the log files

After restarting the syslog service I have the following files at /home/monuser/log :

SCOM.Critical
SCOM.Warning
SCOM.Information

As superuser I execute:

chown monuser /home/monuser/log/SCOM.* 
chmod 660 /home/monuser/log/SCOM.*

2- Creating a shell script to manage the logs

This shell script will be later executed via SCOM, to read and process the logs.

The shell script is:

# -- 1st Parameter is the logfile, for instance: /var/log/SCOM.Critical
flog="$1"
# -- File has data?
if [ ! -f "$flog" ] || [ ! -s "$flog" ]; then
  exit 1
fi
# -- Move file
cp "$flog" "$flog.sent";
> "$flog"
# -- Send the output to SCOM
cat "$flog.sent"
exit 0

I ended up converting script to a single line, so it can be later directly inserted into a SCOM rule:

In the case of critical alert, the line would be:

flog="/home/monuser/log/SCOM.Critical"; if [ ! -f "$flog" ] || [ ! -s "$flog" ]; then exit 1; fi; cp "$flog" "$flog.sent"; > "$flog"; cat "$flog.sent"; exit 0

3- Defining the SCOM rules

Finally I specify three different rules, for critical, warning and information alerts.

Creating the first rule, the others are similar:





The script is sent in a single line:



The criteria is only RerturnCode Equals 0:



This inserts at the Alert Description all the script's output:



4- Testing 

As a superuser, at the linux machine, enter:

Testing a critical alert:

logger -p user.err "TEST Error Msg"

Testing a warning alert:

logger -p user.warning "TEST warning Msg"

Testing a information alert:

logger -p user.notice "TEST Error Msg"

4- Further developing

What if you want to create an alert that depends on a specific syslog event?

One possibility is to create a rule that searches inside the file left over by the main rule... For instance, the rule "Syslog critical messages" creates a file named 'SCOM.critical.sent', so further rules can search this file for a certain regular expression.

Example of a script (not tested):
# -- Where to find
flog="/home/monuser/log/SCOM.Critical.sent";

# -- What text to find, for instance, "hardware"
find="hardware"

# -- Define an unique preffix for this rule
name="hw"

# -- File having the found events
flog2="$flog.$name"

# -- File has data?

if [ ! -f "$flog" ] || [ ! -s "$flog" ]; then
  exit 1
fi

# -- There's new data to process?

if [ ! -f "$flog2" ] || [ "$flog" -nt "$flog2" ]; then

  grep "$find" "$flog" > "$flog2"

  if  [ -s "$flog2" ]; then
    cat "$flog2"
    exit 0
  fi
fi
exit 1

This script can then be executed from a rule with a shorter time interval than "Syslog critical messages", to be able to process the leftover file. For instance, if the main rule is executed each 15 minutes then the rules looking for a certain text should be executed each 12 minutes.