Pythian Blog: Technical Track

Extend Oracle Enterprise Manager (OEM) Compatibility with Proxy Monitoring

Introduction

Since Adobe Flash became unavailable in most browsers, so did all the graphical tools of Oracle Enterprise Manager up to version 13.3.

This forced most companies that use OEM to migrate to at least version 13.4, which no longer uses Adobe Flash. That created another problem since the 13.4 version doesn’t support some systems that used to be supported by the 13.3, such as Linux 5.

Therefore hosts running this version can no longer be monitored directly by OEM.

Although OEM still supports Oracle Database versions starting with 11.2.0.4, any version prior to that can’t be monitored directly. To overcome these issues, we can use what I like to call: Proxy Monitoring!

The idea is to use a host or database currently supported by OEM 13.4 to monitor targets that are no longer supported. To do so, you’ll have to deploy the current agent on the “Proxy host,” which will be used to monitor one or more targets that are not supported.

Next, you need to make sure that the Enterprise Manager is properly monitoring the “Proxy host.” After that, you’ll need to establish communication between the “Proxy host” and the unsupported target.

If you’re dealing with Oracle databases, you may want to create database links between the one running in the “Proxy host” and the ones running in the unsupported hosts, which will allow OEM’s Extended Metric to query these databases without requiring additional authentication or human interaction.

But this method also allows to create scripts using any tool available on the operating system to check the unsupported targets. The script’s output will then be evaluated by the OEM through the Extended Metric.

I’ll demonstrate two examples:

  • Using a database link between two Oracle databases
  • Using simple SSH user equivalence between two Linux hosts

Monitor an unsupported Oracle Database

Create a database Link

To monitor a database that cannot be added directly into OEM 13.4 and above, you can establish communication with such database either using its own client installed in a machine supported by OEM or, if you need to monitor an unsupported Oracle Database, you can also create a database link between a supported and the unsupported database(s) that you want to monitor.

This example uses a TNS service name, which must be properly registered either in the local tnsnames.ora or in your central directory (LDAP, etc.).

You can also explicitly describe the connection descriptor during the database link creation to avoid using a TNS service.

  • Create and test a database link under the monitoring schema in the “Proxy database,” pointing towards the unsupported one:
SQL> CREATE DATABASE LINK "DB11G" CONNECT TO PROXY_MON IDENTIFIED BY "oracle" USING 'TNS_SERVICE_DB11G';

Database link created.

SQL> SELECT sysdate FROM dual@DB11g;

SYSDATE
-------------------------------
05/19/2023 10:11:23

You should grant the required privileges for the PROXY_MON user so it’s able to query the tables/views used by the metric.

Monitor a single database with each extended metric

This is a simpler approach to this method, allowing only a single database to be monitored by each Extended Metric.

Create the metric

Navigate to the “Metric Extensions” page (Enterprise -> Monitoring -> Metric Extensions). Click on the “Create” button and select “Metric extension”:

Specify the details for the metric as follows, then click “Next”:

I’m using a simple query that uses the database link we just created to check the number of processes at that DB:
Extended metrics support PL/SQL blocks and SQL commands.

After specifying the query to be used, we need to define the columns for this metric. First, we have the instance name, which is a key column:

Then we define the processes count column, which is where we’ll define the thresholds for this metric:
You can customize the Alert and the Clear messages. By hovering the mouse over the “?” sign will show a list of variables you can use.

Your column list should look like this:

Specify which credentials should be used by this metric:

Then add a target to test the new metric by clicking on the “+ Add” button:

Then select which database you’d like to use and click the “Run test” button:

If everything goes according to plan, you’ll see something like this:

Complete the metric creation:

Deploy the metric

Before being able to deploy this new metric, we need to save it as a deployable draft. To do so, select the newly created metric, then go to “Actions” and select “Save as Deployable Draft”:

Now let’s deploy the metric to the PROXYDB database by clicking the “Deploy to targets” button after selecting the desired metric:

Then click the “Add” button and select the PROXYDB database:

Check the metric results

Once the metric is deployed, you can check it by going to the database page and then selecting “All metrics” under the “Oracle Database -> Monitoring” menu:

Monitor multiple databases with a single extended metric

If you’d like to use a single “Proxy database” to monitor multiple targets, I’d recommend creating a table with the list and details of all the databases you’d like to monitor.

Then build a simple PL/SQL block that runs through this list and then executes the required check on each database.

The script sends the result to OEM, which is then parsed using a fixed structure with Key and Data columns. After that alerts, it will raise alerts for each key that exceeds the thresholds.

A simple example of a key pair would be the: Target (i.e., Database) and Object (i.e., Tablespace). More details about key pairs are demonstrated in the operating system monitoring section.

Below are the details to run the same check on multiple databases.

Create a databases list table as follows:

Table name: PROXY_MONITORING_DBS

Column list:

DB_NAME     VARCHAR2(100)    (Primary Key)
DB_LINK     VARCHAR2(100)
ENABLED     VARCHAR2(1)      (Default: 'N')

For each database, insert the database name, database link name, and Y/N to indicate if the checks should be enabled;

The following PL/SQL block will execute the check on each database listed in the table:

DECLARE

    CURSOR cursor_dbs IS   SELECT db_name, db_link
                             FROM proxy_monitoring.processes_usage_db_list 
                            WHERE enabled = 'Y';

    v_dbs 	           cursor_dbs%ROWTYPE;
    
    v_sql_stmt		   VARCHAR2(300);
    
    
    TYPE ProcCurType       IS REF CURSOR;

    v_processes		   ProcCurType;
    TYPE ProcRecType       IS RECORD
    (
      db_name              proxy_monitoring.processes_usage_db_list.db_name%TYPE,
      inst_id		   gv$process.inst_id%TYPE,
      total	           NUMBER
     );

    proc_record		   ProcRecType;	

BEGIN

OPEN cursor_dbs;

LOOP

    FETCH cursor_dbs INTO v_dbs;

    IF cursor_dbs%NOTFOUND THEN
        EXIT;
    END IF;

    v_sql_stmt := 'SELECT ''' || v_dbs.db_name || ''' AS db_name, inst_id, COUNT(1) AS "TOTAL" FROM gv$process@' || v_dbs.db_link || ' GROUP BY ''' || v_dbs.db_name || ''', inst_id';

    EXECUTE IMMEDIATE v_sql_stmt
    INTO proc_record;
    
    dbms_output.put_line('Database: ' || proc_record.db_name || ', Instance ID: '|| proc_record.inst_id || ', Total processes: ' || proc_record.total );

    
END LOOP;

END;

Monitor an unsupported Linux host

This example will use a Bash script and a list of hosts that should be monitored. It also requires password-less communication between the host being monitored by OEM and the unsupported ones.

I used SSH User equivalence to achieve it. You can find details on how to set up user equivalence here: Configuring SSH user equivalence

You can use any authentication method or even run the checks using a third-party tool. I created the metric using a script in the local machine to make it easier to change its code.

Here’s the script code:

#!/bin/bash
#############################################################################################
# Script to check filesystem utilization on a remote host using SSH user equivalence.       #
#                                                                                           #
# The script will go through the hosts in the HOST_LIST file and will run a "df -k"         #
# command to check the free/used space on all file systems                                  #
#                                                                                           #
# Notes: SSH User equivalence must be properly set.                                         #
#                                                                                           #
#    The HOST_LIST file should have the following information:                              #  
#                                                                                           #
#    full_qualified_domain_name|short_name|os_user                                          #
#                                                                                           #
#    The fileds in the output of the script are separated by "|", below is the              #
#    list of columns in order.                                                              #
#                                                                                           #
#    full_qualified_domain_name|mount_point|file_system|used_kb|free_kb|used_percentage     #
#############################################################################################

export HOST_LIST="/u01/oemscripts/proxy_mon_os_fs_hosts.lst"

for i in `cat ${HOST_LIST}`
do
     export HOST=`echo ${i} | cut -d"|" -f1`
     export HOST_USER=`echo ${i} | cut -d"|"  -f3`
     export HOST_SHORT=`echo ${i} | cut -d"|" -f2`
     ssh -o StrictHostKeyChecking=no ${HOST_USER}@${HOST} 'df -k | grep -v "Filesystem" | sed "s/%//g"' | awk -v h=${HOST} '{split($0,a); printf("%s|%s|%s|%s|%s|%s\n", h,a[6], a[1], a[3], a[4], a[5])}'

#   If you want to do any additional processing on the data of each host you can create another loop as follows:
#   for j in `ssh ${HOST_USER}@${HOST} 'df -h | grep -v "Filesystem"' | awk -v h=${HOST} '{split($0,a); printf("Host: %s, Mount: %s, Free: %s \n", h,a[6], a[5])}'`
#   do
#           echo ${j}
#   done

done

This example uses a single threshold for all file systems, but you could have a separate file for each monitored host to store different thresholds for each file system.

In this scenario, you’d have to use the script to evaluate the file system utilization and then report either “NORMAL, “WARNING,” or “CRITICAL” to indicate to OEM the severity of the issue.

Configure the credentials

You might need to set the required operating system credentials to execute this metric, so here’s how you can do it:

Go to the  “Preferred credentials” page:

Select “Host” as the “Target type”:

Select either “Privileged” or “Normal host credentials,” depending on which credential you’re going to use:

Set the credentials for the host you’ll use to execute the metric, then test and save it:

Create the metric

Specify the details for the metric as follows, then click “Next”:

To avoid having metrics running too long, use the “timeout” command as demonstrated below:

Define the columns following the order in the information displayed by the script, first the “Host” column:

Then the mount point column:

Next is the file system column:

Total space used in Kilobytes:

Free space in Kilobytes:

And finally, the percentage of used space, which is the column we’ll use to generate alerts:

Your column list should look like this:

Select which credentials the metric will use, then test it in a host to make sure it’s working:

After the metric is created, you’ll have to “Save as Deployable draft,” as demonstrated before. Then you can deploy it to the target you’ll use as a proxy.

Remember to copy the script to the same location you indicated during the metric creation.

Check the metric results

Once you have deployed the new metric, you can go to the target page and view the metric results:

Notifications

Check that the Extended Metric(s) you create is(are) being properly evaluated by your incident rules. To that end, it’s better to create a specific rule for these metrics, thus keeping it easy to check and adjust.

Here’s an example of an Incident Rule set for the metrics we just created:

Select to which targets this rule will be applied:

Select the metrics we just created:

Specify the actions for Warning and Critical events:In this example, I’m checking the “Clear permanently” check box to avoid having duplicated incidents/notifications for these events:

After creating the rule set, remember to adjust the order to match your configuration. In this example, I want the new rule set to be first in line to avoid duplicated notifications:

These metrics can also be added to an existing rule. Here’s a series of posts that explains in detail how to set up notifications and incident rules:

OEM 13c – Ultimate monitoring guide

Notification examples

Once you set the incident rules, here’s an example of an alert raised by OEM:

I didn’t have an email server set up in my lab, which is the reason why it failed to send the email.

And here’s one example of the database monitoring metric:

Conclusion

Using the idea of Proxy Monitoring, you can extend your OEM capabilities beyond the compatibility matrix to virtually any target, provided you’re able to access it from a supported host/database.

Furthermore, using proxy monitoring in conjunction with heterogeneous services and the right drivers would even allow OEM to monitor other database systems, e.g., PostgreSQL, SQL Server, etc.

No Comments Yet

Let us know what you think

Subscribe by email