Pythian Blog: Technical Track

Can Concurrent Processing Abort Help Automation and Predictable Maintenance?

On the surface concurrent processing (CP) looks simple; a new request is pending for a while, then it runs and is completed. However in the background it’s extremely complicated with all the statuses, schedules, permissions, incompatibilities, etc. If you’d like to know more, there are a few excellent blog posts written on this topic, including this one and this one.

When it comes to automating or scripting, things seem simple as well. We have adcmtl out-of-the box control utility with various options and stop/start should also work out-of-the box. But does it?

$ADMIN_SCRIPTS_HOME/adcmctl.sh {start|stop|abort|status} 
  [<APPS username/APPS password>|<Applications username/Applications password>]
  [sleep=<seconds>] [restart=<N|minutes>] [pmon=<iterations>] [quesiz=<pmon_iterations>]  
  [diag=Y|N] [wait=Y|N]

I think every single Apps DBA with hands on experience knows—not so fast.

Keep in mind

When adcmctl.sh stop is issued it’s not the same as httpd stop, which is basically a hard stop. A CP stop is more like a gentle ask.

  DBA: Please, EBS, shut down concurrent processing now—I have important tasks to do.
  EBS: Sure, sir, but I shall complete my tasks first! 
  DBA: Cool—how long this will take? Can you do it in five minutes?
  EBS: I dunno LOL ¯\(°_o)/¯
  DBA: ?!?!?!

And there is another aspect to this that’s more functional than technical. DBAs are usually in control for internal requests like workflow and statistics, but CP is built for functional use. Data ingestion, reporting, modifications, workflows—all these are done by actual system users, and in a busy environment there’s a lot going on. In the case of maintenance, in a perfect world every involved party is aware of it. Everybody makes sure concurrent requests end on time, no scheduled requests will trigger through the window and schedules are managed by owners. But is this realistic?

Problem

  • As a DBA runs through the maintenance they must be very sure that once the system is released all the requests and their schedules are basically untouched.
  • Maintenance windows should be predictive in time—usually with r12.2 there’s already big stress on all adop cutover activities. Waiting on CP to stop while some requests are still running may not be acceptable.
  • And finally automation—a DBA must think about scripting; automating processes to avoid human error and reduce overall toil.

What is the best way to achieve controlled, repeatable, managed and preferably automated/scripted results?

On top of that let’s add two possible scenarios a DBA must fulfil:

  • After the cutover a DBA team must provide an EBS system for the end-to-end testing team so the system is verified prior to release to end-users including CP.
    • Certain patches will actually submit concurrent requests that should execute to consider the patch applied (e.g. 12c.AD.Delta).
  • The custom code in R12.2 is not online patching compliant. It can be applied on runfs but needs CP to be up, or requires a restart.

In the case where CP is not required to start through the maintenance it could be fine. However, how can you verify that patching didn’t affect CP, and that CP will start in the first place?

I hope I’ve managed to scope the problem.

Solution

No, there isn’t a silver bullet and the solution is not quick. The real problem is the running concurrent requests DBAs need to either kill or terminate in order to perform maintenance. If a DBA kills some requests, how can they track them and report on them and how can they reschedule them under the same user/schedule? It’s hard and messy (e.g. the out-of-office hours for user scheduled X who is on vacation).

When automation is involved it’s best to use all the available out-of-the-box tools. For example, when CP is being stopped sometimes the DBA already has handy scripts in place to kill all the remaining OS processes. Having said that, it’s unlikely that those scripts will list running requests and manage them afterward.
This is where adcmctl.sh abort becomes handy. From the documentation:

Abort

You can abort or terminate individual services.

When you abort (terminate) requests and terminate the Internal Concurrent Manager, all running requests (running concurrent programs) are terminated, and all managers are terminated. Managers previously deactivated on an individual basis are not affected.

Any service that was active when the ICM was aborted will be restarted when the ICM is brought back up. Managers that were deactivated on an individual basis will not be brought back up with the ICM.

It’s extremely important to understand the differences between managers and requests, and not to misinterpret that all the requests will be restarted—they won’t be. Abort is very good at getting rid of OS processes and database sessions; within ~two minutes of being issued all CP processes will be gone. But concurrent request data won’t be touched.

All those requests in running state will remain in running (phase_code=’R’, status_code=’R’) state in the process table (fnd_concurrent_requests) without an actual DB session or an application OS process. That’s another confusion (check your scripts) and it’s important to note the OAM (Oracle Applications Manager) dashboard will not show those requests running. Note: upon CP start not all those requests will be managed. A few will be moved to complete/error state and some may remain in running/running state forever.

Here is a CP maintenance sequence that I find helpful to address the previously described issues (please note: test carefully in your systems before proceeding):

  • $ADMIN_SCIPRT_HOME/adcmctl.sh stop
  • Wait five minutes in order to let small jobs complete
  • $ADMIN_SCIPRT_HOME/adcmctl.sh abort
  • Wait two minutes while stop completes
  • SQL> update fnd_concurrent_requests set phase_code=’P’, status_code=’I’ where phase_code=’R’ and status_code=’R’;
  • SQL> update fnd_concurrent_requests set hold_flag=’Y’ where phase_code in (‘P’,’I’) and hold_flag = ‘N’;
  • Perform all the maintenance / green light to start app / start EBS
  • SQL> update fnd_concurrent_requests set hold_flag=’N’ where phase_code in (‘P’,’I’) and hold_flag = ‘Y’;

So by putting those SQL updates against fnd_concurrent_requests a few things have been achieved:

  • All requests running after abort are moved back into pending normal status.
  • On all pending/inactive requests (not completed) a hold flag is set.

This guarantees that:

  • None of the pending requests will be started until the hold flag is removed.
  • Requests submitted by patching activity are free to run.
  • Testing of CP can be performed in the maintenance window.
  • The DBA is free to restart CP if custom code is delivered after cutover on runfs to take effect.
  • The DBA is free to troubleshoot CP issues and/or restart CP without worrying about starting any business process.
  • Even if it doesn’t look simple, automating is quite possible rather than using a complicated custom shell script that manages stops, notes, requests, states, etc.

Conclusion

In my opinion, CP abort is a good option as it delivers predictable and consistent results out of the box which are good for automation. However it requires a DBA to be very certain of feature use and behaviour. That knowledge can be gained from experience as Oracle documentation does not go in details.

Notes:

  • Examples shown in this article may not be appropriate for all environments and use-cases.
  • If you use adcmctl abort because it’s faster than stop and you never deal with requests, please stop doing so; otherwise you’re eventually guaranteed to have problems!

If you have any questions, or thoughts, please leave them in the comments.

No Comments Yet

Let us know what you think

Subscribe by email