Tuesday, July 6, 2010

JDBC Connections leakage and optimization in Oracle Apps R12.

Identifying issues with JDBC connections in Apps can sometimes be a frustrating process, as the investigations may need to consider multiple failure points and complex architectures. I hope this article will give you a better understanding of JDBC Pooling and where to start looking for issues. Hopefully this article should help system and technical architects in Apps to resolve connection leak problems.

Recently while working with Oracle MEA in world's first major implementation of CMRO module in Apps, we faced some critical performance issues, initially. One of the major performance issue was "JDBC connections leakage". Over a period the "INACTIVE" jdbc connections rise and finally they cross the maximum jdbc connections figure set in database and server goes down. The immediate solution was bouncing OC4J Core and HTTP containers of Oracle Apps application server 10g, but off-course this is something you can not do daily in a production environment.

Since our implementation layer has a strong layer of custom code and standard code , done by various teams and you don't have control on coding of individual new developers. I thought it is next to impossible to identify root cause and fix it ,but thanks to Oracle... in Apps there are some standard ways to fix this problem.

Ok to start with lets start with some basic definitions and facts to understand the problem , I am talking about here :

What is JDBC ?
The Java Database Connectivity (JDBC) API is the industry standard for database-independent connectivity between the Java programming language and a wide range of databases – SQL databases(Oracle,mysql etc) and other tabular data sources, such as spreadsheets or flat files. The JDBC API provides a call-level API for SQL-based database access.

What is JDBC Connection ?
Java JDBC APIs uses connection (session) instance in which SQL statements are executed and results are returned within the context of a connection.

What is JDBC Connection pool?
If you have used a SQL or other similar tool to connect to a database and act on the data, you probably know that getting the connection and logging in is the part that takes the most time. An application can easily spend several seconds every time it needs to establish a connection.

In releases prior to JDBC 2.0 every database session requires a new connection and login even if the previous connection and login used the same table and user account. If you are using a JDBC release prior to 2.0 and want to improve performance, you can cache JDBC connections instead.

Cached connections are kept in a runtime object pool and can be used and reused as needed by the application. One way to implement the object pool is to make a simple hashtable of connection objects.

What is Connection leakage ?
An application(basically application server) is said to be leaking connection, if it acquires a connection and does not close it within specified time period. If this feature is enabled, the application server detects these potential connection leaks and dumps the leak tracing logs to server logs. Looking at the logs the user can figure out which application is leaking connection and fix the issues with application, if exists.

What is Connection locks ?
If your are doing multiple DML jdbc transactions in your code with autocommit flag as false on the connection object,the jdbc connection remains locked till the time commit is issued specifically.(By default in OAF/JTF the autocommit flag is false.)

How to identify that Apps instance has connection leakage problem ?

You can run the following sql statement occasionaly to monitor jdbc connections
behaviour :

select s.machine, s.username, s.module, s.inst_id, count(*) how_many
from (select distinct PROGRAM, PADDR, machine, username, module, inst_id from gV$SESSION) s,
gv$process p
where s.paddr = p.addr
and p.inst_id = s.inst_id
group by s.machine,s.username, s.module, s.inst_id

This will list down number of jdbc connections in each module. For OAF, the module is shown as page AM and for JTF/JTT/JSP, the JDBC connections are listed in "JDBC Thin Client". You can keep of track of connections in each module mainly ("JDBC Thin Client"), if this is growing each day. You can also check the status of these connections in v$session if its "INACTIVE" and these connections are persisting for more than 24-48 hours, means these sum up to connection leakage.

For OA Framework calls AOL/J when it needs a database connection, and it is up to OAF to release any such connection when it has finished with it. There is an added complexity, in that OAF also has its own pooling mechanism for the OAF Pages, which is the "Application Module pool" (AM pool). This means that although a user may have finished with a page, the page and its associated database connection are retained for a period of time.So, generally this layer would not have problem, because developer does not have to release/close connection in his code, that is automatically taken care by framework depending upon AM instance pooling and Database Connection Pooling enabled on the instance.

AOL/J JDBC code is the code that handles JDBC connection, it is often the first area to be blamed, but the JDBC connection pool can only drop database connections where the calling application has released the JDBC connection it holds in the pool, so it often turns out to be an issue higher up the code stack.This layer covers all yours JSP/JTT/JTF layer and 99% of the cases is responsible for connections leaks.

How to track leaked and locked connections?
Once you see that your instance has a problem of growing jdbc connections over a period of time all with inactive status for more than 24-48 hours, you will now start to debug and find the exact leaked and locked connections.To gather JDBC Connection statics , do following steps :
1) We need to first identify number of OACORE processes running on apps application server, then on each OACORE process we will check number of leaked locked jdbc connections in Apps instance.To note the number of OACORE processes , we need to do following in Unix box of each application server node :

(i) Login to Apps Application server unix box using ftp tool like putty.
(iii) adoacorectl.sh status

The output will look like :

Processes in Instance:
ias-component | process-type | pid | status
OC4JGroup:default_group | OC4J:oafm | 21894 | Alive
OC4JGroup:default_group | OC4J:forms | 21861 | Alive
OC4JGroup:default_group | OC4J:oacore | 21775 | Alive
OC4JGroup:default_group | OC4J:oacore | 21776 | Alive
HTTP_Server | HTTP_Server | 21699 | Alive

From this we can analyse that there are two OCCore processes running with process id
21775 and 21776 .

Please note in case of multi-node enviorment i.e. Application server with multiple nodes,you will have to run this command individually on each node to get the OACORE process id(s) for that node.

2) Enable FND Diagnostics profile at your user level.

3) Login into Apps instance, on home page click "About This Page" link.

4)Go to subtab "Java System Properties" and note "CLIENT_PROCESSID" from the
table, this will be one of what you got in step 1.

5) Now in the url type : http://< host >:< port >/OA_HTML/jsp/fnd/AoljDbcPoolStatus.jsp

6) This will display you the list of leaked and locked connections for the process id you got in step 3. For leaked connections if you will click the hyperlink, you will also get detailed stack for each leaked connection , which can furthur help you
in identifying the code layer where connection leak is happening.

You need to repeat step (3) and (4) again and again each time logging out and closing the browser to track all the CLIENT_PROCESSID locked and leaked connections.

How to get the root cause/code that is causing this connection leak ?
Actually AoljDbcPoolStatus.jsp is the key to resolve this problem and leaked connection stack will help us to figure out our problem.We can make following conclusions from the stack :

1) The first obvious thing you will want to look at is that whether or not your custom code is causing this problem. Thats easy just search the stack with ur custom top name eg- all your extensions lie in lets say xxabc folder under $JAVA_TOP, then you can search xxabc in leak connection stack, that will give you pointers of class and code where this connection is instantiated and later not released/closed.

2) For custom and standard jsp(s), search the stack with "_" as all jsp classes in apps are compiled as "_" pre-fixed to the jsp name. So this can give you idea where and which jsp are leaking connections. In my experience, this is the layer where most developers make mistakes.

3) Lastly if not custom class then standard classes are doing this (This is a rare case), and for this you anyways just have to raise an SR with Oracle to fix this. To identify search stacks and see its coming from which module i.e., the stack will we like oracle.apps.per.xxx.yyy or oracle.apps.icx.yyy.zzzz , this simple points you that per=> HRMS or ICX=> I-procurement module standard class is causing this problem. Its always good to decompile and look at the code of these classes because often developers do invaisive customization to Oracle code, directly changing the seeded Oracle file to take easy way to their customizations , without realising that this can later cause bigger problems.

Hopefully these analysis will help you to fix root cause of connection leakage problem.