This post is written for fellow SOA Suite Administrators, I will show how improper Composite cleanup in the Oracle SOA Suite will affect the performance of the Java Virtual Machines.
SOA environments are always in motion, as business requirements grow and developers keep adding services and support for frontend and backend systems. Depending on the size and complexity of the middleware environment, this can result in a constant flow of new composites, ready for deployment.
The Oracle Fusion Middleware SOA suite supports a nice way for service lifecycle management. It allows the SOA Administrators to have multiple revisions of one composite. If for example a new version of a composite is deployed, the revision number is increased and the new composite becomes the ‘default’ version. Older versions of the composite can still be used by older running instances. New instances will automatically use the new (default) composite. If for some reason the new composite contains a problem, an older revision can be set as the default.
Although this mechanism offers interesting lifecycle and fallback possibilities, it can become difficult for SOA Suite Administrators to keep track of all the composites which are deployed. After all, deploying new composites is always in the interest of the business, but the importance of proper composite cleanup is often underestimated or even forgotten. As we’ve seen with one of our customers, this can have serious impact on the performance of service calls and the throughput, because this is directly related to the way the Java Virtual Machine works.
The Java Virtual machine (JVM for short) is the underlying engine of every Weblogic server. To run all the compiled java code, it requires a block of memory inside the physical memory of the server. This block is called a heap and its size is often pre-set in the startup properties (using the Xmx and Xms arguments). The JVM uses about 25% of this heapspace as a nursery(1). New memory allocations will be pointed to this piece of the memory. As the need for memory grows, the space within the heap will run out. When this happens, the JVM will try to cleanup parts of its memory by performing a Garbage Collection. At first, the JVM will try to perform this process as quick as possible, minimizing the impact for applications running on the JVM. These attempts to clean memory quickly are called Young Collections and often don’t take more than a couple of milliseconds. In this step, the JVM checks if the allocated memory in the nursery is still used. If so, the memory is ‘promoted’ to the larger part of the heap. The rest is cleaned.
When however the heap can’t be cleaned with this ‘quick’ method, the JVM will try a more rigorous approach. A different kind of Garbage collection is started, which is called an Old collect (or Full Collect, depending on the JVM). In this process, the JVM will scan the entire heap to clear as much memory as possible: Marking and Sweeping to find removable objects and a compact stage to ‘defragment’ the heap. Because of these steps, the JVM will require a ‘stop of the world’ moment, which will pause all java activity. Depending on the size of the heap and the amount of cpu’s available, this can take up to several seconds. When the Old Collection procedure is complete, the remaining java code which cannot be cleaned, remains. This footprint is called the liveset.
As shown above, Young Collections are pretty harmless but Old collections can have a big impact on performance. To tune this process, the goal is to minimize the number and duration of Old Collections. One way to do this, is to reduce the liveset. After all, the Old Collection will go through the entire heap to find free space and a large liveset means that a high percentage of the heap
cannot be cleaned. Every Old Collection will try to scan this part of the heap, pausing all java activity, while nothing is gained. Additionally, the liveset also determines the ‘baseline’ on which new memory is allocated. If this baseline is higher, less memory is available until the JVM runs out again.
For example, imagine a heap of 1024MB, of which the liveset about 30%. This means that 300MB of the heap is allocated by memory which cannot be cleaned. When the JVM runs out of memory (e.g. 1000MB is used already), an Old Collection is started. This will scan though the entire 1000MB of memory and 700MB will be cleared. New allocations will start from the 300MB baseline.
Now Image this same heap with 50% liveset. This means 512MB of heap will never be cleaned after each Old Collect, even though the process will have to scan the same 1024MB. And because of the higher baseline, the JVM only has 488MB this time before it runs out again.
In a managed Weblogic server running SOA Suite, this liveset basicly exists of the soa-infra application and all deployed composites at any point in time. Because non-default composites are also loaded on startup, these will also remain in the liveset of the JVM the entire time. Reducing the amount of deployed composites on a managed server is an efficient way of reducing the liveset. In result, your Weblogic server will have fewer pauses and additionally, these pauses will be done quicker.
At our customer, an ANT script was build to undeploy all non-default composites. This way SOA Administrators can easily clean all the old composites before a new release is deployed. Within two weeks after the release is put in pace, the older non-default revision of composites are removed. This way a ‘composite backup’ is retained for a short period in which older running instances can still finish all steps. Also, by only maintaining a maximum of two revisions of each composite for a limited time, the liveset is reduced to a minimum. This procedure benefits all Java activity including the throughput of the load and resulted in a stable processtime with fewer outliers.