Saturday 13 January 2018

Rolling garbage collector logs in JBOSS

In this post, I am going to explain  GC Logs Rotation configuration in JBoss.

Please check my previous blog post Enable GC logs in JBoss  to enable GC logs in Jboss.

Garbage Collection logs are essential artifacts to optimize application’s performance and trouble shoot complex memory problems. Garbage Collection logs can be generated in a particular file path by passing the “-Xloggc” JVM argument.

–Xloggc:/path/to/log/dir/gc.log

whenever the application is restarted, old GC log file will be over-ridden by the new GC log file as the file path is same.

Thus you wouldn’t be able to analyze the old GC logs that existed before restarting the application. Especially if the application has crashed or had certain performance problems then, you need old GC Logs for analysis.

Because of the heat of the production problem, most of the time, IT/DevOps team forgets to back up the old GC log file; A classic problem that happens again & again, that we all are familiar. Most human errors can be mitigated through automation and this problem is no exception to it.

A simple strategy to mitigate this challenge is to write new GC log contents in a different file location.   In this article 2 different strategies to do that are shared. 

1. Suffix timestamp to GC Log file :

If you can suffix the GC log file with the time stamp at which the JVM was restarted then, GC Log file locations will become unique. Then new GC logs will not over-ride the old GC logs. It can be achieved as shown below:

-Xloggc:/path/to/log/dir/gc.log-"`date +%Y-%m-%d-%H-%M`
This strategy has one minor drawback:

a. Growing file size

Suppose if you don’t restart your JVMs, then GC log file size can be growing to huge size.  Because in this strategy new GC log files are created only when you restart the JVM. But this is not a major concern in my opinion, because one GC event only occupies few bytes. So typically GC log file size will not grow beyond a manageable point.

2. Use -XX:+UseGCLogFileRotation

Another approach is to use the below JVM properties:

 -Xloggc:/path/to/log/dir/gc.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=2M
When ‘-XX:-UseGCLogFileRotation’ is passed GC log rotation is enabled by the JVM itself.

‘-XX:NumberOfGClogFiles’ sets the number of files to use when rotating logs, must be >= 1. The rotated log files will use the following naming scheme, <filename>.0, <filename>.1, …, <filename>.n-1.

‘-XX:GCLogFileSize’ defines the size of the log file at which point the log will be rotated, must be >= 8K

But this strategy has few challenges:

a. Losing old GC Logs

Suppose if you had configured -XX:NumberOfGCLogFiles=5 then, over a period of time, 5 GC log files will be created:

gc.log.0 — oldest GC Log content
gc.log.1
gc.log.2
gc.log.3
gc.log.4 — latest GC Log content

Most recent GC log contents will be written to ‘gc.log.4’ and old GC log content will be present in ‘gc.log.0’.

When the application starts to generate more GC logs than the configured ‘-XX:NumberOfGCLogFiles’ in this case 5, then old GC log contents in gc.log.0 will be deleted. New GC events will be written to gc.log.0. It means you will end up not having all the generated GC logs. You will loose the full visibility of the events.

b. Mixed-up GC Logs

Suppose application has created 5 gc log files i.e.

gc.log.0
gc.log.1
gc.log.2
gc.log.3
gc.log.4


Then, let’s say you are restarting the application. Now new GC logs will be written to gc.log.0 file and old GC log content will be present in gc.log.1, gc.log.2, gc.log.3, gc.log.4 i.e.

gc.log.0 — GC log file content after restart
gc.log.1 — GC log file content before restart
gc.log.2 — GC log file content before restart
gc.log.3 — GC log file content before restart
gc.log.4 — GC log file content before restart


So your new GC log contents get mixed up with old GC logs. Thus to mitigate this problem you might have to move all the old GC logs to a different folder before you restart the application.

Friday 12 January 2018

Jboss AJP thread pool not released idle threads


In this post, I am going to explain the problem and the solution related to AJP connector executor leaking threads.

We may notice and detect that AJP thread pool with the thread counts will start increase continuously on application servers.  It looks like that the idle threads never recycle or reclaim even after keepalive-time. This behavior will happen continuously even if the system will remain idle for a long period of time (without any user request). The application leak connections because it is not closing the AJP connection and returning them to the pool.


In the case of unbounded-queue-thread-pool executor configured and the default maximum-connections value being used, a new thread will create for every new request and will remain idle forever. As the number of AJP requests increase, the response from Jboss EAP may get slow eventually.

The reason for this behavior is that the AJP thread-pool connections by default have no timeout (default value of "DEFAULT_CONNECTION_TIMEOUT" is -1 for AJP, which means it never recycle) and therefore, the thread connection will persist permanently once established. The same behavior detectable if we use httpd/mod_cluster in front of the JBoss which will invoke periodic pings to JBoss and fill its connection pool.

The solution is to add the following System property in standalone.xml or domain.xml to limit the keepalive timeout value for AJP:

<property name="org.apache.coyote.ajp.DEFAULT_CONNECTION_TIMEOUT" value="600000" boot-time="true"/>


Here value 600000 menace 10 minutes . After 10 minutes of idle time, JBoss kills the thread.


However, if there are no strong reasons to use "DEFAULT_CONNECTION_TIMEOUT" option, keeping AJP connections open between apache and EAP is a good idea. The default value of "DEFAULT_CONNECTION_TIMEOUT" is -1 for AJP, which means it never disconnect.

Thursday 11 January 2018

Which Java Thread Consumes High CPU.. ?

 In this blog post, I am going to explain How to Find which Java Thread Consumes high CPU..

Frequently we are facing a strange issue with one of our JBOSS server in Production. The monitors that we configured on these always showed degraded performance. We started to analyze the performance of the server by using the basic top command which gave us the highest CPU and Memory. High CPU may cause System Crash or Server Crash or Slow Responsiveness of Server.

High CPU utilization is a very very common thing for an Application Server. It is very important for an Administrator to find out what is causing the High CPU.

Fallow the below steps to find out The High CPU consumed thread.  

1. We use top to find out the process id (PID) of the java process consuming most of the CPU.

top -n1 | grep -m1 java | perl -pe 's/\e\[?.*?[\@-~] ?//g' | cut -f1 -d' '

2.  Next step is we started to find out the Thread ID that is causing the HIGH CPU usage in this process which gave us this.

printf "%x" $(top -n1 -H | grep -m1 java | perl -pe 's/\e\[?.*?[\@-~] ?//g' | cut -f1 -d' ')

3.  So we have a process ID of our busy JVM and Linux thread ID (most likely from that process) consuming our CPU. By using jstack we can be obtained Thread dump. Theses dumps/ Thread stack traces re very useful for find out root causes of issues.

  $JAVA_HOME/bin/jstack <PID> | grep -A500 <ThreadID> | grep -m1 "^$" -B 500

The above command simply dumps the JVM stack trace of the given PID and filters out (using grep) the thread which has matching Thread ID.

Shell script :

I have written shell scripts as below with all above commands.  This script make easy to find out High CPU consumed JAVA PID & Thread ID and simply dumps the JVM stack trace.  


#!/bin/bash
JAVA_HOME=/usr/java/jdk1.8.0_121/bin
PID=$(top -n1 | grep -m1 java | perl -pe 's/\e\[?.*?[\@-~] ?//g' | cut -f1 -d' ')
ThreadID=$(printf "%x" $(top -n1 -H | grep -m1 java | perl -pe 's/\e\[?.*?[\@-~] ?//g' | cut -f1 -d' '))
EAP=`ps -ef | grep java |grep -v grep | awk '{print $2 "," $9}' | grep $PID | awk -F, '{ print $2}' | sed -e 's/^-D//' | tr -d '[]' | sed -e 's/Server://'`
  if [ -z "$PID" ] || [ -z "$NID" ] || [ -z "$EAP" ]; then
       echo ""
       echo "No High CPU consumption Threads at this moment"
       exit 1
  else
       echo ""
       echo -e "****************************************************************"
       echo " High CPU consumed PID       :- " $PID
       echo " High CPU consumed EAP Name  :- " $EAP
       echo " High CPU consumed Thread ID :- " $ThreadID
       echo -e "****************************************************************"
       echo ""
       echo " High CPU consumption Thread Stack Trace"
echo"########################################################################"
       echo ""
       $JAVA_HOME/bin/jstack $PID | grep -A500 $ThreadID | grep -m1 "^$" -B 500
       echo ""
echo"########################################################################"
  fi

Copy the above Code and Save this with .sh extension and run it with sufficient permissions.

Let see how this script will work



[jboss@middleware ~]$ ./High_CPU_Consumption_Threads.sh
****************************************************************
High CPU consumed PID       :-  72520
High CPU consumed EAP Name  :-  Server-One
High CPU consumed Thread ID :-  6d08
****************************************************************

High CPU consumption Thread Stack Trace

################################################################

"HttpManagementService-threads - 2288" #10447 prio=5 os_prio=0 tid=0x0000000000891000 nid=0x6d08 waiting on condition [0x00007f80ba35a000]
   java.lang.Thread.State: TIMED_WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x00000000e0461778> (a java.util.concurrent.SynchronousQueue$TransferStack)
        at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
        at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
        at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362)
        at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941)
        at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1066)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
        at org.jboss.threads.JBossThread.run(JBossThread.java:122)
################################################################


Running the script multiple times (or with watch, see below) will capture Busy thread in different places..


Hope this helps


Wednesday 10 January 2018

Monitor JVM memory usage in WAS


Now, here is the Jython script designed to help monitor the “HeapSize”  which can let us able to proactively measure and take action for Outofmemoryerror issue.

Copy the Below Code and  Save this with .py extension and run it with wsadmin.

appServer=raw_input("Enter application server name: ")
JVMName = AdminControl.completeObjectName('type=JVM,process='+ appServer +',*')
JVMObject = AdminControl.makeObjectName(JVMName)
perf = AdminControl.completeObjectName('type=Perf,process='+appServer+',*')
perfObject = AdminControl.makeObjectName(perf)
Obj = AdminControl.invoke_jmx(perfObject,"getStatsObject",[JVMObject,java.lang.Boolean('false')],['javax.management.ObjectName','java.lang.Boolean'])
current = Obj.getStatistic('HeapSize').getCurrent()
used = Obj.getStatistic('UsedMemory').getCount()
usage = float(used)/float(current)*100
uptime = float(Obj.getStatistic('UpTime').getCount())/60/60/24
print "--------------------------------------------"
print "ServerName      :", appServer
print "uptime(in days) :",  int(uptime)
print "--------------------------------------------"
print "CurrentUsage    :", current
print "Usedmemory      :", used
print "Usage in Percent:", int(usage)
print "--------------------------------------------"


Let see how this script will work

Here i saved above code as getHeapSize.py and run as below.

Syntax :   $WAS_HOME/bin/wsadmin.sh  -lang jython  -f getHeapSize.py



Remember that WebSphere allows you to set an Initial as well as a Maximum heap size - if they're not the same then the heap size could be anywhere between these two values.  The output above will show you the percentage use of the *current* heapsize, which may not be the maximum.  Some recommend setting these to the same value to remove the performance overhead of the JVM needed to allocate more memory as it grows the heap




Tuesday 2 January 2018

How to rotate the proccess-controller.log, host-controller.log log files in EAP 6?

In this blog post, I am going to explain the log rotation of host-controller.log and process-controller.log

Issue: Every time the host controller starts is overwritten the logs:  host-controller.log, process-controller.log.

Resolution:
Define the PeriodicRotatingFileHandler in $JBOSS_HOME/domain/configuration/logging.properties as following:

# File handler configuration
handler.BOOT_FILE=org.jboss.logmanager.handlers.PeriodicRotatingFileHandler
handler.BOOT_FILE.level=DEBUG
handler.BOOT_FILE.properties=autoFlush,append,fileName,suffix
handler.BOOT_FILE.autoFlush=true
handler.BOOT_FILE.append=true
handler.BOOT_FILE.suffix=.yyyy-MM-dd
handler.BOOT_FILE.fileName=${org.jboss.boot.log.file:boot.log}
handler.BOOT_FILE.formatter=PATTERN



The suffix property is responsible for rotating the log files based on a time interval that is determined by the content of the string which is passed in to this property:

handler.BOOT_FILE.suffix=.yyyy-MM-dd

Also note that you need to add append and suffix to handler.BOOT_FILE.properties. (handler.BOOT_FILE.properties=autoFlush,fileName by default.)

Validation:

After a day, verify the $JBOSS_HOME/domain/log path and the logs should be rotated with suffix of date.

[jboss@middleware log]$ ls -ltr  $JBOSS_HOME/domain/log
-rw-rw-r--. 1 jboss jboss   7542 Dec 27 23:50 process-controller.log.2017-12-23
-rw-rw-r--. 1 jboss jboss  14329 Dec 27 23:50 host-controller.log.2017-12-23
-rw-rw-r--. 1 jboss jboss   7572 Dec 28 21:11 process-controller.log
-rw-rw-r--. 1 jboss jboss  19940 Dec 28 21:14 host-controller.log
[jboss@middleware log]$ ls -ltr


EAP - logging.properties



You may have noticed that there is a logging.properties file in the  below directory

$JBOSS_HOME/standalone/configuration     --> standalone mode

$JBOSS_HOME/domain/servers/<SERVER_NAME>/data    --> Domain Mode

This logging configuration is used when the server boots up until the logging subsystem kicks in. If the logging subsystem is not included in your configuration, then this would act as the logging configuration for the entire server.

In the future this file may go away or be automatically generated from the logging subsystem. For most users this file should not be modified.

Note: If the logging.properties is not available during start there will be no logging until the logging subsystems kicks in, this is also the case if the configuration is damaged, the server might exit without any further message.

JBoss Deployment EAR Error on start-up - No content hash available

In this post, we are going to discuss the deployment problem that prevents JBoss Server to start properly.


 The below defined error usually occurs when we delete directory standalone\data or domain\data and then try to redeploy the same application war. The JBoss server unable to delete entries defined in standalone.xml/domain.xml with the same hash for the deployed application.


06:02:24,774 ERROR [org.jboss.as.controller.management-operation] (Controller Boot Thread) JBAS014612: Operation ("add") failed - address: ([("deployment" => "abc.ear")]) - failure description: "JBAS018717: No deployment content with hash 545ac26ed6a31ef41d749d5228669eb6858bb345 is available in the deployment content repository for deployment 'abc.ear'. This is a fatal boot error. To correct the problem, either restart with the --admin-only switch set and use the CLI to install the missing content or remove it from the configuration, or remove the deployment from the xml configuration file and restart."

06:02:24,785 FATAL [org.jboss.as.server] (Controller Boot Thread) JBAS015957: Server boot has failed in an unrecoverable manner; exiting. See previous messages for details.


 The domain.xml looks like below:

In Deployments definition:

      <deployment name="abc.ear" runtime-name="abc.ear">
               <content sha1="545ac26ed6a31ef41d749d5228669eb6858bb345"/>
        </deployment>

Server –group Definition:

<server-group name="abc-group" profile="abc">
            <socket-binding-group ref="ha-sockets"/>
            <deployments>
                <deployment name="abc.ear" runtime-name="abc.ear"/>
                <deployment name="ojdbc6.jar" runtime-name="ojdbc6.jar"/>
            </deployments>
       </server-group>

Removing the entries manually for the application having defined hash from domain.xml and then restart the DC and application server.