Tuesday, 1 April 2014

Real time issues/questions

If the heapdumps are genrating frequently..How we can restict the no. of heapDumps in the Environment..

One way to restrict the number of heapdumps by wsadmin commands....
 like...connect to the problematic node and try to run the below jython commands ..
1)AdminControl.getAttribute(jvm, "maxHeapDumpsOnDisk") It shows the maximum number of heapdumps configured at present.
2)AdminControl.setAttribute(jvm, "maxHeapDumpsOnDisk", "Numberof heapdumps to restrict which you want to set")

I have done changes in Servers > Application Servers > server_name > Process Definition >Java Virtual Machine > Custom Properties> New
 IBM_HEAPDUMP_OUTOFMEMORY
value: false


How to clear java class cache and also osgi cache in WAS.

Why we require to clear the cache .. They are different reasons one of the reason when we are going to upgrade the product the possibility of previous class of the product are still holding .. So before clearing the cache by CLI we need to stop the JVM's

To clear the JVM cachee..

<productHOME>/bin/clearClassCache.sh

To clear the OSGI cache...

<ProductHOME>\profiles\profile_name\bin\osgiCfgInit.sh

What we have to do, if the web server is down or the machine is crashed where the web server is running?

Webserver basically will route statc contents.. Plugin will do WLM...

For maintaining HA of environment.. We will configure webservers in different machine and the requests will route by the load balancer on routing policy it may be at hardware level or tool available in the market ( edge components, big ip )

We will configure WebSphere on distributed machine to maintain HA of the application and also SPOF of JVM'S by using the services provided by the product...

 so webservers are installed in different machines.. and we will configure them with an application server profile.... so when any of the webserver is down, then the load balancer will send the request to the webserver which is in active.


I deployed two (indiA and USA ) applications in two JVM's ( one for each country) ..

The developer requested to print country time zone in the respective JVM systemOut.log for the troubleshooting the issue...

Servers > Server Types > WebSphere application servers > server_name > Java process management > Process definition > Environment entries -> Change TZ variable.. this should work as you said INDIA app in one JVM/ US app in other JVM... if both application are deployed in same JVM.. then its difficult

This way also we can configure.. User.timezone=IST (timezone of country) in the JVM custom property or else configure in the Generic arguments like -Duser.timezone=IST

When we change the logs path of the server's ( nodeant, dmgr, appserver ) , we can able to change the path ... But why the starServer.log and stopServer.log path will not change....

Cause
Changing the LOG_ROOT variable from the Administrative Console does not change the location of the startServer.log or stopServer.log files, because the startServer and stopServer commands do not read any Java™ Virtual Machine (JVM) parameters.

Resolving the problem
This is working as designed. You cannot change the location of the startServer.log and stopServer.log files from the Administrative Console, or in any XML files, because the startServer.log is generated by the startServer command and there is no JVM loaded by it. Also the startServer and stopServer commands do not read any JVM parameters.
To change the location of these logs, specify the -logfile parameter and provide the location where the logs need to be written when the script is run.

startServer.sh  server1 -logfile C:\logs\start.log


startServer.bat server1 -logifle C:\logs\start.log

pls explain what is coregroup service and HA manager. and what are the advantages of enabling HA manager??


This is huge topic... I provided few points below, which you can undestand..

Coregroup is physical grouping of all process (dmgr, nodeagent and Application servers) in cell and it will provide a communication among all process through DCS layer. If we restart any process it will try to make a communication of all process in the cell. every process have internal HA Manager, and also it provide the HA Co-ordinator ( we can configure or by default any process can be HA-Cordinator in the cell ) services to maintain the high avalibilty of the servers.

Can giev one example : i am doing online shopping, my session request was routed by the server1, suddenly server was not responding properly because of huge utilization, then Session Manager will identify the session on which server sessiondata was replicated and inform to the HA Cordinator. HA Co-ordinator will route the request to that server , this can achieve by configuring the Data Replication service.

Advantages of the coregroup : you can maintain session replication , session will not lapse

DisAdvantage of Coregroup : In large toplogy environment (More than 200 JVM's) in the cell, Its very difficult to manage with the default coregroup, because if we start the process( JVM ) it will try to make a communication with all the JVM's (199) in the cell and will take huge amount of time to bring the JVM's ( we can do it by tunning the OS level )

How we can achieve : We can divided the Process into different coregroups and configure the bridge settings among the coregroups. For example i have 200 JVM's in Cell, will sagregatte 10 coregroups consists of 20 JVM's and configure the Bridge Settings among all coregroups.

A HA Manager will runs inside every process in a ND cell, including the dmgr,
node agents, application servers, HA Manager will provide availabilty of the servers facilitating with the other services( like session management, Data replication service....etc ) and singleton services will ensures the HA Manager, if one instance down it will route the seesion request to the other instance. On the basis of one of N-Policy HA -coordinator will route the request to the other server and also this policy can be use by the many services like BUS, transaction services ... etc


Suppose if I login into the console and changing the few configurations at the same time the other guy logged into the server, he stopped the dmgr process.
Then in the (preference.xml ) file the changes will be store...

I can retrieve the configuration once restart done. In case the other guy who logged in the server deleted wstemp.

How I can retrieve the changes I have done...


It depends which version of WAS you are working. In WAS 8.5 preferences.xml stored in consolepreference folder not in wstemp.
When you made some configuration changes in admin console and somebody removed wstemp directory content, your data he can't delete untill and unless you logged out from the current session or he stop the dmgr.
If he stops the dmgr then he removed the wstemp content then its difficult to revert back your changes untill and unless you have wstemp directory backup.
The temporary directory includes a file named preferences.xml for each user, which stores the console preferences that are configured by that user. If you delete the preferences.xml for a specific user, the administrative console loses the configured preferences, so the user/you must reconfigure them with the next login. To clean the temporary directory of unnecessary data but keep the preferences configuration file, each user must log out of the administrative console before closing the Web browser.
If you have a back up then you can restore the preferences.xml file and start the dmgr and when you logged in to the dmgr it will prompt below
Work with the master configuration

Recover changes made in a prior session

server getting down unexpectedly getting out of memory error....SRVE0068E: Uncaught exception thrown in one of the service methods of the servlet: getdocstream. Exception thrown : java.lang.OutOfMemoryError......incread the max & min heap size....still gettting this error.....how to resolve it pleaze help....!!!!
WAS version is 6.1

Solution 1:please provide the fixpack level currently you are running. i agree with swapnil apply WAS with a latest fix pack level. Several known memory leak issues were fixed in the latest patch. I want to know is their any application changes. applying fixpack is one solution to overcome known issues in websphere. sometimes memory leak happens from application side also. to analyse collect javacore and heapdump.
Solution 2:
i don't think fixpack is straight forward solution for this... i'm guessing application is trying to load huge document/content/object and going OOO though we have/had enough max memory ... if it is happening during the JVM/applcation startup then its probably due to the intial/prmitive heap .... try to set some higher value for primitive heap xms/xmos and check if it can load.. but if we have same applciation running on ohter JVMs .. and if wea are obseving this kind of behaviour on only single JVM at runtime/intermittantly .. then its some thing we need to check .... i think it some thing realted to application packing as well .. would be beneficial if we have any informtaion about what applicaiotn is trying to load through that servlet


due to this error, we found the OOM and Hung Threads in the server logs......


As you know that WAS is a runtime it throws same error for "N" reasons, we have to check before throwing the OOM or hung is that JVM working fine or is this facing any any other kind of issues.
You can check the below things
1. Check the ulimits values as suggested by IBM
2. Check the connection between dmgr and nodeagent at the time of issue occured
3. If you have any core group bridges check whether they are proper
4. Have you analyzed the java core or heapdump?? in java core where the threads are waiting
5. At the time of issue as there any network issue
6. Is the OOM hapens for dmgr or any app server??

Better to share the full log for work around.


What is the exact difference between full synchronization and partial synchronization and when will they happen i.e, what operations will trigger them?(from console we can do it,but not in this way)


First, an auto-sync process will take place between nodeagent and DMGR in regular time intervals (it can be customized). Secondly, any changes made in console and it propagates to the nodes (sync & save option). Third is the Full Re-sync option from Console.

We have something called Digest and Epoch. Digest is the random & unique number assigned for files and Epoch is random,unique number for folders. DMGR maintains Digest and Epoch of its repository.

During Auto Sync / Partial sync, DMGR checks the folder whose Epoch & Digest numbers have changed and only those files will be synced.

During full resync from admin console or when using syncNode.sh/bat, the Epoch and digest are calculated again and the entire repository is then synced to the nodes.

Why do we need to restart the node agent after configuring the data source for the test connection to be successful ?
(I have faced this situation in my environment and was little surprised to hear from my IBM resource that we need to restart the node agent after configuring the datasource)

The reason for it is that the "Test Connection" button you see on the admin console, invokes the JDBC connection test from within the address-space of the Node Agent.
There is no way for the J2C Alias information to propagate to the Node Agent without restarting it;
some configuration objects take effect in WebSphere as soon as you save the configuration to the master repository, and some only take effect on a restart.
J2C aliases take effect on restarts.
In a Network Deployment topology, you may have 20 server instances all controlled by the same Node Agent.
You may restart your server instances many times, but unless you restart the Node Agent itself, the "test connection" button will never work.

I need to stop the Ihs server but my users will not loss the sessions till connection was live....Which command will use to stop the Webserver?

apachectl -k graceful-stop

We can use the above command, but we have to provide grace shutdown time zero in the config or else by default it wait for 7-9 sec to complete request route by the child process.. It will shutdown after 7 seconds


No comments: