WebLogic Diagnostic Archive Bug
-
Hi,
We ran into multiple times that there are some stuck threads on the WebLogic Diagnostic frameworks’ archive filestore. When searching through this, found that it’s a bug from WLS 12.1.2 onwards. Oracle’s suggested fix is to include
-Dweblogic.diagnostics.DisableDiagnosticRuntimeControlService=true
in the Java Options parameter.Please refer:
STUCK threads at weblogic.diagnostics.archive.wlstore.PersistentStoreDataArchive.readRecord (Doc ID 2253017.1)My question is: If we set this diagnostic parameter, would it impact any of the WLSDM functionality or data capturing? Please let me know.
-
Hi,
We will reproduce the issue and will get back to you with a result.
Meanwhile; could you please attach full STUCK thread log message and log file. It’s very improtant to get these information.
Also; zip and upload whole WSDM/logs folder.
We need to reproduce it.
Regards…
-
Hi,
Is there an email or private upload section to share all the logs with you?
- Jeba,
-
-
Hi,
Sent the WLSDM log file zip and also the JFR file recording during the troublesome time period. That JFR file has the full stack trace.
Let me know if that works.
Jeba.
-
Hi Jeba,
We have analyzed the problem you’re facing in detail. The STUCK thread is below and it is different than the STUCK documented in MOS with Doc ID 2253017.1. But totally throwing from the same module which is WLDF.
This JVM
-Dweblogic.diagnostics.DisableDiagnosticRuntimeControlService=true
flag is valid after Oracle WebLogic release 12.1.2+ and please check this note:NOTE: The flag is only relevant for WLS 12.1.2+. The RuntimeControl feature and built-ins were introduced in the 12.1.2 release.
Your WebLogic version is 12.1.1 which means this JVM flag does not going to help us to alter this problem.
On the other hand, there is a workaround for the issue you’ve reported. It is because of another WebLogic bug or performance fatigue which is about querying WebLogic access logs through WLDF.
Here is the workaround;
-
Please check your all WebLogic access log file folder path and count number of access log file including rotated access logs. Access logs exists in $DOMAIN_HOME/servers/$SERVER_NAME/logs/access.log (include rotated files"
-
Move or delete all the rotated access.log (access.log00001xxx) from the path (apply to all WebLogic managed servers)
-
After the step 2, there is no need to restart managed servers, then check the STUCK threads again.
STUCK THREADs in Thread Dump
"[STUCK] ExecuteThread: '47' for queue: 'weblogic.kernel.Default (self-tuning)'" #14988 daemon prio=1 os_prio=0 tid=0x00007ff3f0007000 nid=0x6946 waiting for monitor entry [0x00007ff363c75000] java.lang.Thread.State: BLOCKED (on object monitor) at weblogic.utils.classloaders.GenericClassLoader.loadClass(GenericClassLoader.java:515) - waiting to lock <0x00000006c4f9ac90> (a java.lang.Object) at weblogic.utils.classloaders.GenericClassLoader.loadClass(GenericClassLoader.java:492) at weblogic.utils.classloaders.GenericClassLoader.loadClass(GenericClassLoader.java:469) at antlr.Utils.loadClass(Utils.java:18) at antlr.CharScanner.setTokenObjectClass(CharScanner.java:335) at antlr.CharScanner.<init>(CharScanner.java:49) at antlr.CharScanner.<init>(CharScanner.java:58) at weblogic.diagnostics.archive.filestore.AccessLogLexer.<init>(AccessLogLexer.java:62) at weblogic.diagnostics.archive.filestore.AccessLogLexer.<init>(AccessLogLexer.java:59) at weblogic.diagnostics.archive.filestore.AccessLogLexer.<init>(AccessLogLexer.java:56) at weblogic.diagnostics.archive.filestore.AccessLogRecordParser.parseRecord(AccessLogRecordParser.java:60) at weblogic.diagnostics.archive.filestore.RecordReader.getRecord(RecordReader.java:201) at weblogic.diagnostics.archive.filestore.FileRecordIterator.readRecords(FileRecordIterator.java:76) at weblogic.diagnostics.archive.filestore.FileRecordIterator.fill(FileRecordIterator.java:245) at weblogic.diagnostics.archive.RecordIterator.fetchMore(RecordIterator.java:157) at weblogic.diagnostics.archive.RecordIterator.hasNext(RecordIterator.java:130) at weblogic.diagnostics.collections.BulkIteratorImpl.hasNext(BulkIteratorImpl.java:61) at weblogic.diagnostics.collections.BulkIteratorImpl_WLSkel.invoke(Unknown Source) at weblogic.rmi.internal.BasicServerRef.invoke(BasicServerRef.java:645) at weblogic.rmi.internal.BasicServerRef$2.run(BasicServerRef.java:534) at weblogic.security.acl.internal.AuthenticatedSubject.doAs(AuthenticatedSubject.java:368) at weblogic.security.service.SecurityManager.runAs(SecurityManager.java:163) at weblogic.rmi.internal.BasicServerRef.handleRequest(BasicServerRef.java:531) at weblogic.rmi.internal.wls.WLSExecuteRequest.run(WLSExecuteRequest.java:138) at weblogic.invocation.ComponentInvocationContextManager._runAs(ComponentInvocationContextManager.java:348) at weblogic.invocation.ComponentInvocationContextManager.runAs(ComponentInvocationContextManager.java:333) at weblogic.work.LivePartitionUtility.doRunWorkUnderContext(LivePartitionUtility.java:54) at weblogic.work.PartitionUtility.runWorkUnderContext(PartitionUtility.java:41) at weblogic.work.SelfTuningWorkManagerImpl.runWorkUnderContext(SelfTuningWorkManagerImpl.java:640) at weblogic.work.ExecuteThread.execute(ExecuteThread.java:406) at weblogic.work.ExecuteThread.run(ExecuteThread.java:346)
STUCK Thread in the MOS Document:
"[ACTIVE] ExecuteThread: '17' for queue: 'weblogic.kernel.Default (self-tuning)'" #78 daemon prio=5 os_prio=64 tid=0x0000000105dc1000 nid=0x7f in Object.wait() [0xffffffff343fe000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:502) at weblogic.common.CompletionRequest.getResult(CompletionRequest.java:115) - locked (a weblogic.common.CompletionRequest) at weblogic.diagnostics.archive.wlstore.PersistentStoreDataArchive.readRecord(PersistentStoreDataArchive.java:708) - locked (a weblogic.diagnostics.archive.wlstore.HarvestedPersistentStoreDataArchive) at weblogic.diagnostics.archive.wlstore.PersistentStoreDataArchive.readRecord(PersistentStoreDataArchive.java:681) at weblogic.diagnostics.archive.wlstore.PersistentStoreDataArchive.getWrapper(PersistentStoreDataArchive.java:1809) at weblogic.diagnostics.archive.wlstore.PersistentStoreDataArchive.removeGarbageInPage(PersistentStoreDataArchive.java:1855) - locked (a weblogic.diagnostics.archive.wlstore.HarvestedPersistentStoreDataArchive) at weblogic.diagnostics.archive.wlstore.PersistentStoreDataArchive.cleanupPages(PersistentStoreDataArchive.java:1736) at weblogic.diagnostics.archive.wlstore.PersistentStoreDataArchive.cleanup(PersistentStoreDataArchive.java:1684) at weblogic.diagnostics.archive.wlstore.PersistentStoreDataArchive.deleteDataRecords(PersistentStoreDataArchive.java:1393) at weblogic.diagnostics.archive.wlstore.PersistentStoreDataArchive.retireOldestRecords(PersistentStoreDataArchive.java:1240) at weblogic.diagnostics.archive.DataRetirementByQuotaTaskImpl.performDataRetirement(DataRetirementByQuotaTaskImpl.java:92) at weblogic.diagnostics.archive.DataRetirementByQuotaTaskImpl.run(DataRetirementByQuotaTaskImpl.java:49) at weblogic.diagnostics.archive.DataRetirementTaskImpl.run(DataRetirementTaskImpl.java:276) at weblogic.work.SelfTuningWorkManagerImpl$WorkAdapterImpl.run(SelfTuningWorkManagerImpl.java:643) at weblogic.invocation.ComponentInvocationContextManager._runAs(ComponentInvocationContextManager.java:348) at weblogic.invocation.ComponentInvocationContextManager.runAs(ComponentInvocationContextManager.java:333) at weblogic.work.LivePartitionUtility.doRunWorkUnderContext(LivePartitionUtility.java:54) at weblogic.work.PartitionUtility.runWorkUnderContext(PartitionUtility.java:41) at weblogic.work.SelfTuningWorkManagerImpl.runWorkUnderContext(SelfTuningWorkManagerImpl.java:617) at weblogic.work.ExecuteThread.execute(ExecuteThread.java:397) at weblogic.work.ExecuteThread.run(ExecuteThread.java:346) "[STUCK] ExecuteThread: '16' for queue: 'weblogic.kernel.Default (self-tuning)'" #77 daemon prio=1 os_prio=64 tid=0x000000010829e000 nid=0x7e waiting for monitor entry [0xffffffff345fe000] java.lang.Thread.State: BLOCKED (on object monitor) at weblogic.diagnostics.archive.wlstore.PersistentStoreDataArchive.insertRecord(PersistentStoreDataArchive.java:853) - waiting to lock (a weblogic.diagnostics.archive.wlstore.HarvestedPersistentStoreDataArchive) at weblogic.diagnostics.archive.wlstore.PersistentStoreDataArchive.writeData(PersistentStoreDataArchive.java:1523) at weblogic.diagnostics.harvester.internal.MetricArchiver.archive(MetricArchiver.java:612) at weblogic.diagnostics.harvester.internal.HarvesterSamplesQueue$2.run(HarvesterSamplesQueue.java:80) - locked (a weblogic.diagnostics.harvester.internal.HarvesterSamplesQueue$2) at weblogic.work.SelfTuningWorkManagerImpl$WorkAdapterImpl.run(SelfTuningWorkManagerImpl.java:643) at weblogic.invocation.ComponentInvocationContextManager._runAs(ComponentInvocationContextManager.java:348) at weblogic.invocation.ComponentInvocationContextManager.runAs(ComponentInvocationContextManager.java:333) at weblogic.work.LivePartitionUtility.doRunWorkUnderContext(LivePartitionUtility.java:54) at weblogic.work.PartitionUtility.runWorkUnderContext(PartitionUtility.java:41) at weblogic.work.SelfTuningWorkManagerImpl.runWorkUnderContext(SelfTuningWorkManagerImpl.java:617) at weblogic.work.ExecuteThread.execute(ExecuteThread.java:397) at weblogic.work.ExecuteThread.run(ExecuteThread.java:346)
-
-
As a summary;
- Always keep WebLogic logs tidy and move them or delete them periodically.
- We have made some changes, performance improvements and took some prevention about this issue and would be available in the next WLSDM release which is v3.2.3 and would be ready very soon.
Regards…
-
Hi,
Thank you for the suggestions. I will look into this and clear the logs.
Our WebLogic server version is 12.2.1.1.0. I uploaded a screencapture of the information from the Admin console. We went ahead and applied the JVM parameters to the start up as per the Oracle’s suggestion on the Doc ID: 2253017.1
-
Hi Jeba,
Thank you for the additional information. Please clear the logs then let us know the result. You can easily track STUCK threads by following WLSDM stuck notifications.Regards…
-
-
Hi,
Are you sure all access and diagnostic logs are backuped?
Which WLSDM version are you using?
Kindly, please send reply at [email protected] email address.
Regards.