FULL GC有可能导致JVM暂停

http://hllvm.group.iteye.com/group/topic/28745

与另外个程序交互有个心跳检测, 15s/次, 31s没有收到心跳检测的返回消息就认为连接断了, 

现在遇到一个诡异问题, 不知什么原因 导致程序中1分多钟内日志无任何输出, 接着就是检测到超时, 可是接着连续输出多次发送keepalive. 

发送keepalive就是很简单的java.util.Timer, 速率固定 

启动参数 
-Xms16m -Xmx128m -XX:+HeapDumpOnOutOfMemoryError 

程序启动时间[15:55:10.225] [23.12.2011] 


第一次 
Java代码   收藏代码
  1. [09:51:16.288] [26.12.2011] [Thread-14][DEBUG] Keepalive receiver started or reseted.  
  2.   
  3. [09:53:26.819] [26.12.2011] [Timer-9197][ERROR] Keepalive timeout reached  
  4.   
  5. [09:53:26.819] [26.12.2011] [Timer-9][DEBUG] Keepalive sent  
  6.   
  7. [09:53:26.819] [26.12.2011] [Timer-9][DEBUG] Keepalive sent  
  8.   
  9. [09:53:26.819] [26.12.2011] [Timer-9][DEBUG] Keepalive sent  
  10.   
  11. [09:53:26.819] [26.12.2011] [Timer-9][DEBUG] Keepalive sent  
  12.   
  13. [09:53:26.819] [26.12.2011] [Timer-9][DEBUG] Keepalive sent  
  14.   
  15. [09:53:26.819] [26.12.2011] [Timer-9][DEBUG] Keepalive sent  
  16.   
  17. [09:53:26.819] [26.12.2011] [Timer-9][DEBUG] Keepalive sent  
  18.   
  19. [09:53:26.819] [26.12.2011] [Timer-9][DEBUG] Keepalive sent        
  20.   
  21. [09:53:26.819] [26.12.2011] [Timer-9197][DEBUG] Sent <Disconnect/>  


重连启动后第二次 
Java代码   收藏代码
  1. [09:57:17.053] [26.12.2011] [Thread-9207][DEBUG] Keepalive receiver started or reseted.  
  2.   
  3. [09:58:29.381] [26.12.2011] [Timer-9259][ERROR] Keepalive timeout reached  
  4.   
  5. [09:58:29.381] [26.12.2011] [Timer-9259][DEBUG] Sent <Disconnect/>  
  6.   
  7. [09:58:29.397] [26.12.2011] [Timer-9200][DEBUG] Keepalive sent  
  8.   
  9. [09:58:29.397] [26.12.2011] [Timer-9200][DEBUG] Keepalive sent  
  10.   
  11. [09:58:29.397] [26.12.2011] [Timer-9200][DEBUG] Keepalive sent  
  12.   
  13. [09:58:29.397] [26.12.2011] [Timer-9200][DEBUG] Keepalive sent  
qianhd 2011-12-30
另外 -Xloggc 这个jvm的gc日志能以压缩的形式输出吗? 或者过滤一些.
RednaxelaFX 2011-12-30
GC暂停超过1分钟的情况不是没有,但以-Xmx128m的配置来看这个很诡异。
请问能使用 -XX:+PrintSafepointStatistics 来看看VM暂停的统计信息么?
一共是这几个参数:

C++代码   收藏代码
  1. product(bool, PrintSafepointStatistics, false,                            \  
  2.         "print statistics about safepoint synchronization")               \  
  3.                                                                           \  
  4. product(intx, PrintSafepointStatisticsCount, 300,                         \  
  5.         "total number of safepoint statistics collected "                 \  
  6.         "before printing them out")                                       \  
  7.                                                                           \  
  8. product(intx, PrintSafepointStatisticsTimeout,  -1,                       \  
  9.         "print safepoint statistics only when safepoint takes"            \  
  10.         " more than PrintSafepointSatisticsTimeout in millis")            \  
qianhd 2011-12-30
应该可以 
但是这问题也就遇到过这一次  未必能重现 
这地方只要出现过一次的问题 都能报bug 

就连temp目录垃圾文件太多 导致程序里创建临时文件太慢 使得程序响应变慢 都能报个bug.
william_ai 2011-12-30
3分钟以内,但是频率不能太高了。
xiaoyu 2011-12-31
Threaddump 一下. 先看看有没有其他东西在block住... 对于2边网络, 也最好使用tcpdump看一下是否出现什么问题(是不是ARP失效等, 是不是发送, 接收缓冲区满了等等)...

x_root 2012-01-01
你可以将readTimeout时间设大点,这个问题我以前也遇到过,GC时间过长,导致服务端以为连接无效而关闭了。

不过你的设置-Xmx128m却导致GC要一分多钟却有点不可思议。
iaimstar 2012-01-04
Xmx128m 也能一分钟以上?你这fullgc是坑爹啊
moonese 2012-02-27
会不会是对方程序的问题? 最后查出原因了?
qianhd 2012-02-29
最终没有查出原因, 因为很难复现 
这破烂货一般来说只要开3天以上, 就会有很多莫名其妙的问题. 所以建议用户不用就关掉 
从上面的[Timer-9200]就可以看出代码质量有多少差 
当然这破烂货不是我写的.
jjshanwei 2012-03-07
Xmx128m 的fullgc 要一分钟? 应改不是gc本身的问题
moonese 2012-03-08
jjshanwei 写道
Xmx128m 的fullgc 要一分钟? 应改不是gc本身的问题


不知道是不是这台机器太老,内存太小,上面跑的程序太多的缘故。 

建议加上打印gc日志的参数 -verbose:gc -XX:+PrintGCDetails -Xloggc:timer_gc.log,分析gc的日志,就知道是不是 GC 导致的问题了。 

或是用 jconsole / jvisualvm 监控下程序的 heap 使用情况。 

怀疑是程序里面有死锁或是代码写的不好,有性能问题,这个也可以通过工具查出的。
qianhd 2012-08-21
RednaxelaFX 写道
GC暂停超过1分钟的情况不是没有,但以-Xmx128m的配置来看这个很诡异。 
请问能使用 -XX:+PrintSafepointStatistics 来看看VM暂停的统计信息么? 
一共是这几个参数: 

C++代码   收藏代码
  1. product(bool, PrintSafepointStatistics, false,                            \  
  2.         "print statistics about safepoint synchronization")               \  
  3.                                                                           \  
  4. product(intx, PrintSafepointStatisticsCount, 300,                         \  
  5.         "total number of safepoint statistics collected "                 \  
  6.         "before printing them out")                                       \  
  7.                                                                           \  
  8. product(intx, PrintSafepointStatisticsTimeout,  -1,                       \  
  9.         "print safepoint statistics only when safepoint takes"            \  
  10.         " more than PrintSafepointSatisticsTimeout in millis")            \  




有没有资料详细讲一下这个参数?  
噩梦的问题又来了~
RednaxelaFX 2012-08-21
例如这帖:http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2011-August/002968.html ?
你可以搜一下OpenJDK的hotspot-gc-dev和hotspot-gc-use这两个邮件列表的过往内容。这个参数不是“标准参数”所以也没啥公开文档描述它的意义的。
qianhd 2012-08-21
RednaxelaFX 写道
例如这帖:http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2011-August/002968.html ? 
你可以搜一下OpenJDK的hotspot-gc-dev和hotspot-gc-use这两个邮件列表的过往内容。这个参数不是“标准参数”所以也没啥公开文档描述它的意义的。



Orz 
你真是24小时在线~ 

如果我设置个 MaxGCPauseMillis这个参数 
有意义吗? 
RednaxelaFX 2012-08-21
qianhd 写道

如果我设置个 MaxGCPauseMillis这个参数 
有意义吗? 

不是完全没意义不过这参数我印象中只有在ParallelScavenge和G1中的作用才比较明显,CMS虽然也有用这个参数但我印象中它没啥用。你最好到hotspot-gc-use那边问问,靠谱些。
qianhd 2012-08-23
我加了3个参数 
-XX:+PrintGCApplicationStoppedTime -XX:+PrintGCDateStamps -Xloggc:gclog.log 

今天早上一看, 果然断了, 晚上8点17的样子 
common GC竟然花了41秒 诧异~ 

另外,不是full GC才会stop all threads吗? 

再另外,我设定过最大和最小内存 -Xms64m -Xmx256m 
但是实际上最大时内存占用达到了400多m 
这是为啥? 

再再另外,有没有办法记录当时的JVM的一些状态信息? 

求解惑 

引用

2012-08-22T20:17:27.921+0800: 16489.272: [GC 191334K->84242K(299840K), 41.4177511 secs] 
Total time for which application threads were stopped: 41.4731411 seconds 
Total time for which application threads were stopped: 0.0489481 seconds 
Total time for which application threads were stopped: 0.1110761 seconds 
Total time for which application threads were stopped: 0.0007286 seconds 
Total time for which application threads were stopped: 0.0001268 seconds 
Total time for which application threads were stopped: 2.8581772 seconds 
Total time for which application threads were stopped: 0.0112389 seconds 
Total time for which application threads were stopped: 0.0001335 seconds 
Total time for which application threads were stopped: 0.0003246 seconds 
Total time for which application threads were stopped: 0.0003774 seconds 
Total time for which application threads were stopped: 0.0001439 seconds 
Total time for which application threads were stopped: 0.0000897 seconds 
Total time for which application threads were stopped: 0.0001408 seconds 
Total time for which application threads were stopped: 0.0001687 seconds 
Total time for which application threads were stopped: 0.0001380 seconds 
Total time for which application threads were stopped: 0.0001374 seconds 
Total time for which application threads were stopped: 0.0002028 seconds 
Total time for which application threads were stopped: 0.0001171 seconds 
Total time for which application threads were stopped: 0.0001277 seconds 
Total time for which application threads were stopped: 0.0002257 seconds 
Total time for which application threads were stopped: 0.0353492 seconds 
Total time for which application threads were stopped: 0.0004783 seconds 
Total time for which application threads were stopped: 0.0001612 seconds 
Total time for which application threads were stopped: 0.0075786 seconds 
Total time for which application threads were stopped: 0.0001754 seconds 
Total time for which application threads were stopped: 0.0002405 seconds 
Total time for which application threads were stopped: 0.0005079 seconds 
Total time for which application threads were stopped: 0.0001257 seconds 
Total time for which application threads were stopped: 0.0004294 seconds 
Total time for which application threads were stopped: 0.0001970 seconds 
Total time for which application threads were stopped: 0.0001310 seconds 
Total time for which application threads were stopped: 0.0001763 seconds 
Total time for which application threads were stopped: 0.0001685 seconds 
Total time for which application threads were stopped: 0.0004185 seconds 
Total time for which application threads were stopped: 0.0001067 seconds 
Total time for which application threads were stopped: 0.0005286 seconds 
Total time for which application threads were stopped: 0.0001140 seconds 
Total time for which application threads were stopped: 0.0002386 seconds 
Total time for which application threads were stopped: 0.0003646 seconds 
Total time for which application threads were stopped: 0.0001048 seconds 
Total time for which application threads were stopped: 0.0003132 seconds 
Total time for which application threads were stopped: 0.0001953 seconds 
Total time for which application threads were stopped: 0.0003942 seconds 
Total time for which application threads were stopped: 0.0004210 seconds 
Total time for which application threads were stopped: 0.0001009 seconds 
Total time for which application threads were stopped: 0.0001430 seconds 
Total time for which application threads were stopped: 0.0000816 seconds 
Total time for which application threads were stopped: 0.0297057 seconds 
Total time for which application threads were stopped: 0.0462548 seconds 
Total time for which application threads were stopped: 0.0003872 seconds 
Total time for which application threads were stopped: 0.0004657 seconds 
Total time for which application threads were stopped: 0.0004777 seconds 
Total time for which application threads were stopped: 0.0004568 seconds 
Total time for which application threads were stopped: 0.0003623 seconds 
Total time for which application threads were stopped: 0.0004906 seconds 
Total time for which application threads were stopped: 0.0003693 seconds 
Total time for which application threads were stopped: 0.0004129 seconds 
Total time for which application threads were stopped: 0.0003953 seconds 
Total time for which application threads were stopped: 0.0052699 seconds 
Total time for which application threads were stopped: 0.0002953 seconds 
Total time for which application threads were stopped: 0.0000514 seconds 
Total time for which application threads were stopped: 0.0000310 seconds 
Total time for which application threads were stopped: 0.0000603 seconds 
Total time for which application threads were stopped: 0.0000293 seconds 
Total time for which application threads were stopped: 0.0000777 seconds 
Total time for which application threads were stopped: 0.0001048 seconds 
Total time for which application threads were stopped: 0.0003676 seconds 
Total time for which application threads were stopped: 0.0003964 seconds 
Total time for which application threads were stopped: 0.0001201 seconds 
Total time for which application threads were stopped: 0.0003676 seconds 
Total time for which application threads were stopped: 0.0003995 seconds 
Total time for which application threads were stopped: 0.0004118 seconds 
Total time for which application threads were stopped: 0.0004026 seconds 
Total time for which application threads were stopped: 0.0003911 seconds 
Total time for which application threads were stopped: 0.0367284 seconds 
Total time for which application threads were stopped: 0.0003959 seconds 
Total time for which application threads were stopped: 0.0004009 seconds 
Total time for which application threads were stopped: 0.0003964 seconds 
Total time for which application threads were stopped: 0.0003727 seconds 
Total time for which application threads were stopped: 0.0002084 seconds 
Total time for which application threads were stopped: 0.0003738 seconds 
Total time for which application threads were stopped: 0.0003671 seconds 
Total time for which application threads were stopped: 0.0004165 seconds 
Total time for which application threads were stopped: 0.0004026 seconds 
Total time for which application threads were stopped: 0.0003674 seconds 
Total time for which application threads were stopped: 0.0003964 seconds 
Total time for which application threads were stopped: 0.0004165 seconds 
Total time for which application threads were stopped: 0.0003908 seconds 
Total time for which application threads were stopped: 0.0004109 seconds 
Total time for which application threads were stopped: 0.0004540 seconds 
Total time for which application threads were stopped: 0.0003730 seconds 
Total time for which application threads were stopped: 0.0003889 seconds 
Total time for which application threads were stopped: 0.0003495 seconds 
Total time for which application threads were stopped: 0.0003665 seconds 
Total time for which application threads were stopped: 0.0003805 seconds 
Total time for which application threads were stopped: 0.0004277 seconds 
Total time for which application threads were stopped: 0.0003864 seconds 
Total time for which application threads were stopped: 0.0003939 seconds 
Total time for which application threads were stopped: 0.0002249 seconds 
Total time for which application threads were stopped: 0.0003922 seconds 
Total time for which application threads were stopped: 0.0006319 seconds 
Total time for which application threads were stopped: 0.0003757 seconds 
Total time for which application threads were stopped: 0.0004816 seconds 
Total time for which application threads were stopped: 0.0004068 seconds 
Total time for which application threads were stopped: 0.0003878 seconds 
Total time for which application threads were stopped: 0.0004177 seconds 
Total time for which application threads were stopped: 0.0003771 seconds 
Total time for which application threads were stopped: 0.0003903 seconds 
Total time for which application threads were stopped: 0.0003964 seconds 
Total time for which application threads were stopped: 0.0004051 seconds 
Total time for which application threads were stopped: 0.0003984 seconds 
Total time for which application threads were stopped: 0.0003757 seconds 
Total time for which application threads were stopped: 0.0004886 seconds 
Total time for which application threads were stopped: 0.0004056 seconds 
Total time for which application threads were stopped: 0.0003822 seconds 
Total time for which application threads were stopped: 0.0003660 seconds 
Total time for which application threads were stopped: 0.0004520 seconds 
Total time for which application threads were stopped: 0.0003788 seconds 
Total time for which application threads were stopped: 0.0003671 seconds 
Total time for which application threads were stopped: 0.0003752 seconds 
Total time for which application threads were stopped: 0.0003964 seconds 
Total time for which application threads were stopped: 0.0002159 seconds 
Total time for which application threads were stopped: 0.0005065 seconds 
Total time for which application threads were stopped: 0.0003685 seconds 
Total time for which application threads were stopped: 0.0004872 seconds 
Total time for which application threads were stopped: 0.0006339 seconds 
Total time for which application threads were stopped: 0.0003903 seconds 
Total time for which application threads were stopped: 0.0004392 seconds 
qianhd 2012-08-24
引用

2012-08-24T10:47:26.359+0800: 62319.697: [Full GC62334.410: [SoftReference, 1591 refs, 0.0004489 secs]62334.411: [WeakReference, 7980 refs, 0.0011812 secs]62334.412: [FinalReference, 1683 refs, 0.0012770 secs]62334.413: [PhantomReference, 125 refs, 0.0000475 secs]62334.413: [JNI Weak Reference, 0.0003093 secs] [PSYoungGen: 7912K->0K(183360K)] [PSOldGen: 146039K->70830K(145024K)] 153952K->70830K(328384K) [PSPermGen: 42129K->42129K(65536K)], 17.9215166 secs] [Times: user=0.75 sys=0.25, real=17.92 secs] 



引用

2012-08-24T14:21:54.218+0800: 75187.848: [GC75197.407: [SoftReference, 0 refs, 0.0000056 secs]75197.407: [WeakReference, 102 refs, 0.0000212 secs]75197.407: [FinalReference, 1318 refs, 0.6698083 secs]75198.077: [PhantomReference, 1549 refs, 0.0088953 secs]75198.086: [JNI Weak Reference, 0.0315859 secs] [PSYoungGen: 170048K->1232K(171328K)] 240878K->72062K(316352K), 10.2911859 secs] [Times: user=2.01 sys=0.01, real=10.30 secs] 



从这个GC日志 能看出来17秒和10秒时间分别花在什么地方了吗
qianhd 2012-08-24
我发现一个现象, 长时间的GC的前面一般3,4个小时都没有发生过GC, 因为是个GUI程序,我都最下化, 这样内存占用会减少很多, 但是虚拟内存占用还是原来那么多. 
会不会在GC的时候,需要重新把虚拟内存的数据读回到内存, 这样就触发 先腾出内存,(某些进程的内容先暂存虚拟内存),然后把虚拟内存的数据读进来. 

会不会是这种原因导致的??
qianhd 2012-08-30
引用

2012-08-29T19:14:30.968+0800:  10069.800: [GC 10099.225: [SoftReference, 0 refs, 0.0000109 secs]10099.226: [WeakReference, 4072 refs, 0.0012099 secs]10099.227: [FinalReference, 984 refs, 1.5822450 secs]10100.809: [PhantomReference, 251 refs, 0.0001394 secs]10100.809: [JNI Weak Reference, 0.0994015 secs] [PSYoungGen: 175672K->8528K(167360K)] 251523K->100182K(353152K), 31.1580402 secs] [Times: user=0.61 sys=0.52, real=31.16 secs] 



其实GC执行, 并没有花很多时间, 从上面这段信息看, 开始准备GC在10069.800, 真正GC开始在10099.225, 中间花费了约30秒, 貌似时间大部分是花费在GC的准备上. 
这需要看看openJDK里面,执行GC前哪些操作比较耗时, 
我前面的猜测是可能和windows内存和虚拟内存有关, 这东西越来越底层了, 有点力不从心
qianhd 2012-09-28
搞定~ 

windows会在最小化GUI程序的时候trim这个进程的working set, 详见: 
http://support.microsoft.com/default.aspx?scid=kb;en-us;293215 

这样就导致了在下一次GC的时候, 需要重新从页面文件中reload数据, 这样就导致时间花费的很多. 

如何在最小化GUI程序的时候keep working set呢? MS给出了解决方案是针对VC应用程序的, java的话需要使用这个参数 -Dsun.awt.keepWorkingSetOnMinimize=true 

用Intellij的同学可能对这个参数有印象,"可以让IDEA最小化到任务栏时依然保持以占有的内存,当你重新回到IDEA,能够被快速显示,而不是由灰白的界面逐渐显现整个界面,加快回复到原界面的速度。" 

可惜我没用过Intellij,所以不知道有这参数, 
qianhd 2012-09-28
@RednaxelaFX 

能不能从openJDK源码的角度告诉我下, 这gclog的这些时间, 比如9622.177 9638.861 分别是表示什么时间点 

2012-09-26T19:00:49.468+0800:  9622.177: [GC  9638.861: [SoftReference, 0 refs, 0.0000050 secs]9638.861: [WeakReference, 3176 refs, 0.0005855 secs]9638.862: [FinalReference, 1005 refs, 0.0661215 secs]9638.928: [PhantomReference, 751 refs, 0.0058066 secs]9638.934: [JNI Weak Reference, 0.0040648 secs] [PSYoungGen: 147456K->19930K(154624K)] 207743K->80218K(312448K), 16.7716327 secs] [Times: user=2.31 sys=0.42, real=16.77 secs] 


对应的JVM参数 -XX:+PrintReferenceGC -XX:+PrintGCDetails -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCDateStamps -XX:+PrintHeapAtGC -Xloggc:gclog.log 

JVM(Java Virtual Machine)中,**Full GC**(Full Garbage Collection)是指对整个堆内存(包括年轻代、老年代)以及方法区(元空间或永久代)进行的垃圾回收。它通常耗时较长,会导致应用程序**“Stop-The-World”**(STW),即所有应用线程暂停,因此应尽量避免频繁发生。 --- ## 🔥 一、什么是 Full GC? - **Full GC** 是一次全局性的垃圾回收: - 回收 **年轻代(Young Generation)** - 回收 **老年代(Old Generation)** - 回收 **方法区 / 元空间(Metaspace)** - 它由不同的 GC 算法触发,常见的如 Serial Old、Parallel Old、CMS 的最终阶段、G1 的 Mixed GCFull GC 阶段等。 - 触发后会**暂停所有用户线程**,直到清理完成。 > ⚠️ 注意:虽然名字叫 "Full GC",但某些情况下可能只是 Major GC(仅老年代回收),但在日志中仍被标记为 Full GC。 --- ## 🧨 二、Full GC 触发的主要原因 以下是导致 JVM 执行 Full GC 的常见原因: --- ### 1. **老年代空间不足(最常见)** 当老年代没有足够空间容纳从年轻代晋升的对象时,会尝试触发一次 Full GC。 #### 常见场景: - 大量对象提前晋升到老年代(如 Survivor 区太小) - 长期存活对象过多 - 年轻代 GC 后有大批对象要进入老年代,但老年代放不下 ```text java.lang.OutOfMemoryError: Java heap space ``` 👉 此错误前往往伴随多次 Full GC。 --- ### 2. **System.gc() 被显式调用** Java 中调用 `System.gc()` 会建议 JVM 执行一次 Full GC(除非禁用)。 #### 示例代码: ```java System.gc(); // 显式请求 GC ``` #### 如何避免? - 使用 JVM 参数禁止响应此调用: ```bash -XX:+DisableExplicitGC ``` - 或者在生产环境中严格禁止使用 `System.gc()`。 --- ### 3. **元空间(Metaspace)空间不足** JDK 8+ 使用 **Metaspace** 替代了永久代(PermGen)。如果类加载过多(如动态生成类、反射、OSGi、热部署框架),可能导致 Metaspace 耗尽。 #### 触发条件: - Metaspace 占用接近阈值(`MaxMetaspaceSize`) - 默认情况下,Metaspace 扩容失败时会触发 Full GC 来尝试卸载类(通过 CMS 或 G1) #### 相关参数: ```bash -XX:MaxMetaspaceSize=256m # 设置上限 -XX:MetaspaceSize=64m # 初始大小,达到后触发首次 GC ``` --- ### 4. **CMS GC 时出现 “Concurrent Mode Failure”** 如果你使用的是 **CMS 收集器**(老年代),它本意是并发回收,不暂停应用线程。但如果在 CMS 执行过程中,老年代空间不足以容纳新晋升的对象,就会退化为 **Serial Old** 进行 Full GC。 #### 日志特征: ```text [GC (CMS Initial Mark) ... [Full GC (Concurrent Mode Failure) ``` #### 解决方案: - 增加老年代大小 - 提前启动 CMS:调整 `-XX:CMSInitiatingOccupancyFraction` - 减少对象晋升速度(优化对象生命周期) --- ### 5. **晋升失败(Promotion Failed)** 在 Minor GC 期间,如果 Survivor 区无法容纳存活对象,这些对象将尝试晋升到老年代。但如果老年代**连续空间不足**,即使总空间够,也可能因无法分配大对象而失败。 #### 原因: - 老年代碎片化严重(尤其使用 CMS 时) - 没有足够的连续空间存放大对象 此时 JVM 会触发 Full GC 尝试整理老年代空间。 --- ### 6. **堆内存分配担保失败(Allocation Failure)** JVM 在每次 Minor GC 前会检查老年代最大可用连续空间是否大于“年轻代所有对象总和”或“历次晋升平均大小”。如果不够,就提前触发 Full GC(称为“分配担保”机制)。 这属于一种预防性策略。 --- ### 7. **G1 GC 中的 Full GC(应极力避免)** G1 正常情况下不会执行 Full GC,但在以下情况会发生: - **并发标记阶段来不及完成**,而堆空间又快满了 - **晋升时老年代空间不足**,且 G1 无法及时完成 Mixed GC 清理 - **巨型对象(Humongous Object)** 分配失败 此时 G1 会退回到单线程 Full GC(通常是 **SATB + Mark-Sweep Compact**),非常慢。 #### 日志示例: ```text [Full GC (Ergonomics) ...] ``` #### 预防措施: - 调整 `-XX:InitiatingHeapOccupancyPercent` - 增加堆大小 - 控制 Humongous 对象分配 --- ### 8. **Direct Memory 导致的间接 Full GC(NIO 场景)** 使用 `ByteBuffer.allocateDirect()` 分配堆外内存时,不受堆限制,但其释放依赖于 `Cleaner`(基于 PhantomReference),而 Cleaner 的触发依赖于 **老年代 GC**。 如果长时间不触发老年代 GC,则 Direct Memory 无法及时释放 → 内存泄漏 → 最终 OOM → 可能触发紧急 Full GC。 --- ## 📊 三、如何监控和诊断 Full GC? ### 1. 开启 GC 日志(强烈推荐) ```bash -Xloggc:gc.log -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=10M ``` ### 2. 查看关键字段 在 GC 日志中查找: ```text [Full GC [GC (Allocation Failure) --> 可能引发 Full GC [GC (Metadata GC Threshold) --> Metaspace 触发 [Full GC (Ergonomics) --> JVM 自动决策 [Full GC (System.gc()) --> 显式调用 [Full GC (Concurrent Mode Failure) --> CMS 失败 ``` ### 3. 使用工具分析 - **jstat**: 实时查看 GC 统计 ```bash jstat -gcutil <pid> 1s ``` - **VisualVM / JConsole / Prometheus + Grafana** - **GCViewer / GCEasy.io**:上传日志文件自动分析 --- ## ✅ 四、如何减少或避免 Full GC? | 问题 | 解决方案 | |------|----------| | 老年代空间不足 | 增大 `-Xmx` 和 `-Xms`;优化对象生命周期 | | 对象过早晋升 | 增大 Survivor 区:`-XX:SurvivorRatio` | | Metaspace 不足 | 增大 `-XX:MaxMetaspaceSize`;检查类加载泄漏 | | System.gc() 调用 | 添加 `-XX:+DisableExplicitGC` | | CMS 并发失败 | 提前触发 CMS:`-XX:CMSInitiatingOccupancyFraction=70` | | G1 Full GC | 调整 IHOP:`-XX:InitiatingHeapOccupancyPercent=45` | | 堆外内存泄漏 | 限制 Direct Memory:`-XX:MaxDirectMemorySize`;手动 `clean()` | --- ## 🧪 五、实战排查步骤 1. **确认是否真的发生了 Full GC** 查看 GC 日志或使用 `jstat -gc <pid>`。 2. **定位触发原因** 看日志中的括号说明:`(System.gc())`, `(Metadata GC Threshold)` 等。 3. **分析内存分布** 使用 MAT(Memory Analyzer Tool)分析 heap dump,看是否有内存泄漏。 4. **优化代码与配置** - 减少大对象创建 - 避免长生命周期持有无用对象 - 合理设置堆参数 5. **选择合适的 GC 器** - 吞吐优先:Parallel GC - 低延迟:G1、ZGC、Shenandoah ---
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值