记JVM堆外内存泄漏Bug查找

发布时间:2019-11-19 发布网站:脚本宝典
脚本宝典收集整理的这篇文章主要介绍了记JVM堆外内存泄漏Bug查找脚本宝典觉得挺不错的,现在分享给大家,也给大家做个参考。

内存泄漏Bug现场

一个做BI数据展示的服务在一个晚上重启了5次,由于是通过k8s容器编排,服务挂了以后会自动重启,所以服务还能继续提供服务。

第一时间先上日志系统查看错误日志,发现如下报错:

java.lang.OutOfMemoryError    ERROR    java.lang.OutOfMemoryError: unable to create new native thread  at java.lang.Thread.start0(Native Method)  at java.lang.Thread.start(Thread.java:717)  at org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase.start(CloseableHttpAsyncClientBase.java:83)  at org.elasticsearch.client.RestClientBuilder.build(RestClientBuilder.java:190)  at com.xiaohongshu.fls.jbds.handler.JbdsImpl.get_realtime_trend_data_from_es(JbdsImpl.java:637)  at com.xiaohongshu.fls.jbds.handler.JbdsImpl.get_seller_trend(JbdsImpl.java:316)  at com.xiaohongshu.fls.jbds.handler.JbdsImpl$$FastClassBySpringCGLIB$$bd2466f7.invoke(<generated>)  at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)  at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:738)  at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157)  at org.springframework.aop.aspectj.MethodInvocationProceedingJoinPoint.proceed(MethodInvocationProceedingJoinPoint.java:85)  at com.xiaohongshu.fls.jbds.aspect.RpcMethodExceptionAspect.requestControllerLog(RpcMethodExceptionAspect.java:33)  at sun.reflect.GeneratedMethodAccessor54.invoke(Unknown Source)  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)  at java.lang.reflect.Method.invoke(Method.java:498)  at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethodWithGivenArgs(AbstractAspectJAdvice.java:629)  at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethod(AbstractAspectJAdvice.java:618)  at org.springframework.aop.aspectj.AspectJAroundAdvice.invoke(AspectJAroundAdvice.java:70)  at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)  at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:92)  at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)  at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:673)  at com.xiaohongshu.fls.jbds.handler.JbdsImpl$$EnhancerBySpringCGLIB$$d374b954.get_seller_trend(<generated>)  at sun.reflect.GeneratedMethodAccessor81.invoke(Unknown Source)  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)  at java.lang.reflect.Method.invoke(Method.java:498)  at com.xiaohongshu.infra.rpc.core.ThriftServerBaseProxy.call(ThriftServerBaseProxy.java:119)  at com.xiaohongshu.infra.rpc.core.ThriftServerCGlibProxy.intercept(ThriftServerCGlibProxy.java:27)  at com.xiaohongshu.fls.jbds.handler.JbdsImpl$$EnhancerByCGLIB$$19473b8f.get_seller_trend(<generated>)  at com.xiaohongshu.fls.rpc.jbds.JBusinessDataService$Processor$get_seller_trend.getResult(JBusinessDataService.java:1450)  at com.xiaohongshu.fls.rpc.jbds.JBusinessDataService$Processor$get_seller_trend.getResult(JBusinessDataService.java:1435)  at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)  at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)  at org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:518)  at org.apache.thrift.server.Invocation.run(Invocation.java:18)  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)  at java.lang.Thread.run(Thread.java:748) 

发现是OOM的错误,并且有unable to create new native thread的错误信息,笔者的第一直觉是创建了大量线程从而导致堆外内存空间不足。

随即去CAT监控系统上去查看线程的活跃情况,如下图:

记JVM堆外内存泄漏Bug查找

发现服务的活跃线程数达到了1W左右,这显然有问题。服务本身是一个Thrift Server,开起的worker线程数为200,再加上一些其他IO线程,总的线程数不会超过300(服务内自己没有显示创建线程或者使用线程池)。

查找线索

笔者登录到线上的docker实例上,通过jmap -histo:live pid命令,查看JVM中存活的对象,输出如下内容:

num     #instances         #bytes  class name ----------------------------------------------    1:         19303      656995672  [B    2:        116670       13942144  [C    3:         15645        5298648  [I    4:         10098        3796848  java.lang.Thread    5:         75817        3639216  java.util.HashMap    6:        113889        2733336  java.lang.String    7:         27521        2421848  java.lang.reflect.Method    8:         68662        2197184  java.util.concurrent.ConcurrentHashMap$Node    9:         10041        1767032  [J   10:           372        1525568  [Ljava.nio.ByteBuffer;   11:         36807        1472280  java.util.LinkedHashMap$Entry   12:         16504        1352488  [Ljava.util.HashMap$Node;   13:         40687        1301984  java.lang.ThreadLocal$ThreadLocalMap$Entry   14:         11495        1277344  java.lang.Class   15:         29903        1196120  java.util.WeakHashMap$Entry   16:         70474        1127584  java.lang.Object   17:         15346         918592  [Ljava.lang.Object;   18:         51556         824896  java.util.HashSet   19:         25528         816896  java.util.HashMap$Node   20:          9989         812176  [Ljava.lang.ThreadLocal$ThreadLocalMap$Entry;   21:          9661         792032  [Ljava.util.WeakHashMap$Entry;   22:          9504         760320  org.apache.http.impl.nio.reactor.BaseIOReactor   23:         30551         733224  java.util.concurrent.ConcurrentLinkedQueue$Node   24:          9914         713808  sun.nio.ch.EPollArrayWrapper   25:          9914         713808  sun.nio.ch.EPollSelectorImpl   26:         29682         712368  java.util.concurrent.ConcurrentLinkedQueue   27:          3525         694672  [Ljava.util.concurrent.ConcurrentHashMap$Node;   28:         15664         501248  java.lang.ref.WeakReference   29:          8364         468384  java.util.LinkedHashMap   30:          9656         463488  java.util.WeakHashMap   31:         20064         436536  [Ljava.lang.Class;   32:          9267         370680  java.security.AccessControlContext   33:          4856         349632  java.lang.reflect.Field   34:         10278         328896  java.lang.ref.ReferenceQueue   35:          9914         317248  sun.nio.ch.AllocatedNativeObject   36:          4955         317120  java.util.concurrent.ConcurrentHashMap   37:          7597         303880  java.lang.ref.SoftReference   38:          9245         293568  [Ljava.security.ProtectionDomain;   39:         11484         275616  java.util.ArrayList   40:          3330         239760  org.springframework.core.annotation.AnnotationAttributes   41:          9989         239736  java.lang.ThreadLocal$ThreadLocalMap   42:          9951         238824  java.util.Collections$SynchronizedSet   43:          9922         238128  java.util.BitSet   44:          9504         228096  org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker   45:          5460         207944  [Ljava.lang.String;   46:          2459         196720  java.lang.reflect.Constructor   47:         12103         193648  java.util.HashMap$KeySet   48:          8003         192072  java.beans.MethodRef   49:          3657         175536  org.aspectj.weaver.reflect.ShadowMatchImpl   50:         10339         165424  java.util.concurrent.atomic.AtomicBoolean   51:         10281         164496  java.lang.ref.ReferenceQueue$Lock   52:         10203         163248  java.util.Collections$UnmodifiableSet   53:          2913         163128  java.beans.MethodDescriptor   54:          2547         161536  [Ljava.lang.reflect.Method;   55:          9914         158624  java.nio.channels.spi.AbstractSelector$1   56:          9913         158608  sun.nio.ch.Util$3   57:          5481         131544  org.springframework.core.MethodClassKey   58:          3707         118624  java.util.LinkedList   59:          3637         116384  org.aspectj.weaver.patterns.Exposedstate   60:          2047         114632  java.lang.Class$ReflectionData   61:          1155         110880  org.springframework.beans.GenericTypeAwarePropertyDescriptor   62:           399         109984  [Ljava.lang.Thread;   63:          4387         105288  sun.reflect.generics.tree.SimpleClassTypeSignature   64:          1388          99936  java.beans.PropertyDescriptor   65:          4087          90560  [Ljava.lang.reflect.Type;   66:          5361          85776  org.springframework.core.annotation.AnnotationUtils$DefaultValueHolder   67:          1296          82944  io.netty.buffer.PoolSubpage   68:          4387          82520  [Lsun.reflect.generics.tree.Typeargument;   69:           674          75488  org.springframework.boot.loader.jar.JarEntry   70:          1848          73920  java.util.TreeMap$Entry   71:          3061          73464  sun.reflect.annotation.AnnotationInvocationHandler   72:          2737          65688  java.util.LinkedList$Node   73:          3989          63824  sun.reflect.generics.tree.ClassTypeSignature   74:           858          61776  org.apache.ibatis.mapping.ResultMapping   75:           832          53248  org.springframework.core.MethodParameter   76:            78          51168  io.netty.util.internal.shaded.org.jctools.queues.MpscArrayQueue   77:          2120          50880  java.util.Collections$UnmodifiableRandomAccessList   78:          3165          50640  java.util.LinkedHashMap$LinkedKeySet   79:          2419          49728  [Lsun.reflect.generics.tree.FieldTypeSignature;

记JVM堆外内存泄漏Bug查找

标红的这个类org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker非常值得怀疑,有9504个实例,从类名看是一个实现了Reactor模式的http客户端,了解Reactor模式的同学知道,其包含了一个accept线程处理连接事件,NIO bounding的线程处理readwrite事件,和Mworker线程处理业务逻辑。

进一步查找

接下来需要查找哪里使用了这个http客户端,在build.gradle的文件里看到依赖了EShttp client

group: 'org.elasticsearch.client', name: 'elasticsearch-rest-client', version: '6.3.2'

通过全局搜索,发现了如下的代码:

private String get_realtime_data_from_es (String seller_id) throws IOException{     Date current_date = new Date( );     SimpleDateFormat sDF = new SimpleDateFormat ("yyyyMMdd");     String date=sdf.format(current_date);     RestClient restClient = RestClient.builder(             new HttpHost("xxxxxxxxx", 9200, "http"),             new HttpHost("xxxxxxxxx", 9201, "http")).build();     org.elasticsearch.client.Response response = restClient.performRequest("GET", "/seller/"+date+"/"+seller_id);     restClient.close();     return EntityUtils.toString(response.getEntity()); }

笔者发现这里调用ESRestClient的非常可疑,居然是每次调用创建一个对象,而不是使用单例模式。

使用过RestTemplate或者HttpClient的同学都知道,Http客户端一般都是通过单例的方式注入到Spring容器中,客户端都是线程安全的,多线程情况下不会出现串包的情况。(具体线程安全的实现机制,可参考笔者之前的博客:《Http请求连接池-HttpClient的AbstractConnPool源码分析》,地址:https://segmentfault.com/a/11...

码探究

直接进到RestClientBuilder类的build()方法中一探究竟,代码如下:

public RestClient build() {     if (failureListener == null) {         failureListener = new RestClient.FailureListener();     }     CloseableHttpAsyncClient httpClient = AccessController.doPrivileged(new PrivilegedAction<CloseableHttpAsyncClient>() {         @Override         public CloseableHttpAsyncClient run() {             return createHttpClient();         }     });     RestClient restClient = new RestClient(httpClient, maxRetryTimeout, defaultHeaders, hosts, pathPrefix, failureListener);     httpClient.start();     return restClient; }

build方法是实际创建RestClient的地方,设置了超时时间、host等参数。再点进start方法,他是抽象类CloseableHttpAsyncClient的一个抽象方法,具体实现在实现类CloseableHttpAsyncClientBase中,如下:

@Override public void start() {     if (this.status.COMpareAndSet(Status.INACTIVE, Status.ACTIVE)) {         if (this.reactorThread != null) {             this.reactorThread.start();         }     } }

其调用了成员变量reactorThreadstart方法,而成员变量reactorThread 是一个Thread类,它在构造方法中初始化,如下:

public CloseableHttpAsyncClientBase(         final NHttpClientConnectionManager connmgr,         final ThreadFactory threadFactory,         final NHttpClientEventHandler handler) {     super();     this.connmgr = connmgr;     if (threadFactory != null &amp;& handler != null) {         this.reactorThread = threadFactory.newThread(new Runnable() {              @Override             public void run() {                 try {                     final IOEventDispatch ioEventDispatch = new InternalIODispatch(handler);                     connmgr.execute(ioEventDispatch);                 } catch (final Exception ex) {                     log.error("I/O reactor terminated abnormally", ex);                 } finally {                     status.set(Status.STOPPED);                 }             }          });     } else {         this.reactorThread = null;     }     this.status = new AtomicReference<Status>(Status.INACTIVE); } 

可见,每创建一个CloseableHttpAsyncClient对象,就会创建一个reactorThread线程,而connmgr.execute(ioEventDispatch)是一个永久for循环执行的方法,所以线程的run方法不会主动退出,即,reactorThread线程不会销毁。

问题解决

找到bug所在后,结局方案简单,只需将ESRestClient通过单例的方式注入到服务中,即可解决堆外内存泄漏的问题。

总结

查找bug时,需要从多个方面去定位,通过尽可能多的现场信息去定量分析,监控、线上机器、源码的查看都需要,深入下去,会有不少收获。

原文链接

https://segmentfault.com/a/11...

脚本宝典总结

以上是脚本宝典为你收集整理的记JVM堆外内存泄漏Bug查找全部内容,希望文章能够帮你解决记JVM堆外内存泄漏Bug查找所遇到的问题。

如果觉得脚本宝典网站内容还不错,欢迎将脚本宝典推荐好友。

本图文内容来源于网友网络收集整理提供,作为学习参考使用,版权属于原作者。
如您有任何意见或建议可联系处理。小编QQ:384754419,请注明来意。