本文大部分内容,摘自下面两篇文章:
http://blog.xebia.com/2007/10/04/leaking-memory-in-java/、
http://www.iteye.com/topic/626801
先用一个极端例子说明String的substring方法引起的OutOfMemoryError问题:
- public class TestGC {
- private String large = new String(new char[100000]);
- public String getSubString() {
- return this.large.substring(0,2);
- }
- public static void main(String[] args) {
- ArrayList<String> subStrings = new ArrayList<String>();
- for (int i = 0; i <1000000; i++) {
- TestGC testGC = new TestGC();
- subStrings.add(testGC.getSubString());
- }
- }
- }
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
为什么会出现这个情况?查看一下JDK String类substring方法的源码,可以找到原因,源码如下:
- public String substring(int beginIndex, int endIndex) {
- if (beginIndex < 0) {
- throw new StringIndexOutOfBoundsException(beginIndex);
- }
- if (endIndex > count) {
- throw new StringIndexOutOfBoundsException(endIndex);
- }
- if (beginIndex > endIndex) {
- throw new StringIndexOutOfBoundsException(endIndex - beginIndex);
- }
- return ((beginIndex == 0) && (endIndex == count)) ? this :
- new String(offset + beginIndex, endIndex - beginIndex, value);
- }
- // Package private constructor which shares value array for speed.
- String(int offset, int count, char value[]) {
- this.value = value;
- this.offset = offset;
- this.count = count;
- }
该方法为了避免内存拷贝,提高性能,并没有重新创建char数组,而是直接复用了原String对象的char[],通过改变偏移量和长度来标识不同的字符串内容。也就是说,substring出的来String小对象,仍然会指向原String大对象的char[],所以就导致了OutOfMemoryError问题 。
找到问题之后,将上面代码中,getSubString的方法修改一下,如下:
- public String getSubString() {
- return new String(this.large.substring(0,2));
- }
- public String(String original) {
- int size = original.count;
- char[] originalValue = original.value;
- char[] v;
- if (originalValue.length > size) {
- // The array representing the String is bigger than the new
- // String itself. Perhaps this constructor is being called
- // in order to trim the baggage, so make a copy of the array.
- int off = original.offset;
- v = Arrays.copyOfRange(originalValue, off, off+size);
- } else {
- // The array representing the String is the same
- // size as the String, so no point in making a copy.
- v = originalValue;
- }
- this.offset = 0;
- this.count = size;
- this.value = v;
- }
除了substring方法之后,String的split方法,也存在同样的问题,split的源码如下:
- public String[] split(String regex, int limit) {
- urn Pattern.compile(regex).split(this, limit);
- }
可以看出,String的split方法通过Pattern的split方法来实现,Pattern的split方法源码如下:
- public String[] split(CharSequence input, int limit) {
- int index = 0;
- boolean matchLimited = limit > 0;
- ArrayList<String> matchList = new ArrayList<String>();
- Matcher m = matcher(input);
- // Add segments before each match found
- while(m.find()) {
- if (!matchLimited || matchList.size() < limit - 1) {
- String match = input.subSequence(index, m.start()).toString();
- matchList.add(match);
- index = m.end();
- } else if (matchList.size() == limit - 1) { // last one
- String match = input.subSequence(index,
- input.length()).toString();
- matchList.add(match);
- index = m.end();
- }
- }
- // If no match was found, return this
- if (index == 0)
- return new String[] {input.toString()};
- // Add remaining segment
- if (!matchLimited || matchList.size() < limit)
- matchList.add(input.subSequence(index, input.length()).toString());
- // Construct result
- int resultSize = matchList.size();
- if (limit == 0)
- while (resultSize > 0 && matchList.get(resultSize-1).equals(""))
- resultSize--;
- String[] result = new String[resultSize];
- return matchList.subList(0, resultSize).toArray(result);
- }
调用了String类的subSequence方法,该方法源码如下:
- public CharSequence subSequence(int beginIndex, int endIndex) {
- return this.substring(beginIndex, endIndex);
- }
通过代码可以看出,最终调用的是String类的substring方法,因此存在同样的问题。split出来的小对象,直接使用原String对象的char[]。
看了一下StringBuilder和StringBuffer的substring方法,则不存在这样的问题。其源码如下:
- public String substring(int start, int end) {
- (start < 0)
- throw new StringIndexOutOfBoundsException(start);
- (end > count)
- throw new StringIndexOutOfBoundsException(end);
- (start > end)
- throw new StringIndexOutOfBoundsException(end - start);
- return new String(value, start, end - start);
- }
- public String(char value[], int offset, int count) {
- if (offset < 0) {
- throw new StringIndexOutOfBoundsException(offset);
- }
- if (count < 0) {
- throw new StringIndexOutOfBoundsException(count);
- }
- // Note: offset or count might be near -1>>>1.
- if (offset > value.length - count) {
- throw new StringIndexOutOfBoundsException(offset + count);
- }
- this.offset = 0;
- this.count = count;
- this.value = Arrays.copyOfRange(value, offset, offset+count);
- }