generational system? Ever been curious about what's in the permanent
generation. Are objects ever promoted into it? Ever promoted out?
We'll you're not alone. Here are some of the answers.
Java objects are instantiations of Java classes. Our JVM has an internal
representation of those Java objects and those internal representations
are stored in the heap (in the young generation or the tenured generation).
Our JVM also has an internal representation of the Java classes and those
are stored in the permanent generation. That relationship is shown in
the figure below.
The internal representation of a
Java object and an internal representation of a Java class are very
similar. From this point on let me just call them Java objects and Java
classes and you'll understand that I'm referring to their internal
representation. The Java objects and Java classes
are similar to the extent that during a garbage collection both
are viewed just as objects and are collected in exactly the same way. So
why store the Java objects in a separate permanent generation? Why not just
store the Java classes in the heap along with the Java objects?
Well, there is a philosophical reason and a technical reason. The
philosophical reason is that the classes are part of our JVM implementation
and we should not fill up the Java heap with our data structures. The
application writer has a hard enough time understanding the amount
of live data the application needs and we shouldn't confuse the issue
with the JVM's needs.
The technical reason comes in parts.
Firstly the origins of the permanent generation predate my joining the team
so I had to do some code archaeology to get the story straight (thanks
Steffen for the history lesson).
Originally there was no permanent generation. Objects and classes
were just stored together.
Back in those days classes were mostly static. Custom class loaders were
not widely used and so it was observed that
not much class unloading occurred. As a performance optimization
the permanent generation was created and classes were put into it.
The performance improvement was significant back then. With the amount
of class unloading that occur with some applications, it's not clear that
it's always a win today.
It might be a nice simplification to not have a permanent generation, but
the recent implementation of the parallel collector for the tenured
generation (aka parallel old collector)
has made a separate permanent generation again desirable. The issue
with the parallel old collector has to do with the order in which
objects and classes are moved. If you're interested, I describe this
at the end.
So the Java classes are stored in the permanent generation. What all
does that entail? Besides the basic fields of a Java class there are
Methods of a class (including the bytecodes)
Names of the classes (in the form of an object that points to a string also in the permanent generation)
Constant pool information (data read from the class file, see chapter 4 of the JVM
specification for all the details).
Object arrays and type arrays associated with a class (e.g., an object array
containing references to methods).
Internal objects created by the JVM (java/lang/Object or java/lang/exception
for instance)
Information used for optimization by the compilers (JITs)
That's it for the most part. There are a few other bits of information that
end up in the permanent generation but nothing of consequence in terms of size. All these are allocated in the permanent generation and stay
in the permanent generation. So now you know.
This last part is really, really extra credit.
During a collection the garbage collector needs
to have a description of a Java object (i.e., how big is it and what
does it contain). Say I have an object X and X has a class K.
I get to X in the collection and I need K to tell me what X
looks like. Where's K? Has it been moved already?
With a permanent generation during a collection we move the
permanent generation first so we know that all the K's are in
their new location by the time we're looking at any X's.
How do the classes in the permanent generation get collected while the
classes are moving? Classes also have classes that describe their content.
To distinguish these classes from those classes we spell the former klasses.
The classes of klasses we spell klassKlasses. Yes, conversations around the
office can be confusing. Klasses are instantiation of klassKlasses so
the klassKlass KZ of klass Z has already been allocated before Z can be
allocated.
Garbage collections in the permanent generation
visit objects in allocation order and that allocation order is
always maintained during the collection. That is, if A is allocated
before B then A always
comes before B in the generation. Therefore if a Z is
being moved it's always the case that KZ has already been moved.
And why not use the same knowledge about allocation order to
eliminate the permanent generations even in the parallel old
collector case?
The parallel old collector does maintain allocation order of
objects, but objects are moved in parallel. When the collection
gets to X, we no longer know if K has been moved. It might be
in its new location (which is known) or it might be in its
old location (which is also known) or part of it might have
been moved (but not all of it). It is possible to keep track
of where K is exactly, but it would complicate the collector
and the extra work of keeping track of K might make it a performance
loser. So we take advantage of the fact that classes are kept in the permanent
generation by collecting the permanent generation before collecting
the tenured generation. And the permanent generation is currently collected serially.