To study the usage of thread-safe libraries in dacapo benchmark suite, I want to instrument the applications.
Soot, although good at analyzing and instrumenting programs, seems to miss certain classes for some applications. In other words, the output may not be correct.
Also, each soot analysis runs for almost an hour and uses up to 14G memory, which is annoying. What is more annoying, you must rerun the analysis even you find if you find a tiny mistake in the code, e.g., missing a nullcheck condition. It costs another one hour.
To get rid of incorrect and inefficient soot, I want to use asm to instrument dacapo applications.
Basically, what I am going to do are:
1) identify the thread-safe classes.
2) instrument such classes so that their instances contain extra fields representing our tracking information.
3) when we invoke the methods of such classes, we first forward to the instrumented method which checks the shareness of the invoker.
The tasks involve adding fields, adding methods and invocations, and also the detection of thread-safe classes.
Before carrying out the tasks, we first introduce asm as follows.
Section{ASM primer}
The bytecode outline plugin is very great to teach you the ASM code for producing certain java code.
Note that, eclipse 3.5 has a version incompatibility problem with the plugin, use the newest version in the author's webpage.
section{how to define a class}
dynamic use of the produced class: defineclass(,,, bytearry.length). This method is protected in ClassLoader. So extend the ClassLoader as a subClassLoader, where you can use it and wraps it as a public interface.
Copy a class to the other file: suppose cr is a reader, cw is a class writer. cr.accept(cw, ..) will help copying.
Transform: cr.accept(cv,...) cv is a wrapper of cw, it can intercepts the events produced by cr, then consumes and redirects it to the inside cw, during which some transformation may be done.
Optimization: cr declares the cw in its initialization, so that, the classes in cr are also in cw, without any copying. problems? not clear.
note for dynamic use: transform defined above can be applicable for the classes loaded in the same classloader. If you want it to be applicable to all classes, transform it in the jdk's instrumentor class.
remove a source info: overwrite visitSource in cv and do not forward it to the cw.
remove field: overwrite visitField and return null. Note that a return value is needed according to the interface.
add field: add visitField(); Note that, you cannot add it into visit() as it is the starts of the class and would be followed by visitAnnotations, which are required to precede visitField(). you can, however, add the visitField in visitEnd(), where we can make sure no duplication of fields can occur.
section{method instrumentation}
visitcode // must be called
visitIns; visistFrame(); // optional
visitMaxs // must be invoked
visitEnd();
mv1 and mv2 can be invoked in an interleaving way.
Section{stack frame}
It is used in jdk1.6 to speed up the verification process.
map frame: [type of locals][type of operands]
Special types are:
initialized (label) denotes the type where the object is allocated but not initialized.
initialized_this is similar, it denotes the type where the object is allocated but not initialized in the constructor method.
Top denotes the things which are not assigned to any value
NULL denotes the null.
We do not need to store the map frames before every instruction, actually, many of them can be inferred easily.
We only store the map frames before such instructions:
1) target of any jump
2) the instruction following the unconditional jumps such as goto/throw.
For such instructions, we cannot figure out easily the clues of the preceding instructions.
Shortcut: If the map stack is identical to the initial map stack, we simply use the F_SAME.
Note that, one instruction can map to multiple labels (sequentially placed)
but one label can only map to one instruction.
Section{add timer}
visitCode()
{
// add code for startTimer=System.currentTime();
}
visitInst(opcode)
{
if((opcode>=IReturn&&opcode<=Return)||opcode==athrow)
{
//add code for endTimer=System.currenTime();
}
}
updateMaxs()
{
// figure out what maximal slots for locals and operand stack.
visitMax(maxStack+4, maxiLocal);// for example, the maxstack needs to be expanded 4 slots to accommodate our new code.
}
Section{stateful transformation}
It is actually very simple.
It maintains a state machine which may transfer the state to a specific state without directly emitting the code.