Best way to count byte codes executed for Java code
I've been trying to get temp data for various Java programs. Then I had to do some regression analysis based on the sync data. Here are the two methods I used to get the sync data:
- System.currentTimeMillis () : I used this initially, but I wanted the sync data to be persistent when the same program was run multiple times. In this case, the variation was huge. When two instances of the same code were executed in parallel, the change was even greater. So I gave it up and started looking for some profilers.
- -XX countBytecodes Flag in Hotspot JVM : Since the change in temporary data was huge, I thought about measuring the number of byte codes executed when this code was executed. This should have given a more static count when the same program was run multiple times. But this also had variations. When the programs were executed sequentially, the variations were small, but when parellel ran the same code, the variations were huge. I also tried to compile with
-Xint
, but the results were similar.
So, I'm looking for some profiler that could count the number of bytecodes executed when the code is executed. The count should remain constant (or close to 1 correlation) across runs of the same program. Or, if there might be some other metric based on which I could get temporary data that should remain almost constant across multiple runs.
source to share
I wanted the sync data to be persistent when the same program was run multiple times
This is not possible on a real machine unless it is designed for a hard-core real-time system, which your machine almost certainly won't.
I'm looking for some profiler that could count the number of bytecodes executed when executing the code.
Assuming you can do this, it doesn't prove anything. For example, you won't be able to see which ++
is 90x cheaper than %
depending on the hardware you run it on. You won't be able to see that branch failing is if
up to 100 times more expensive than speculative branch. You won't be able to see that accessing memory to the area of ββmemory that triggers TLB skips can be more expensive than copying 4 KB of data.
if there could be some other metric based on which I could get temporary data that should remain nearly constant across multiple runs.
You can run it many times and take the average. This will hide any high results / outliers and give you a good idea of ββthe throughput. This can be a reproducible number for a given machine if it is long enough.
source to share