The Java Class File Constant Pool

Everything that is constant in a class file is reflected in the constant pool. This means not only string or numeric constants, but everything that does not change during runtime, e.g.: variable and method names, method signatures, class names etc.

The information contained in the constant pool can be used to better understand the Java compiler or to do some static analysis.

In the fourth part of the series about the Java Class File Format I take a look at the constant pool.

Structure Of the Constant Pool

u2 constant_pool_count;
cp_info constant_pool[constant_pool_count-1];

Since the constant pool is of variable size, the first two bytes (u2) denote the count of constant pool entries. In fact it is the count minus one, because the first index (0) is always left empty. That means I f the count was e.g. 4, there would be entries with indices 0, 1, 2, 3 but 0 would be empty. This is a little bit strange and I’m not sure if this makes sense in any way.

After the count bytes all the entries follow, each starting with a first byte denoting the type of the entry called tag.

cp_info {
 u1 tag;
 u1 info[];
}

Below is a list of the tag values from the JVM spec.

Constant Type Value
CONSTANT_Class 7
CONSTANT_Fieldref 9
CONSTANT_Methodref 10
CONSTANT_InterfaceMethodref 11
CONSTANT_String 8
CONSTANT_Integer 3
CONSTANT_Float 4
CONSTANT_Long 5
CONSTANT_Double 6
CONSTANT_NameAndType 12
CONSTANT_Utf8 1
CONSTANT_MethodHandle 15
CONSTANT_MethodType 16
CONSTANT_InvokeDynamic 18

Types Of Constant Pool Entries

There are 14 different types of Constant Pool entries. Some of them are self-contained, some of them point to other constants, and one type points to the attributes table, wich we will handle later in this series of blog posts.

CONSTANT_Utf8

The UTF-8 constant has a length and an array of bytes which contains an UTF-8 encoded string constant. This may be a string literal such as “Hello, World” but also everything else which is represented as a string such as variable, class, filed and method names, method signatures etc.

CONSTANT_Integer, CONSTANT_Float

The integer & float constants contain a 32-bit representation of an integer or float value. It is important to know, that all other numeric types smaller than integer such as short, byte and also boolean, are handled as integer values in the java byte code.

So the following code

public static final boolean CONSTANT_BOOL = false;

leads to an integer constant pool entry with value 0.

CONSTANT_Long, CONSTANT_Double

The long & double constants contain a 64-bit (2 x 32-bit) representation of an long or float double.

Each constant of this types takes two indices in the constant pool. That means it creates a constant at a specific index and the following index is empty.  This was a design decision even the creators of the Java Class File Format regret. From the spec:

In retrospect, making 8-byte constants take two constant pool entries was a poor choice.

 – JVM Specification 

CONSTANT_Class

The class constant entry contains the index of an UTF-8 constant entry holding a fully qualified class name such as java/lang/Object. This type of entry itself may be referenced e.g. by a MethodRef constant.

CONSTANT_String

The String constant entry is used for constant java.lang.String entries such as string literals. It does not hold the string itself but references to an UTF-8 constant entry.

CONSTANT_Fieldref

The Fieldref constant is used for non-constant (static final) fields. It references a Class and NameAndType constant.

CONSTANT_Methodref

This constant is used when invoking a method via invokevirtual,  invokespecial, invokestatic

CONSTANT_InterfaceMethodref

This constant is used when invoking a method via invokespecial, invokestatic, invokeinterface

CONSTANT_NameAndType

The NameAndType constant is used for methods and fields, containing tow references to UTF-8 constants containing their names and signatures.

CONSTANT_MethodHandle

A MethodHandle consists of a kind and a reference to an item of the Constant Pool. The type of the referenced Constant Pool item depends on the kind of method handle.

CONSTANT_MethodType

The MethodType describes the signature of a MethodHandle constant.

CONSTANT_InvokeDynamic

The InvokeDynamic constant is used by the invokedynamic bytecode instruction to specify a bootstrap method.

Analyze the Class Using the Constant Pool

So what do the entries of the Constant Pool tell us?

At one hand we can learn some things about the Java compiler. For Example, that it optimizes string concatenation by replacing it with a StringBuilder. So we can find a java/lang/StringBuilder class entry in the Constant Pool although we have not even used this class explicitly.

Another optimization we can see in the constant pool is concatenation of string literals or constants at compile time.

final String TABLE = "person";
String query = "SELECT * " +
 "FROM " + TABLE + " p ";

Instead of creating three UTF-8 constants, the java compiler creates one UTF-8 constant for the TABLE constant and one concatenated entry containing "SELECT * FROM person p ".

At the other hand we can already do some static analysis of the class we were reading. We can filter the for all MethodRef and InterfaceMethodRef constants to see which methods have been called or filter all Class constants to see all classes related to the class.

So although we cannot say anything about the type hierarchy (that means which classes or interfaces our class extends or implements), we can see a kind of “network” of the types our class depends on.

Author: Thomas Lemmé

I am a Java Developer currently working at one of the top APM vendors as a Java agent developer.

2 thoughts on “The Java Class File Constant Pool”

Leave a Reply

Your email address will not be published. Required fields are marked *