As we saw in the last post each Java class file starts with the so-called “magic” section. This has historic reasons and goes back to the very early days of the Java language. You may take a look here or here for more details about James Goslings decision.
We will continue here with our little ClassFileReader Utility. In Step 0 in the last post we displayed all the bytes of a Java class file. Now we will not only print it, but start to process the class file as we read it.
Read & Validate the Java Class File Format’s Magic Number
Define the API
One of the hardest parts of writing code is defining an API that makes sense for the one who writes the code and also for the one who uses the API. This is especially true if you want to accept some kind of files or similar resources. Many libraries provide multiple method overloads for e.g. string path, URI, InputStream, File etc.
For the first version of the ClassFileReader I will simply use InputStream because that is what we need to read bytes. So the ClassFile’s interface will look as follows:
public class ClassFile{ public static ClassFile read(InputStream in); }
This is what makes sense and works good from my experience. Of cause I could also pass the InputStream to the Constructor, but I like to use factory methods and hide the constructor.
Write the Tests
Of course we want to test the ClassFileReader. It makes sense to write tests before the implementation, if possible.
public class TestMagic { @Test public void test() throws IOException { // GIVEN InputStream is = TestMagic.class.getResourceAsStream("/EmptyClass.class"); // WHEN ClassFile classFile = ClassFile.read(is); // THEN assertThat(classFile).isNotNull(); } @Test(expected = IllegalArgumentException.class) public void testNotAClass() throws IOException { // GIVEN InputStream is = TestMagic.class.getResourceAsStream("/notAClass.txt"); // WHEN ClassFile classFile = ClassFile.read(is); //THEN fail("We must not reach this code."); } }
So we have two tests. The first simply passes when given an InputStream of a class file. The second test should fail with an IllegalArgumentException because the InputStream is not from a valid class file.
Maybe you wonder where the assert statement assertThat(classFile).isNotNull()
comes from. It is from the nice AssertJ library which I prefer over the JUnit and Hamcrest matchers. The fluid AssertJ API is so intuitive and comfortable, it just feels right.
Implement It
The last step is the implementation. Although we accept an InputStream
as parameter, internally we will use the DataInputStream
. This is simply recommended by the JVM Specification:
This chapter defines its own set of data types representing
class
file data: The typesu1
,u2
, andu4
represent an unsigned one-, two-, or four-byte quantity, respectively. In the Java SE platform, these types may be read by methods such asreadUnsignedByte
,readUnsignedShort
, andreadInt
of the interfacejava.io.DataInput
.
We will use the following methods to read data from the class file:
// read fixed amount of bytes int u1 = input.readUnsignedByte(); int u2 = input.readUnsignedShort(); int u4 = input.readInt(); // read variable amount of bytes int length = ...; byte[] bytes = new byte[length]; input.readFully(bytes);
As the magic number is of type u4
, we read it with readInt()
.
private static void readMagic(DataInput in) throws IOException { int magic = in.readInt(); if(magic != 0xCAFEBABE){ throw new IllegalArgumentException("Magic is expected to be 0xCAFEBABE. " + "Argument is not a Java Class File!"); } }
If the magic number is not equal to OxCAFEBABE
, we will throw an IllegalArgumentException
, as we expect it in the test above.
You can find the complete source code for this blog post tagged on the GitHub repo.
Summary
- I favor factory methods over constructors. The biggest advantage is, that I can easily change and add methods without touching the constructor(s).
- I will try to write tests before implementation where possible. As I won’t understand many things upfront (I’m learning the Class File Format while developing the ClassFileReader), I will try to at least write tests for the parts I really understand.
- The class file will be consumed byte by byte using the methods of the
DataInput
interface.
2 thoughts on “0xCAFEBABE – The Java Class File Magic Number”