Skip to content

Tag: Blob

Reading Git Blob objects in Java

In order to read the Git Blob objects, we need to understand that git uses zlib to compress the stored objects. We can use the Java zip utils to decompress the Git blob.

Code Snippets

Below are some methods of decompressing the Blob file. If you did some research online, you will find many examples showing Method 1.

But I recommend using Method 2 as it is does not assume the size of the decompressed file. This method is also used in Apache Commons library.

Make sure the binary file is not corrupted or you might encounter java.util.zip.ZipException

Method 1

The byte array size of result can be arbitrarily set with a specific size. But you will have problems if the decompressed file size is uncertain.

String file = "<PATH to Git blob>";
byte[] fileBytes = Files.readAllBytes(Paths.get(file));
Inflater decompresser = new Inflater();
decompresser.setInput(fileBytes, 0, fileBytes.length);

byte[] result = new byte[1024]; // Size need to be set

int resultLength = 0;
resultLength = decompresser.inflate(result);

decompresser.end();

Method 2: Checks if end of compressed data is reached

This method reads the content of the file and output to ByteArrayOutputStream object.

String file = "<PATH to Git blob>";
byte[] fileBytes = Files.readAllBytes(Paths.get(file));

Inflater inflater = new Inflater();
inflater.setInput(fileBytes);

ByteArrayOutputStream outputStream = new ByteArrayOutputStream(fileBytes.length);
byte[] buffer = new byte[1024];

while (!inflater.finished()) {
  int count = inflater.inflate(buffer);
  outputStream.write(buffer, 0, count);
}

outputStream.close();

byte[] result = outputStream.toByteArray();

References