Saturday 16 February 2013

Automated Large Object Swapping Part II


 

Very Large Breasts
Very large objects can be hard to contain.
This image is public domain.
Here I carry on from Part I to discuss fast serialisation.

Note - this is a cross post form Nerds-Central

Writing large objects to disk takes a lot of time, so we might think that efficient serialisation is not required. However, this does not seem to be the case in practice. Java does not allow us to treat a block of memory which stores and array of floats (or any other non byte type) as though it were a block of bytes. Java's read and write routines use byte arrays; this results in copying to and from byte arrays. The approach I took was to move the contents of Sonic Field's float arrays into a byte array as quickly as possible and then set the references to the float array to null so making it eligible for garbage collection.

Now - it is possible (I did try it) to serialise the data from the large objects to the swap file 'in place' and not require the byte array intermediate. However, the speed of methods like RandomAccessFile.writeFloat and DataOutputStream.writeFloat are so slow (even ObjectOutputStream was) that on a really fast drive, the CPU would become the limiting factor. This is really an attribute of hard drives becoming so very fast these days. The SSD in my Mac Book Pro Retina can easy write 300 mega bytes per second. This is not even that high speed by some modern standards. Calling DataOutPutStream.writeFloat (with a buffering system between it and the drive) takes around a third of a CPU core to write out at 80 mega bytes per second to my external USB3 drive. So, if I were to use the SSD in my machine for swap (which I don't as that is just SSD abuse), the CPU would be the limiting factor.

We need much faster serialisation than Java provides by default!

What is the fastest way to move several megabytes of float data into a byte array? The fastest way I have found is to use the (slightly naughty) sun.misc.Unsafe class. Here is an example piece of code:

setUpUnsafe();
payload = allocateBytes(getReference().getLength() * 4);
unsafe.copyMemory(
    getReference().getDataInternalOnly(), 
    floatArrayOffset, 
    payload, 
    byteArrayOffset,
    payload.length
);

What copyMemory is doing is copying bytes - raw memory with no type information - from one place to another. The first argument is the float array, the second is the position within the in-memory layout of a float[] class where the float data sits. The third is a byte array and the fourth the data offset within a byte array class. The final argument is the number of bytes to copy. The Unsafe class code its self works out all the tricky stuff like memory pinning; so from the Java point of view, the raw data in the float array just turns up in the byte array very very quickly indeed.

It is worth noting that this is nothing like using byte buffers to move the information over. There is no attempt to change endianness or any other bit twiddling; this is just raw memory copying. Do not expect this sort of trick to work if the resulting byte array is going to be serialised and read into a different architecture (x86 to Itanium for example).

In Sonic Field, the byte array thus loaded with float data is stored in a Random access file:

ra.seek(position);
ra.writeInt(payload.length);
ra.writeLong(uniqueId);
ra.write(payload);
ra.writeLong(uniqueId);

Performing this call on my external drive at around 80 MBytesPerSecond uses about 4 percent of one core. This give a comfortable 16GBytesPerSecond to saturate one core which is more like it!

Reading the data back in is just the reverse.


ra.seek(position);
int payloadLen = ra.readInt();
long unid = ra.readLong();
if (unid != this.uniqueId) throw new IOException(/* message goes here*/);
payload = allocateBytes(payloadLen);
if (ra.read(payload) != payloadLen)
{
    throw new IOException(/* Message Goes Here */);
}
unid = ra.readLong();
if (unid != this.uniqueId) throw new IOException(/* message goes here*/);
ret = SFData.build(payload.length / 4);
unsafe.copyMemory(
    payload,
    byteArrayOffset,
    ret.getDataInternalOnly(),
    floatArrayOffset,
    payload.length
);

Note that the length of the data is recorded as a integer before the actual data block. I record the unique ID for the object the data came form before and after the serialised data. This is a safe guard against corruption or algorithm failure elsewhere in the memory manager.

Setting Up Unsafe
Unsafe is not that easy to set up, especially if you do not also set up a security manager. Here is the code I use:
java.lang.reflect.Field theUnsafeInstance = Unsafe.class.getDeclaredField("theUnsafe"); //$NON-NLS-1$
 theUnsafeInstance.setAccessible(true);
 Unsafe unsafe = (Unsafe) theUnsafeInstance.get(Unsafe.class);

Also we need to get those offsets within classes:
// Lazy - eventually consistent initialization
    private static void setUpUnsafe()
    {
        if (byteArrayOffset == 0)
        {
            byteArrayOffset = unsafe.arrayBaseOffset(byte[].class);
        }
        if (longArrayOffset == 0)
        {
            longArrayOffset = unsafe.arrayBaseOffset(long[].class);
        }
        if (floatArrayOffset == 0)
        {
            floatArrayOffset = unsafe.arrayBaseOffset(float[].class);
        }
    }

Note that all code from Sonic Field (including all the code on this page) is AGPL3.0 licensed.

No comments:

Post a Comment