Java

1.7 and UP.

It uses NIO File channels for cache write backs to disk. Hence dependency on 1.7.

POM

<dependency>
    <groupId>org.wonderdb</groupId>
    <artifactId>wonderdb</artifactId>
    <version>0.2</version>
</dependency>

Github location

https://github.com/dreamdbvilas/wonderdb

Design and How it works

WonderDB cache can be configured as Btree+ index or hash index. Please refer to my blog on Btree+ and Hash Index to understand how both these data structures work in general and how they are used in WonderDB.

Configuration

Default server.properties contents are configured for dev machines with minimum settings for caches. You will need to change values based on available physical memory.

Property Description
primaryCache.maxSize Default value 50000. It means it uses 50000 blocks of JVM memory for caching. Each block is 2048 bytes. So default value will use 50000*2048=100MB of JVM heap. Rule of thumb is to set 50% of JVM heap for primary cache.
primaryCache.highWatermark Default value 49090. It describes when primary cache eviction thread will start evicting entries into the cache. Its value should be about 1% of primaryCache.maxSize.
primaryCache.lowWatermark Default value 49050. It describes when primary cache eviction thread stops evicting. Its value should be 2% of primary cache max size.
secondaryCache.maxSize Default value 50000. It uses 50000 blocks of machine physical memory (Direct ByteBuffer). Each block is 2048 bytes. So default value will use 50000*2048=100MB of machine physical memory outside of JVM heap. Set its value based on available machine physical memory.
secondaryCache.highWatermark Default value 49050. It describes when secondary cache eviction thread will start evicting entries into the cache. Its value should be about 1% of secondaryCache.maxSize.
secondaryCache.lowWatermark Default value 49050. It describes when secondary cache eviction thread stops evicting. Its value should be about 2% of secondary cache max size.
cacheWriter.syncTime Default value 3000 (3 seconds). Its value is set in millisecond. It describes how frequently cache writer thread writes back dirty blocks from secondary cache to the disk.
disk.asyncWriterThreadPool.queueSize Default value 10. It describes number parallel threads cache writer invokes to write secondary cache block to disk. Its value should be set depending on type of disk and number of CPUs the node has.
cache.storage Default value ./cache. It describes location of disk file to store cache entries.
cacheIndex.storage Default value ./cacheIndex. It describes location of disk file to store cache key entries.
cache.type [btree , hash]. Default is btree. Read more about btree. If your queries are mostly PK lookup type then use hash else leave it to btree. We have seen performance improvement of over 100% with hash.
cache.buckets This is used only when cache.type is hash. I am in the process of updating a document explaining calculations for its optimal performance. For now leave it to its default value.

Supported APIs

API Description
byte[] get(byte[] key) Gets entry from the cache.
void set(byte[] key, byte[] value) Set value into the cache.
void remove(byte[] key) Removes entry from the cache.

Using it in the code

Initialization

It needs to first initialize primary and secondary caches including disk files and number of thread pools. Add folioing line into your initialization code. This should be called only once.

WonderDBCacheService.getInstance().init(<server.properties file name>);
Accessing it in the code

To get value from cache, use following line from anywhere in the code.

CacheManager.getInstance().get(byte[] key);

To set a value,

CacheManager.getInstance().set(byte[] key, byte[] value);

To remove entry from the cache

CacheManager.getInstance().remove(byte[] key);
Shutdown

Call following line during the shutdown at the end. It cleans up all resources like thread pools, caches, sockets etc. JVM will not exit properly if this line is not called during shutdown.

WonderDBCacheService.getInstance().shutdown();

Questions?

Post you question on WonderDBCache google group.

Performance Tests

Performance testing was done on Amazon EC2 xlarge VM (4 CPUs, 13 GB RAM and 2 40GB standard SSDs with 3000 IOPs). Here are the results.

I first loaded 1 million entries in to the cache with value size of 1000 bytes and key size of 4 bytes.

It took 92 seconds to load. About 10869 records per second.

All tests are run for get and set api with accessing a randomly generated key between 0-1M int value. Results below are for the get apis.

Cache settings with hash index (cache.type=hash, config settings)
100% access from cache

 

# of threads Queries per second (QPS) CPU

200K

25000 25%

400K

44000 50%

600K

54000 60%

800K

61500 70%

1M 

52600 75-80%

1.2M

50000 75-80%

Cache settings with BTree+ index (cache.type=btree, config settings)
Case I – 100% access from cache

In this test keys were generated between 0-100000 just to make sure all keys will be accessed from the cache.

# of threads Queries per second (QPS) CPU
1  14285 20-25%
2 23529 25-50%
3  30000 50-75%
4  33000 75+%
5  33000 75+%
Case II – 80% from memory and 20% from disk (SSD)

In this test keys were generated between 0-1M to make sure there were 20% cache misses.

# of threads Queries per second (QPS) CPU
1  7142 20-25%
2  10000 25-50%
3  12500 50-75%
4  13800 75+%
5  12500 75+%

Leave a Reply

Your email address will not be published. Required fields are marked *