Java
1.7 and UP.
It uses NIO File channels for cache write backs to disk. Hence dependency on 1.7.
POM
<dependency> <groupId>org.wonderdb</groupId> <artifactId>wonderdb</artifactId> <version>0.2</version> </dependency>
Github location
https://github.com/dreamdbvilas/wonderdb
Design and How it works
WonderDB cache can be configured as Btree+ index or hash index. Please refer to my blog on Btree+ and Hash Index to understand how both these data structures work in general and how they are used in WonderDB.
Configuration
Default server.properties contents are configured for dev machines with minimum settings for caches. You will need to change values based on available physical memory.
Property | Description |
---|---|
primaryCache.maxSize | Default value 50000. It means it uses 50000 blocks of JVM memory for caching. Each block is 2048 bytes. So default value will use 50000*2048=100MB of JVM heap. Rule of thumb is to set 50% of JVM heap for primary cache. |
primaryCache.highWatermark | Default value 49090. It describes when primary cache eviction thread will start evicting entries into the cache. Its value should be about 1% of primaryCache.maxSize. |
primaryCache.lowWatermark | Default value 49050. It describes when primary cache eviction thread stops evicting. Its value should be 2% of primary cache max size. |
secondaryCache.maxSize | Default value 50000. It uses 50000 blocks of machine physical memory (Direct ByteBuffer). Each block is 2048 bytes. So default value will use 50000*2048=100MB of machine physical memory outside of JVM heap. Set its value based on available machine physical memory. |
secondaryCache.highWatermark | Default value 49050. It describes when secondary cache eviction thread will start evicting entries into the cache. Its value should be about 1% of secondaryCache.maxSize. |
secondaryCache.lowWatermark | Default value 49050. It describes when secondary cache eviction thread stops evicting. Its value should be about 2% of secondary cache max size. |
cacheWriter.syncTime | Default value 3000 (3 seconds). Its value is set in millisecond. It describes how frequently cache writer thread writes back dirty blocks from secondary cache to the disk. |
disk.asyncWriterThreadPool.queueSize | Default value 10. It describes number parallel threads cache writer invokes to write secondary cache block to disk. Its value should be set depending on type of disk and number of CPUs the node has. |
cache.storage | Default value ./cache. It describes location of disk file to store cache entries. |
cacheIndex.storage | Default value ./cacheIndex. It describes location of disk file to store cache key entries. |
cache.type | [btree , hash]. Default is btree. Read more about btree. If your queries are mostly PK lookup type then use hash else leave it to btree. We have seen performance improvement of over 100% with hash. |
cache.buckets | This is used only when cache.type is hash. I am in the process of updating a document explaining calculations for its optimal performance. For now leave it to its default value. |
Supported APIs
API | Description |
---|---|
byte[] get(byte[] key) | Gets entry from the cache. |
void set(byte[] key, byte[] value) | Set value into the cache. |
void remove(byte[] key) | Removes entry from the cache. |
Using it in the code
Initialization
It needs to first initialize primary and secondary caches including disk files and number of thread pools. Add folioing line into your initialization code. This should be called only once.
WonderDBCacheService.getInstance().init(<server.properties file name>);
Accessing it in the code
To get value from cache, use following line from anywhere in the code.
CacheManager.getInstance().get(byte[] key);
To set a value,
CacheManager.getInstance().set(byte[] key, byte[] value);
To remove entry from the cache
CacheManager.getInstance().remove(byte[] key);
Shutdown
Call following line during the shutdown at the end. It cleans up all resources like thread pools, caches, sockets etc. JVM will not exit properly if this line is not called during shutdown.
WonderDBCacheService.getInstance().shutdown();
Questions?
Post you question on WonderDBCache google group.
Performance Tests
Performance testing was done on Amazon EC2 xlarge VM (4 CPUs, 13 GB RAM and 2 40GB standard SSDs with 3000 IOPs). Here are the results.
I first loaded 1 million entries in to the cache with value size of 1000 bytes and key size of 4 bytes.
It took 92 seconds to load. About 10869 records per second.
All tests are run for get and set api with accessing a randomly generated key between 0-1M int value. Results below are for the get apis.
Cache settings with hash index (cache.type=hash, config settings)
100% access from cache
# of threads | Queries per second (QPS) | CPU |
---|---|---|
200K |
25000 | 25% |
400K |
44000 | 50% |
600K |
54000 | 60% |
800K |
61500 | 70% |
1M |
52600 | 75-80% |
1.2M |
50000 | 75-80% |
Cache settings with BTree+ index (cache.type=btree, config settings)
Case I – 100% access from cache
In this test keys were generated between 0-100000 just to make sure all keys will be accessed from the cache.
# of threads | Queries per second (QPS) | CPU |
---|---|---|
1 | 14285 | 20-25% |
2 | 23529 | 25-50% |
3 | 30000 | 50-75% |
4 | 33000 | 75+% |
5 | 33000 | 75+% |
Case II – 80% from memory and 20% from disk (SSD)
In this test keys were generated between 0-1M to make sure there were 20% cache misses.
# of threads | Queries per second (QPS) | CPU |
---|---|---|
1 | 7142 | 20-25% |
2 | 10000 | 25-50% |
3 | 12500 | 50-75% |
4 | 13800 | 75+% |
5 | 12500 | 75+% |
Leave a Reply