Many years ago I designed and implemented a trading strategy container which included strategy persistence, and I used a toolkit that was ready to hand, familiar to me (Hibernate) -- and used the most commonly deployed SQL database at my employer. It's not quite the all-time winner for worst architectural choices I have made, but it's a contender: if I had to do it all over neither Hibernate nor a SQL database would figure in the design, thus reducing complexity and cost at the same time. I'd probably put JSON documents into a MongoDB and call it a day.
Nevertheless, relational databases have their uses, and the object-relational impedance mismatch has not magically healed itself in the intervening years. I wanted to pull down a bunch of data from the GDAX cryptocurrency exchange and store it for P&L reporting on the fills. Furthermore, I wanted cheap and cheerful, not complex and fussy, which led me to the popular SQLite embedded database and JDBI, one of a family of open source libraries like iBatis and Spring JDBC that focus on easing use of JDBC and SQL rather than hiding it. And along the way I found another reason to like IntelliJ IDEA as a tool; it includes some nice, lightweight tools that easy interacting with databases while developing.
How to stop worrying and learn to love SQL
After years working with it I believe object-relational mapping is not a solved problem, at least not in statically-typed languages like Java, and attempts to solve it will take you up a curve of exponentially increasing frustration. I believe in most cases you are better off working closer to the metal with raw SQL and just acknowledge that it is fundamentally different from your object model. You don't need to buy into my opinion to read the rest of the article, but I wanted to explain the thinking to motivate why you might want to look at minimalist libraries like JDBI before reaching for Java object-relational mapping of any kind, including what's built into these lighter libraries.
Optimal object models and relational models are never going to look the same. Here "optimal" is the key word: if you are willing to accept a weaker object model or poorer relational data model you can save a lot of time using an object-relational framework -- and that's OK. But if you are not, some of the gaps you might see depending on the framework chosen:
- you might need to expose getter/setter methods, exposing fields
- you will mix concerns by combining annotations coupled to the schema with business logic
- custom type mapping for string-like value types like URI, UUID, etc. will further couple you to the framework if they are not natively supported
- caching will have to be in the database or the framework
- more sophisticated data models like bi-temporal schemas might be beyond your reach
- you will have to write tuned SQL by learning what framework equivalent is, if any
The variety, extensibility and flexibility of object models is our enemy here: it's far too easy to come up with an object model which will break your framework of choice, and every elaboration to try and fit in more cases increases the framework's complexity and scope for bugs. It can also slow down your database layer dramatically.
What will you lose? Limited cross-database SQL dialect portability is one of the big benefits of a good object-relational framework. You'll also probably end up hand-writing SQL and Java code which would be generated by the framework, so it's more verbose, though there is scope to use OO techniques to reuse code which a framework might not allow. This will take time, and if you don't care and control both code and database schema just picking one as the "master" and generating the code of the other will speed up initial development dramatically. You should just have eyes open about the trade-offs: the up-front cost saving does not come for free, and you might end up spending more time on the care-and-feeding of your object-relational layer over time.
SQLite
SQLite is a robust, widely-deployed, C-based library which offers a variety of cross-language bindings, including Java, and a rich implementation of SQL in a small runtime footprint. For Java, Xerial offers a JDBC driver pre-packaged with JNI libraries for common architectures. Importantly, its SQL implementation is complete enough that a later switch to a larger database is not out of the question.
The biggest issue with it is the limited range of SQL column data types. While on the surface it supports a wide range, under the hood they are all mapped to NULL, INTEGER, REAL, TEXT or BLOB storage types. This can lead to quirks in handling of types like DATETIME, e.g. if you want to support microsecond timestamps and your type column is actually an INTEGER number of milliseconds since the UNIX epoch, you are out of luck. Similarly, issues with storing very large REAL values which might otherwise map to BigDecimal forces you to store as a TEXT type.
JDBI
JDBI, the Java DataBase Interface, is a modern, Java 8-friendly library that layers over JDBC. You start by initializing a Jdbi object, e.g. using a JDBC driver string for SQLite:
Jdbi dbi = Jdbi.create("jdbc:sqlite:behemoth.db");
You can run a multi-line SQL script located on the CLASSPATH with just a couple lines, e.g. to load up DDL and initialize the database with static reference data:
String installScript = ClasspathSqlLocator.
findSqlOnClasspath(InitDatabase.class.getName() + "_install");
dbi.useHandle(hnd -> {
int totalExecuted = 0;
int[] rowsExecuted = hnd.createScript(installScript).execute();
for (int rows : rowsExecuted) {
totalExecuted += rows;
}
System.out.println("Executed: " + totalExecuted);
});
But the real fun starts with queries. Just as using a database handle takes a lambda function, you can take a Query object created from any string and turn it into a ResultIterable type which transforms the underlying JDBC ResultSet into an iterable stream of objects.
This next fragment of code creates a simple in-memory cache of "typecode"-like tables, ones that map an integer ID to a string code. Note as a nice Java 8 trick the constructor MyTypeCode(Long, String) will map to a BiFunction<Long,String,T> via MyTypeCode::new, so most of these can be created given just a class, a table name and a method reference to a constructor:
public class JdbiDynamicEnumCache<T extends DynamicEnum> extends AbstractDynamicEnumCache<T> {
public JdbiDynamicEnumCache(Jdbi conn, Class<T> clazz, String tableName,
BiFunction<Long, String, T> typeCodeFactory) {
super(clazz);
conn.useHandle(handle -> {
Query q = handle.createQuery("SELECT id, name FROM " + tableName);
ResultIterable<T> tcIter = q.map((rs, ctx) -> typeCodeFactory.apply(rs.getLong("id"),
rs.getString("name")));
tcIter.forEach(this::add);
});
}
}
Want to do something sophisticated like bi-temporal table upserts without needing stored procedures? This is where JDBI really shines. An upsert requires atomicity (via a transaction); a comparison to see if a value is new (via equals, an object operation); and the ability to do both a SQL INSERT and SELECT. All of these operations are available to you with minimal code in JDBI.
Start by finding the latest value by name:
private ResultIterable<BitemporalRow<ProductEntity>> findLatestByName(String name, Handle handle) {
String whereClause = "WHERE name = :name AND end_time = :endTime";
String sql = new ST(queryTemplate).add("where", whereClause).render();
return handle.createQuery(sql)
.bind("name", name)
.bind("endTime", new java.sql.Date(ValidRange.END_TIME.toEpochMilli()))
.map(getProductRowMapper());
}
where
private static final String ST_GROUP = "cloudwall/tradedb/dbi/JdbiProductEntityDao.sql.stg";
ST queryTemplate = StringTemplateSqlLocator.findStringTemplate(ST_GROUP, "selectAll");
and the STG file contains the definition selectAll:
selectAll(where) ::= <<
SELECT
vid,
eid,
exchange_id,
base_asset_id,
quote_asset_id,
name,
min_order_size,
max_order_size,
min_order_increment,
min_quote_increment,
update_user,
version,
start_time,
end_time
FROM product
<where>
>>
This uses a feature in JDBI, StringTemplates, which let you render dynamic SQL. Here I'm looking for the row with a given name where endTime is equal to a signal value for "latest" e.g. 9999-12-31 for those who care to have a Y10K problem.
You can then execute the combination of findLatestByName(), UPDATE and INSERT with:
private Jdbi conn;
@Override
public BitemporalRow<ProductEntity> upsert(ProductEntity product, String updateUser) {
conn.inTransaction(TransactionIsolationLevel.SERIALIZABLE, handle -> {
// select, then UPDATE and INSERT, or just INSERT, depending on whether object exists
});
}
You can see the full code in cloudwall.tradedb.dbi.JdbiProductEntityDao.
IntelliJ
IntelliJ Ultimate 2017.3 supports the Xerial JDBC driver out-of-the-box, so you can configure a database connection:
And get a browsable view of the schema:
And even browse table contents:
There's also a basic SQL execution console which lets you write queries against your new embedded database, which is a great way to test out before embedding in code.