EPSG lookup failures caused by concurrent database shutdown in FactoryUsingHSQL finalizer

Description

Database shutdown in the FactoryUsingHSQL finalizer causes concurrent EPSG lookup via other FactoryUsingHSQL instances using the same database URL and thus the same org.hsqldb.Database instance to fail in, for example, GeoServer gs-main unit tests.

Note that ThreadedHsqlEpsgFactory.createDataSource() opens the database with "shutdown=true" so the database will be shutdown when the last session is closed. This suggests that an acceptable fix might be overriding FactoryUsingHSQL.shutdown() to disable it for use in ThreadedHsqlEpsgFactory. This would be backwards-compatible with all other uses.

There is one org.hsqldb.Database instance per JVM for each unique database URL. FactoryUsingHSQL really has no business shutting down a resource that it does not own. Furthermore, all synchronization on the parent DirectEpsgFactory dispose() and getConnection() is on the DirectEpsgFactory instance and does not protect against use by other DirectEpsgFactory instances of the same org.hsqldb.Database instance concurrent with its shutdown.

Failures are readily seen in about 8% of gs-main "mvn clean install" builds on a 2 CPU machine (e.g. Linux booted with maxcpus=2). Overriding FactoryUsingHSQL.shutdown() to disable it for use in ThreadedHsqlEpsgFactory seems to reduce the total failure rate of gs-main builds to about 0.5%.

Occurrences of the finalizer and related failure can be seen by rebuilding hsqldb 2.4.1 with

and ./gradlew -Dbuild.debug=true hsqldb. For example:

Environment

Linux booted with maxcpus=2 to reproduce GeoServer gs-main failures.

Activity

Show:
Ben Caradoc-Davies
June 30, 2018, 4:54 AM

For your debugging convenience, I have attached hsqldb-2.4.1.jar built with debugging and the patch to print stack traces in org.hsqldb.Database.close() applied.

Andrea Aime
June 30, 2018, 11:57 AM

The bit that I don't get is "using the same database URL and thus the same org.hsqldb.Database instance to fail in".
Is this a "feature" unique to HSQLDB? Normally using the same URL generates separate sets of connections anyways.
The "org.hsqldb.Database" does not seem to have documentation, also checked http://hsqldb.org/doc/guide/running-chapt.html#rgc_inprocess but did not see mentions of the object being shared.

Ben Caradoc-Davies
June 30, 2018, 10:29 PM
Edited

my understanding comes from reading the source and observing the behaviour of HSQLDB in the debugger.

org.hsqldb.DatabaseManager, which is is used to manage all org.hsqldb.Database instances obtained from org.hsqldb.jdbc.JDBCDriver, maintains a static HashMap of database keys to org.hsqldb.Database instances:
https://sourceforge.net/p/hsqldb/svn/HEAD/tree/base/tags/2.4.1/src/org/hsqldb/DatabaseManager.java#l75

Any request for a database returns an existing org.hsqldb.Database instance if the HashMap contains one with the same key:
https://sourceforge.net/p/hsqldb/svn/HEAD/tree/base/tags/2.4.1/src/org/hsqldb/DatabaseManager.java#l333

For file databases, the key is the canonical path of the file:
https://sourceforge.net/p/hsqldb/svn/HEAD/tree/base/tags/2.4.1/src/org/hsqldb/DatabaseManager.java#l546

This means that there can only be one org.hsqldb.Database instance per JVM for each canonical file path, and it is shared between all connections using the same file path. I am not sure what happens to query components of the URL but I think they might be handled in org.hsqldb.DatabaseURL.parseURL. In our case the URL is identical as we are dealing with multiple instances of ThreadedHsqlEpsgFactory.

Ben Caradoc-Davies
July 1, 2018, 10:00 PM

Merged on master. I will backport to 19.x and 18.x when the builds for master are complete.

Ben Caradoc-Davies
July 2, 2018, 1:02 AM

Backported to 19.x and 18.x.

Assignee

Ben Caradoc-Davies

Reporter

Ben Caradoc-Davies

Triage

None

Components

Fix versions

Affects versions

Priority

Medium
Configure