DataAccessFinder performance degradation under concurrency

Description

DataAccessFinder.getAllDataStores and getAvailableDataStores performance degrades a lot under concurrency. These are methods that can be called rather frequently. As a matter of fact, GeoServer does when configuring a data source, and this hinders performance of its REST config API.

The problems seem to be mostly related to synchronization, but there is also going on, and probably other issues.

At 256 concurrent threads, getAllDataStores goes to an average call time of 2.8ms and getAvailableDataStores of 294ms.

That wouldn’t be the end of the world, but some threads get really unlucky, to the point that getAllDataStores can block for about 820 milliseconds, and getAvailableDataStores for about 980 seconds! That’s an astonishing 16 minutes waiting for a single method call. Even at a “more reasonable” concurrency of 64 threads, it could block for more than 4 minutes.

Here are the figures, see below for more info about the JMH tests:

Baseline (master@eb9ee0bb)

(note the following charts use a logarithmic scale)

 

The JMH tests used to get these results have the following datastore implementations on the classpath: shapefile, property, jdbc-postgis, jdbc-h2, jdbc-oracle, jdbc-sqlserver, and jdbc-mysql.

And can be run with:

git clone git@github.com:groldan/geotools-benchmarks.git

cd geotools-benchmarks/library/main/ && mvn clean install -DskipTests

mvn test -o -Dtest=DataAccessFinderBenchmarkTest

Environment

None

Assignee

Gabriel Roldan

Reporter

Gabriel Roldan

Triage

None

Priority

Medium
Configure