StreamingRenderer.stopRendering() should hint feature sources to stop reading

Description

StreamingRenderer.stopRendering() currently interrupts the rendering thread, but the processing thread (the one that called paint(...) continues processing (reading features) until a check for renderingStopRequested is done. Problem is a feature being read may be blocking the processing thread for too long, even minutes, so (for example, GeoServer) gets stuck with several unfinished rendering tasks that bog down the server.

This is the case, for example, for a polygon shapefile where each feature has a lot of shells and holes. Reading a single such feature can take well over a minute to several minutes, proportionally to the number of shells and holes.

A proposed solution is for the stopRendering() method to hint the feature sources to abort reading by setting the interrupted status on the processing thread, and clearing out before returning from paint().

This way, feature source implementations can check for Thread.currentThread().isInterrupted() and abort reading.

A prototype implementation demonstrates that this way GeoServer WMS's max rendering time is obeyed without problems, whilst currently it won't be and has several threads awaiting for the feature reader to return, effectively being a DoS attack vector.

Environment

None

Activity

Show:
Gabriel Roldan
April 22, 2020, 3:59 PM

Hi Andrea, I’m preparing a pull request (need to create separate prs for different issues out of a prototype branch). Testing with shapefiles (see GEOT-6561), and no need to have a separate thread as far as I can tell.

The trick is to save the current thread as an instance variable in StreamingRenderer and interrupt it at stopRendering(), in case stopRendering is called from another thread than the one that called paint. Then the finally block in paint just clears out the interrupted status before returning.

It’s working pretty well because the shapefile reader can handle it whether it’s doing IO or not. For example, if doing IO, it’ll get a {{ClosedByInterruptException}} and can infer it should just abort and return, and if not doing IO and stuck processing (e.g. assigning holes to shells) it can check for Thread.currentThread().isInterrupted(), which I actually implemented as a provided {{java.util.function.BooleanSupplier}} that {{ShapefileFeatureReader}} provides to {{ShapefileReader}} and this to {{ShapeHandler}}, so the lower level stuff is agnostic of the abort checking method.

I guess all file based datastores can do the same, whilst for jdbc ones you’re right, it could be more involved. Hence I’m thinking of it as a hint that required no API change at all. It’s not crazy that a reader would check for whether the thread has been interrupted.

Gabriel Roldan
April 22, 2020, 4:13 PM
Edited

Thinking about it, in JDBC there’s {{Statement.cancel()}} and {{Statement.setQueryTimeout(int seconds)}}

Possible situations are:

  • The reader is stuck waiting for the statement execution call to return. The renderer could provide a hint in the {{Query}} so the reader can set a query timeout, catch the {{SQLTimeoutException}}, and abort cleanly. This would be better than requiring a separate thread polling the thread interrupted status to call {{Statement.cancel()}}.

  • The reader is processing a resultset, in which case it can periodically check the current thread interrupted status and abort cleanly

Am I missing something here? may be worth a try.

 

Andrea Aime
April 22, 2020, 4:20 PM

Makes sense, but I have a question then… what kind of shapefile do you have, that might “is a feature being read may be blocking the processing thread for too long, even minutes” :-D

Is it really another type of store? The ones that can block executions for such a long time are the JDBC ones normally.

Gabriel Roldan
April 22, 2020, 4:25 PM
Edited

For a system where users publish their datasets as shapefiles, there are some with very complex polygons, like the ones mentioned in

That one polygon alone takes 2.5 minutes to be built on my machine. The problem is not IO/parsing it at all, but composing it. And there are several.

So far got a wms request that took 4.5 minutes with only 5 features, down to 1.5 minutes, but the requirement is that the WMS max rendering time is obeyed nonetheless.

Gabriel Roldan
April 22, 2020, 9:55 PM

Hey , please take a look at this commit:

Looks like the Statement.setQueryTimeout idea might work. So bad the postgres jdbc driver throws a org.postgresql.util.PSQLException instead of a java.sql.SQLTimeoutException so I’m not sure the idea of throwing a (new) org.geotools.data.QueryTimeoutException will be feasible. At least not with a bunch more of hacking, given JDBCDataStore is final and subclassing JDBCFeatureReader would also be complicated since it’s created directly by JDBCDataStore. If the postgis implementation were a little bit more in charge it could check for the 57014 “query cancelled” error code in PSQLException and wrap it in a SQLTimeoutException.

 

Assignee

Gabriel Roldan

Reporter

Gabriel Roldan

Triage

None

Components

Priority

Medium
Configure