Update: The root cause of the issue is actually caused by exceptions thrown from Dispatcher.fireFinishedCallback(). Any exception thrown from this method will be silenty handled by higher level Dispatcher classes, and will not be logged by GeoServer. This can cause failures to cleanup request resources, and premature EOF errors when reading the OWS response message.
Previous Title: WPS Thread pool recycling not clearing ThreadLocals
This issue does not effect core geoserver - it only seems to crop up when using certain downstream GeoServer extensions that mess with thread pooling.
Sometimes, when running a WPS request, I get an error response:
If I run a number of request in succession, this error gradually becomes more common, until it is the only response I see (restarting geoserver fixes this temporarily).
When I stop GeoServer, I also see the following error:
I have traced the issue through GeoServer, and have ascertained that WPSResourceManager is using a ThreadLocal to store the unique excecution id for the WPS process. Tomcat is reusing these threads, and when one of these threads gets reused by a different WPS process, the old excecution id is still stored in the ThreadLocal. Then, when the WPSExecutionManager tries to submit the new request, it grabs the ExcecutionStatus from the old request, tries to subit it as a new request, and gets the error "Cannot switch process status from FAILED [old request] to QUEUED [new request]"
Even though this does not seem to affect core GeoServer, WPS should really be doing proper cleanup of its ThreadLocal variables.