GMLFilterFeature handles character content erratically when provided as multiple chunks

Description

In GMLFilterFeature the method characters(char[], int, int) is implemented erratically, which occasionally leads to erratic Features when parsing GML documents.

As described in the JavaDoc of org.xml.sax.ContentHandler#characters(char[] ch, int start, int length) the character content might be passed from the SAX parser to the content handler in multiple chunks:
"...The Parser will call this method to report each chunk of character data. SAX parsers may return all contiguous character data in a single chunk, or they may split it into several chunks; ..."

The GMLFilterFeature implementation, however, only works correctly when each XML text element is passed in as a single chunk (only one call to this method). In more or less rare occasions this leads to errors when parsing features in a format similar to this:
<gml:featureMember><gml:ATTRIBUTE1>value</gml:ATTRIBUTE1></gml:featureMember>

Examples:

  • If "value" is passed in as one chunk "value", the resulting feature is correct. If "value" is passed in as a chunk "val" and a chunk "ue", the resulting feature has an attribute value "val ue" - with an extra space.
    * If 100 is passed in as one chunk "100", the resulting feature is correct. If 100 is passed in as a chunk "1" and a chunk "00", the resulting feature has an attribute value 0.

I have attached a unit test which can reproduce the problem. I have discovered the problem while reading GML from the GeoServer. I will try to provide a patch to fix the problem.

Environment

None

Status

Assignee

Unassigned

Reporter

codehaus

Triage

None

Components

Fix versions

Affects versions

Priority

High
Configure