fid filter may be ambiguous if there's an actual attribute named ID
Description
Environment
Attachments
Activity
CodeHaus Comment From: mauro - Time: Tue, 6 Jul 2010 02:47:57 -0500
---------------------
solved for window and mac too
CodeHaus Comment From: jgarnett - Time: Mon, 28 Jun 2010 18:47:00 -0500
---------------------
The resulting build fails on osx and win32. We spent an hour on IRC and were unable to sort out what the problem was.
Since it has now been one day; and we have not been able to find you for a debugging session it is time to back out your patch.
Your work has been removed as of -r25776; I will attach a reverse patch (or you can simply refer to this revision number to recover your work).
CodeHaus Comment From: mauro - Time: Mon, 24 May 2010 07:54:03 -0500
---------------------
The ECQL grammar is context-free (it is an LL grammar). There is not overlap between the two rules. Now ID is a Keyword.
About "@id", if there are a rule or convention in the community or domain that does not allow to use @id as attribute name, then we could use it. But I think the reply is "no". So, if GML3 is not acceptable as reference I think the Jody option could be a possible solution, "IN ( ...)".
CodeHaus Comment From: groldan - Time: Mon, 24 May 2010 05:16:34 -0500
---------------------
"In the ECQL context ID is feature id."
Which obviously makes the grammatical structures for the two filter types in question to overlap, not allowed in a context free grammar.
" We selected this symbol as a keyword to match the ECQL syntax with GML 3.1.0 <attribute name="id" type="ID"/>. There is no confusion in the ECQL context and GML3 context what the ID symbol means."
The ID symbol is arbitrarily chosen to represent the identifier concept, present in various - if not most - information models. Moreover, it seems from the ECQL discussions in the geotools mailing list that ECQL was designed after the GeoAPI Filter subsystem, which allows for a richer composition of filter terms than the OGC Filter 1.1 and CSW Common Query Language. Neither of them limit its applicability to the GML 3.1.0 specification. In fact, the GeoAPI Filter subsystem is object and information model agnostic.
" Obviously, in the FeatureType context the ID symbol has not the same meaning as GML3 and ECQL."
There's no such thing as an ID symbol in the GeoAPI FeatureType context, but the concept of feature identifier by means of the Feature.getIdentifier():FeatureId and Attribute.getIdentifier():Identifier properties, which map to the gml:id attribute defined in the GML spec when encoding to it, hence making the Feature/FeatureType, GML and ECQL contexts compatible. Actually, ECQL being tied to GeoAPI Filter makes the ECQL query language compatible with any information model in the Java2 language that supports the concepts of Object, property, and Object Identifier.
" Thus, if the FeatureType is compatible with GML3 its schema should have ID attribute and it is the Feature ID and it will be compatible with ECQL."
The way the Feature, Attribute and Geometry identifiers are derived from the underlying data structures is implementation dependent (in the GeoTools library depending on the actual DataStore implementation and how it handles identity). There's no mandate for an underlying data structure to contain a property called ID nor to map such a property to the object identifier.
" In conclusion: in compatible contexts there is no confusion."
I would rather say that this conclusion is wrong as it is based on false premises, except if it is acceptable for the ECQL query language to provide a subset of the functionality it was originally intended for by prohibiting a given valid property name (as per the language's production rules) to be used in the "IN" predicate, in which case it would be better to either update the "IN predicate" production rules to account for the limitation or rather document the ECQL grammar in other way than EBNF to avoid confusion.
" I think that it is a designer obligation to maintain this consistency."
Agreed, but it seems wrong that the "ID" symbol is deliberately excluded from being part of an information model because that would imply assuming that the only possible meaning for such a property is "Object Identifier", tying the query language to an abbreviation in the english language that by itself may assume different meanings depending on the context: [<a href="http://en.wikipedia.org/wiki/Id">http://en.wikipedia.org/wiki/Id</a>]
" But if this is a real problem, we could change the new "ID" symbol back to the old "FID"?."
That would present the same problem than using "ID". If the production of a GeoAPI org.opengis.filter.Id filter is gonna be defined by a production rule like <id predicate> ::= "ID" [ "NOT" ] "IN" "(" <id> {"," <id> } "", then the "ID" terminal symbol needs to be replaced by one that does not overlap with the production of <attribute-name> in the "IN predicate", such as, for example, @id
Just my 2 cents, I don't actually have a need to get this working for any of my ongoing work assignments.
CodeHaus Comment From: mauro - Time: Mon, 24 May 2010 04:17:27 -0500
---------------------
"... Let us try a different approach:
- should we modify the grammar to say IN ('fid.1','fid.2',...). That is IN with nothing in front of it is a test against feature id"
RE:
I think ECQL does not need a new token for fid value. Taking SQL as inspiration, you can use all defined types to identify a row in the table. If the fid is an string in the relation we should use the string token, if the fid is an integer we should use the integer token, ... etc. So, my proposal should be to extend the syntax to support all types (now you can only use string)?.
" Is the "IN" syntax defined in CQL or is it something we just made up for ECQL? ..."
RE: No, it isn't. It was defined in ECQL inspired in GML3.
The ECQL grammar does not allow to create an IN filter on an attribute called ID: ID IN (1, 2, 3) is meant as a fid filter.
Some time ago it was proposed that the fid filter were created through a special token @id instead: http://www.mail-archive.com/geotools-devel@lists.sourceforge.net/msg07018.html - http://www.mail-archive.com/geotools-devel@lists.sourceforge.net/msg07018.html.
The following test case exemplifies it:
sample: length IN (4100001)
@ CQLException
*/
and produces the parse error:
org.geotools.filter.text.cql2.CQLException: Encountered "ID IN ( 1" at line 1, column 1.
Was expecting one of:
<NOT> ...
"(" ...
"[" ...
<IDENTIFIER> ...
"-" ...
<INTEGER_LITERAL> ...
<FLOATING_LITERAL> ...
<STRING_LITERAL> ...
"true" ...
"false" ...
"point" ...
"linestring" ...
"polygon" ...
"multipoint" ...
"multilinestring" ...
"multipolygon" ...
"geometrycollection" ...
"envelope" ...
"id" ...
"id" <NOT> ...
"id" "in" "(" <STRING_LITERAL> ...
"include" ...
"exclude" ...
. Parsing : ID IN (1,2,3). Current Token : "null"
at org.geotools.filter.text.ecql.ECQLCompiler.compileFilter(ECQLCompiler.java:96)
...