Quadratic dependence of time to load shape file on number of features


In trying to load a large shape file, I noticed that the time required was still significant, even after the hashcode() fix made by Ian Schneider. When running the profiler, I discovered that the number of calls to GeometryCollection.isFlattenedShape() was 35976403 = 8482*8483 / 2 (the number of features in the shape file was 8482). This quadratic dependence arises from the line

if (flattened) {
// May changes from 'true' to 'false'.
flattened = checkFlattenedShape();

in GeometryCollection.addImpl()




April 10, 2015, 3:32 PM

CodeHaus Comment From: desruisseaux - Time: Sat, 31 Jan 2004 08:34:45 -0600
The quadratic dependence of time has been fixed. Actually, it was an easy fix and it is strange that this bug existed in the first place. The other part (flattening only the geometries that need it) will need more though.

April 10, 2015, 3:32 PM

CodeHaus Comment From: - Time: Mon, 2 Feb 2004 06:50:58 -0600
Thanks very much for the fix. After friday, another thought occurred

to me, which is this: the current JTS geometries don't appear to allow for representing curved line segments, so the stuff about flattened shapes presumably doesn't apply currently. Will the ISO 19107 specification allow curved segments?

April 10, 2015, 3:32 PM

CodeHaus Comment From: desruisseaux - Time: Mon, 2 Feb 2004 07:31:47 -0600
ISO 19107 allow curve segments. See for example:

<a href="http://geoapi.sourceforge.net/javadoc/org/opengis/gm/primitive/CurveSegment.html">http://geoapi.sourceforge.net/javadoc/org/opengis/gm/primitive/CurveSegment.html</a>

But such CurveSegment would probably not be rendered using org.geotools.renderer.j2d.Geometry anyway, because j2d.Geometry has a limited support for curves. The important thing is that we are not forced to use j2d.Geometry; we uses it only when it make sense. j2d.Geometry is designed only for big geometries made from straight lines only, with a lot of points. In other words: countries, coast, rivers and stuff like that. My assumptions are:

&nbsp;&nbsp;&nbsp;- If a geometry has a lot of points, then it usually doesn&#39;t have

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;curves. It doesn&#39;t make much sense to bother about curves if the

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;length of each line segment is about 2 pixels. However, if the

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;geometry has a lot of points, then it need some more agressive

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;optimizations likes generalization, clipping, caching, etc. in

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;order to get decent performance. j2d.Geometry is there exactly

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;for that.

&nbsp;&nbsp;&nbsp;- If a geometry has curves, then it probably have few points since

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;curves make sense only for long distances (e.g. 50 pixels or

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;more). Example are administrative frontiers. If a geometry has

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;few points, then it doesn&#39;t need all the optimization found in

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;j2d.Geometry, which have a limited support for curves anyway.

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Such geometry should be rendered using the normal Java2D API.

More about curve support on j2d.Geometry: this framework internally represent Polylines or Polygons as a linked list of PointArrays. Each PointArray may be interpreted in two differents way:

&nbsp;&nbsp;&nbsp;&nbsp;1) As an array of geographic feature (lot of points)

&nbsp;&nbsp;&nbsp;&nbsp;2) As an array with few points making the connection between

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;two arrays of geographic features (1), e.g. the border of

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;a map.

See for example the figure there:

<a href="http://modules.geotools.org/renderer/apidocs/org/geotools/renderer/geom/Polyline.html">http://modules.geotools.org/renderer/apidocs/org/geotools/renderer/geom/Polyline.html</a>

The black lines (the coast line between the green land and the blue sea) are handled like PointArray #1. The orange lines on the border of the map (making the connection between coast lines in place of the part that are out of the widget area) are handled like PointArray #2. The orange dots represents the points stored in &quot;border&quot; PointArrays, which are much less numerous than the points stored in the &quot;coast line&quot; PointArrays.

Curves are supported only in the &quot;border&quot; PointArrays. They are not supported in the &quot;coast line&quot; PointArray for performance reason, and because I assume that the difference would not be visible anyway (except of very close zoom, when the zoom is closer than what the map resolution can provides).

Border may be initially straight lines and become curved after a projection. The projection algorithm in the renderer check for curves in &quot;border&quot; PointArray, but not in &quot;coast line&quot; PointArray, again for performance reason.

In summary, for a typical usage, curves may appears in a j2d.Geometry only after a reprojection. (it is technically possible to build j2d.Geometry with curved border before reprojection, but I assume that it will not be used very often; curves support were added in &quot;border&quot; PointArray mostly for better suport of reprojection). For curved geometries with a reasonably small amount of points, the rendering chain should be different (a different RenderedGeometries subclass) and make use directly of Java2D API. Different tradeoff, different RenderedGeometries subclass.

April 10, 2015, 3:32 PM

CodeHaus Comment From: desruisseaux - Time: Mon, 2 Feb 2004 07:43:20 -0600
In the above post, I mean &quot;different tradeoff, different RenderedLayer subclass&quot;. It would probably not extends RenderedGeometries since curved geometries may not use j2d.Geometry.

It raise the issue about how to support mouse selection (e.g. highlighting) for arbitrary RenderedLayer subclasses, not just an RenderedGeometries subclass. One approach may be to defines a common sub-class with basic functionality. An other approach is to always use j2d.Geometries, even for curved geometries. It is technically possible, but I wonder if it would introduce an unnecessary level of complexity for shapes that doesn&#39;t need clipping, generalization, etc.

April 10, 2015, 3:32 PM

CodeHaus Comment From: desruisseaux - Time: Tue, 3 Feb 2004 07:07:29 -0600
Closing as fixed, since the quadratic dependence of time has been fixed. The other issue (using Flattening only the geometries that need it) still unresolved, but should not hurt in most cases. We will open a new task if it still a problem after the refactoring of renderer.