Shapefile w/ Large (~100Mb) Dataset

Description

I am using large datasets (200,000 features ~100Mb) loaded through the shapefile datasource.

I have had it work on smaller datasets quite well but it seems to die if I go large. Any ideas on why this might be happening and if there is anything that I can do to get around this?

Here is the stack trace. It dies in New IO:

java.io.IOException: Not enough storage is available to process this command
at sun.nio.ch.FileChannelImpl.map0(Native Method)
at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:705)
at org.geotools.data.shapefile.shp.ShapefileReader.init(ShapefileReader.java:190)
at org.geotools.data.shapefile.shp.ShapefileReader.<init>(ShapefileReader.java:106)
at org.geotools.data.shapefile.ShapefileDataStore.openShapeReader(ShapefileDataStore.java:228)
at org.geotools.data.shapefile.ShapefileDataStore.readAttributes(ShapefileDataStore.java:318)
at org.geotools.data.shapefile.ShapefileDataStore.getSchema(ShapefileDataStore.java:304)
at org.geotools.data.shapefile.ShapefileDataStore.getFeatureSource(ShapefileDataStore.java:805)
at org.vfny.geoserver.action.validation.ValidationTestDoIt.runTransactions(ValidationTestDoIt.java:91)
at org.vfny.geoserver.action.validation.ValidationTestDoIt.execute(ValidationTestDoIt.java:58)
at

Environment

None

Activity

Show:
codehaus
April 10, 2015, 3:03 PM

CodeHaus Comment From: ianschneider - Time: Thu, 6 May 2004 14:37:05 -0500
---------------------
The error occurs in native code and is mapped from the underlying OS.

Quoted from <a href="http://msdn.microsoft.com/library/default.asp?url=/library/en-us/w2kmsgs/3714.asp">http://msdn.microsoft.com/library/default.asp?url=/library/en-us/w2kmsgs/3714.asp</a>

[begin quote]

Error Message:

Not enough storage is available to process this command.

User Action:

Do one of the following, then retry the command: (1) reduce the number of running programs; (2) remove unwanted files from the disk the paging file is on and restart the system; (3) check the paging file disk for an I/O error; or (4) install additional memory in your system.

[endquote]

codehaus
April 10, 2015, 3:03 PM

CodeHaus Comment From: jgarnett - Time: Thu, 6 May 2004 16:47:09 -0500
---------------------
Nevertheless we have to figure out something? Right now they are partitioning the data (the machine has 1.5 gigs, and JUMP manages to load this information).

Bleck, I noticed your bug reference was from Windows2000 - will a different operating system help? WindowsXP or Linux?

codehaus
April 10, 2015, 3:03 PM

CodeHaus Comment From: aaime - Time: Fri, 7 May 2004 02:09:05 -0500
---------------------
This thread on sun java forum may be of interest... it seems that this error is not easy to reproduce even on windows...

<a href="http://forum.java.sun.com/thread.jsp?forum=4&thread=437539">http://forum.java.sun.com/thread.jsp?forum=4&amp;thread=437539</a>

codehaus
April 10, 2015, 3:03 PM

CodeHaus Comment From: - Time: Fri, 7 May 2004 05:54:45 -0500
---------------------
I remember having a problem similar to this.

If I remember correctly (it was a while back) we were running out of virtual memory address space on NT4 and 2K boxes. MappedByteBuffer was not being correctly released and the address space could never be reclaimed. If we opened and closed one of our shapefiles(300mb+) a few times it would barf with that error. The only thing we could do was to ensure we didn&#39;t reload shapefiles and made sure we only loaded 2gb at most. In the end we scrapped MappedByteBuffers for normal ones. Never tested on Linux.

Turning on aggresive heap can fix other problems associated to memory allocation for large files.

codehaus
April 10, 2015, 3:03 PM

CodeHaus Comment From: jgarnett - Time: Fri, 7 May 2004 19:39:45 -0500
---------------------
IanS has kindly fixed this issue by providing a parameter that allows the user to disable the use of MappedByteBuffer. This parameter shows up in the GeoServer DataStore definition screen allowing end users to deal with the problem.

The only further thing we could do is &quot;detect&quot; when we need to switch behind the seens. Although some of the comments indicate that this is a time dependent thrashing of the heap.

Assignee

Unassigned

Reporter

codehaus

Triage

None

Components

Fix versions

Affects versions

Priority

High
Configure