Обсуждение: process large tables
Hi,
I am trying to process a large table. Unfortunately, using select * from table gives me a ResultSet that is too large.
The java runs out of memory even if I boost the vm memory.
Is there any way I can programmatically (in java) retrieve say 10,000 records at a time without knowing anything specific about the table? Then, when I am done with those records, retrieve the next 10,000, etc?
Thank you in advance for any help you can spare.
Kristina
From the documentation (http://jdbc.postgresql.org/documentation/80/query.html#query-with-cursor) "By default the driver collects all the results for the query at once. This can be inconvenient for large data sets so the JDBC driver provides a means of basing a ResultSet on a database cursor and only fetching a small number of rows. ..." With a cursor you fetch rows by pieces. It is well explained in the documentation. Bye Nelson Arapé El Jue 14 Abr 2005 16:37, Kristina Magwood escribió: > Hi, > I am trying to process a large table. Unfortunately, using select * from > table gives me a ResultSet that is too large. > The java runs out of memory even if I boost the vm memory. > Is there any way I can programmatically (in java) retrieve say 10,000 > records at a time without knowing anything specific about the table? Then, > when I am done with those records, retrieve the next 10,000, etc? > > Thank you in advance for any help you can spare. > Kristina
Kristina Magwood wrote: > > Hi, > I am trying to process a large table. Unfortunately, using select * > from table gives me a ResultSet that is too large. > The java runs out of memory even if I boost the vm memory. > Is there any way I can programmatically (in java) retrieve say 10,000 > records at a time without knowing anything specific about the table? > Then, when I am done with those records, retrieve the next 10,000, etc? > > Thank you in advance for any help you can spare. > Kristina The following should work but was written mostly from memory, so it might not compile. Make sure you're only pulling the columns you need. There are three important bits (the details of which might have changed from pg7 to pg8): 1) the auto commit must be turned off. 2) the fetch direction must be FETCH_FORWARD 3) the query can be only a *single* statement. So no trailing ";" is allowed, as "select * from foo;" is actually three statements (assuming I understand the discussion that happened some time back). You can tell it's working if you see cursors being created on the server, or by using truss/strace type tools on the client...and if your memory footprint is smaller. This was all discussed on the list a few months back, so you might look through the mailing list for more details. Good luck. -- Alan String sql = null; boolean commitState = true; try { Connection conn = getConnectionSomehow(); Statement st = conn.createStatement(); commitState = conn.getAutoCommit(); conn.setAutoCommit(false); // keep the commit state st.setFetchDirection(ResultSet.FETCH_FORWARD); st.setFetchSize(nRows); // number of rows to pull in one chunk. ResultSet rs = st.executeQuery(sql = "select * from someTable"); while (rs.next()) processRow(); } catch (Exception e) { System.err.println("error in processing statement: " + sql); e.printStackTrace(); } finally { // restore the commit state if the connection is going back to a pool... if (conn != null) try {conn.setAutoCommit(commitState); } catch (Exception e) {} if (st != null) try {st.close(); } catch (Exception e) {} if (conn != null) try {conn.close(); } catch (Exception e) {} }