Обсуждение: Does anyone use in ram postgres database?

Поиск
Список
Период
Сортировка

Does anyone use in ram postgres database?

От
Chris Barnes
Дата:
 
 
  We are testing in memory postgres database and have questions about configuring the ram mount point and whether there is great gains in setting it up this way? Are there any considerations for postgres?
 
  If you have experience, can you please give us some ideas on how you have accomplished this?
 
Cheers,
 
Chris Barnes
 
 


IM on the go with Messenger on your phone. Try now.

Re: Does anyone use in ram postgres database?

От
Merlin Moncure
Дата:
On Thu, Mar 25, 2010 at 2:42 PM, Chris Barnes
<compuguruchrisbarnes@hotmail.com> wrote:
>   We are testing in memory postgres database and have questions about
> configuring the ram mount point and whether there is great gains in setting
> it up this way? Are there any considerations for postgres?

There is very little reason to do this.  both postgres and the
operating system cache frequently used pages in memory already and
they are pretty smart about it -- this leaves more memory for
temporary demands like sorts, indexes, large result sets, etc.  It's
usually just as good to simply set fsync=off on the database in
scenarios where you are ok with data loss following a crash and the
system is performance critical.

merlin

Re: Does anyone use in ram postgres database?

От
Scott Marlowe
Дата:
On Thu, Mar 25, 2010 at 12:42 PM, Chris Barnes
<compuguruchrisbarnes@hotmail.com> wrote:
>
>
>   We are testing in memory postgres database and have questions about
> configuring the ram mount point and whether there is great gains in setting
> it up this way? Are there any considerations for postgres?
>
>   If you have experience, can you please give us some ideas on how you have
> accomplished this?

These questions always get the first question back, what are you
trying to accomplish?  Different objectives will have different
answers.

A common answer to these issues is to put some small but heavily used
tables into ram, and leave the rest on the hard drive.  Easy enough:

(as pg unix user)
mkdir /dev/shm/pgram
psql postgres
create tablespace test location '/dev/shm/pgram';
set temp_tablespaces test;

And now all temp tables are in ram.

Now, if your pg_xlog directory is a problem, then you either need
bigger faster hard drives, or your data is more transient in nature
and you can recreate it and you put the whole db into RAM.

Note that the query planner wasn't designed with RAM as the storage
space for pg, so it might make some bad decisions until you adjust
postgresql.conf to stop that.  and then it still might make some bad
decisions.

Re: Does anyone use in ram postgres database?

От
Alan McKay
Дата:
On Thu, Mar 25, 2010 at 4:04 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
> There is very little reason to do this.  both postgres and the
> operating system cache frequently used pages in memory already and
> they are pretty smart about it -- this leaves more memory for
> temporary demands like sorts, indexes, large result sets, etc.  It's
> usually just as good to simply set fsync=off on the database in
> scenarios where you are ok with data loss following a crash and the
> system is performance critical.

(I work with the OP)

We've found that writes to the ram based DB are about 3x faster than
disk based (with fsync turned ON), but we were expecting them to be a
LOT faster than that and are wondering what we might be doing wrong.


--
“Don't eat anything you've ever seen advertised on TV”
         - Michael Pollan, author of "In Defense of Food"

Re: Does anyone use in ram postgres database?

От
Alan McKay
Дата:
On Thu, Mar 25, 2010 at 4:15 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
> These questions always get the first question back, what are you
> trying to accomplish?  Different objectives will have different
> answers.

We have a real-time application that processes data as it comes in.
Doing some simple math tells us that a disk-based DB cannot possible
perform fast enough to allow us to process the data.

> Now, if your pg_xlog directory is a problem, then you either need
> bigger faster hard drives, or your data is more transient in nature
> and you can recreate it and you put the whole db into RAM.

When we only saw a 3x improvement in speed with the RAM based DB, we
were still seeing a fair bit of disk activity but were not sure what
was going on.  Then we thought about pg_xlog and moved it to RAM as
well, but as I recall still not a great improvement.

We are trying a test right now where "initdb" was run against
/ramdisk/data so that absolutely everything should be in there.  Will
report back with results.

We are also about to try another test with a regular disk-based DB and
fsync turned OFF

> Note that the query planner wasn't designed with RAM as the storage
> space for pg, so it might make some bad decisions until you adjust
> postgresql.conf to stop that.  and then it still might make some bad
> decisions.

What thinks might need adjusting?

thanks,
-Alan


--
“Don't eat anything you've ever seen advertised on TV”
         - Michael Pollan, author of "In Defense of Food"

Re: Does anyone use in ram postgres database?

От
Ozz Nixon
Дата:
On 3/26/10 10:06 AM, Alan McKay wrote:
> On Thu, Mar 25, 2010 at 4:15 PM, Scott Marlowe<scott.marlowe@gmail.com>  wrote:
>
>> These questions always get the first question back, what are you
>> trying to accomplish?  Different objectives will have different
>> answers.
>>
> We have a real-time application that processes data as it comes in.
> Doing some simple math tells us that a disk-based DB cannot possible
> perform fast enough to allow us to process the data.
>

I have to ask the obvious question... as we develop solutions which must
process 100,000 queries a second. In those cases, we use a combination
hash table and link-lists. There are times where SQL is not the right
choice, it is great for simplifying indexing and locks - but prior to
SQL *we* had to write code guys... and it sounds like you too need to go
back to old-school programming techniques.

O.

Re: Does anyone use in ram postgres database?

От
Alan McKay
Дата:
On Fri, Mar 26, 2010 at 10:14 AM, Ozz Nixon <ozznixon@gmail.com> wrote:
> I have to ask the obvious question... as we develop solutions which must
> process 100,000 queries a second. In those cases, we use a combination hash
> table and link-lists. There are times where SQL is not the right choice, it
> is great for simplifying indexing and locks - but prior to SQL *we* had to
> write code guys... and it sounds like you too need to go back to old-school
> programming techniques.

Oh I hear ya!   I'm in the IT team, and I told the SW designers some 6
or 7 weeks ago now that I don't think they should be doing this with a
DB.


--
“Don't eat anything you've ever seen advertised on TV”
         - Michael Pollan, author of "In Defense of Food"

Re: Does anyone use in ram postgres database?

От
Gordan Bobic
Дата:
Alan McKay wrote:
> On Thu, Mar 25, 2010 at 4:15 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
>> These questions always get the first question back, what are you
>> trying to accomplish?  Different objectives will have different
>> answers.
>
> We have a real-time application that processes data as it comes in.
> Doing some simple math tells us that a disk-based DB cannot possible
> perform fast enough to allow us to process the data.
>
>> Now, if your pg_xlog directory is a problem, then you either need
>> bigger faster hard drives, or your data is more transient in nature
>> and you can recreate it and you put the whole db into RAM.
>
> When we only saw a 3x improvement in speed with the RAM based DB, we
> were still seeing a fair bit of disk activity but were not sure what
> was going on.  Then we thought about pg_xlog and moved it to RAM as
> well, but as I recall still not a great improvement.

Have you considered using one of these:
http://www.acard.com/english/fb01-product.jsp?idno_no=270&prod_no=ANS-9010&type1_title=
Solid State Drive&type1_idno=13

Gordan

Re: Does anyone use in ram postgres database?

От
John Gage
Дата:
As a  kind of [very?] dumb question, is this where SQLite has been
used?  I am just curious.


On Mar 26, 2010, at 3:14 PM, Ozz Nixon wrote:

> On 3/26/10 10:06 AM, Alan McKay wrote:
>> On Thu, Mar 25, 2010 at 4:15 PM, Scott Marlowe<scott.marlowe@gmail.com
>> >  wrote:
>>
>>> These questions always get the first question back, what are you
>>> trying to accomplish?  Different objectives will have different
>>> answers.
>>>
>> We have a real-time application that processes data as it comes in.
>> Doing some simple math tells us that a disk-based DB cannot possible
>> perform fast enough to allow us to process the data.
>>
>
> I have to ask the obvious question... as we develop solutions which
> must process 100,000 queries a second. In those cases, we use a
> combination hash table and link-lists. There are times where SQL is
> not the right choice, it is great for simplifying indexing and locks
> - but prior to SQL *we* had to write code guys... and it sounds like
> you too need to go back to old-school programming techniques.
>
> O.
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general


Re: Does anyone use in ram postgres database?

От
Merlin Moncure
Дата:
On Fri, Mar 26, 2010 at 10:06 AM, Alan McKay <alan.mckay@gmail.com> wrote:
> We are trying a test right now where "initdb" was run against
> /ramdisk/data so that absolutely everything should be in there.  Will
> report back with results.
>
> We are also about to try another test with a regular disk-based DB and
> fsync turned OFF
>
>> Note that the query planner wasn't designed with RAM as the storage
>> space for pg, so it might make some bad decisions until you adjust
>> postgresql.conf to stop that.  and then it still might make some bad
>> decisions.
>
> What thinks might need adjusting?

if you are chasing tps in volatile cpu bound problems, you can try:
*) disable fsync
*) full_page_writes = off
*) disable ALL statistics (track_activities, etc)

application side:
*) using libpq? consider moving to binary style queries
and
*) prepare queries
or
*) if possible, stack queries together, perhaps stacking data into arrays

well, if you are looking at extreme TPS rates beyond what postgres can
handle on your hardware (there are some hard cpu limits besides what
the disks are doing), you may have to start looking at an 'in process'
cache management library.  of course, you have to balance this against
the loss of backend flexibility that postgres provides.  also beware
this route if you have any type of concurrency requirements.

at one point due to curiosity I hacked a test into the backend by
inserting queries directly into the protocol handling switch in tcop.c
(bypassing the protocol completely)  and was able to observe TPS rates
that are simply impossible in the current architecture (with the
backend processing humming along at 100% cpu from a single client).
until postgres supports this type of mode of operation (which is not
likely to happen anytime soon), it is going to remain relatively
unsuited for super high tps rate low concurrency 'cache' type
applications where the real strengths of the database don't play (all
that said, you may still get it to work for your own stuff).

merlin

Re: Does anyone use in ram postgres database?

От
Ozz Nixon
Дата:
On 3/26/10 11:12 AM, John Gage wrote:
> As a  kind of [very?] dumb question, is this where SQLite has been
> used?  I am just curious.
All questions are good ones, as that is how we all learn. ;-)

SQLite is useful for small foot print environments, along with simpler
solutions like XBase (DBase) files. They tend to be quick and easy for
implementation and usage, not robust for enterprise multi-user systems.
(Not trying to stat a flame war, just the facts).

Enterprise engines are great for day to day transactional data flow, a
few thousand reads with fewer writes. When you start to exceed writes to
reads, then this is where you need to decide -- are those writes for
audit and archive, or are those writes compounding the results of the reads.

     If they are archive/history and audit as needed, this is where
partitionable databases come to mind, or even simplistic text files
(encrypted if needed).

     If they are compounding your reads then the fork in the road
appears... there are questions you have to ask yourself about the 'now'
and '3 years from now' of your data. For example, the original statement
was that running the SQL engine in RAM mode only handled 3 times more
data requests, and that is not enough (I assume). There are probably
database designs and query techniques that could improve your
performance -- but does that answer the now or the 3 years from now
need? We spend hours on each of our database designs, and our queries -
and sometimes the queries force us to redesign the schema so we can milk
out a few hundred more queries in our time of measurement (minutes,
seconds, or hours).

     We had an existing solution in place which was capable of
processing 10,000 queries a minute. At the point of design, that was
more than our customer thought of doing. 8 months later, they were
starting to see waits on their processes for our solution. I spent the
next 2 days redesigning a simple socket listener with the data in RAM
using link-lists, hashes and returning it back in XML. Introduced 5
additional queries to improve the quality of the results, and delivered
it to them handling over 100,000 queries a second now.

     So with that said, the questions become:

What does your schema look like now?

What are your writing into the database?

How often are you writing?

What are you searching for?

How often are you searching?

How large is the result set that is flowing across the ether?

     There are times answer these questions, it is easier to see the
problem is not the technology you are trying to leverage, but how you
are using the technology. Then, there are times were you are trying to
use the wrong technology. Answering those above will allow myself and
the postgreSQL guru's to help you out.

* I use a wide range of SQL engines, depending upon budget, needs, etc.
Along with developing custom solutions when the DB way is not tailored
enough for a need. Hope that helps, and shows you, depending upon your
needs for now and 36 months from now play a big roll in designs and
re-designs.

O.

Re: Does anyone use in ram postgres database?

От
John Gage
Дата:
Thanks very, very much for this reply.  It is extremely useful.

So far, I have not run into anything remotely resembling a performance
barrier in Postgres.   I'm still looking :-)



On Mar 26, 2010, at 4:43 PM, Ozz Nixon wrote:

> On 3/26/10 11:12 AM, John Gage wrote:
>> As a  kind of [very?] dumb question, is this where SQLite has been
>> used?  I am just curious.
> All questions are good ones, as that is how we all learn. ;-)
>
> SQLite is useful for small foot print environments, along with
> simpler solutions like XBase (DBase) files. They tend to be quick
> and easy for implementation and usage, not robust for enterprise
> multi-user systems. (Not trying to stat a flame war, just the facts).
>
> Enterprise engines are great for day to day transactional data flow,
> a few thousand reads with fewer writes. When you start to exceed
> writes to reads, then this is where you need to decide -- are those
> writes for audit and archive, or are those writes compounding the
> results of the reads.
>
>    If they are archive/history and audit as needed, this is where
> partitionable databases come to mind, or even simplistic text files
> (encrypted if needed).
>
>    If they are compounding your reads then the fork in the road
> appears... there are questions you have to ask yourself about the
> 'now' and '3 years from now' of your data. For example, the original
> statement was that running the SQL engine in RAM mode only handled 3
> times more data requests, and that is not enough (I assume). There
> are probably database designs and query techniques that could
> improve your performance -- but does that answer the now or the 3
> years from now need? We spend hours on each of our database designs,
> and our queries - and sometimes the queries force us to redesign the
> schema so we can milk out a few hundred more queries in our time of
> measurement (minutes, seconds, or hours).
>
>    We had an existing solution in place which was capable of
> processing 10,000 queries a minute. At the point of design, that was
> more than our customer thought of doing. 8 months later, they were
> starting to see waits on their processes for our solution. I spent
> the next 2 days redesigning a simple socket listener with the data
> in RAM using link-lists, hashes and returning it back in XML.
> Introduced 5 additional queries to improve the quality of the
> results, and delivered it to them handling over 100,000 queries a
> second now.
>
>    So with that said, the questions become:
>
> What does your schema look like now?
>
> What are your writing into the database?
>
> How often are you writing?
>
> What are you searching for?
>
> How often are you searching?
>
> How large is the result set that is flowing across the ether?
>
>    There are times answer these questions, it is easier to see the
> problem is not the technology you are trying to leverage, but how
> you are using the technology. Then, there are times were you are
> trying to use the wrong technology. Answering those above will allow
> myself and the postgreSQL guru's to help you out.
>
> * I use a wide range of SQL engines, depending upon budget, needs,
> etc. Along with developing custom solutions when the DB way is not
> tailored enough for a need. Hope that helps, and shows you,
> depending upon your needs for now and 36 months from now play a big
> roll in designs and re-designs.
>
> O.
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general


Re: Does anyone use in ram postgres database?

От
John R Pierce
Дата:
Chris Barnes wrote:
>
>
>   We are testing in memory postgres database and have questions about
> configuring the ram mount point and whether there is great gains in
> setting it up this way? Are there any considerations for postgres?
>
>   If you have experience, can you please give us some ideas on how you
> have accomplished this?
>

you might look into TimesTen... Oracle bought them a couple years ago,
they have an SQL database thats heavily optimized for memory rather than
block oriented disk.  it optionally uses a disk as a persistence backing
store.   Of course, the entire database has to fit in ram, and they
charge proportional to database size.    its extremely fast.





Re: Does anyone use in ram postgres database?

От
Chris Barnes
Дата:
 
I just looked into timesten, at 46K for perpetual licence or 10k for yearly plus support.
 
Is there anything else available? LOL
 
Chris
 
> Date: Fri, 26 Mar 2010 10:39:37 -0700
> From: pierce@hogranch.com
> To: compuguruchrisbarnes@hotmail.com
> CC: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] Does anyone use in ram postgres database?
>
> Chris Barnes wrote:
> >
> >
> > We are testing in memory postgres database and have questions about
> > configuring the ram mount point and whether there is great gains in
> > setting it up this way? Are there any considerations for postgres?
> >
> > If you have experience, can you please give us some ideas on how you
> > have accomplished this?
> >
>
> you might look into TimesTen... Oracle bought them a couple years ago,
> they have an SQL database thats heavily optimized for memory rather than
> block oriented disk. it optionally uses a disk as a persistence backing
> store. Of course, the entire database has to fit in ram, and they
> charge proportional to database size. its extremely fast.
>
>
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general


Take your contacts everywhere. Try Messenger for mobile