Обсуждение: best practice in archiving CDR data

Поиск
Список
Период
Сортировка

best practice in archiving CDR data

От
Juan Backson
Дата:
Hi,

I am using Postgres to store CDR data for voip switches.  The data size quickly goes about a few TBs.  

What I would like to do is to be able to regularly archive the oldest data so only the most recent 6 months of data is available.  

All those old data will be stored in a format that can be retrieved back either into DB table or flat files.

Does anyone know how should I go about doing that?  Is there any existing tool that can already do that?

thanks,
jb

Re: best practice in archiving CDR data

От
"A. Kretschmer"
Дата:
In response to Juan Backson :
> Hi,
>
> I am using Postgres to store CDR data for voip switches.  The data size quickly
> goes about a few TBs.  
>
> What I would like to do is to be able to regularly archive the oldest data so
> only the most recent 6 months of data is available.  
>
> All those old data will be stored in a format that can be retrieved back either
> into DB table or flat files.
>
> Does anyone know how should I go about doing that?  Is there any existing tool
> that can already do that?

Sounds like table partitioning: create, for instance, a table for each
month and DROP old tables after 6 month or so.

http://www.postgresql.org/docs/current/static/ddl-partitioning.html


Regards, Andreas
--
Andreas Kretschmer
Kontakt:  Heynitz: 035242/47150,   D1: 0160/7141639 (mehr: -> Header)
GnuPG: 0x31720C99, 1006 CCB4 A326 1D42 6431  2EB0 389D 1DC2 3172 0C99

Re: best practice in archiving CDR data

От
Juan Backson
Дата:
Hi

Instead of dropping the table, I would like to archive the old table into a format that can be read and retrieved.

Can I db_dump on each child table?  

What is the best way to do it?  db_dump and make the data into csv and then tar.gz it or backup it up into a pg archived format?

thanks,
jb

On Mon, Mar 29, 2010 at 9:33 PM, A. Kretschmer <andreas.kretschmer@schollglas.com> wrote:
In response to Juan Backson :
> Hi,
>
> I am using Postgres to store CDR data for voip switches.  The data size quickly
> goes about a few TBs.  
>
> What I would like to do is to be able to regularly archive the oldest data so
> only the most recent 6 months of data is available.  
>
> All those old data will be stored in a format that can be retrieved back either
> into DB table or flat files.
>
> Does anyone know how should I go about doing that?  Is there any existing tool
> that can already do that?

Sounds like table partitioning: create, for instance, a table for each
month and DROP old tables after 6 month or so.

http://www.postgresql.org/docs/current/static/ddl-partitioning.html


Regards, Andreas
--
Andreas Kretschmer
Kontakt:  Heynitz: 035242/47150,   D1: 0160/7141639 (mehr: -> Header)
GnuPG: 0x31720C99, 1006 CCB4 A326 1D42 6431  2EB0 389D 1DC2 3172 0C99

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: best practice in archiving CDR data

От
Edgardo Portal
Дата:
On 2010-03-29, Juan Backson <juanbackson@gmail.com> wrote:
> --0016e64ccb10fb54050482f07924
> Content-Type: text/plain; charset=ISO-8859-1
>
> Hi,
>
> I am using Postgres to store CDR data for voip switches.  The data size
> quickly goes about a few TBs.
>
> What I would like to do is to be able to regularly archive the oldest data
> so only the most recent 6 months of data is available.
>
> All those old data will be stored in a format that can be retrieved back
> either into DB table or flat files.
>
> Does anyone know how should I go about doing that?  Is there any existing
> tool that can already do that?
>
> thanks,
> jb

FWIW, I partition by ISO week, use INSERT RULEs to route CDRs to the correct
partition (keeping about 3 partitions "open" to new CDRs at any one time),
use pg_dump to archive partition tables to off-line storage, and
DROP TABLE to keep the main DBs to about 40 weeks of data. I used
to use monthly partitioning, but the file sizes got a bit awkward
to deal with.

When I need to restore old CDRs (e.g. to service a subpoena) I
use pg_restore to load the needed CDRs to a throwaway database
and process as necessary.

Re: best practice in archiving CDR data

От
David Fetter
Дата:
On Mon, Mar 29, 2010 at 02:08:23PM +0000, Edgardo Portal wrote:
> On 2010-03-29, Juan Backson <juanbackson@gmail.com> wrote:
> > --0016e64ccb10fb54050482f07924
> > Content-Type: text/plain; charset=ISO-8859-1
> >
> > Hi,
> >
> > I am using Postgres to store CDR data for voip switches.  The data
> > size quickly goes about a few TBs.
> >
> > What I would like to do is to be able to regularly archive the
> > oldest data so only the most recent 6 months of data is available.
> >
> > All those old data will be stored in a format that can be
> > retrieved back either into DB table or flat files.
> >
> > Does anyone know how should I go about doing that?  Is there any
> > existing tool that can already do that?
> >
>
> FWIW, I partition by ISO week, use INSERT RULEs to route CDRs

Just generally, triggers are much better than RULEs for this kind of
thing.  The underlying functions can be made quite efficient.

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate