Re: Storing files: 2.3TBytes, 17M file count

Поиск
Список
Период
Сортировка
От Mike Sofen
Тема Re: Storing files: 2.3TBytes, 17M file count
Дата
Msg-id 00f201d249da$e142b830$a3c82890$@runbox.com
обсуждение исходный текст
Ответ на Storing files: 2.3TBytes, 17M file count  (Thomas Güttler <guettliml@thomas-guettler.de>)
Ответы Re: Storing files: 2.3TBytes, 17M file count  (Thomas Güttler <guettliml@thomas-guettler.de>)
Список pgsql-general

From: Thomas Güttler   Sent: Monday, November 28, 2016 6:28 AM

...I have 2.3TBytes of files. File count is 17M

Since we already store our structured data in postgres, I think about storing the files in PostgreSQL, too.

Is it feasible to store file in PostgreSQL?

-------

I am doing something similar, but in reverse.  The legacy mysql databases I’m converting into a modern Postgres data model, have very large genomic strings stored in 3 separate columns.  Out of the 25 TB of legacy data storage (in 800 dbs across 4 servers, about 22b rows), those 3 columns consume 90% of the total space, and they are just used for reference, never used in searches or calculations.  They range from 1k to several MB.

 

Since I am collapsing all 800 dbs into a single PG db, being very smart about storage was critical.  Since we’re also migrating everything to AWS, we’re placing those 3 strings (per row) into a single json document and storing the document in S3 bins, with the pointer to the file being the globally unique PK for the row…super simple.  The app tier knows to fetch the data from the db and large string json from the S3 bins.  The retrieval time is surprisingly fast, this is all real time web app stuff.

 

This is a model that could work for anyone dealing with large objects (text or binary).  The nice part is, the original 25TB of data storage drops to 5TB – a much more manageable number, allowing for significant growth, which is on the horizon.

 

Mike Sofen  (Synthetic Genomics USA)

В списке pgsql-general по дате отправления:

Предыдущее
От: Israel Brewster
Дата:
Сообщение: Re: Backup "Best Practices"
Следующее
От: David Steele
Дата:
Сообщение: Re: Wal files - Question | Postgres 9.2