Loading 500m json files to database

Поиск
Список
Период
Сортировка
От pinker
Тема Loading 500m json files to database
Дата
Msg-id 1584959088557-0.post@n3.nabble.com
обсуждение исходный текст
Ответы Re: Loading 500m json files to database  (Ertan Küçükoğlu <ertan.kucukoglu@1nar.com.tr>)
Re: Loading 500m json files to database  (Christopher Browne <cbbrowne@gmail.com>)
Re: Loading 500m json files to database  (Rob Sargent <robjsargent@gmail.com>)
Re: Loading 500m json files to database  (Adrian Klaver <adrian.klaver@aklaver.com>)
Re: Loading 500m json files to database  ("David G. Johnston" <david.g.johnston@gmail.com>)
Re: Loading 500m json files to database  (Reid Thompson <Reid.Thompson@omnicell.com>)
Список pgsql-general
Hi, do you have maybe idea how to make loading process faster?

I have 500 millions of json files (1 json per file) that I need to load to
db.
My test set is "only" 1 million files.

What I came up with now is:

time for i in datafiles/*; do
  psql -c "\copy json_parts(json_data) FROM $i"&
done

which is the fastest so far. But it's not what i expect. Loading 1m of data
takes me ~3h so loading 500 times more is just unacceptable.

some facts:
* the target db is on cloud so there is no option to do tricks like turning
fsync off
* version postgres 11
* i can spin up huge postgres instance if necessary in terms of cpu/ram
* i tried already hash partitioning (to write to 10 different tables instead
of 1)


Any ideas?



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-general-f1843780.html



В списке pgsql-general по дате отправления:

Предыдущее
От: pabloa98
Дата:
Сообщение: Re: Could postgres12 support millions of sequences? (like 10 million)
Следующее
От: Ravi Krishna
Дата:
Сообщение: Re: Postgres cluster setup