Обсуждение: Growth planning

Поиск
Список
Период
Сортировка

Growth planning

От
Israel Brewster
Дата:
A couple of months ago I was tasked with setting up a new database to hold the results of some new data processing scripts a colleague of my was developing. As I knew this would be a decent amount of data, I did my best to set up a system that would perform well, assigning the machine 20 processor cores to allow for parallel processing workflows with retrieving/processing the data, 128GB of RAM, and 6TB of NMVe storage. On this I installed the latest postgresql (13), did some performance tuning to the settings according to various guides I found online, set up a table to hold the data, and created a number of indexes that seemed appropriate for the queries I anticipated.

At this point, performance is fine, though not spectacular, with a “normal” query of the currently 400GB database taking a couple of seconds. “Normal” in this case being defined as the most common SELECT query run against the data to plot the current and historical trends. Users are few in number (and will likely remain that way), and it can keep up with the insert rate without difficulty - data is processed and dumped to the database in 10 minute “chunks", and generally the run is only taking around 2 minutes, so there is a fair amount of overhead available there.

However, this database is growing at a rate of 14GB/day (as measured via looking at df stats), and the script owner is taking about wanting to process and pull in “all the historical data we have access to”, which would go back several years, not to mention the probable desire to keep things running into the foreseeable future. This amounts to a volume of data that is WAY beyond anything I have any experience with, especially since I am primarily a software engineer, not a DB admin (though administering a DB is often involved with my role, as the software I engineer often needs a data source). As such, I am looking for advice from people who have dealt with such large volumes of data as far as how I should architect things now, before it gets to out-of-hand, to best ensure optimal performance in the future.

Some additional data that may or may not be relevant:

- The workload is largely insert intensive. Every 10 minutes, 1Hz records are inserted for up to three channels from about 118 stations - so up to around 212,000 new records inserted every 10 minutes. In practice, the number is generally somewhat lower as not all stations actually have three channels of data, but the majority do.

- The largest SELECT workflow currently is a script that pulls all available data for ONE channel of each station (currently, I suspect that will change to all channels in the near future), and runs some post-processing machine learning algorithms on it. This script (written in R, if that makes a difference) currently takes around half an hour to run, and is run once every four hours. I would estimate about 50% of the run time is data retrieval and the rest doing its own thing. I am only responsible for integrating this script with the database, what it does with the data (and therefore how long that takes, as well as what data is needed), is up to my colleague. I have this script running on the same machine as the DB to minimize data transfer times.

- Other than the above processing script, workload is fairly light, with only one or two users occasionally pulling up graphs of the data for a handful of channels on a handful of stations at most (singe station/channel at a time). The time range of data needed for these graphs may vary from a few seconds (looking at a specific event, which may or may not be recent) to several years (looking at historical trends). As such, full-resolution data needs to be available quickly - that is, on a user-friendly time scale - for any period of time for which we have data.


- Some stats from the pg_stat_user_tables:

volcano_seismology=# SELECT relname, last_vacuum, last_analyze,last_autovacuum,last_autoanalyze, autovacuum_count, autoanalyze_count, n_dead_tup,idx_scan,idx_tup_fetch,n_tup_ins,n_mod_since_analyze,n_ins_since_vacuum FROM pg_stat_user_tables WHERE relname='data';
-[ RECORD 1 ]-------+------------------------------
relname             | data
last_vacuum         | 2021-09-29 17:33:19.269922+00
last_analyze        | 2021-09-29 17:33:32.281416+00
last_autovacuum     | 2021-10-04 12:28:38.250069+00
last_autoanalyze    | 2021-10-04 15:05:29.745062+00
autovacuum_count    | 30
autoanalyze_count   | 37
n_dead_tup          | 122031
idx_scan            | 1584854
idx_tup_fetch       | 44873856136
n_tup_ins           | 245597916
n_mod_since_analyze | 1901231
n_ins_since_vacuum  | 5958840

- and the data table definition:

     Column     |           Type           | Collation | Nullable |             Default              
----------------+--------------------------+-----------+----------+----------------------------------
 id             | bigint                   |           | not null | nextval('data_id_seq'::regclass)
 datetime       | timestamp with time zone |           | not null | 
 freq_max1      | double precision         |           |          | 
 freq_max5      | double precision         |           |          | 
 freq_max10     | double precision         |           |          | 
 freq_max20     | double precision         |           |          | 
 freq_max30     | double precision         |           |          | 
 freq_max40     | double precision         |           |          | 
 freq_max50     | double precision         |           |          | 
 freq_max100    | double precision         |           |          | 
 sd_freq_max5   | double precision         |           |          | 
 sd_freq_max10  | double precision         |           |          | 
 sd_freq_max20  | double precision         |           |          | 
 sd_freq_max30  | double precision         |           |          | 
 sd_freq_max40  | double precision         |           |          | 
 sd_freq_max50  | double precision         |           |          | 
 sd_freq_max100 | double precision         |           |          | 
 ssa_max1       | double precision         |           |          | 
 ssa_max5       | double precision         |           |          | 
 ssa_max10      | double precision         |           |          | 
 ssa_max20      | double precision         |           |          | 
 ssa_max30      | double precision         |           |          | 
 ssa_max40      | double precision         |           |          | 
 ssa_max50      | double precision         |           |          | 
 ssa_max100     | double precision         |           |          | 
 sd_ssa_max5    | double precision         |           |          | 
 sd_ssa_max10   | double precision         |           |          | 
 sd_ssa_max20   | double precision         |           |          | 
 sd_ssa_max30   | double precision         |           |          | 
 sd_ssa_max40   | double precision         |           |          | 
 sd_ssa_max50   | double precision         |           |          | 
 sd_ssa_max100  | double precision         |           |          | 
 station        | smallint                 |           | not null | 
 channel        | character varying(6)     |           | not null | 
 epoch          | integer                  |           |          | 
 rsam           | double precision         |           |          | 
 sd_rsam        | double precision         |           |          | 
Indexes:
    "data_pkey" PRIMARY KEY, btree (id)
    "date_station_channel_idx" UNIQUE, btree (datetime, station, channel)
    "station_channel_epoch_idx" UNIQUE, btree (station, channel, epoch)
    "data_station_channel_idx" btree (station, channel)
    "station_data_idx" btree (station)
    "station_date_idx" btree (station, datetime)
Foreign-key constraints:
    "data_station_fkey" FOREIGN KEY (station) REFERENCES stations(id)
Triggers:
    update_epoch BEFORE INSERT OR UPDATE OF datetime ON data FOR EACH ROW EXECUTE FUNCTION store_epoch()

---
Israel Brewster
Software Engineer
Alaska Volcano Observatory 
Geophysical Institute - UAF 
2156 Koyukuk Drive 
Fairbanks AK 99775-7320
Work: 907-474-5172
cell:  907-328-9145

Re: Growth planning

От
Rob Sargent
Дата:


On Oct 4, 2021, at 10:22 AM, Israel Brewster <ijbrewster@alaska.edu> wrote:


- and the data table definition:

     Column     |           Type           | Collation | Nullable |             Default              
----------------+--------------------------+-----------+----------+----------------------------------
 id             | bigint                   |           | not null | nextval('data_id_seq'::regclass)
 datetime       | timestamp with time zone |           | not null | 
 freq_max1      | double precision         |           |          | 
 freq_max5      | double precision         |           |          | 
 freq_max10     | double precision         |           |          | 
 freq_max20     | double precision         |           |          | 
 freq_max30     | double precision         |           |          | 
 freq_max40     | double precision         |           |          | 
 freq_max50     | double precision         |           |          | 
 freq_max100    | double precision         |           |          | 
 sd_freq_max5   | double precision         |           |          | 
 sd_freq_max10  | double precision         |           |          | 
 sd_freq_max20  | double precision         |           |          | 
 sd_freq_max30  | double precision         |           |          | 
 sd_freq_max40  | double precision         |           |          | 
 sd_freq_max50  | double precision         |           |          | 
 sd_freq_max100 | double precision         |           |          | 
 ssa_max1       | double precision         |           |          | 
 ssa_max5       | double precision         |           |          | 
 ssa_max10      | double precision         |           |          | 
 ssa_max20      | double precision         |           |          | 
 ssa_max30      | double precision         |           |          | 
 ssa_max40      | double precision         |           |          | 
 ssa_max50      | double precision         |           |          | 
 ssa_max100     | double precision         |           |          | 
 sd_ssa_max5    | double precision         |           |          | 
 sd_ssa_max10   | double precision         |           |          | 
 sd_ssa_max20   | double precision         |           |          | 
 sd_ssa_max30   | double precision         |           |          | 
 sd_ssa_max40   | double precision         |           |          | 
 sd_ssa_max50   | double precision         |           |          | 
 sd_ssa_max100  | double precision         |           |          | 
 station        | smallint                 |           | not null | 
 channel        | character varying(6)     |           | not null | 
 epoch          | integer                  |           |          | 
 rsam           | double precision         |           |          | 
 sd_rsam        | double precision         |           |          | 
Indexes:
    "data_pkey" PRIMARY KEY, btree (id)
    "date_station_channel_idx" UNIQUE, btree (datetime, station, channel)
    "station_channel_epoch_idx" UNIQUE, btree (station, channel, epoch)
    "data_station_channel_idx" btree (station, channel)
    "station_data_idx" btree (station)
    "station_date_idx" btree (station, datetime)
Foreign-key constraints:
    "data_station_fkey" FOREIGN KEY (station) REFERENCES stations(id)
Triggers:
    update_epoch BEFORE INSERT OR UPDATE OF datetime ON data FOR EACH ROW EXECUTE FUNCTION store_epoch()

Guessing the “sd” is "standard deviation”?  Any chance those stddevs are easily calculable from base data?  Could cut your table size in half (and put those 20 cores to work on the reporting).  And I wonder if the last three indices are strictly necessary? They take disc space too.

But my bet is you’re headed for partitioning on datetime or perhaps station.

Re: Growth planning

От
Israel Brewster
Дата:
On Oct 4, 2021, at 8:46 AM, Rob Sargent <robjsargent@gmail.com> wrote:

On Oct 4, 2021, at 10:22 AM, Israel Brewster <ijbrewster@alaska.edu> wrote:
Guessing the “sd” is "standard deviation”?  Any chance those stddevs are easily calculable from base data?  Could cut your table size in half (and put those 20 cores to work on the reporting).

Possible - I’d have to dig into that with the script author. I was just handed an R script (I don’t work with R…) and told here’s the data it needs, here’s the output we need stored in the DB. I then spent just enough time with the script to figure out how to hook up the I/O. The schema is pretty much just a raw dump of the output - I haven’t really spent any resources figuring out what, exactly, the data is. Maybe I should :-)

 And I wonder if the last three indices are strictly necessary? They take disc space too.

Not sure. Here’s the output from pg_stat_all_indexes:

volcano_seismology=# select * from pg_stat_all_indexes where relname='data';
 relid | indexrelid | schemaname | relname |       indexrelname        | idx_scan | idx_tup_read | idx_tup_fetch 
-------+------------+------------+---------+---------------------------+----------+--------------+---------------
 19847 |      19869 | public     | data    | data_pkey                 |        0 |            0 |             0
 19847 |      19873 | public     | data    | date_station_channel_idx  |   811884 |  12031143199 |    1192412952
 19847 |      19875 | public     | data    | station_channel_epoch_idx |        8 |       318506 |        318044
 19847 |      19876 | public     | data    | station_data_idx          |     9072 |         9734 |          1235
 19847 |      19877 | public     | data    | station_date_idx          |   721616 |  10927533403 |   10908912092
 19847 |      20479 | public     | data    | data_station_channel_idx  |    47293 | 194422257262 |    6338753379
(6 rows)

so they *have* been used (although not the station_data_idx so much), but this doesn’t tell me when it was last used, so some of those may be queries I was experimenting with to see what was fastest, but are no longer in use. Maybe I should keep an eye on this for a while, see which values are increasing.


But my bet is you’re headed for partitioning on datetime or perhaps station.

While datetime partitioning seems to be the most common, I’m not clear on how that would help here, as the most intensive queries need *all* the datetimes for a given station, and even the smaller queries would be getting an arbitrary time range potentially spanning several, if not all, partitions. Now portioning on station seems to make sense - there are over 100 of those, and pretty much any query will only deal with a single station at a time. Perhaps if more partitioning would be better, portion by both station and channel? The queries that need to be fastest will only be looking at a single channel of a single station.

I’ll look into this a bit more, maybe try some experimenting while I still have *relatively* little data. My main hesitation here is that in the brief look I’ve given partitioning so far, it looks to be a royal pain to get set up. Any tips for making that easier?

Thanks for the suggestion!
---
Israel Brewster
Software Engineer
Alaska Volcano Observatory 
Geophysical Institute - UAF 
2156 Koyukuk Drive 
Fairbanks AK 99775-7320
Work: 907-474-5172
cell:  907-328-9145

Re: Growth planning

От
Rob Sargent
Дата:
On 10/4/21 11:09 AM, Israel Brewster wrote:
On Oct 4, 2021, at 8:46 AM, Rob Sargent <robjsargent@gmail.com> wrote:

On Oct 4, 2021, at 10:22 AM, Israel Brewster <ijbrewster@alaska.edu> wrote:
Guessing the “sd” is "standard deviation”?  Any chance those stddevs are easily calculable from base data?  Could cut your table size in half (and put those 20 cores to work on the reporting).

Possible - I’d have to dig into that with the script author. I was just handed an R script (I don’t work with R…) and told here’s the data it needs, here’s the output we need stored in the DB. I then spent just enough time with the script to figure out how to hook up the I/O. The schema is pretty much just a raw dump of the output - I haven’t really spent any resources figuring out what, exactly, the data is. Maybe I should :-)

 And I wonder if the last three indices are strictly necessary? They take disc space too.

Not sure. Here’s the output from pg_stat_all_indexes:

volcano_seismology=# select * from pg_stat_all_indexes where relname='data';
 relid | indexrelid | schemaname | relname |       indexrelname        | idx_scan | idx_tup_read | idx_tup_fetch 
-------+------------+------------+---------+---------------------------+----------+--------------+---------------
 19847 |      19869 | public     | data    | data_pkey                 |        0 |            0 |             0
 19847 |      19873 | public     | data    | date_station_channel_idx  |   811884 |  12031143199 |    1192412952
 19847 |      19875 | public     | data    | station_channel_epoch_idx |        8 |       318506 |        318044
 19847 |      19876 | public     | data    | station_data_idx          |     9072 |         9734 |          1235
 19847 |      19877 | public     | data    | station_date_idx          |   721616 |  10927533403 |   10908912092
 19847 |      20479 | public     | data    | data_station_channel_idx  |    47293 | 194422257262 |    6338753379
(6 rows)

so they *have* been used (although not the station_data_idx so much), but this doesn’t tell me when it was last used, so some of those may be queries I was experimenting with to see what was fastest, but are no longer in use. Maybe I should keep an eye on this for a while, see which values are increasing.


But my bet is you’re headed for partitioning on datetime or perhaps station.

While datetime partitioning seems to be the most common, I’m not clear on how that would help here, as the most intensive queries need *all* the datetimes for a given station, and even the smaller queries would be getting an arbitrary time range potentially spanning several, if not all, partitions. Now portioning on station seems to make sense - there are over 100 of those, and pretty much any query will only deal with a single station at a time. Perhaps if more partitioning would be better, portion by both station and channel? The queries that need to be fastest will only be looking at a single channel of a single station.

I’ll look into this a bit more, maybe try some experimenting while I still have *relatively* little data. My main hesitation here is that in the brief look I’ve given partitioning so far, it looks to be a royal pain to get set up. Any tips for making that easier?


If no queries address multiple stations you could do a table per station.  Doesn't smell good but you have a lot of data and well, speed kills.

I think the date-station-channel could "take over" for the station-date.  Naturally the latter is chosen if you give just the two fields, but I would be curious to see how well the former performs given just its first two fields(when station-date doesn't exist).

Re: Growth planning

От
Israel Brewster
Дата:
On Oct 4, 2021, at 9:22 AM, Rob Sargent <robjsargent@gmail.com> wrote:

On 10/4/21 11:09 AM, Israel Brewster wrote:
On Oct 4, 2021, at 8:46 AM, Rob Sargent <robjsargent@gmail.com> wrote:

On Oct 4, 2021, at 10:22 AM, Israel Brewster <ijbrewster@alaska.edu> wrote:
Guessing the “sd” is "standard deviation”?  Any chance those stddevs are easily calculable from base data?  Could cut your table size in half (and put those 20 cores to work on the reporting).

Possible - I’d have to dig into that with the script author. I was just handed an R script (I don’t work with R…) and told here’s the data it needs, here’s the output we need stored in the DB. I then spent just enough time with the script to figure out how to hook up the I/O. The schema is pretty much just a raw dump of the output - I haven’t really spent any resources figuring out what, exactly, the data is. Maybe I should :-)

 And I wonder if the last three indices are strictly necessary? They take disc space too.

Not sure. Here’s the output from pg_stat_all_indexes:

volcano_seismology=# select * from pg_stat_all_indexes where relname='data';
 relid | indexrelid | schemaname | relname |       indexrelname        | idx_scan | idx_tup_read | idx_tup_fetch 
-------+------------+------------+---------+---------------------------+----------+--------------+---------------
 19847 |      19869 | public     | data    | data_pkey                 |        0 |            0 |             0
 19847 |      19873 | public     | data    | date_station_channel_idx  |   811884 |  12031143199 |    1192412952
 19847 |      19875 | public     | data    | station_channel_epoch_idx |        8 |       318506 |        318044
 19847 |      19876 | public     | data    | station_data_idx          |     9072 |         9734 |          1235
 19847 |      19877 | public     | data    | station_date_idx          |   721616 |  10927533403 |   10908912092
 19847 |      20479 | public     | data    | data_station_channel_idx  |    47293 | 194422257262 |    6338753379
(6 rows)

so they *have* been used (although not the station_data_idx so much), but this doesn’t tell me when it was last used, so some of those may be queries I was experimenting with to see what was fastest, but are no longer in use. Maybe I should keep an eye on this for a while, see which values are increasing.


But my bet is you’re headed for partitioning on datetime or perhaps station.

While datetime partitioning seems to be the most common, I’m not clear on how that would help here, as the most intensive queries need *all* the datetimes for a given station, and even the smaller queries would be getting an arbitrary time range potentially spanning several, if not all, partitions. Now portioning on station seems to make sense - there are over 100 of those, and pretty much any query will only deal with a single station at a time. Perhaps if more partitioning would be better, portion by both station and channel? The queries that need to be fastest will only be looking at a single channel of a single station.

I’ll look into this a bit more, maybe try some experimenting while I still have *relatively* little data. My main hesitation here is that in the brief look I’ve given partitioning so far, it looks to be a royal pain to get set up. Any tips for making that easier?


If no queries address multiple stations you could do a table per station.  Doesn't smell good but you have a lot of data and well, speed kills.

Indeed. Table per station as opposed to partitioning? The *most* I can reasonably envision needing is to query two stations, i.e. I could see potentially wanting to compare station a to some “baseline” station b. In general, though, the stations are independent, and it seems unlikely that we will need any multi-station queries. Perhaps query one station, then a second query for a second to display graphs for both side-by-side to look for correlations or something, but nothing like that has been suggested at the moment.


I think the date-station-channel could "take over" for the station-date.  Naturally the latter is chosen if you give just the two fields, but I would be curious to see how well the former performs given just its first two fields(when station-date doesn't exist).

Ah, that makes sense. I’ll try to run some benchmarks later today/tomorrow.

Thanks again!

---
Israel Brewster
Software Engineer
Alaska Volcano Observatory 
Geophysical Institute - UAF 
2156 Koyukuk Drive 
Fairbanks AK 99775-7320
Work: 907-474-5172
cell:  907-328-9145

Re: Growth planning

От
Alban Hertroys
Дата:
> On 4 Oct 2021, at 18:22, Israel Brewster <ijbrewster@alaska.edu> wrote:

(…)

> the script owner is taking about wanting to process and pull in “all the historical data we have access to”, which
wouldgo back several years, not to mention the probable desire to keep things running into the foreseeable future. 

(…)

> - The largest SELECT workflow currently is a script that pulls all available data for ONE channel of each station
(currently,I suspect that will change to all channels in the near future), and runs some post-processing machine
learningalgorithms on it. This script (written in R, if that makes a difference) currently takes around half an hour to
run,and is run once every four hours. I would estimate about 50% of the run time is data retrieval and the rest doing
itsown thing. I am only responsible for integrating this script with the database, what it does with the data (and
thereforehow long that takes, as well as what data is needed), is up to my colleague. I have this script running on the
samemachine as the DB to minimize data transfer times. 

I suspect that a large portion of time is spent on downloading this data to the R script, would it help to rewrite it
inPL/R and do (part of) the ML calculations at the DB side? 

Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.




Re: Growth planning

От
Ron
Дата:
On 10/4/21 12:36 PM, Israel Brewster wrote:
[snip]
Indeed. Table per station as opposed to partitioning? The *most* I can reasonably envision needing is to query two stations, i.e. I could see potentially wanting to compare station a to some “baseline” station b. In general, though, the stations are independent, and it seems unlikely that we will need any multi-station queries. Perhaps query one station, then a second query for a second to display graphs for both side-by-side to look for correlations or something, but nothing like that has been suggested at the moment.


Postgresql partitions are tables.  What if you partition by station (or range of stations)?

--
Angular momentum makes the world go 'round.

Re: Growth planning

От
Israel Brewster
Дата:
On Oct 4, 2021, at 12:46 PM, Ron <ronljohnsonjr@gmail.com> wrote:

On 10/4/21 12:36 PM, Israel Brewster wrote:
[snip]
Indeed. Table per station as opposed to partitioning? The *most* I can reasonably envision needing is to query two stations, i.e. I could see potentially wanting to compare station a to some “baseline” station b. In general, though, the stations are independent, and it seems unlikely that we will need any multi-station queries. Perhaps query one station, then a second query for a second to display graphs for both side-by-side to look for correlations or something, but nothing like that has been suggested at the moment.


Postgresql partitions are tables.  What if you partition by station (or range of stations)?

Yeah, that’s what I thought, but Rob had said “Table per station”, so I wasn’t sure if he was referring to *not* using partitioning, but just making “plain” tables.

Regardless, I intend to try portioning by station sometime this week, to see how performance compares to the “one big table” I currently have. Also to figure out how to get it set up, which from what I’ve seen appears to be a bit of a pain point.
---
Israel Brewster
Software Engineer
Alaska Volcano Observatory 
Geophysical Institute - UAF 
2156 Koyukuk Drive 
Fairbanks AK 99775-7320
Work: 907-474-5172
cell:  907-328-9145

--
Angular momentum makes the world go 'round.

Re: Growth planning

От
Rob Sargent
Дата:
On 10/4/21 3:09 PM, Israel Brewster wrote:
On Oct 4, 2021, at 12:46 PM, Ron <ronljohnsonjr@gmail.com> wrote:

On 10/4/21 12:36 PM, Israel Brewster wrote:
[snip]
Indeed. Table per station as opposed to partitioning? The *most* I can reasonably envision needing is to query two stations, i.e. I could see potentially wanting to compare station a to some “baseline” station b. In general, though, the stations are independent, and it seems unlikely that we will need any multi-station queries. Perhaps query one station, then a second query for a second to display graphs for both side-by-side to look for correlations or something, but nothing like that has been suggested at the moment.


Postgresql partitions are tables.  What if you partition by station (or range of stations)?

Yeah, that’s what I thought, but Rob had said “Table per station”, so I wasn’t sure if he was referring to *not* using partitioning, but just making “plain” tables.

Regardless, I intend to try portioning by station sometime this week, to see how performance compares to the “one big table” I currently have. Also to figure out how to get it set up, which from what I’ve seen appears to be a bit of a pain point.
---

My "strict" table per station suggestion was meant as an option to avoid the partitioning pain point entirely if it wasn't going to buy you anything. Namely querying more than one station's data.

In a write-once scenario such as this,  would a "clustered index" on datetime be stable, performant?  Seems a read-for-export could put the head down at time point A and just go?


Re: Growth planning

От
Israel Brewster
Дата:
On Oct 4, 2021, at 1:21 PM, Rob Sargent <robjsargent@gmail.com> wrote:

On 10/4/21 3:09 PM, Israel Brewster wrote:
On Oct 4, 2021, at 12:46 PM, Ron <ronljohnsonjr@gmail.com> wrote:

On 10/4/21 12:36 PM, Israel Brewster wrote:
[snip]
Indeed. Table per station as opposed to partitioning? The *most* I can reasonably envision needing is to query two stations, i.e. I could see potentially wanting to compare station a to some “baseline” station b. In general, though, the stations are independent, and it seems unlikely that we will need any multi-station queries. Perhaps query one station, then a second query for a second to display graphs for both side-by-side to look for correlations or something, but nothing like that has been suggested at the moment.


Postgresql partitions are tables.  What if you partition by station (or range of stations)?

Yeah, that’s what I thought, but Rob had said “Table per station”, so I wasn’t sure if he was referring to *not* using partitioning, but just making “plain” tables.

Regardless, I intend to try portioning by station sometime this week, to see how performance compares to the “one big table” I currently have. Also to figure out how to get it set up, which from what I’ve seen appears to be a bit of a pain point.
---

My "strict" table per station suggestion was meant as an option to avoid the partitioning pain point entirely if it wasn't going to buy you anything. Namely querying more than one station's data.

Ah, so in theory making “strict” tables for each would be easier than creating partitions for each? Something to consider for sure if so.


In a write-once scenario such as this,  would a "clustered index" on datetime be stable, performant?  Seems a read-for-export could put the head down at time point A and just go?

That’s beyond my level of DB admin knowledge, unfortunately :) I can certainly read up on it and give it a try though!

---
Israel Brewster
Software Engineer
Alaska Volcano Observatory 
Geophysical Institute - UAF 
2156 Koyukuk Drive 
Fairbanks AK 99775-7320
Work: 907-474-5172
cell:  907-328-9145  

Re: Growth planning

От
Israel Brewster
Дата:
On Oct 4, 2021, at 9:22 AM, Rob Sargent <robjsargent@gmail.com> wrote:

I think the date-station-channel could "take over" for the station-date.  Naturally the latter is chosen if you give just the two fields, but I would be curious to see how well the former performs given just its first two fields(when station-date doesn't exist).


Interesting result here. Technically it appears you are correct - the date-station-channel index *can* “take over” for the station-date index. Unfortunately, it is about 6x slower (see the EXPLAIN ANALYZE output for the station_date_idx here: https://explain.depesz.com/s/COfy vs the one for the date-station-channel index here: https://explain.depesz.com/s/hgBt) - using the station_date_idx takes around 2.5 seconds while the date-station-channel index is over 12 seconds, even though it has an apparently simpler execution plan. Perhaps something about the different sizes of the indexes?

The query I used in both cases was this:

SELECT
to_char(datetime AT TIME ZONE 'UTC','YYYY-MM-DD"T"HH24:MI:SS"Z"') as text_date,
freq_max10,
sd_freq_max10,
rsam
FROM
data
WHERE datetime>='2021-09-27' 
AND station=27
AND channel=‘BHZ'

Which actually includes all three columns (which makes it even more interesting to me that the two column, non-UNIQUE index is preferable), and I ran the query several times both with and without the station-date index to (hopefully) make sure there were no caching issues.

---
Israel Brewster
Software Engineer
Alaska Volcano Observatory 
Geophysical Institute - UAF 
2156 Koyukuk Drive 
Fairbanks AK 99775-7320
Work: 907-474-5172
cell:  907-328-9145

Re: Growth planning

От
Rob Sargent
Дата:
On 10/4/21 3:37 PM, Israel Brewster wrote:
On Oct 4, 2021, at 1:21 PM, Rob Sargent <robjsargent@gmail.com> wrote:

My "strict" table per station suggestion was meant as an option to avoid the partitioning pain point entirely if it wasn't going to buy you anything. Namely querying more than one station's data.

Ah, so in theory making “strict” tables for each would be easier than creating partitions for each? Something to consider for sure if so.


In a write-once scenario such as this,  would a "clustered index" on datetime be stable, performant?  Seems a read-for-export could put the head down at time point A and just go?

That’s beyond my level of DB admin knowledge, unfortunately :) I can certainly read up on it and give it a try though!


I was hoping one of the smart people would chime in;)

Re: Growth planning

От
Ryan Booz
Дата:
As for clustering, unfortunately, it's a one-time operation in Postgres (as far as I'm aware), so you'd have to "cluster" the index every time after an insert or update of data. If it is partitioned, I presume it can be run on the index of each partition table individually - but I'm not sure.

On Mon, Oct 4, 2021 at 6:05 PM Rob Sargent <robjsargent@gmail.com> wrote:
On 10/4/21 3:37 PM, Israel Brewster wrote:
On Oct 4, 2021, at 1:21 PM, Rob Sargent <robjsargent@gmail.com> wrote:

My "strict" table per station suggestion was meant as an option to avoid the partitioning pain point entirely if it wasn't going to buy you anything. Namely querying more than one station's data.

Ah, so in theory making “strict” tables for each would be easier than creating partitions for each? Something to consider for sure if so.


In a write-once scenario such as this,  would a "clustered index" on datetime be stable, performant?  Seems a read-for-export could put the head down at time point A and just go?

That’s beyond my level of DB admin knowledge, unfortunately :) I can certainly read up on it and give it a try though!


I was hoping one of the smart people would chime in;)

Re: Growth planning

От
Israel Brewster
Дата:
Just a quick update to this topic from my testing: I whipped up a quick python script to create the partition tables for me, which went smoothly enough, and created a table LIST partitioned on station. Once populated with my current data, this table proved marginally faster than the unpartitioned table, especially on the initial select (~4 seconds vs ~6 seconds, speeding up to around 2.4 seconds on subsequent queries). Of course, it is entirely possible that performance will remain higher than with the unpartitioned table as the column count grows. 

Then I tried partitioning by station, with the station tables sub-partitioned by channel, on the logic that most queries (especially the ones that need to be fast) are only interested in a single channel on a single station. This made a HUGE improvement (relatively speaking). Initial query time dropped to ~2.5 seconds, with subsequent queries coming in at closer to 1 second!

I’ll have to think about the maintenance side - what happens if a new station/channel comes online that I don’t have a partition for? I’m thinking try to catch the error in my python code when I try to insert such a record, create the relevant table(s), then try the INSERT again, but I’ll have to investigate more to figure out if this is an option (what sort of error do I get), or if there is a better one.

I guess time will tell if this is a sustainable/good schema, but at least for the 1,171,575,191 rows I currently have, this gives much better performance than the non-partitioned table, and presumably will continue to do so into the future.

Of course, if anyone else has any other suggestions other than simple partitioning, I’m all ears! Like I said, this is far outside my experience in terms of sheer data size (I will be talking to the timescaledb people tomorrow)!

Thanks again!
---
Israel Brewster
Software Engineer
Alaska Volcano Observatory 
Geophysical Institute - UAF 
2156 Koyukuk Drive 
Fairbanks AK 99775-7320
Work: 907-474-5172
cell:  907-328-9145

On Oct 4, 2021, at 12:46 PM, Ron <ronljohnsonjr@gmail.com> wrote:

On 10/4/21 12:36 PM, Israel Brewster wrote:
[snip]
Indeed. Table per station as opposed to partitioning? The *most* I can reasonably envision needing is to query two stations, i.e. I could see potentially wanting to compare station a to some “baseline” station b. In general, though, the stations are independent, and it seems unlikely that we will need any multi-station queries. Perhaps query one station, then a second query for a second to display graphs for both side-by-side to look for correlations or something, but nothing like that has been suggested at the moment.


Postgresql partitions are tables.  What if you partition by station (or range of stations)?

--
Angular momentum makes the world go 'round.