Обсуждение: Database setup for pgarchives

Поиск
Список
Период
Сортировка

Database setup for pgarchives

От
Sahil Harpal
Дата:
Hello everyone,
I am working on the improvement of pgarchives project as a part of GSoC.
I need help in database initialization. Do anyone have a database initialization script which will create all the required tables and fills it with sample data? Because what I observed the simple migration is not actually creating all the tables. Like table list_months is not getting created during migration process.
Also can I get some sample real word data that can be dump from the current database? I tried inserting some sample data but may be due to some internal constraints/dependency with other tables/info it is not working properly and giving error on accessing mail threads.
It would be realy helpful if I get database initialization script which will create all required tables and some sample real world data that I can use for testing.
I am planning to make the new design open for the discussion by puting link of test server but without data nothing will be visible.
Also would love to know any other possible method/technique for this.

Thanks,
Sahil Harpal

Re: Database setup for pgarchives

От
Jacob Champion
Дата:
On 7/20/22 14:12, Sahil Harpal wrote:
> Hello everyone,
> I am working on the improvement of pgarchives project as a part of GSoC.
> I need help in database initialization. Do anyone have a database
> initialization script which will create all the required tables and
> fills it with sample data? Because what I observed the simple migration
> is not actually creating all the tables. Like table list_months is not
> getting created during migration process.

Hi Sahil, I've also been playing with a local pgarchives setup recently.
I ended up using loader/sql/schema.sql in that repository to create some
of the missing tables, and then the migration scripts to fill the rest
in. Hopefully someone knows of an easier or more straightforward way.

> Also can I get some sample real word data that can be dump from the
> current database? I tried inserting some sample data but may be due to
> some internal constraints/dependency with other tables/info it is
> not working properly and giving error on accessing mail threads.
> It would be realy helpful if I get database initialization script which
> will create all required tables and some sample real world data that I
> can use for testing.
> I am planning to make the new design open for the discussion by puting
> link of test server but without data nothing will be visible.
> Also would love to know any other possible method/technique for this.

I am also interested in the answer to this, but unfortunately I don't
have much advice to offer. My local setup contains manually inserted
rows as well. I tried to use loader/pglister_sync.py as a bit of a
guide. The Django admin interface also helped a little bit (for example
with list creation), but in the end I had to construct queries for the
remaining pieces.

--Jacob



Re: Database setup for pgarchives

От
Sahil Harpal
Дата:
I am also interested in the answer to this, but unfortunately I don't
have much advice to offer. My local setup contains manually inserted
rows as well. I tried to use loader/pglister_sync.py as a bit of a
guide. The Django admin interface also helped a little bit (for example
with list creation), but in the end I had to construct queries for the
remaining pieces.

Are you able to visit all the pages for inserted data? Especially this one => /message-id/id_of_thread/
If yes, could you please dump your database entries to json and share it with the community? 
I also tried inserting but something is there which I am missing and hence getting an error on accessing this page.
And it's great if you could summarize the steps of initialization.

Thanks,
Sahil

Re: Database setup for pgarchives

От
Magnus Hagander
Дата:
On Wed, Jul 20, 2022 at 11:13 PM Sahil Harpal <sahilharpal1234@gmail.com> wrote:
>
> Hello everyone,
> I am working on the improvement of pgarchives project as a part of GSoC.
> I need help in database initialization. Do anyone have a database initialization script which will create all the
requiredtables and fills it with sample data? Because what I observed the simple migration is not actually creating all
thetables. Like table list_months is not getting created during migration process. 

There is very much a backlog on fixing this. This:
https://www.postgresql.org/message-id/12eb75f0-3fc2-14f3-0931-4f29e145f182%40cmatte.me
may be a good starting point. It's been on my list for far too long to
review that submission and I haven't gotten around to it, but it can
hopefully help set you on the right track.

The core problem being that some items are created manually using the
scripts in loader/sql and some are in the django models, which is...
Not very good.


> Also can I get some sample real word data that can be dump from the current database? I tried inserting some sample
databut may be due to some internal constraints/dependency with other tables/info it is not working properly and giving
erroron accessing mail threads. 

Download a mbox file for one month of say pgsql-hackers from the
website, and load that using loader/load_message.py. That's what I
usually do to inject test data. The only thing needed before that is
to create the list pgsql-hackers (well, you can call the list whatever
you want, but it has to be *a* list that you specify with -l to the
loader script). If you get a whole months worth of it it pretty much
always contains a big enough mix of threads to make it useful for
testing.

> It would be really helpful if I get a database initialization script which will create all required tables and some
samplereal world data that I can use for testing. 

Yeah, just getting a single-step setup of an initial, if empty,
database would be good.

--
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/



Re: Database setup for pgarchives

От
Sahil Harpal
Дата:
I tried things on WSL. In the wsl I am facing psql:schema.sql:94: ERROR:  text search parser "tsparser" does not exist error on executing schema.sql. What would be a solution for this?
Also I created a database in the WSL but I'm not getting how can I add the pgsql-hackers list in the database which is required to load the data using mbox. 
Could you please share the SQL query that I need to use for the list name and listgroup insertion?

On Thu, 21 Jul 2022 at 15:30, Magnus Hagander <magnus@hagander.net> wrote:
On Wed, Jul 20, 2022 at 11:13 PM Sahil Harpal <sahilharpal1234@gmail.com> wrote:
>
> Hello everyone,
> I am working on the improvement of pgarchives project as a part of GSoC.
> I need help in database initialization. Do anyone have a database initialization script which will create all the required tables and fills it with sample data? Because what I observed the simple migration is not actually creating all the tables. Like table list_months is not getting created during migration process.

There is very much a backlog on fixing this. This:
https://www.postgresql.org/message-id/12eb75f0-3fc2-14f3-0931-4f29e145f182%40cmatte.me
may be a good starting point. It's been on my list for far too long to
review that submission and I haven't gotten around to it, but it can
hopefully help set you on the right track.

The core problem being that some items are created manually using the
scripts in loader/sql and some are in the django models, which is...
Not very good.


> Also can I get some sample real word data that can be dump from the current database? I tried inserting some sample data but may be due to some internal constraints/dependency with other tables/info it is not working properly and giving error on accessing mail threads.

Download a mbox file for one month of say pgsql-hackers from the
website, and load that using loader/load_message.py. That's what I
usually do to inject test data. The only thing needed before that is
to create the list pgsql-hackers (well, you can call the list whatever
you want, but it has to be *a* list that you specify with -l to the
loader script). If you get a whole months worth of it it pretty much
always contains a big enough mix of threads to make it useful for
testing.

> It would be really helpful if I get a database initialization script which will create all required tables and some sample real world data that I can use for testing.

Yeah, just getting a single-step setup of an initial, if empty,
database would be good.

--
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/

Re: Database setup for pgarchives

От
Sahil Harpal
Дата:
Also I created a database in the WSL but I'm not getting how can I add the pgsql-hackers list in the database which is required to load the data using mbox. 
Could you please share the SQL query that I need to use for the list name and listgroup insertion?

This insertion part is done using following queries.
  • INSERT INTO listgroups (groupid, groupname, sortkey) VALUES (1, 'Developer lists', 1);
  • INSERT INTO lists (listid, listname, shortdesc, description, active, groupid, subscriber_access) VALUES (1, 'pgsql-hackers', 'pgsql-hackers', 'The PostgreSQL developers team lives here. Discussion of current development issues, problems and bugs, and proposed new features. If your question cannot be answered by people in the other lists, and it is likely that only a developer will know the answer, you may re-post your question in this list. You must try elsewhere first!', True, 1, True); 
Now the execution of load_message.py gives below error:
Failed to parse mbox:
b'/bin/sh: 1: formail: not found\n'