Обсуждение: Perl modules for testing/viewing/corrupting/repairing your heap files

Поиск
Список
Период
Сортировка

Perl modules for testing/viewing/corrupting/repairing your heap files

От
Mark Dilger
Дата:
Hackers,

Recently, as part of testing something else, I had need of a tool to create
surgically precise corruption within heap pages.  I wanted to make the
corruption from within TAP tests, so I wrote the tool as a set of perl modules.

The modules allow you to "tie" a perl array to a heap file, in essence thinking
of the file as an array of heap pages.  Each page within the file manifests as
a tied perl hash, where each of the page header fields are an element in the
hash, and the tuples in the page are an array of tied hashes, with each field
in the tuple header as a field in that tied hash.

This is all done in pure perl.  There is no eXtended Subroutine component of
this.

The body of each tuple (stuff beyond the tuple header) is thought of merely as
binary data.  I haven't done any work to decode it into perl datastructures
equivalent to integer, text, timestamp, etc., nor have I needed that
functionality as yet.  That seems doable as an extension of this work, at least
if the caller passes tuple descriptor type information into the `tie @file`
command.

Stuff like the following example works in the implementation already completed.
Note in particular that the file is bound in O_RDWR mode.  That means it all
gets written back to the underlying file and truly updates (corrupts) your
data.  It all also works in O_RDONLY mode, in which case the updates are made
to a copy of the data in perl's memory, but none of it goes back to disk.  Of course,
nothing forces you to update anything.  You could use this to read the fields from
the file/page/tuple without making modifications.

    #!/usr/bin/perl

    use HeapTuple;
    use HeapPage;
    use HeapFile;
    use Fcntl;

    my @file;
    tie @file, 'HeapFile', path => 'base/12925/3599', pagesize => 8192, mode => O_RDWR;
    for my $page (@file)
    {
        $page->{pd_lsn_xrecoff}++;
        print $page->{pd_checksum}, "\n";
        for (@{$page->{'tuples'}})
        {
            $_->{HEAP_COMBOCID} = 1 if ($_->{HEAP_HASNULL});
            $_->{t_xmin} = $_->{t_xmax} if $_->{HEAP_XMAX_COMMITTED};
        }
    }
    untie @file;

In my TAP test usage of these modules, I tend to fall into the pattern of:

    my $node = get_new_node('master');
    $node->init;
    my $pgdata = $node->data_dir;
    $node->safe_psql('postgres', 'create table public.test (bar text)');
    my $path = join('/', $pgdata, $node->safe_psql(
        'postgres', "SELECT pg_relation_filepath('public.test')"));
    $node->stop;

    my @file;
    tie @file, 'HeapFile', path => $path, pagesize => 8192, mode => O_RDWR;
    # do some corruption

    $node->start;
    # do some queries against the corrupt table, see what happens

For kicks, I just ran this one-liner and got many screenfuls of data.  I'll just include
the tail end:

    perl -e 'use HeapFile; tie @file, "HeapFile", path => "pgdata/base/12925/1255"; print(scalar(%$_)) for(@file);'

BODY AS HEX               ===>  PRINTABLE ASCII
ff 0f 06 00 00 00 00 00   ===>  . . . . . . . .
47 20 00 00 46 06 46 43   ===>  q 2 . . p l p g
49 47 06 05 3f 3d 06 06   ===>  s q l _ c a l l
05 44 3d 06 40 06 41 48   ===>  _ h a n d l e r
00 00 00 00 00 00 00 00   ===>  . . . . . . . .
00 00 00 00 00 00 00 00   ===>  . . . . . . . .
00 00 00 00 00 00 00 00   ===>  . . . . . . . .
00 00 00 00 00 00 00 00   ===>  . . . . . . . .
00 00 00 00 00 00 00 00   ===>  . . . . . . . .
00 00 00 00 00 00 00 00   ===>  . . . . . . . .
00 00 00 00 00 00 00 00   ===>  . . . . . . . .
00 00 50 03 00 00 00 00   ===>  . . . ? . . . .
00 00 00 00 00 00 00 00   ===>  . . . . . . . .
42 00 00 00 00 4c 4b 00   ===>  f . . . . v u .
00 00 00 00 00 08 00 00   ===>  . . . . . . . .
3c 00 00 00 01 00 00 00   ===>  ` . . . . . . .
00 00 00 00 01 00 00 00   ===>  . . . . . . . .
00 00 00 00 00 00 00 00   ===>  . . . . . . . .
02 46 06 46 43 49 47 06   ===>  + p l p g s q l
05 3f 3d 06 06 05 44 3d   ===>  _ c a l l _ h a
06 40 06 41 48 15 18 06   ===>  n d l e r ! $ l
45 3e 40 45 48 02 46 06   ===>  i b d i r / p l
46 43 49 47 06            ===>  p g s q l
b6 01 00 00            t_xmin: 438
00 00 00 00            t_xmax: 0
02 00 00 00          t_field3: 2
00 00                   bi_hi: 0
50 00                   bi_lo: 80
06 00                ip_posid: 6
1d 00             t_infomask2: 29
                        Natts: 29
            HEAP_KEYS_UPDATED: 0
             HEAP_HOT_UPDATED: 0
              HEAP_ONLY_TUPLE: 0
03 0b              t_infomask: 2819
                 HEAP_HASNULL: 1
             HEAP_HASVARWIDTH: 1
             HEAP_HASEXTERNAL: 0
              HEAP_HASOID_OLD: 0
        HEAP_XMAX_KEYSHR_LOCK: 0
                HEAP_COMBOCID: 0
          HEAP_XMAX_EXCL_LOCK: 0
          HEAP_XMAX_LOCK_ONLY: 0
          HEAP_XMIN_COMMITTED: 1
            HEAP_XMIN_INVALID: 1
          HEAP_XMAX_COMMITTED: 0
            HEAP_XMAX_INVALID: 1
           HEAP_XMAX_IS_MULTI: 0
                 HEAP_UPDATED: 0
               HEAP_MOVED_OFF: 0
                HEAP_MOVED_IN: 0
20                     t_hoff: 32
ffff0f06        NULL_BITFIELD: 11111111111111111111000001100
                      OID_OLD:

BODY AS HEX               ===>  PRINTABLE ASCII
ff 0f 06 00 00 00 00 00   ===>  . . . . . . . .
48 20 00 00 46 06 46 43   ===>  r 2 . . p l p g
49 47 06 05 45 06 06 45   ===>  s q l _ i n l i
06 41 05 44 3d 06 40 06   ===>  n e _ h a n d l
41 48 00 00 00 00 00 00   ===>  e r . . . . . .
00 00 00 00 00 00 00 00   ===>  . . . . . . . .
00 00 00 00 00 00 00 00   ===>  . . . . . . . .
00 00 00 00 00 00 00 00   ===>  . . . . . . . .
00 00 00 00 00 00 00 00   ===>  . . . . . . . .
00 00 00 00 00 00 00 00   ===>  . . . . . . . .
00 00 00 00 00 00 00 00   ===>  . . . . . . . .
00 00 50 03 00 00 00 00   ===>  . . . ? . . . .
00 00 00 00 00 00 00 00   ===>  . . . . . . . .
42 00 00 01 00 4c 4b 00   ===>  f . . . . v u .
01 00 00 00 00 08 00 00   ===>  . . . . . . . .
46 00 00 00 01 00 00 00   ===>  p . . . . . . .
00 00 00 00 01 00 00 00   ===>  . . . . . . . .
01 00 00 00 00 00 00 00   ===>  . . . . . . . .
00 08 00 00 02 46 06 46   ===>  . . . . / p l p
43 49 47 06 05 45 06 06   ===>  g s q l _ i n l
45 06 41 05 44 3d 06 40   ===>  i n e _ h a n d
06 41 48 15 18 06 45 3e   ===>  l e r ! $ l i b
40 45 48 02 46 06 46 43   ===>  d i r / p l p g
49 47 06                  ===>  s q l
b6 01 00 00            t_xmin: 438
00 00 00 00            t_xmax: 0
03 00 00 00          t_field3: 3
00 00                   bi_hi: 0
50 00                   bi_lo: 80
07 00                ip_posid: 7
1d 00             t_infomask2: 29
                        Natts: 29
            HEAP_KEYS_UPDATED: 0
             HEAP_HOT_UPDATED: 0
              HEAP_ONLY_TUPLE: 0
03 0b              t_infomask: 2819
                 HEAP_HASNULL: 1
             HEAP_HASVARWIDTH: 1
             HEAP_HASEXTERNAL: 0
              HEAP_HASOID_OLD: 0
        HEAP_XMAX_KEYSHR_LOCK: 0
                HEAP_COMBOCID: 0
          HEAP_XMAX_EXCL_LOCK: 0
          HEAP_XMAX_LOCK_ONLY: 0
          HEAP_XMIN_COMMITTED: 1
            HEAP_XMIN_INVALID: 1
          HEAP_XMAX_COMMITTED: 0
            HEAP_XMAX_INVALID: 1
           HEAP_XMAX_IS_MULTI: 0
                 HEAP_UPDATED: 0
               HEAP_MOVED_OFF: 0
                HEAP_MOVED_IN: 0
20                     t_hoff: 32
ffff0f06        NULL_BITFIELD: 11111111111111111111000001100
                      OID_OLD:

BODY AS HEX               ===>  PRINTABLE ASCII
ff 0f 06 00 00 00 00 00   ===>  . . . . . . . .
49 20 00 00 46 06 46 43   ===>  s 2 . . p l p g
49 47 06 05 4c 3d 06 45   ===>  s q l _ v a l i
40 3d 4a 06 48 00 00 00   ===>  d a t o r . . .
00 00 00 00 00 00 00 00   ===>  . . . . . . . .
00 00 00 00 00 00 00 00   ===>  . . . . . . . .
00 00 00 00 00 00 00 00   ===>  . . . . . . . .
00 00 00 00 00 00 00 00   ===>  . . . . . . . .
00 00 00 00 00 00 00 00   ===>  . . . . . . . .
00 00 00 00 00 00 00 00   ===>  . . . . . . . .
00 00 00 00 00 00 00 00   ===>  . . . . . . . .
00 00 50 03 00 00 00 00   ===>  . . . ? . . . .
00 00 00 00 00 00 00 00   ===>  . . . . . . . .
42 00 00 01 00 4c 4b 00   ===>  f . . . . v u .
01 00 00 00 00 08 00 00   ===>  . . . . . . . .
46 00 00 00 01 00 00 00   ===>  p . . . . . . .
00 00 00 00 01 00 00 00   ===>  . . . . . . . .
01 00 00 00 00 00 00 00   ===>  . . . . . . . .
01 00 00 00 19 46 06 46   ===>  . . . . % p l p
43 49 47 06 05 4c 3d 06   ===>  g s q l _ v a l
45 40 3d 4a 06 48 15 18   ===>  i d a t o r ! $
06 45 3e 40 45 48 02 46   ===>  l i b d i r / p
06 46 43 49 47 06         ===>  l p g s q l



Is there any interest in this stuff, and if so, where should it live?  I'm happy to
reorganize this a bit if there is general interest in such a submission.


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Perl modules for testing/viewing/corrupting/repairing your heapfiles

От
Mark Dilger
Дата:
Not having received any feedback on this, I've dusted the modules off for submission as-is.



—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Вложения

Re: Perl modules for testing/viewing/corrupting/repairing your heap files

От
Peter Geoghegan
Дата:
On Wed, Apr 8, 2020 at 3:51 PM Mark Dilger <mark.dilger@enterprisedb.com> wrote:
> Recently, as part of testing something else, I had need of a tool to create
> surgically precise corruption within heap pages.  I wanted to make the
> corruption from within TAP tests, so I wrote the tool as a set of perl modules.

There is also pg_hexedit:

https://github.com/petergeoghegan/pg_hexedit

-- 
Peter Geoghegan



Re: Perl modules for testing/viewing/corrupting/repairing your heapfiles

От
Mark Dilger
Дата:

> On Apr 14, 2020, at 6:17 PM, Peter Geoghegan <pg@bowt.ie> wrote:
>
> On Wed, Apr 8, 2020 at 3:51 PM Mark Dilger <mark.dilger@enterprisedb.com> wrote:
>> Recently, as part of testing something else, I had need of a tool to create
>> surgically precise corruption within heap pages.  I wanted to make the
>> corruption from within TAP tests, so I wrote the tool as a set of perl modules.
>
> There is also pg_hexedit:
>
> https://github.com/petergeoghegan/pg_hexedit

I steered away from software released under the GPL, such as pg_hexedit, owing to difficulties in getting anything I
developaccepted.  (That's a hard enough problem without licensing issues.).  I'm not taking a political stand for or
againstthe GPL here, just a pragmatic position that I wouldn't be able to integrate pg_hexedit into a postgres
submission.

(Thanks for writing pg_hexedit, BTW.  I'm not criticizing it.)

The purpose of these perl modules is not the viewing of files, but the intentional and targeted corruption of files
fromwithin TAP tests.  There are limited examples of tests in the postgres source tree that intentionally corrupt
files,and as I read them, they employ a blunt force trauma approach: 

In src/bin/pg_basebackup/t/010_pg_basebackup.pl:

> # induce corruption
> system_or_bail 'pg_ctl', '-D', $pgdata, 'stop';
> open $file, '+<', "$pgdata/$file_corrupt1";
> seek($file, $pageheader_size, 0);
> syswrite($file, "\0\0\0\0\0\0\0\0\0");
> close $file;
> system_or_bail 'pg_ctl', '-D', $pgdata, 'start';

In src/bin/pg_checksums/t/002_actions.pl:
>     # Time to create some corruption
>     open my $file, '+<', "$pgdata/$file_corrupted";
>     seek($file, $pageheader_size, 0);
>     syswrite($file, "\0\0\0\0\0\0\0\0\0");
>     close $file;

These blunt force trauma tests are fine, as far as they go.  But I wanted to be able to do things like

        # Corrupt the tuple to look like it has lots of attributes, some of
        # them null.  This falsely creates the impression that the t_bits
        # array is longer than just one byte, but t_hoff still says otherwise.
        $tup->{HEAP_HASNULL} = 1;
        $tup->{HEAP_NATTS_MASK} = 0x3FF;
        $tup->{t_bits} = 0xAA;

or

    # Same as above, but this time t_hoff plays along
        $tup->{HEAP_HASNULL} = 1;
        $tup->{HEAP_NATTS_MASK} = 0x3FF;
        $tup->{t_bits} = 0xAA;
        $tup->{t_hoff} = 32;

That's hard to do from a TAP test without modules like this, as you have to calculate by hand the offsets where you're
goingto write the corruption, and the bit pattern you are going to write to that location.  Even if you do all that,
nobodyelse is likely going to be able to read and maintain your tests. 

I'd like an easy way from within TAP tests to selectively corrupt files, to test whether various parts of the system
failgracefully in the presence of corruption.  What happens when a child partition is corrupted?  Does that impact
queriesthat only access other partitions?  What kinds of corruption cause pg_upgrade to fail? ...to expand the scope of
thecorruption?  What happens to logical replication when there is corruption on the primary? ...on the standby?  What
kindsof corruption cause a query to return data from neighboring tuples that the querying role has not permission to
view? What happens when a NAS is only intermittently corrupt? 

The modules I've submitted thus far are incomplete for this purpose.  They don't yet handle toast tables, btree, hash,
gist,gin, fsm, or vm, and I might be forgetting a few other things in the list.  Before I go and implement all of that,
Ithought perhaps others would express preferences about how this should all work, even stuff like, "Don't bother
implementingthat in perl, as I'm reimplementing the entire testing structure in COBOL", or similarly unexpected
feedback.


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Perl modules for testing/viewing/corrupting/repairing your heap files

От
Peter Geoghegan
Дата:
On Wed, Apr 15, 2020 at 7:22 AM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:
> I steered away from software released under the GPL, such as pg_hexedit, owing to difficulties in getting anything I
developaccepted.  (That's a hard enough problem without licensing issues.).  I'm not taking a political stand for or
againstthe GPL here, just a pragmatic position that I wouldn't be able to integrate pg_hexedit into a postgres
submission.
>
> (Thanks for writing pg_hexedit, BTW.  I'm not criticizing it.)

The only reason that pg_hexedit is under the GPL is that it's derived
from pg_filedump, which was and is also GPL 2. Note that pg_filedump
is hosted on community resources, and is something that index access
methods know about and try not to break (grep for pg_filedump in the
Postgres source code). pg_hexedit supports all index access methods
with the core distribution, including even the unpopular ones, like
SP-GiST.

> That's hard to do from a TAP test without modules like this, as you have to calculate by hand the offsets where
you'regoing to write the corruption, and the bit pattern you are going to write to that location.  Even if you do all
that,nobody else is likely going to be able to read and maintain your tests. 

Logical corruption is almost inherently a once-off thing. I think that
a tool like pg_hexedit is useful for seeing how the system behaves
with certain novel kinds of logical corruption, which it will tolerate
to varying degrees and with diverse symptoms. Pretty much for
investigating on a once-off basis.

I have occasionally wished for an SQL-like interface to bufpage.c
routines like PageIndexTupleDelete(), PageRepairFragmentation(), etc.
That would probably be a great deal more maintainable than what you
propose to do. It's not really equivalent, of course, but it would
give tests a way to dynamically manipulate/damage pages at the
"logical level". That seems like the thing that's hard to simulate
right now.

--
Peter Geoghegan