Обсуждение: Bitmask trickiness

Поиск
Список
Период
Сортировка

Bitmask trickiness

От
Howard Rogers
Дата:
Suppose 1=Red, 2=Yellow, 4=Green and 8=Orange.

Now suppose the following data structures and rows exist:

create table coloursample (recid integer, colour integer, descript varchar);
insert into coloursample values (1,2,'Yellow only');
insert into coloursample values (2,10,'Yellow and Orange');
insert into coloursample values (3,11,'Red, Yellow and Orange');
insert into coloursample values (4,12,'Green and Orange');
insert into coloursample values (5,13,'Red, Green and Orange');

Selecting things which have some yellow in them somewhere is easy:

select * from coloursample where colour & 2>0;

It's also easy to find records which have either some yellow or some
orange (or both) in them:

select * from coloursample where colour & 10>0;

But how do I find records which are ONLY yellow and orange, and
exclude records which have some other colour mixed in, in one simple
query without a lot of 'not this, not that' additions, and without
using multiple separate AND tests to nail it down?

I thought to do

select * from coloursample where colour & 10 = 10;

...but that's not right, because it finds the third record is a match.

(The question comes about because I'm trying to test for up to 15
attributes per record. One record has a bitmask value of 21205, say.
That should mean the '1' bit is set (because 21205 =
1+4+16+64+128+512+4096+16384), but when I do the above queries for
21205 & 4098 (which is 4096 + 2, and therefore should not be finding
records with the '1' bit set), the record is being returned because
the 4096 bit is being detected (correctly) as 'on'. I want the query
to only return records where both bits are true, but I don't want to
have to test the records 15 times to find out!).

I suspect the answer is really simple... but I'm having writer's block
today! All help appreciated.

Regards
HJR

Re: Bitmask trickiness

От
Scott Marlowe
Дата:
On Tue, Jul 20, 2010 at 9:59 PM, Howard Rogers <hjr@diznix.com> wrote:
> Suppose 1=Red, 2=Yellow, 4=Green and 8=Orange.
>
> Now suppose the following data structures and rows exist:
>
> create table coloursample (recid integer, colour integer, descript varchar);
> insert into coloursample values (1,2,'Yellow only');
> insert into coloursample values (2,10,'Yellow and Orange');
> insert into coloursample values (3,11,'Red, Yellow and Orange');
> insert into coloursample values (4,12,'Green and Orange');
> insert into coloursample values (5,13,'Red, Green and Orange');

Is there a reason you're avoiding bit strings?
http://www.postgresql.org/docs/8.3/interactive/datatype-bit.html
http://www.postgresql.org/docs/8.3/interactive/functions-bitstring.html

Re: Bitmask trickiness

От
Yeb Havinga
Дата:
Howard Rogers wrote:
> insert into coloursample values (2,10,'Yellow and Orange');
>
> But how do I find records which are ONLY yellow and orange
what about

select * from coloursample where colour = 10;

regards,
Yeb Havinga

Re: Bitmask trickiness

От
Stephen Cook
Дата:
On 7/20/2010 11:59 PM, Howard Rogers wrote:
> But how do I find records which are ONLY yellow and orange, and
> exclude records which have some other colour mixed in, in one simple
> query without a lot of 'not this, not that' additions, and without
> using multiple separate AND tests to nail it down?
> ...
> I suspect the answer is really simple... but I'm having writer's block
> today! All help appreciated.

Just use a regular ol' equals.


Re: Bitmask trickiness

От
Howard Rogers
Дата:
On Wed, Jul 21, 2010 at 9:17 PM, Mathieu De Zutter <mathieu@dezutter.org> wrote:
> On Wed, Jul 21, 2010 at 5:59 AM, Howard Rogers <hjr@diznix.com> wrote:
>> It's also easy to find records which have either some yellow or some
>> orange (or both) in them:
>>
>> select * from coloursample where colour & 10>0;
>>
>> But how do I find records which are ONLY yellow and orange, and
>> exclude records which have some other colour mixed in, in one simple
>> query without a lot of 'not this, not that' additions, and without
>> using multiple separate AND tests to nail it down?
>
> What about:
> WHERE colour & ~10 = 0
>
> Kind regards,
> Mathieu
>

Hi Mathieu:

Yes, that works for the simple case I gave by way of illustration (as
does the bitwise XOR -#- function), so thanks. But neither work very
obviously for the more realistic example I gave. If the stored value
is 21205 and I'm 'probing' it with 4098, the record should not be
returned, because 21205 implies a '1' bit is set, not the '2' bit. But
the # and ~ functions cause a non-negative result to be returned, so I
am none the wiser as to how to spot when or when not to return the
record:

In other words, this one is 'wrong' and should not be in the resultset:

ims=# select 21205 & ~ 4098;
 ?column?
----------
   17109
(1 row)

But this one is 'right' and should be returned:

ims=# select 21205 & ~ 4097;
 ?column?
----------
   17108
(1 row)

But looking at the outcome of both queries, there's nothing that
particularly leaps out at me that screams '17108' is right and '17109'
is wrong.

If I was 'probing' with a simple value (1,2,4,8 etc), then I can just
test for a non-zero return: if it's non-zero, the probe value is
implied by the stored value and the record should be returned. So,
right:

ims=# select 21205 & 1;
 ?column?
----------
       1
(1 row)

And wrong:

ims=# select 21205 & 2;
 ?column?
----------
       0
(1 row)

The minute you start probing with a complex value, however, (that is,
a probe value which is made up of multiple basic values, for example
4098, which is 4096 + 2) interpreting the output of the bitwise
operations becomes more than my head can cope with!

I fear I'm going to have to decompose the probe value supplied by a
user and perform multiple simple probes that match their meaning...
but with 15 possible attributes to deal with, that could get a lot
slower than I was hoping.

Re: Bitmask trickiness

От
Scott Marlowe
Дата:
On Wed, Jul 21, 2010 at 4:41 AM, Howard Rogers <hjr@diznix.com> wrote:
> On Wed, Jul 21, 2010 at 6:08 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
>> On Tue, Jul 20, 2010 at 9:59 PM, Howard Rogers <hjr@diznix.com> wrote:
>>> Suppose 1=Red, 2=Yellow, 4=Green and 8=Orange.
>>>
>>> Now suppose the following data structures and rows exist:
>>>
>>> create table coloursample (recid integer, colour integer, descript varchar);
>>> insert into coloursample values (1,2,'Yellow only');
>>> insert into coloursample values (2,10,'Yellow and Orange');
>>> insert into coloursample values (3,11,'Red, Yellow and Orange');
>>> insert into coloursample values (4,12,'Green and Orange');
>>> insert into coloursample values (5,13,'Red, Green and Orange');
>>
>> Is there a reason you're avoiding bit strings?
>> http://www.postgresql.org/docs/8.3/interactive/datatype-bit.html
>> http://www.postgresql.org/docs/8.3/interactive/functions-bitstring.html
>>
>> --
>> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-general
>>
>
>
> Why on Earth would I want to store this sort of stuff in a bit string?!

Because you are manipulating bits and not integers?  I guess there are
10 kinds of people, those who like think in binary and those who
don't.

> I don't know about you, but I find looking at 21205 a darn'd site
> easier than staring blankly at 101001011010101!!

If the fifth bit means one thing, and the 7th bit means something
else, quick which of the following have the fifth bit set and the 7th
bit off:

01001101 (base2)
or
77 (base 10)


> And, fundamentally,
> they mean precisely the same thing.

Of course.  But that wasn't my point, and by my example above, one is
much easier to figure out than the other if you're interested in bit
twiddling.

> And the '&' function works as
> nicely with boring old decimals as it does with long-winded binaries,
> so I really don't see the point of making it more complicated than it
> needs to be -but I'm open to be enlightened on the matter!

I fail to see how storing a binary as a binary and showing it as a
binary makes things more complicated.  But I'm open to someone showing
me how that's true.  At least hex or octal have direct and simple
conversions where each hex or octal digit represents 4 or 3 bits
respectively.  Decimal does not.

> Thanks for the second link though. I hadn't realised that PostgreSQL
> was so richly-endowed with bitwise functions. Specifically, it's got
> the bitwise XOR I was thinking it would be nice to have in these sorts
> of situations:
>
> ims=# select * from coloursample where colour # 10 = 0;
>  recid | colour |     descript
> -------+--------+-------------------
>     1 |     10 | Yellow and Orange
> (1 row)
>
> Not quite sure how to apply that to my more realistic example just
> yet, but I think this will be very helpful, so thank you!

Note you can cast integer to bitstring, but there may be some odd
behaviour for sign bits and such.  Which is again why I'd use the
right type for the job, bit string.  But it's your project.

Re: Bitmask trickiness

От
Scott Marlowe
Дата:
On Wed, Jul 21, 2010 at 9:02 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
> If the fifth bit means one thing, and the 7th bit means something
> else, quick which of the following have the fifth bit set and the 7th
> bit off:

That should be fifth bit off and 7th bit on up there ^^^

>
> 01001101 (base2)
> or
> 77 (base 10)

Re: Bitmask trickiness

От
Howard Rogers
Дата:
On Thu, Jul 22, 2010 at 1:02 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:


>>
>> Why on Earth would I want to store this sort of stuff in a bit string?!
>
> Because you are manipulating bits and not integers?  I guess there are
> 10 kinds of people, those who like think in binary and those who
> don't.
>
>> I don't know about you, but I find looking at 21205 a darn'd site
>> easier than staring blankly at 101001011010101!!
>
> If the fifth bit means one thing, and the 7th bit means something
> else, quick which of the following have the fifth bit set and the 7th
> bit off:
>
> 01001101 (base2)
> or
> 77 (base 10)
>
>
>> And, fundamentally,
>> they mean precisely the same thing.
>
> Of course.  But that wasn't my point, and by my example above, one is
> much easier to figure out than the other if you're interested in bit
> twiddling.
>
>> And the '&' function works as
>> nicely with boring old decimals as it does with long-winded binaries,
>> so I really don't see the point of making it more complicated than it
>> needs to be -but I'm open to be enlightened on the matter!
>
> I fail to see how storing a binary as a binary and showing it as a
> binary makes things more complicated.  But I'm open to someone showing
> me how that's true.  At least hex or octal have direct and simple
> conversions where each hex or octal digit represents 4 or 3 bits
> respectively.  Decimal does not.
>
>> Thanks for the second link though. I hadn't realised that PostgreSQL
>> was so richly-endowed with bitwise functions. Specifically, it's got
>> the bitwise XOR I was thinking it would be nice to have in these sorts
>> of situations:
>>
>> ims=# select * from coloursample where colour # 10 = 0;
>>  recid | colour |     descript
>> -------+--------+-------------------
>>     1 |     10 | Yellow and Orange
>> (1 row)
>>
>> Not quite sure how to apply that to my more realistic example just
>> yet, but I think this will be very helpful, so thank you!
>
> Note you can cast integer to bitstring, but there may be some odd
> behaviour for sign bits and such.  Which is again why I'd use the
> right type for the job, bit string.  But it's your project.
>

Quoting...

> Because you are manipulating bits and not integers?  I guess there are
> 10 kinds of people, those who like think in binary and those who
> don't.

Er, no. 21205 is not an integer. It's an encoded bit of magic.


> quick which of the following have the fifth bit set and the 7th
>bit off:

>01001101 (base2)
>or
>77 (base 10)

I'll give you that one (except that the fifth bit isn't set and the
7th bit is on!!). Now repeat for 10 million records and see how
you get on.

Really, I don't have to visually inspect a record to work this stuff
out! So what is easier on your eye for one record is completely
irrelevant as far as my code is concerned!

>one is
>much easier to figure out than the other if you're interested in bit
>twiddling.

As I say, take an incredibly simple example and everything looks, er,
simple. Point of fact, I happen to know that the '1' bit is set in
21205 simply by looking at the last digit and spotting that it's not
even. Fat lot of good that does me when fetching 350,000 records that
happen to match 'insurance claim'.

>I fail to see how storing a binary as a binary and showing it as a
>binary makes things more complicated

Because it's NOT binary. It's an encoding. Whether that encoding is
displayed in binary, fluent hebrew or klingon or imaginary numbers is
really irrelevant to me. It happens that I can look at a decimal
number and work out most things (if the number is 29438, it's a fair
chance the 16384 bit is set, for example; if it's 4098, I know it's
4096 + 2. And so on.). If you're happier working with fancifully long
strings of 1s and 0s, good on you: but it doesn't alter the fact that
I'm working with encoded meanings, not binary digits and I prefer a
nice, compact display of that encoding which doesn't involve
hieroglyphics.

>But it's your project.

Indeed.

Doesn't mean I can do it all on my own, of course. But if it's simply
a question of personal preference, I'll take mine over yours, for
projects I work on, if that's OK. No hard feelings! :-)

Regards
HJR

Re: Bitmask trickiness

От
Peter Hunsberger
Дата:
On Wed, Jul 21, 2010 at 11:13 PM, Howard Rogers <hjr@diznix.com> wrote:
>
> On Thu, Jul 22, 2010 at 1:02 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
>
>
> >>
> >> Why on Earth would I want to store this sort of stuff in a bit string?!
> >
> > Because you are manipulating bits and not integers?  I guess there are
> > 10 kinds of people, those who like think in binary and those who
> > don't.
> >
> >> I don't know about you, but I find looking at 21205 a darn'd site
> >> easier than staring blankly at 101001011010101!!

<snip> lots of stuff</snip>

> >
> > Note you can cast integer to bitstring, but there may be some odd
> > behaviour for sign bits and such.  Which is again why I'd use the
> > right type for the job, bit string.  But it's your project.
> >
>
> Quoting...
>
> > Because you are manipulating bits and not integers?  I guess there are
> > 10 kinds of people, those who like think in binary and those who
> > don't.
>
> Er, no. 21205 is not an integer. It's an encoded bit of magic.
>

In that case your database design is fundamentally broken.  A database
should have content fields that map to the needs of the application.
As you describe your application requirements, that is a bit string
and not an integer.  Use bit strings and your application logic is
transparent, obvious and easy to maintain.  Use integers and you have
to resort to "magic".  As you say, it's your choice, but you came here
looking for advice and the advice you were given is very good....

--
Peter Hunsberger

Re: Bitmask trickiness

От
Peter Hunsberger
Дата:
On Thu, Jul 22, 2010 at 4:09 PM, Howard Rogers <hjr@diznix.com> wrote:
>
> On Fri, Jul 23, 2010 at 12:35 AM, Peter Hunsberger
> <peter.hunsberger@gmail.com> wrote:
> > On Wed, Jul 21, 2010 at 11:13 PM, Howard Rogers <hjr@diznix.com> wrote:
> >>
> there's a room-full of users who can look
> at code '4097' and know precisely what it means and would be mortified
> if I suddenly started displaying exactly the same meanings in what, to
> them, would look like utter gibberish.
>

In that case, you shouldn't be describing the column as "some encoded
bit of magic" here.  It clearly has some some semantic meaning which
gives you a reason to want to keep it that way.  Though why your users
are dealing with the raw values as stored in the database may be
another issue to deal with: Personally, I'd say store it in the way
that is easiest for your application logic to deal with, display it in
the form that is easiest for your users to deal with.  The are often
two completely different things...

--
Peter Hunsberger

Re: Bitmask trickiness

От
Howard Rogers
Дата:
On Fri, Jul 23, 2010 at 12:35 AM, Peter Hunsberger
<peter.hunsberger@gmail.com> wrote:
> On Wed, Jul 21, 2010 at 11:13 PM, Howard Rogers <hjr@diznix.com> wrote:
>>
>> On Thu, Jul 22, 2010 at 1:02 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
>>
>>
>> >>
>> >> Why on Earth would I want to store this sort of stuff in a bit string?!
>> >
>> > Because you are manipulating bits and not integers?  I guess there are
>> > 10 kinds of people, those who like think in binary and those who
>> > don't.
>> >
>> >> I don't know about you, but I find looking at 21205 a darn'd site
>> >> easier than staring blankly at 101001011010101!!
>
> <snip> lots of stuff</snip>
>
>> >
>> > Note you can cast integer to bitstring, but there may be some odd
>> > behaviour for sign bits and such.  Which is again why I'd use the
>> > right type for the job, bit string.  But it's your project.
>> >
>>
>> Quoting...
>>
>> > Because you are manipulating bits and not integers?  I guess there are
>> > 10 kinds of people, those who like think in binary and those who
>> > don't.
>>
>> Er, no. 21205 is not an integer. It's an encoded bit of magic.
>>
>
> In that case your database design is fundamentally broken.  A database
> should have content fields that map to the needs of the application.
> As you describe your application requirements, that is a bit string
> and not an integer.  Use bit strings and your application logic is
> transparent, obvious and easy to maintain.  Use integers and you have
> to resort to "magic".  As you say, it's your choice, but you came here
> looking for advice and the advice you were given is very good....
>
> --
> Peter Hunsberger

Hi Peter:

It wasn't, as the original poster pointed out, 'advice' that was given
so much as personal preference. Had someone said, 'ah, but you see
storing your 15 meanings in decimal uses up 5 bytes, whereas a
bitstring only requires 15 bits, and over 10,000,000 records, the
saving of 3 bytes per record adds up...', then that would be technical
advice I could listen to, assess and make a call on.

But simply saying "your design is broken... wooooo!" might well scare
the children, but doesn't really do anything for me, because I know
for a certainty that it's not broken at all.

It comes down to this: I can do Boyce-Codd normal form in my sleep
(...and falling asleep happens quite frequent when doing it, strangely
enough), and have been doing so since 1987. I'm certainly not perfect,
but I reckon I can tell from a mile away when one of my designs is
"broken", as you put it -and this one isn't. I haven't even begun to
describe a scintilla of a percentage point of the design decisions
this thing has to deal with, nor the fact that it's been running quite
happily in this manner for a good couple of years... so you'll just
have to take it from me that there's a room-full of users who can look
at code '4097' and know precisely what it means and would be mortified
if I suddenly started displaying exactly the same meanings in what, to
them, would look like utter gibberish.

Unless you, or someone else, can come up with some hard, *technical*
facts as to why working with bitstring encodings of meaning is so much
better than working in decimal, we're sticking with the decimal
representation. I'll buy "you're forever doing implicit casts which
are poor performers" or "implicit casts might break in a future
release" or "it's costing you three bytes per record" ...or anything
else in that vein. But matters of transparency and ease of maintenance
are entirely subjective things (about which I sought no advice at all,
incidentally), and what works for you on those scores doesn't work for
me.

Regards
HJR

Re: Bitmask trickiness

От
Scott Marlowe
Дата:
On Thu, Jul 22, 2010 at 3:09 PM, Howard Rogers <hjr@diznix.com> wrote:
> On Fri, Jul 23, 2010 at 12:35 AM, Peter Hunsberger
> <peter.hunsberger@gmail.com> wrote:
>> On Wed, Jul 21, 2010 at 11:13 PM, Howard Rogers <hjr@diznix.com> wrote:
>>>
>>> On Thu, Jul 22, 2010 at 1:02 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
>>>
>>>
>>> >>
>>> >> Why on Earth would I want to store this sort of stuff in a bit string?!
>>> >
>>> > Because you are manipulating bits and not integers?  I guess there are
>>> > 10 kinds of people, those who like think in binary and those who
>>> > don't.
>>> >
>>> >> I don't know about you, but I find looking at 21205 a darn'd site
>>> >> easier than staring blankly at 101001011010101!!
>>
>> <snip> lots of stuff</snip>
>>
>>> >
>>> > Note you can cast integer to bitstring, but there may be some odd
>>> > behaviour for sign bits and such.  Which is again why I'd use the
>>> > right type for the job, bit string.  But it's your project.
>>> >
>>>
>>> Quoting...
>>>
>>> > Because you are manipulating bits and not integers?  I guess there are
>>> > 10 kinds of people, those who like think in binary and those who
>>> > don't.
>>>
>>> Er, no. 21205 is not an integer. It's an encoded bit of magic.
>>>
>>
>> In that case your database design is fundamentally broken.  A database
>> should have content fields that map to the needs of the application.
>> As you describe your application requirements, that is a bit string
>> and not an integer.  Use bit strings and your application logic is
>> transparent, obvious and easy to maintain.  Use integers and you have
>> to resort to "magic".  As you say, it's your choice, but you came here
>> looking for advice and the advice you were given is very good....
>>
>> --
>> Peter Hunsberger
>
> Hi Peter:
>
> It wasn't, as the original poster pointed out, 'advice' that was given
> so much as personal preference. Had someone said, 'ah, but you see
> storing your 15 meanings in decimal uses up 5 bytes, whereas a
> bitstring only requires 15 bits, and over 10,000,000 records, the
> saving of 3 bytes per record adds up...', then that would be technical
> advice I could listen to, assess and make a call on.

You do realize the first page I linked to told you that, right?  It's
not a particularly big page.  I had made the erroneous assumption
you'd read the link I posted.

> But simply saying "your design is broken... wooooo!" might well scare
> the children, but doesn't really do anything for me, because I know
> for a certainty that it's not broken at all.

I asked if there was a reason you were avoiding bit strings.  Hardly a
"your design is broken" point.  You've now said why you are not using
the type that was designed to handle bit strings for bit strings.

I personally would store them as bit strings and change representation
for users.  There are some issues that come up if your bit strings are
long enough to get close to the last bit in an integer (also mentioned
on the links I posted that didn't get read).  But other than that it
should work fine.

Re: Bitmask trickiness

От
Scott Marlowe
Дата:
On Thu, Jul 22, 2010 at 10:27 PM, Howard Rogers <hjr@diznix.com> wrote:
> On Fri, Jul 23, 2010 at 8:37 AM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
>> On Thu, Jul 22, 2010 at 3:09 PM, Howard Rogers <hjr@diznix.com> wrote:
>>> On Fri, Jul 23, 2010 at 12:35 AM, Peter Hunsberger
>>> <peter.hunsberger@gmail.com> wrote:
>>>> On Wed, Jul 21, 2010 at 11:13 PM, Howard Rogers <hjr@diznix.com> wrote:
>>>>>
>>>>> On Thu, Jul 22, 2010 at 1:02 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
>>>>>
>>>>>
>>>>> >>
>>>>> >> Why on Earth would I want to store this sort of stuff in a bit string?!
>>>>> >
>>>>> > Because you are manipulating bits and not integers?  I guess there are
>>>>> > 10 kinds of people, those who like think in binary and those who
>>>>> > don't.
>>>>> >
>>>>> >> I don't know about you, but I find looking at 21205 a darn'd site
>>>>> >> easier than staring blankly at 101001011010101!!
>>>>
>>>> <snip> lots of stuff</snip>
>>>>
>>>>> >
>>>>> > Note you can cast integer to bitstring, but there may be some odd
>>>>> > behaviour for sign bits and such.  Which is again why I'd use the
>>>>> > right type for the job, bit string.  But it's your project.
>>>>> >
>>>>>
>>>>> Quoting...
>>>>>
>>>>> > Because you are manipulating bits and not integers?  I guess there are
>>>>> > 10 kinds of people, those who like think in binary and those who
>>>>> > don't.
>>>>>
>>>>> Er, no. 21205 is not an integer. It's an encoded bit of magic.
>>>>>
>>>>
>>>> In that case your database design is fundamentally broken.  A database
>>>> should have content fields that map to the needs of the application.
>>>> As you describe your application requirements, that is a bit string
>>>> and not an integer.  Use bit strings and your application logic is
>>>> transparent, obvious and easy to maintain.  Use integers and you have
>>>> to resort to "magic".  As you say, it's your choice, but you came here
>>>> looking for advice and the advice you were given is very good....
>>>>
>>>> --
>>>> Peter Hunsberger
>>>
>>> Hi Peter:
>>>
>>> It wasn't, as the original poster pointed out, 'advice' that was given
>>> so much as personal preference. Had someone said, 'ah, but you see
>>> storing your 15 meanings in decimal uses up 5 bytes, whereas a
>>> bitstring only requires 15 bits, and over 10,000,000 records, the
>>> saving of 3 bytes per record adds up...', then that would be technical
>>> advice I could listen to, assess and make a call on.
>>
>> You do realize the first page I linked to told you that, right?  It's
>> not a particularly big page.  I had made the erroneous assumption
>> you'd read the link I posted.
>
> If you mean, did I read the bit in the doco where it said nothing at
> all in the 'these are great advantages' style I've just described, but
> instead makes the fairly obvious point that a bit string takes 8 bits
> to store a group of 8 bits (well, stone me!!)

Wow, I'm surprised you get any help with your attitude.  I posted a
link and asked a question and right up front got my head handed to me.

To quote:  "Why on Earth would I want to store this sort of stuff in a
bit string?!

I don't know about you, but I find looking at 21205 a darn'd site
easier than staring blankly at 101001011010101!!"

Like I'd somehow bitten your hand when I asked my question.

> PLUS has extra overhead,
> then yes, I did read that part of your first link... and nevertheless
> concluded that, overall, there is... er, some extra overhead in
> storing bitstrings.

Well, your initial answer certainly didn't give ANY idea that you'd
read that page.

> So what precisely about that first article, which I did indeed read,
> would you have expected to lead me to the conclusion that I'd SAVE
> significant amounts of space or find some other technically-compelling
> reason for switching?

I didn't expect such.  I asked why you weren't using them, and gave
you some links to read on it.  It clearly states that bit strings use
a bit per bit, plus some overhead.  Now, I had no idea if you were
dealing with bigints and 60 bit strings or 5 bit strings.  In fact,
you did little to really describe your project and preferences in your
post.  Which is why my response was short and concise, I had little to
go on.

> My point is that there's nothing much in it, storage-wise, either way.

Well, there is the fact that bit strings can restrict the size of the
entry so you don't accidentally get an int stored that's got more bits
than your model can handle.  There's also the issue that if / when you
ever get close to the last bit in an int bitstring may behave oddly
because of sign issues.

> So there's no compelling technical reason to switch.

I never said there was.  I simply asked a question, and got my hand bitten.

> And without a
> technically-compelling reason, the rest of the post I was referring to
> simply boiled down, as far as I could tell, to a matter of personal
> preference. No less valid for that, of course. But ultimately, not
> something that would hold much sway with me.

Sure, fine, whatever you want.  I wasn't trying to convince you either
way.  I do think using the right type for the job makes more sense,
but again, it's personal preference.

>>> But simply saying "your design is broken... wooooo!" might well scare
>>> the children, but doesn't really do anything for me, because I know
>>> for a certainty that it's not broken at all.
>>
>> I asked if there was a reason you were avoiding bit strings.  Hardly a
>> "your design is broken" point.
>
> I'm getting a bit fed up of this thread now. It wasn't YOU that ever
> said 'the design is broken', and I never suggested it was. That was
> Peter Hunsberger, about three posts up in the thread, who wrote "In
> that case your database design is fundamentally broken."

Well, pardon me for misunderstanding.  I'll stop the conversation
after this message too.   you do realize that paragraph immediately
followed the one referring to my post.

> If you're going to take umbrage at something, please take umbrage at
> things that were actually directed at you in the first place!

Wait a second, do I work for you or something?  Cause I could swear
this was a forum where equals discussed issues with each other.  I'll
try to do better from now on when I provide free advice to people.  I
wouldn't want to disappoint you.

>> You've now said why you are not using
>> the type that was designed to handle bit strings for bit strings.
>>
>> I personally would store them as bit strings and change representation
>> for users.
>
> I'm a user, too. I get to see this stuff every time I do a select
> statement. At the command line. Which I use a lot.

I guess 20 years of teaching and working on digital electronics and
assembly code left me quite a bit more comfortable with binary
strings.  I'd be quite comfortable with them on the command line.

>> There are some issues that come up if your bit strings are
>> long enough to get close to the last bit in an integer (also mentioned
>> on the links I posted that didn't get read).
>
> Don't make false assumptions about other people, please. You don't
> know what I read or didn't read.

No I didn't.  But it certainly seemed to me like you hadn't.  From now
on I'll only comment on exactly what you've posted.  I'll go one
better and just stop after this post.

> Just because you didn't make a
> compelling technical argument in favour of bitstrings doesn't mean I
> didn't read the article you linked to ...that also didn't make a
> compelling technical argument in favour of bitstrings.

Who's job is it to understand this software and make compelling
arguments?  Again, I'm not a paid consultant, I don't spend hours each
day carefully crafting arguments to make you happy.  I asked one
freakin question, gave two links and got nothing but attitude.

I'd be ever so happy if you took your attitude and went back to Oracle use.

>> But other than that it
>> should work fine.
>
> Yes, I know. I've only been using this technique for five years on
> Oracle! I would be very surprised indeed if it wasn't transferrable to
> PostgreSQL.

Me too.  But I think you'd be way happier on Oracle.

> Still doesn't answer the precise, specific technical question I
> actually asked, though, does it?!

Which was answered by Stephen Cook was it not?  I.e. use plain old equals?

> And since there is indeed no technical content in these continued
> to-and-fro posts, I'll be leaving it there, if that's OK.

That would be great.  Hope to not be seeing you around.

> Thanks for your contributions to date. I did read them (including both
> of the links you included), and the second one especially was
> enlightening, for which I am grateful.

great.

Re: Bitmask trickiness

От
Howard Rogers
Дата:
On Fri, Jul 23, 2010 at 8:37 AM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
> On Thu, Jul 22, 2010 at 3:09 PM, Howard Rogers <hjr@diznix.com> wrote:
>> On Fri, Jul 23, 2010 at 12:35 AM, Peter Hunsberger
>> <peter.hunsberger@gmail.com> wrote:
>>> On Wed, Jul 21, 2010 at 11:13 PM, Howard Rogers <hjr@diznix.com> wrote:
>>>>
>>>> On Thu, Jul 22, 2010 at 1:02 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
>>>>
>>>>
>>>> >>
>>>> >> Why on Earth would I want to store this sort of stuff in a bit string?!
>>>> >
>>>> > Because you are manipulating bits and not integers?  I guess there are
>>>> > 10 kinds of people, those who like think in binary and those who
>>>> > don't.
>>>> >
>>>> >> I don't know about you, but I find looking at 21205 a darn'd site
>>>> >> easier than staring blankly at 101001011010101!!
>>>
>>> <snip> lots of stuff</snip>
>>>
>>>> >
>>>> > Note you can cast integer to bitstring, but there may be some odd
>>>> > behaviour for sign bits and such.  Which is again why I'd use the
>>>> > right type for the job, bit string.  But it's your project.
>>>> >
>>>>
>>>> Quoting...
>>>>
>>>> > Because you are manipulating bits and not integers?  I guess there are
>>>> > 10 kinds of people, those who like think in binary and those who
>>>> > don't.
>>>>
>>>> Er, no. 21205 is not an integer. It's an encoded bit of magic.
>>>>
>>>
>>> In that case your database design is fundamentally broken.  A database
>>> should have content fields that map to the needs of the application.
>>> As you describe your application requirements, that is a bit string
>>> and not an integer.  Use bit strings and your application logic is
>>> transparent, obvious and easy to maintain.  Use integers and you have
>>> to resort to "magic".  As you say, it's your choice, but you came here
>>> looking for advice and the advice you were given is very good....
>>>
>>> --
>>> Peter Hunsberger
>>
>> Hi Peter:
>>
>> It wasn't, as the original poster pointed out, 'advice' that was given
>> so much as personal preference. Had someone said, 'ah, but you see
>> storing your 15 meanings in decimal uses up 5 bytes, whereas a
>> bitstring only requires 15 bits, and over 10,000,000 records, the
>> saving of 3 bytes per record adds up...', then that would be technical
>> advice I could listen to, assess and make a call on.
>
> You do realize the first page I linked to told you that, right?  It's
> not a particularly big page.  I had made the erroneous assumption
> you'd read the link I posted.

If you mean, did I read the bit in the doco where it said nothing at
all in the 'these are great advantages' style I've just described, but
instead makes the fairly obvious point that a bit string takes 8 bits
to store a group of 8 bits (well, stone me!!) PLUS has extra overhead,
then yes, I did read that part of your first link... and nevertheless
concluded that, overall, there is... er, some extra overhead in
storing bitstrings.

So what precisely about that first article, which I did indeed read,
would you have expected to lead me to the conclusion that I'd SAVE
significant amounts of space or find some other technically-compelling
reason for switching?

My point is that there's nothing much in it, storage-wise, either way.
So there's no compelling technical reason to switch. And without a
technically-compelling reason, the rest of the post I was referring to
simply boiled down, as far as I could tell, to a matter of personal
preference. No less valid for that, of course. But ultimately, not
something that would hold much sway with me.

>> But simply saying "your design is broken... wooooo!" might well scare
>> the children, but doesn't really do anything for me, because I know
>> for a certainty that it's not broken at all.
>
> I asked if there was a reason you were avoiding bit strings.  Hardly a
> "your design is broken" point.

I'm getting a bit fed up of this thread now. It wasn't YOU that ever
said 'the design is broken', and I never suggested it was. That was
Peter Hunsberger, about three posts up in the thread, who wrote "In
that case your database design is fundamentally broken."

If you're going to take umbrage at something, please take umbrage at
things that were actually directed at you in the first place!

> You've now said why you are not using
> the type that was designed to handle bit strings for bit strings.
>
> I personally would store them as bit strings and change representation
> for users.

I'm a user, too. I get to see this stuff every time I do a select
statement. At the command line. Which I use a lot.

> There are some issues that come up if your bit strings are
> long enough to get close to the last bit in an integer (also mentioned
> on the links I posted that didn't get read).

Don't make false assumptions about other people, please. You don't
know what I read or didn't read. Just because you didn't make a
compelling technical argument in favour of bitstrings doesn't mean I
didn't read the article you linked to ...that also didn't make a
compelling technical argument in favour of bitstrings.

> But other than that it
> should work fine.

Yes, I know. I've only been using this technique for five years on
Oracle! I would be very surprised indeed if it wasn't transferrable to
PostgreSQL.

Still doesn't answer the precise, specific technical question I
actually asked, though, does it?!

And since there is indeed no technical content in these continued
to-and-fro posts, I'll be leaving it there, if that's OK.

Thanks for your contributions to date. I did read them (including both
of the links you included), and the second one especially was
enlightening, for which I am grateful.

Regards
HJR

Re: Bitmask trickiness

От
Howard Rogers
Дата:
On Fri, Jul 23, 2010 at 3:02 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
>> If you mean, did I read the bit in the doco where it said nothing at
>> all in the 'these are great advantages' style I've just described, but
>> instead makes the fairly obvious point that a bit string takes 8 bits
>> to store a group of 8 bits (well, stone me!!)
>
> Wow, I'm surprised you get any help with your attitude.  I posted a
> link and asked a question and right up front got my head handed to me.
>
> To quote:  "Why on Earth would I want to store this sort of stuff in a
> bit string?!
>
> I don't know about you, but I find looking at 21205 a darn'd site
> easier than staring blankly at 101001011010101!!"
>
> Like I'd somehow bitten your hand when I asked my question.

That is not attitude, Scott. That is me asking a question riddled with
rhetorical exclamations.

You asked me why I wasn't using bitstrings. I asked why on Earth I
would. That's all. The only "attitude" you could reasonably detect
there, I think, is surprise that when I ask a precise technical
question, someone would feel it fair game to question the entire basis
of a database design about which they know extremely little (which is
not their -or your- fault, because you only get to know the details I
include in the question, after all. Fact remains, on such slim
foundations, I wouldn't build a mountain of questioning the reasons
for or behind someone else's work).

>> PLUS has extra overhead,
>> then yes, I did read that part of your first link... and nevertheless
>> concluded that, overall, there is... er, some extra overhead in
>> storing bitstrings.
>
> Well, your initial answer certainly didn't give ANY idea that you'd
> read that page.

It neither gave the impression I had, nor gave any indication that I
hadn't! That's the point: you've assumed something you needn't have.

>> So what precisely about that first article, which I did indeed read,
>> would you have expected to lead me to the conclusion that I'd SAVE
>> significant amounts of space or find some other technically-compelling
>> reason for switching?
>
> I didn't expect such.  I asked why you weren't using them, and gave
> you some links to read on it.  It clearly states that bit strings use
> a bit per bit, plus some overhead.  Now, I had no idea if you were
> dealing with bigints and 60 bit strings or 5 bit strings.  In fact,
> you did little to really describe your project and preferences in your
> post.  Which is why my response was short and concise, I had little to
> go on.

Exactly. Now, I grant you it's difficult when you can't see me; when
there's only the written word to go on; when you don't know my
personal modes of expression, nor I yours. But that's precisely why I
wouldn't answer a narrowly-scoped technical question with even a hint
of a suggestion that the entire basis of the question was suspect: I
didn't give you any details of my project or preferences, because I
was asking a precise, narrowly-focussed question about getting one
specific result with one specific query structure.

Honestly, when I want general consulting, I pay for it. You really
don't have to try and give it away for free.

>> My point is that there's nothing much in it, storage-wise, either way.
>
> Well, there is the fact that bit strings can restrict the size of the
> entry so you don't accidentally get an int stored that's got more bits
> than your model can handle.

Well, that's fortunately not something I have to worry about.
Somewhere around the 192th bit, I'll start panicking. When I've only
got 15 of the blighters to worry about, I think I'll cope.

>There's also the issue that if / when you
> ever get close to the last bit in an int bitstring may behave oddly
> because of sign issues.

And also something I don't have to worry about.

>> So there's no compelling technical reason to switch.
>
> I never said there was.  I simply asked a question, and got my hand bitten.

No, you got a question asked back at you: *WHY* would you think I
should be using a bitstring? It was an honest question, basically
wondering if there's something about bitstrings that make them such a
great idea. It's not a data type Oracle users are greatly familiar
with, you see.

>> And without a
>> technically-compelling reason, the rest of the post I was referring to
>> simply boiled down, as far as I could tell, to a matter of personal
>> preference. No less valid for that, of course. But ultimately, not
>> something that would hold much sway with me.
>
> Sure, fine, whatever you want.  I wasn't trying to convince you either
> way.  I do think using the right type for the job makes more sense,
> but again, it's personal preference.

Indeed. Although in this case, it's not even the right type for the job.

>>>> But simply saying "your design is broken... wooooo!" might well scare
>>>> the children, but doesn't really do anything for me, because I know
>>>> for a certainty that it's not broken at all.
>>>
>>> I asked if there was a reason you were avoiding bit strings.  Hardly a
>>> "your design is broken" point.
>>
>> I'm getting a bit fed up of this thread now. It wasn't YOU that ever
>> said 'the design is broken', and I never suggested it was. That was
>> Peter Hunsberger, about three posts up in the thread, who wrote "In
>> that case your database design is fundamentally broken."
>
> Well, pardon me for misunderstanding.  I'll stop the conversation
> after this message too.   you do realize that paragraph immediately
> followed the one referring to my post.

Well, the line *I* wrote that "simply saying "your design is broken...
wooooo!" might well scare the children..." and so on, was the SECOND
paragraph of a reply I made to a post made by Peter. It came
immediately after a paragraph *Peter* had written (the first line of
which was 'your design is broken'.  And that came immediately after a
line I'd written that said "It's not an integer. It's an encoded bit
of magic". So I make that at least four paragraphs of content before
your stuff even gets close...

Maybe we read this stuff in different readers that thread differently?

>
>> If you're going to take umbrage at something, please take umbrage at
>> things that were actually directed at you in the first place!
>
> Wait a second, do I work for you or something?  Cause I could swear
> this was a forum where equals discussed issues with each other.

It would be nice if it was. Instead, it seems to be a place where you
get to question all sorts of things you self-confessedly know not much
about and then blame me for having 'attitude' when I ever-so-gently
try to point out that learning about your personal preferences was not
why I was writing here!

>I'll
> try to do better from now on when I provide free advice to people.  I
> wouldn't want to disappoint you.

All I said was, please try to only take offence at things which you
could genuinely take offence at! The comment about 'design is broken'
was made by Peter, not you. My firm -but hopefully polite- rebuttal of
the idea the design is broken was aimed at him, not you. Therefore,
for you to pick up on the point as say "I hardly meant your design was
broken" was something of a complete non sequiteur on your part, I
felt.

>>> You've now said why you are not using
>>> the type that was designed to handle bit strings for bit strings.
>>>
>>> I personally would store them as bit strings and change representation
>>> for users.
>>
>> I'm a user, too. I get to see this stuff every time I do a select
>> statement. At the command line. Which I use a lot.
>
> I guess 20 years of teaching and working on digital electronics and
> assembly code left me quite a bit more comfortable with binary
> strings.  I'd be quite comfortable with them on the command line.

Oh, come now. A veiled appeal to authority... I'm sure you can do
better than that. I've been teaching and working with databases since
1987 if we're trading 'years of experience'.

>>> There are some issues that come up if your bit strings are
>>> long enough to get close to the last bit in an integer (also mentioned
>>> on the links I posted that didn't get read).
>>
>> Don't make false assumptions about other people, please. You don't
>> know what I read or didn't read.
>
> No I didn't.  But it certainly seemed to me like you hadn't.

Which is an assumption on your part. As I said...

From now
> on I'll only comment on exactly what you've posted.  I'll go one
> better and just stop after this post.

An excellent idea for us both, I think.

>> Just because you didn't make a
>> compelling technical argument in favour of bitstrings doesn't mean I
>> didn't read the article you linked to ...that also didn't make a
>> compelling technical argument in favour of bitstrings.
>
> Who's job is it to understand this software and make compelling
> arguments?  Again, I'm not a paid consultant, I don't spend hours each
> day carefully crafting arguments to make you happy.  I asked one
> freakin question, gave two links and got nothing but attitude.

You asked one question, got an answer, and have leapt to conclusions
ever since.

> I'd be ever so happy if you took your attitude and went back to Oracle use.

Your right and privilege, naturally.

>>> But other than that it
>>> should work fine.
>>
>> Yes, I know. I've only been using this technique for five years on
>> Oracle! I would be very surprised indeed if it wasn't transferrable to
>> PostgreSQL.
>
> Me too.  But I think you'd be way happier on Oracle.

I am. It's because I'm learning, and that's always a tricky
experience, and one is generally happier with what one is comfortable
with. But I actually think PostgreSQL is a spectacularly good RDBMS,
and I'll persist, if that's OK by you. When I need help, I'll ask for
it. If I'm given it, I'll be grateful for it. When I'm offered
personal opinion on things I don't need assistance with, I'll try and
deflect it as nicely as I know how. Whether someone will take the hint
or not is another matter, I guess.

>> Still doesn't answer the precise, specific technical question I
>> actually asked, though, does it?!
>
> Which was answered by Stephen Cook was it not?  I.e. use plain old equals?

Maybe I should assume you haven't read the thread, then?! God knows
what that answer even actually meant, but hopefully you read my reply
where I pointed out that it's no answer at all. 21205 & 4098 = what,
precisely? Never mind: another rhetorical question. Plain old equals
doesn't come close.

>> And since there is indeed no technical content in these continued
>> to-and-fro posts, I'll be leaving it there, if that's OK.
>
> That would be great.  Hope to not be seeing you around.

As I say, it's your right and privilege to feel that way. And if it
makes you feel any happier, please feel free to ignore any future
posts I might make. Regrettable, really, since I had nothing but
gratitude for your original post, whatever you might think.

Regards
HJR

Re: Bitmask trickiness

От
Stephen Cook
Дата:
On 7/23/2010 2:38 AM, Howard Rogers wrote:
>>> Still doesn't answer the precise, specific technical question I
>>> >>  actually asked, though, does it?!
>> >
>> >  Which was answered by Stephen Cook was it not?  I.e. use plain old equals?
> Maybe I should assume you haven't read the thread, then?! God knows
> what that answer even actually meant, but hopefully you read my reply
> where I pointed out that it's no answer at all. 21205&  4098 = what,
> precisely? Never mind: another rhetorical question. Plain old equals
> doesn't come close.
>

Hate to interrupt your flame war, and I apologize for not being precise
in my meaning first try... You don't need any bitwise anything to
compare two bitmasks-hiding-in-integers, just check for equality.

Instead of "select * from coloursample where colour & 10 = 10;" just try
"select * from coloursample where colour = 10;".

If you want to probe for two values, that MUST be in there, and WITHOUT
anything else, bitwise OR them together as the probe value and use plain
old equals there too. You only need the bitwise AND stuff for checking
for a value that MUST be in there, regardless of whether or not other
values are in there as well.

Hope I was clearer this time. Originally I just fired off a quickie
email to get you past your coder's block.

-- Stephen

Re: Bitmask trickiness

От
Alban Hertroys
Дата:
> I thought to do
>
> select * from coloursample where colour & 10 = 10;
>
> ...but that's not right, because it finds the third record is a match.


What's not entirely clear to me is whether you only want to find colours that have BOTH Yellow and Orange set and
nothingelse, or colours that have EITHER Yellow and Orange set and nothing else. 

The first case has been answered by Stephen (use a straight 'equals'). The other case is a bit more complicated.

That 11 matches using "& 10" is because you filtered out all the other bits in your comparison by anding them with '0',
whilethey /are/ relevant: they aren't allowed to be '1' after all. You probably need to look at the inverted versions
ofthese numbers to get what you need. 

My bit-foo is a bit rusty, but this looks like what you need (I used bit-strings for my own convenience):

development=> select (~ '01010'::bit(5)) | '01010'::bit(5) = '11111'::bit(5);
 ?column?
----------
 t
(1 row)

development=> select (~ '01011'::bit(5)) | '01010'::bit(5) = '11111'::bit(5);
 ?column?
----------
 f
(1 row)

development=> select (~ '01110'::bit(5)) | '01010'::bit(5) = '11111'::bit(5);
 ?column?
----------
 f
(1 row)

development=> select (~ '11010'::bit(5)) | '01010'::bit(5) = '11111'::bit(5);
 ?column?
----------
 f
(1 row)

development=> select (~ '00010'::bit(5)) | '01010'::bit(5) = '11111'::bit(5);
 ?column?
----------
 t
(1 row)

development=> select (~ '01000'::bit(5)) | '01010'::bit(5) = '11111'::bit(5);
 ?column?
----------
 t
(1 row)


Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.


!DSPAM:737,4c49503f286213027486771!



Re: Bitmask trickiness

От
Stephen Cook
Дата:
On 7/23/2010 5:33 AM, Howard Rogers wrote:
> ...so select * from table where 21205 | 4097 = 21205 would correctly
> grab that record. So I'm assuming you mean the 'stored value' should
> be on both sides of the equals test. If so, that would indeed seem to
> be the ultimate answer to the question (though I wouldn't myself call
> it a 'plain old equals' :-) )
>
>> Hope I was clearer this time. Originally I just fired off a quickie email to
>> get you past your coder's block.
>
> I do indeed think the magic of "BIT OR" is the missing ingredient I
> was looking for, and I very much appreciate your help leading me to
> it. My apologies for being too dense to spot what you were talking
> about before.

I think I misunderstood you the whole time actually, or maybe was
injecting some of my other thoughts into your problem. I figured you
meant you wanted to find records where your probe value has exactly the
same bit pattern as your stored value (probe bits, and only probe bits,
set; hence the "plain old equals"). Rather (and I just confirmed this
looking at the OP) you want any records where the stored value has all
of the probe value's bits set, regardless of the other bits in the
stored value.

So yeah, check if ORing the stored and probe values equals the stored value.

Oh well, even if I misread, glad to help you stumble upon what you
wanted eventually.

-- Stephen

Re: Bitmask trickiness

От
Howard Rogers
Дата:
On Fri, Jul 23, 2010 at 7:57 PM, Stephen Cook <sclists@gmail.com> wrote:
> On 7/23/2010 5:33 AM, Howard Rogers wrote:
>>
>> ...so select * from table where 21205 | 4097 = 21205 would correctly
>> grab that record. So I'm assuming you mean the 'stored value' should
>> be on both sides of the equals test. If so, that would indeed seem to
>> be the ultimate answer to the question (though I wouldn't myself call
>> it a 'plain old equals' :-) )
>>
>>> Hope I was clearer this time. Originally I just fired off a quickie email
>>> to
>>> get you past your coder's block.
>>
>> I do indeed think the magic of "BIT OR" is the missing ingredient I
>> was looking for, and I very much appreciate your help leading me to
>> it. My apologies for being too dense to spot what you were talking
>> about before.
>
> I think I misunderstood you the whole time actually, or maybe was injecting
> some of my other thoughts into your problem. I figured you meant you wanted
> to find records where your probe value has exactly the same bit pattern as
> your stored value (probe bits, and only probe bits, set; hence the "plain
> old equals"). Rather (and I just confirmed this looking at the OP) you want
> any records where the stored value has all of the probe value's bits set,
> regardless of the other bits in the stored value.
>
> So yeah, check if ORing the stored and probe values equals the stored value.
>
> Oh well, even if I misread, glad to help you stumble upon what you wanted
> eventually.
>
> -- Stephen
>


No worries. We got there in the end!

Thanks again,
HJR

Re: Bitmask trickiness

От
Howard Rogers
Дата:
On Fri, Jul 23, 2010 at 6:17 PM, Alban Hertroys
<dalroi@solfertje.student.utwente.nl> wrote:
>> I thought to do
>>
>> select * from coloursample where colour & 10 = 10;
>>
>> ...but that's not right, because it finds the third record is a match.
>
>
> What's not entirely clear to me is whether you only want to find colours that have BOTH Yellow and Orange set and
nothingelse, or colours that have EITHER Yellow and Orange set and nothing else. 
>
> The first case has been answered by Stephen (use a straight 'equals'). The other case is a bit more complicated.
>
> That 11 matches using "& 10" is because you filtered out all the other bits in your comparison by anding them with
'0',while they /are/ relevant: they aren't allowed to be '1' after all. You probably need to look at the inverted
versionsof these numbers to get what you need. 
>
> My bit-foo is a bit rusty,

Hehe. Mine too, it would seem!

>but this looks like what you need (I used bit-strings for my own convenience):
>
> development=> select (~ '01010'::bit(5)) | '01010'::bit(5) = '11111'::bit(5);
>  ?column?
> ----------
>  t
> (1 row)
>
> development=> select (~ '01011'::bit(5)) | '01010'::bit(5) = '11111'::bit(5);
>  ?column?
> ----------
>  f
> (1 row)
>
> development=> select (~ '01110'::bit(5)) | '01010'::bit(5) = '11111'::bit(5);
>  ?column?
> ----------
>  f
> (1 row)
>
> development=> select (~ '11010'::bit(5)) | '01010'::bit(5) = '11111'::bit(5);
>  ?column?
> ----------
>  f
> (1 row)
>
> development=> select (~ '00010'::bit(5)) | '01010'::bit(5) = '11111'::bit(5);
>  ?column?
> ----------
>  t
> (1 row)
>
> development=> select (~ '01000'::bit(5)) | '01010'::bit(5) = '11111'::bit(5);
>  ?column?
> ----------
>  t
> (1 row)
>
>
> Alban Hertroys


Thanks Alban. Steve, too, has joined in again above: the bit-wise OR,
together with an equality test on the stored value, would seem to be
what's called for.

Appreciate the contribution.

I think we can all go home now!!  :-)

Regards
HJR

Re: Bitmask trickiness

От
Howard Rogers
Дата:
> Hate to interrupt your flame war, and I apologize for not being precise in
> my meaning first try... You don't need any bitwise anything to compare two
> bitmasks-hiding-in-integers, just check for equality.
>
> Instead of "select * from coloursample where colour & 10 = 10;" just try
> "select * from coloursample where colour = 10;".

Which works for that example. But please consider the more realistic
one I also posted. The stored value is 21205 (I think... it was all a
long time ago). I probe that with 4098. I do NOT want that returned,
because the '2' bit is not set in the 21205 value.

So select * from table where 21205=4098 won't cut it, will it?

> If you want to probe for two values, that MUST be in there, and WITHOUT
> anything else, bitwise OR them together as the probe value and use plain old
> equals there too. You only need the bitwise AND stuff for checking for a
> value that MUST be in there, regardless of whether or not other values are
> in there as well.

A Bitwise OR? Ah ha: I think that might be the key:

ims=# select 21205 | 4098;
 ?column?
----------
    21207
(1 row)

So again, what exactly am I supposed to test for here? I mean, select
* from table where 21205 | 4098 = 21205 would do it, I suppose,
because that would correctly reject the row. But so would select *
from table where 21205 | 4098 = 4098. However, if I stumble on a bit
further, I do manage this:

ims=# select 21205 | 4097;
 ?column?
----------
    21205
(1 row)

...so select * from table where 21205 | 4097 = 21205 would correctly
grab that record. So I'm assuming you mean the 'stored value' should
be on both sides of the equals test. If so, that would indeed seem to
be the ultimate answer to the question (though I wouldn't myself call
it a 'plain old equals' :-) )

> Hope I was clearer this time. Originally I just fired off a quickie email to
> get you past your coder's block.

I do indeed think the magic of "BIT OR" is the missing ingredient I
was looking for, and I very much appreciate your help leading me to
it. My apologies for being too dense to spot what you were talking
about before.

Regards
HJR

Re: Bitmask trickiness

От
Greg Smith
Дата:
Howard Rogers wrote:
> That's the point: you've assumed something you needn't have.
>

You seem to have assumed that Scott was trying to be a jerk here, when
he was just trying to help you out by suggesting a feature in PostgreSQL
you may not have been familiar with, one that makes this particular sort
of job significantly easier to do.

> Honestly, when I want general consulting, I pay for it. You really
> don't have to try and give it away for free.
>

If you don't want general suggestions going beyond what you might have
specifically asked about, I'm afraid you are on the wrong set of mailing
lists.  People pop up here every day asking very specific things that
suggest they are in fact going about something in completely the wrong
way.  That makes it pretty common for one's general motive to be asked
about.  Also, PostgreSQL is so large that it's easy for people to not be
aware of major pieces to it yet, so there's usually some questions to
try and feel that out too.

The attitude you should have here is "if I want to be able to tell
people exactly what they should and shouldn't say, I'll have to pay for
that".  Community PostgreSQL support is not going to shut up and assume
you know what you're doing unless you prove that with your comments.

P.S. This little "I've been doing this for X long" pissing game is going
to end making everyone look like n00bs when Tom gets back.

--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com   www.2ndQuadrant.us


Re: Bitmask trickiness

От
Scott Marlowe
Дата:
On Fri, Jul 23, 2010 at 10:04 AM, Greg Smith <greg@2ndquadrant.com> wrote:
> P.S. This little "I've been doing this for X long" pissing game is going to
> end making everyone look like n00bs when Tom gets back.

No pissing match on my end.  I honestly feel more comfortable working
with these kinds of things in binary than decimal, and that's all my
comment meant.

Re: Bitmask trickiness

От
John R Pierce
Дата:

the universal solution is a AND with one mask (which has a 1 in every
position you wish to test for and a zero in each position you wish to
ignore) and an XOR with another mask (that has a 1 in each position that
you want to test for a 1 and a zero in each position that you wish to
test for a 0)), then a test if the result is == zero.

in a language like SQL, this could be simplified to a AND with MASK1 and
a COMPARE for equality with MASK2, with the same result.

-jrp
  (old assembler programmer)