--- Michael Fuhr <mike@fuhr.org> wrote:
> On Mon, Aug 15, 2005 at 01:48:00PM -0700, CSN wrote:
> > db=>select ascii('');
> > ascii
> > -------
> > 226
> >
> > db=>select id from news where body ilike '%%';
> > (0 rows)
> >
> > db=>select id from news where body ilike '%' ||
> > chr(226) || '%';
> > db'>
> > db'>^C
> > db=>
>
> What's going on with the last query? The prompt
> change suggests
> that psql is confused with quoting, and the ^C looks
> like you hit
> Control-C to get the regular prompt back. Did you
> ever run this
> query? If it produced no rows then you could widen
> the search.
Hmm, I'm on another computer and I just tried that
last query and it worked without psql thinking it
needed another single quote. Appears the chr code is
146 not 226 (turns out chr(226) is â - why that
doesn't cause problems with iso-8859-1/utf-8 xml and
the single/double quotes and dashes do I don't know).
Anyhow, I ended up doing this:
update news set body=replace(body,chr(146),''''); --
left single quote
update news set body=replace(body,chr(145),''''); --
right single quote
update news set body=replace(body,chr(147),'"'); --
left double quote
update news set body=replace(body,chr(148),'"'); --
right double quote
update news set body=replace(body,chr(150),'-'); -- en
dash
update news set body=replace(body,chr(151),'-'); -- em
dash
and that seems to do the trick. Most places I found
online listed different chars for these codes, but
http://www.webopedia.com/quick_ref/asciicode.asp lists
them. Jeez, I'm so confused with encodings, charsets,
etc. now. :(
Thanks,
CSN
> Example:
>
> SELECT id FROM news WHERE body ~ '[\200-\377]';
>
> You could use the "string from pattern" variant of
> substring() to
> extract characters in a specific range. If you have
> PL/Perl then
> it would be trivial to extract all of and only the
> special characters
> along with their ASCII codes:
>
> CREATE FUNCTION special_chars(text) RETURNS text AS
> '
> return join(" ", map {"$_:" . ord($_)} $_[0] =~
> /[\200-\377]/g);
> ' LANGUAGE plperl IMMUTABLE STRICT;
>
> SELECT id, special_chars(body) FROM news WHERE body
> ~ '[\200-\377]';
>
> --
> Michael Fuhr
>
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com