Обсуждение: pgagent unicode support
Currently pgagent doesn't handle unicode correctly. CharToWString function corrupt multibyte characters because it processes string one byte at a time: 148 std::string s = std::string(cstr); 149 std::wstring wsTmp(s.begin(), s.end()); WStringToChar function does not take into account that there can be _multi_byte character on wcstombs output and create buffer with size = wcslen: 157 int wstr_length = wcslen(wchar_str); 158 char *dst = new char[wstr_length + 10]; Also pgagent do not setup locale with setlocale(), without it all wcs/mbs functions cannot handle multibyte strings. For example: === step code === select 'это проверка кириллицы в теле запроса pgagent' ================= === postgres log === 2021-02-05 23:19:05 UTC [15600-1] postgres@postgres ERROR: unterminated quoted string at or near "'" at character 8 2021-02-05 23:19:05 UTC [15600-2] postgres@postgres STATEMENT: select ' ==================== Please see attached patch. I only test it on GNU/Linux and can't test it on Windows, sorry. -- Sergey Burladyan
Вложения
Hi
On Sat, Feb 6, 2021 at 5:00 AM Sergey Burladyan <eshkinkot@gmail.com> wrote:
Currently pgagent doesn't handle unicode correctly.
CharToWString function corrupt multibyte characters because it processes
string one byte at a time:
148 std::string s = std::string(cstr);
149 std::wstring wsTmp(s.begin(), s.end());
WStringToChar function does not take into account that there can be
_multi_byte character on wcstombs output and create buffer with
size = wcslen:
157 int wstr_length = wcslen(wchar_str);
158 char *dst = new char[wstr_length + 10];
Also pgagent do not setup locale with setlocale(), without it all
wcs/mbs functions cannot handle multibyte strings.
For example:
=== step code ===
select 'это проверка кириллицы в теле запроса pgagent'
=================
=== postgres log ===
2021-02-05 23:19:05 UTC [15600-1] postgres@postgres ERROR: unterminated quoted string at or near "'" at character 8
2021-02-05 23:19:05 UTC [15600-2] postgres@postgres STATEMENT: select '
====================
Please see attached patch.
I only test it on GNU/Linux and can't test it on Windows, sorry.
Thanks for the patch! Neel/Ashesh; can you take a look please? It looks OK to me, but then I'm not overly familiar with multibyte string handling. What, if anything, needs to be done on Windows?
Thanks Sergey for the patch.
There is some compilation warning in linux, I will fix those and test pgAgent in windows and update the thread.
On Mon, Feb 8, 2021 at 2:55 PM Dave Page <dpage@pgadmin.org> wrote:
HiOn Sat, Feb 6, 2021 at 5:00 AM Sergey Burladyan <eshkinkot@gmail.com> wrote:Currently pgagent doesn't handle unicode correctly.
CharToWString function corrupt multibyte characters because it processes
string one byte at a time:
148 std::string s = std::string(cstr);
149 std::wstring wsTmp(s.begin(), s.end());
WStringToChar function does not take into account that there can be
_multi_byte character on wcstombs output and create buffer with
size = wcslen:
157 int wstr_length = wcslen(wchar_str);
158 char *dst = new char[wstr_length + 10];
Also pgagent do not setup locale with setlocale(), without it all
wcs/mbs functions cannot handle multibyte strings.
For example:
=== step code ===
select 'это проверка кириллицы в теле запроса pgagent'
=================
=== postgres log ===
2021-02-05 23:19:05 UTC [15600-1] postgres@postgres ERROR: unterminated quoted string at or near "'" at character 8
2021-02-05 23:19:05 UTC [15600-2] postgres@postgres STATEMENT: select '
====================
Please see attached patch.
I only test it on GNU/Linux and can't test it on Windows, sorry.Thanks for the patch! Neel/Ashesh; can you take a look please? It looks OK to me, but then I'm not overly familiar with multibyte string handling. What, if anything, needs to be done on Windows?--
Hi Sergey,
Thank you for the patch. It looks good to me except below.
We have modified the patch as we fixed the memory leak ( review comment given by Ashesh ) and also fixed the compilation warnings.
Can you please review and let us know ?
Thanks,
Neel Patel
On Mon, Feb 15, 2021 at 6:15 PM Neel Patel <neel.patel@enterprisedb.com> wrote:
Thanks Sergey for the patch.Sure Dave.There is some compilation warning in linux, I will fix those and test pgAgent in windows and update the thread.On Mon, Feb 8, 2021 at 2:55 PM Dave Page <dpage@pgadmin.org> wrote:HiOn Sat, Feb 6, 2021 at 5:00 AM Sergey Burladyan <eshkinkot@gmail.com> wrote:Currently pgagent doesn't handle unicode correctly.
CharToWString function corrupt multibyte characters because it processes
string one byte at a time:
148 std::string s = std::string(cstr);
149 std::wstring wsTmp(s.begin(), s.end());
WStringToChar function does not take into account that there can be
_multi_byte character on wcstombs output and create buffer with
size = wcslen:
157 int wstr_length = wcslen(wchar_str);
158 char *dst = new char[wstr_length + 10];
Also pgagent do not setup locale with setlocale(), without it all
wcs/mbs functions cannot handle multibyte strings.
For example:
=== step code ===
select 'это проверка кириллицы в теле запроса pgagent'
=================
=== postgres log ===
2021-02-05 23:19:05 UTC [15600-1] postgres@postgres ERROR: unterminated quoted string at or near "'" at character 8
2021-02-05 23:19:05 UTC [15600-2] postgres@postgres STATEMENT: select '
====================
Please see attached patch.
I only test it on GNU/Linux and can't test it on Windows, sorry.Thanks for the patch! Neel/Ashesh; can you take a look please? It looks OK to me, but then I'm not overly familiar with multibyte string handling. What, if anything, needs to be done on Windows?--
Вложения
On Fri, Feb 26, 2021 at 12:06 PM Neel Patel <neel.patel@enterprisedb.com> wrote:
Hi Dave/Ashesh,Do you have any further comments ?
Apologies for late response. Committed the patch.
Dave,
I've also updated the pgAgent's patch version (new version: 4.2.1), and copyright information.
-- Thanks, Ashesh
Thanks,Neel PatelOn Wed, Feb 24, 2021 at 8:14 PM Sergey Burladyan <eshkinkot@gmail.com> wrote:Neel Patel <neel.patel@enterprisedb.com> writes:
> Thanks for the review. Please find the attached updated patch.
> Do review it and let me know in case of any comments.
Looks good, thanks!
--
Sergey Burladyan
On Tue, Mar 30, 2021 at 1:45 PM Ashesh Vashi <ashesh.vashi@enterprisedb.com> wrote:
On Fri, Feb 26, 2021 at 12:06 PM Neel Patel <neel.patel@enterprisedb.com> wrote:Hi Dave/Ashesh,Do you have any further comments ?Apologies for late response. Committed the patch.Dave,I've also updated the pgAgent's patch version (new version: 4.2.1), and copyright information.
Thanks!