UTF-8 column names and error messages are octet-streams, not strings [rt.cpan.org #120141]

Migrated from [rt.cpan.org#120141](https://rt.cpan.org/Ticket/Display.html?id=120141) (status was 'open')

Requestors:
* tanabe@fa2.so-net.ne.jp

Attachments:
* [utf8ColunnMsg.pl](https://rt.cpan.org/Ticket/Attachment/1703568/915044/utf8ColunnMsg.pl)
* [0001-Fix-decoding-UTF-8-field-names-and-tables-names-when.patch](https://rt.cpan.org/Ticket/Attachment/1704547/915640/0001-Fix-decoding-UTF-8-field-names-and-tables-names-when.patch)
* [0002-Fix-decoding-UTF-8-warning-and-error-messages-when-m.patch](https://rt.cpan.org/Ticket/Attachment/1704547/915641/0002-Fix-decoding-UTF-8-warning-and-error-messages-when-m.patch)


From tanabe@fa2.so-net.ne.jp on 2017-02-08 04:07:52:
```
Hello,

Column names and error messages should be treated as strings, but
they are octet-streams in DBD-mysql-4.041.

The attached code creates a table with a column whose name
contains a non ASCII character.  After issueing a SELECT statement
and fetchrow_hashref, it tries to get a value using the column name
at (1), but the result is undef.  If you use the octet stream for
the column name as a key, you get the value, at (2).

Also, when you use Japanese error messages by adding line
	lc_messages=ja_JP
in [mysqld] section of my.ini, messages are not decoded in
DBD::mysql.  As a result, messages are unreadable in (3) and (4).
We could explicitly decode them as in (5) for message caught, but
this cannot be applied to (3).  Of course, it can be avoided by
not using automatic encoding for STDERR at (6), but then we need
to manually encode all other strings, a nightmare.

Finally, I noticed that when error messages are in Japanese, make
test of DBD-mysql fails.  It may be difficult to avoid (I do not
know), but a warning message (lc_messages should not be changed)
in make test would help.

DBD::mysql version: 4.041
Strawberry perl 64bit, v5.22.1
MariaDB
   $dbh->{mysql_clientinfo, mysql_clientversion, mysql_serverversion}  
returns:
   5.1.44, 50144, 50505, respectively.
Windows 7 Pro Service Pack 1

Regards,
Tanabe Yoshinori

```

From pali@cpan.org on 2017-02-08 10:32:43:
```
On Tue Feb 07 23:07:52 2017, tanabe@fa2.so-net.ne.jp wrote:
> Hello,
> 
> Column names and error messages should be treated as strings, but
> they are octet-streams in DBD-mysql-4.041.
> 
> The attached code creates a table with a column whose name
> contains a non ASCII character.  After issueing a SELECT statement
> and fetchrow_hashref, it tries to get a value using the column name
> at (1), but the result is undef.  If you use the octet stream for
> the column name as a key, you get the value, at (2).
> 
> Also, when you use Japanese error messages by adding line
> 	lc_messages=ja_JP
> in [mysqld] section of my.ini, messages are not decoded in
> DBD::mysql.  As a result, messages are unreadable in (3) and (4).
> We could explicitly decode them as in (5) for message caught, but
> this cannot be applied to (3).  Of course, it can be avoided by
> not using automatic encoding for STDERR at (6), but then we need
> to manually encode all other strings, a nightmare.
> 
> Finally, I noticed that when error messages are in Japanese, make
> test of DBD-mysql fails.  It may be difficult to avoid (I do not
> know), but a warning message (lc_messages should not be changed)
> in make test would help.
> 
> DBD::mysql version: 4.041
> Strawberry perl 64bit, v5.22.1
> MariaDB
>    $dbh->{mysql_clientinfo, mysql_clientversion, mysql_serverversion}  
> returns:
>    5.1.44, 50144, 50505, respectively.
> Windows 7 Pro Service Pack 1
> 
> Regards,
> Tanabe Yoshinori
> 

Hello, please try development version 4.041_1 of DBD-mysql. That one has fixed UTF-8 support for passing statements and parameters.
```

From tanabe@fa2.so-net.ne.jp on 2017-02-08 11:20:34:
```
On 2017/02/08 19:32, Pali via RT wrote:
> <URL: https://rt.cpan.org/Ticket/Display.html?id=120141 >
>
> On Tue Feb 07 23:07:52 2017, tanabe@fa2.so-net.ne.jp wrote:
>> Hello,
>>
>> Column names and error messages should be treated as strings, but
>> they are octet-streams in DBD-mysql-4.041.
>>
>> The attached code creates a table with a column whose name
>> contains a non ASCII character.  After issueing a SELECT statement
>> and fetchrow_hashref, it tries to get a value using the column name
>> at (1), but the result is undef.  If you use the octet stream for
>> the column name as a key, you get the value, at (2).
>>
>> Also, when you use Japanese error messages by adding line
>> 	lc_messages=ja_JP
>> in [mysqld] section of my.ini, messages are not decoded in
>> DBD::mysql.  As a result, messages are unreadable in (3) and (4).
>> We could explicitly decode them as in (5) for message caught, but
>> this cannot be applied to (3).  Of course, it can be avoided by
>> not using automatic encoding for STDERR at (6), but then we need
>> to manually encode all other strings, a nightmare.
>>
>> Finally, I noticed that when error messages are in Japanese, make
>> test of DBD-mysql fails.  It may be difficult to avoid (I do not
>> know), but a warning message (lc_messages should not be changed)
>> in make test would help.
>>
>> DBD::mysql version: 4.041
>> Strawberry perl 64bit, v5.22.1
>> MariaDB
>>    $dbh->{mysql_clientinfo, mysql_clientversion, mysql_serverversion}
>> returns:
>>    5.1.44, 50144, 50505, respectively.
>> Windows 7 Pro Service Pack 1
>>
>> Regards,
>> Tanabe Yoshinori
>>
>
> Hello, please try development version 4.041_1 of DBD-mysql. That one has fixed UTF-8 support for passing statements and parameters.
>

Hello,

I have just installed 4.041_01 ("print $DBD::mysql::VERSION" shows the 
number) and run the script again.  The results are the same as in my
first report.

Thank you.
Tanabe
```

From pali@cpan.org on 2017-02-12 12:52:30:
```
On Str Feb 08 06:20:34 2017, tanabe@fa2.so-net.ne.jp wrote:
> On 2017/02/08 19:32, Pali via RT wrote:
> > <URL: https://rt.cpan.org/Ticket/Display.html?id=120141 >
> >
> > On Tue Feb 07 23:07:52 2017, tanabe@fa2.so-net.ne.jp wrote:
> >> Hello,
> >>
> >> Column names and error messages should be treated as strings, but
> >> they are octet-streams in DBD-mysql-4.041.
> >>
> >> The attached code creates a table with a column whose name
> >> contains a non ASCII character.  After issueing a SELECT statement
> >> and fetchrow_hashref, it tries to get a value using the column name
> >> at (1), but the result is undef.  If you use the octet stream for
> >> the column name as a key, you get the value, at (2).
> >>
> >> Also, when you use Japanese error messages by adding line
> >>      lc_messages=ja_JP
> >> in [mysqld] section of my.ini, messages are not decoded in
> >> DBD::mysql.  As a result, messages are unreadable in (3) and (4).
> >> We could explicitly decode them as in (5) for message caught, but
> >> this cannot be applied to (3).  Of course, it can be avoided by
> >> not using automatic encoding for STDERR at (6), but then we need
> >> to manually encode all other strings, a nightmare.
> >>
> >> Finally, I noticed that when error messages are in Japanese, make
> >> test of DBD-mysql fails.  It may be difficult to avoid (I do not
> >> know), but a warning message (lc_messages should not be changed)
> >> in make test would help.
> >>
> >> DBD::mysql version: 4.041
> >> Strawberry perl 64bit, v5.22.1
> >> MariaDB
> >>    $dbh->{mysql_clientinfo, mysql_clientversion,
> >> mysql_serverversion}
> >> returns:
> >>    5.1.44, 50144, 50505, respectively.
> >> Windows 7 Pro Service Pack 1
> >>
> >> Regards,
> >> Tanabe Yoshinori
> >>
> >
> > Hello, please try development version 4.041_1 of DBD-mysql. That one
> > has fixed UTF-8 support for passing statements and parameters.
> >
> 
> Hello,
> 
> I have just installed 4.041_01 ("print $DBD::mysql::VERSION" shows the
> number) and run the script again.  The results are the same as in my
> first report.
> 
> Thank you.
> Tanabe

Hi! Can you try compile DBD::mysql (either 4.041_01 or from git master) with these two attached patches? It should fix wide Unicode characters in column names and error messages. Note that DBI itself has broken Unicode messages prior to version 1.635 (see https://rt.cpan.org/Public/Bug/Display.html?id=102404).
```

From pali@cpan.org on 2017-02-12 12:54:03:
```
On Ned Feb 12 07:52:30 2017, PALI wrote:
> On Str Feb 08 06:20:34 2017, tanabe@fa2.so-net.ne.jp wrote:
> > On 2017/02/08 19:32, Pali via RT wrote:
> > > <URL: https://rt.cpan.org/Ticket/Display.html?id=120141 >
> > >
> > > On Tue Feb 07 23:07:52 2017, tanabe@fa2.so-net.ne.jp wrote:
> > >> Hello,
> > >>
> > >> Column names and error messages should be treated as strings, but
> > >> they are octet-streams in DBD-mysql-4.041.
> > >>
> > >> The attached code creates a table with a column whose name
> > >> contains a non ASCII character.  After issueing a SELECT statement
> > >> and fetchrow_hashref, it tries to get a value using the column
> > >> name
> > >> at (1), but the result is undef.  If you use the octet stream for
> > >> the column name as a key, you get the value, at (2).
> > >>
> > >> Also, when you use Japanese error messages by adding line
> > >>      lc_messages=ja_JP
> > >> in [mysqld] section of my.ini, messages are not decoded in
> > >> DBD::mysql.  As a result, messages are unreadable in (3) and (4).
> > >> We could explicitly decode them as in (5) for message caught, but
> > >> this cannot be applied to (3).  Of course, it can be avoided by
> > >> not using automatic encoding for STDERR at (6), but then we need
> > >> to manually encode all other strings, a nightmare.
> > >>
> > >> Finally, I noticed that when error messages are in Japanese, make
> > >> test of DBD-mysql fails.  It may be difficult to avoid (I do not
> > >> know), but a warning message (lc_messages should not be changed)
> > >> in make test would help.
> > >>
> > >> DBD::mysql version: 4.041
> > >> Strawberry perl 64bit, v5.22.1
> > >> MariaDB
> > >>    $dbh->{mysql_clientinfo, mysql_clientversion,
> > >> mysql_serverversion}
> > >> returns:
> > >>    5.1.44, 50144, 50505, respectively.
> > >> Windows 7 Pro Service Pack 1
> > >>
> > >> Regards,
> > >> Tanabe Yoshinori
> > >>
> > >
> > > Hello, please try development version 4.041_1 of DBD-mysql. That
> > > one
> > > has fixed UTF-8 support for passing statements and parameters.
> > >
> >
> > Hello,
> >
> > I have just installed 4.041_01 ("print $DBD::mysql::VERSION" shows
> > the
> > number) and run the script again.  The results are the same as in my
> > first report.
> >
> > Thank you.
> > Tanabe
> 
> Hi! Can you try compile DBD::mysql (either 4.041_01 or from git
> master) with these two attached patches? It should fix wide Unicode
> characters in column names and error messages. Note that DBI itself
> has broken Unicode messages prior to version 1.635 (see
> https://rt.cpan.org/Public/Bug/Display.html?id=102404).

Trying to attach patches again...
```

From tanabe@fa2.so-net.ne.jp on 2017-02-13 02:34:37:
```
On 2017/02/12 21:52, Pali via RT wrote:
> <URL: https://rt.cpan.org/Ticket/Display.html?id=120141 >
>
> On Str Feb 08 06:20:34 2017, tanabe@fa2.so-net.ne.jp wrote:
>> On 2017/02/08 19:32, Pali via RT wrote:
>>> <URL: https://rt.cpan.org/Ticket/Display.html?id=120141 >
>>>
>>> On Tue Feb 07 23:07:52 2017, tanabe@fa2.so-net.ne.jp wrote:
>>>> Hello,
>>>>
>>>> Column names and error messages should be treated as strings, but
>>>> they are octet-streams in DBD-mysql-4.041.
>>>>
>>>> The attached code creates a table with a column whose name
>>>> contains a non ASCII character.  After issueing a SELECT statement
>>>> and fetchrow_hashref, it tries to get a value using the column name
>>>> at (1), but the result is undef.  If you use the octet stream for
>>>> the column name as a key, you get the value, at (2).
>>>>
>>>> Also, when you use Japanese error messages by adding line
>>>>      lc_messages=ja_JP
>>>> in [mysqld] section of my.ini, messages are not decoded in
>>>> DBD::mysql.  As a result, messages are unreadable in (3) and (4).
>>>> We could explicitly decode them as in (5) for message caught, but
>>>> this cannot be applied to (3).  Of course, it can be avoided by
>>>> not using automatic encoding for STDERR at (6), but then we need
>>>> to manually encode all other strings, a nightmare.
>>>>
>>>> Finally, I noticed that when error messages are in Japanese, make
>>>> test of DBD-mysql fails.  It may be difficult to avoid (I do not
>>>> know), but a warning message (lc_messages should not be changed)
>>>> in make test would help.
>>>>
>>>> DBD::mysql version: 4.041
>>>> Strawberry perl 64bit, v5.22.1
>>>> MariaDB
>>>>    $dbh->{mysql_clientinfo, mysql_clientversion,
>>>> mysql_serverversion}
>>>> returns:
>>>>    5.1.44, 50144, 50505, respectively.
>>>> Windows 7 Pro Service Pack 1
>>>>
>>>> Regards,
>>>> Tanabe Yoshinori
>>>>
>>>
>>> Hello, please try development version 4.041_1 of DBD-mysql. That one
>>> has fixed UTF-8 support for passing statements and parameters.
>>>
>>
>> Hello,
>>
>> I have just installed 4.041_01 ("print $DBD::mysql::VERSION" shows the
>> number) and run the script again.  The results are the same as in my
>> first report.
>>
>> Thank you.
>> Tanabe
>
> Hi! Can you try compile DBD::mysql (either 4.041_01 or from git master) with these two attached patches? It should fix wide Unicode characters in column names and error messages. Note that DBI itself has broken Unicode messages prior to version 1.635 (see https://rt.cpan.org/Public/Bug/Display.html?id=102404).
>

Hello,  I have confirmed that the problems have gone by applying the 
patches (and upgrading DBI to a later version).  Thank you very much for 
the quick fix.
One concern is that the fix can break code currently running.
Best regards,
Tanabe
```

From pali@cpan.org on 2017-02-13 08:26:46:
```
On Sun Feb 12 21:34:37 2017, tanabe@fa2.so-net.ne.jp wrote:
> On 2017/02/12 21:52, Pali via RT wrote:
> > <URL: https://rt.cpan.org/Ticket/Display.html?id=120141 >
> >
> > On Str Feb 08 06:20:34 2017, tanabe@fa2.so-net.ne.jp wrote:
> >> On 2017/02/08 19:32, Pali via RT wrote:
> >>> <URL: https://rt.cpan.org/Ticket/Display.html?id=120141 >
> >>>
> >>> On Tue Feb 07 23:07:52 2017, tanabe@fa2.so-net.ne.jp wrote:
> >>>> Hello,
> >>>>
> >>>> Column names and error messages should be treated as strings, but
> >>>> they are octet-streams in DBD-mysql-4.041.
> >>>>
> >>>> The attached code creates a table with a column whose name
> >>>> contains a non ASCII character.  After issueing a SELECT statement
> >>>> and fetchrow_hashref, it tries to get a value using the column
> >>>> name
> >>>> at (1), but the result is undef.  If you use the octet stream for
> >>>> the column name as a key, you get the value, at (2).
> >>>>
> >>>> Also, when you use Japanese error messages by adding line
> >>>>      lc_messages=ja_JP
> >>>> in [mysqld] section of my.ini, messages are not decoded in
> >>>> DBD::mysql.  As a result, messages are unreadable in (3) and (4).
> >>>> We could explicitly decode them as in (5) for message caught, but
> >>>> this cannot be applied to (3).  Of course, it can be avoided by
> >>>> not using automatic encoding for STDERR at (6), but then we need
> >>>> to manually encode all other strings, a nightmare.
> >>>>
> >>>> Finally, I noticed that when error messages are in Japanese, make
> >>>> test of DBD-mysql fails.  It may be difficult to avoid (I do not
> >>>> know), but a warning message (lc_messages should not be changed)
> >>>> in make test would help.
> >>>>
> >>>> DBD::mysql version: 4.041
> >>>> Strawberry perl 64bit, v5.22.1
> >>>> MariaDB
> >>>>    $dbh->{mysql_clientinfo, mysql_clientversion,
> >>>> mysql_serverversion}
> >>>> returns:
> >>>>    5.1.44, 50144, 50505, respectively.
> >>>> Windows 7 Pro Service Pack 1
> >>>>
> >>>> Regards,
> >>>> Tanabe Yoshinori
> >>>>
> >>>
> >>> Hello, please try development version 4.041_1 of DBD-mysql. That
> >>> one
> >>> has fixed UTF-8 support for passing statements and parameters.
> >>>
> >>
> >> Hello,
> >>
> >> I have just installed 4.041_01 ("print $DBD::mysql::VERSION" shows
> >> the
> >> number) and run the script again.  The results are the same as in my
> >> first report.
> >>
> >> Thank you.
> >> Tanabe
> >
> > Hi! Can you try compile DBD::mysql (either 4.041_01 or from git
> > master) with these two attached patches? It should fix wide Unicode
> > characters in column names and error messages. Note that DBI itself
> > has broken Unicode messages prior to version 1.635 (see
> > https://rt.cpan.org/Public/Bug/Display.html?id=102404).
> >
> 
> Hello,  I have confirmed that the problems have gone by applying the
> patches (and upgrading DBI to a later version).  Thank you very much
> for
> the quick fix.
> One concern is that the fix can break code currently running.
> Best regards,
> Tanabe

Thank you for testing. I will reuse your script to create tests for this issue.

Currently Unicode support is broken for a long time in DBD::mysql and proper way is to fix current code.
```

From pali@cpan.org on 2017-07-01 09:12:29:
```
Reopening, fix was reverted in 4.043.
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

UTF-8 column names and error messages are octet-streams, not strings [rt.cpan.org #120141] #214

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

UTF-8 column names and error messages are octet-streams, not strings [rt.cpan.org #120141] #214

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions