Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow setting the collation in auth handshake #860

Merged
merged 13 commits into from
Apr 30, 2024

Conversation

dvilaverde
Copy link
Contributor

This PR is to allow the collation to be set during the auth handshake in order to avoid requiring calls to SET NAMES post connection.

Copy link
Collaborator

@dveeden dveeden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks ok, but might need a few more tests. I'll have a full review next week

@dvilaverde
Copy link
Contributor Author

Looks ok, but might need a few more tests. I'll have a full review next week

Please let me know of an additional changes or tests you might need.

client/auth.go Outdated Show resolved Hide resolved
client/auth.go Outdated
return fmt.Errorf("invalid collation name %s", collationName)
}

data[12] = byte(collation.ID)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would overflow for some collations.

  • 255 / 0xff / utf8mb4_0900_ai_ci would be fine
  • 309 / 0x0135 / utf8mb4_0900_bin would not be

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weirdly enough the protocol says this is 1 byte and that only the low 8-bits are put in this field. Not sure how that's going to work.

https://dev.mysql.com/doc/dev/mysql-server/latest/page_protocol_connection_phase_packets_protocol_handshake_response.html

image

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A MySQL 8.0 Client only allows the user to set the charset, not the collation:

  --default-character-set=name 
                      Set the default character set.

And as all default collations are in the 0-255 range this works with the protocol.

mysql> SELECT MAX(ID) FROM information_schema.COLLATIONS WHERE IS_DEFAULT='Yes';
+---------+
| MAX(ID) |
+---------+
|     255 |
+---------+
1 row in set (0.00 sec)

@dveeden
Copy link
Collaborator

dveeden commented Apr 29, 2024

So this needs testing for collations that have an id that's >255.

Also what is the reason for doing this? Is it connection setup performance? Are you always using an UTF-8 collation or something else?

@dveeden
Copy link
Collaborator

dveeden commented Apr 29, 2024

mysql> SELECT *,ID&0x00ff FROM information_schema.COLLATIONS WHERE COLLATION_NAME IN('utf8mb4_ja_0900_as_cs','latin1_bin') ORDER BY ID;
+-----------------------+--------------------+-----+------------+-------------+---------+---------------+-----------+
| COLLATION_NAME        | CHARACTER_SET_NAME | ID  | IS_DEFAULT | IS_COMPILED | SORTLEN | PAD_ATTRIBUTE | ID&0x00ff |
+-----------------------+--------------------+-----+------------+-------------+---------+---------------+-----------+
| latin1_bin            | latin1             |  47 |            | Yes         |       1 | PAD SPACE     |        47 |
| utf8mb4_ja_0900_as_cs | utf8mb4            | 303 |            | Yes         |       0 | NO PAD        |        47 |
+-----------------------+--------------------+-----+------------+-------------+---------+---------------+-----------+
2 rows in set (0.00 sec)

So if only 1 byte gets send, both these collations will have the same value.

In [24]: c = mysql.connector.connect(host='127.0.0.1',port=3306,user="test",password="test",collation="utf8mb4_unicode_520_ci", ssl_disabled=True)

In [25]: cur = c.cursor()

In [26]: cur.execute("SHOW SESSION VARIABLES LIKE '%collation%'");

In [27]: cur.fetchall()
Out[27]: 
[('collation_connection', 'utf8mb4_unicode_520_ci'),
 ('collation_database', 'utf8mb4_0900_ai_ci'),
 ('collation_server', 'utf8mb4_0900_ai_ci'),
 ('default_collation_for_utf8mb4', 'utf8mb4_0900_ai_ci')]

In [28]: cur.close()
Out[28]: True

In [29]: c.close()

In [30]: c = mysql.connector.connect(host='127.0.0.1',port=3306,user="test",password="test",collation="utf8mb4_ja_0900_as_cs", ssl_disabled=True)

In [31]: cur = c.cursor()

In [32]: cur.execute("SHOW SESSION VARIABLES LIKE '%collation%'");

In [33]: cur.fetchall()
Out[33]: 
[('collation_connection', 'utf8mb4_ja_0900_as_cs'),
 ('collation_database', 'utf8mb4_0900_ai_ci'),
 ('collation_server', 'utf8mb4_0900_ai_ci'),
 ('default_collation_for_utf8mb4', 'utf8mb4_0900_ai_ci')]

In [34]: cur.close()
Out[34]: True

In [35]: c.close()
In [40]: c = mysql.connector.connect(host='127.0.0.1',port=3306,user="test",password="test",collation="latin1_bin", ssl_disabled=True)

In [41]: cur = c.cursor()

In [42]: cur.execute("SHOW SESSION VARIABLES LIKE '%collation%'");

In [43]: cur.fetchall()
Out[43]: 
[('collation_connection', 'latin1_bin'),
 ('collation_database', 'utf8mb4_0900_ai_ci'),
 ('collation_server', 'utf8mb4_0900_ai_ci'),
 ('default_collation_for_utf8mb4', 'utf8mb4_0900_ai_ci')]

image

image

So, the protocol docs are wrong and Wireshark is also not decoding this properly. I'll follow up on both.

This can actually be two bytes (see the "Unused", which isn't all zeros in one of the cases)

@dveeden
Copy link
Collaborator

dveeden commented Apr 29, 2024

mysql/mysql-server#541

@dveeden
Copy link
Collaborator

dveeden commented Apr 29, 2024

@dvilaverde
Copy link
Contributor Author

So this needs testing for collations that have an id that's >255.

Also what is the reason for doing this? Is it connection setup performance? Are you always using an UTF-8 collation or something else?

Hi yes our reason to do this is we have an application, using latin1_general_ci and MariaDB 5.5.x, that is sensitive to latency and we setup and tear down connections a a lot, primarily because there is a large number of MariaDB instances that application needs to interact with. Ideally we'd like reduce the connection setup time as much as possible.

@dveeden
Copy link
Collaborator

dveeden commented Apr 29, 2024

Maybe check with Wireshark if there are any more roundtrips you can avoid

@dvilaverde
Copy link
Contributor Author

I believe I've addressed the > 255 collation ID and added a test to assert the bytes are correctly set in the auth handshake. I am hoping I've understood your comments and feedback correctly on this PR and addressed all the issues. Please let me know if I've misunderstood and you were looking for something else.

Co-authored-by: Daniël van Eeden <[email protected]>
@dvilaverde
Copy link
Contributor Author

Let me know if you prefer to follow the documentation for the protocol and limit the use of collation in the auth handshake to collations with an ID < 255. This would allow users to set the mysql 8.x utf8mb4_0900_ai_ci default if needed while remaining true to the protocol.

client/auth.go Outdated
// the 23 bytes of filler is used to send the upper 8 bits of the collation id.
// see https://github.com/mysql/mysql-server/pull/541
data[12] = byte(collation.ID & 0xff)
if collation.ID > 255 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be possible always encode the collation.ID to a 2 byte integer, if the right endianness is used. That might simplify the code a bit

client/auth.go Outdated Show resolved Hide resolved
@dveeden
Copy link
Collaborator

dveeden commented Apr 29, 2024

@lance6716 PTAL

@lance6716
Copy link
Collaborator

I'll take a look tomorrow

Copy link
Collaborator

@lance6716 lance6716 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rest lgtm

client/auth_test.go Outdated Show resolved Hide resolved
client/auth_test.go Show resolved Hide resolved
client/client_test.go Show resolved Hide resolved
client/conn.go Outdated Show resolved Hide resolved
@dvilaverde
Copy link
Contributor Author

rest lgtm

I've addressed the feedback from the review.

@lance6716 lance6716 merged commit 8551be2 into go-mysql-org:master Apr 30, 2024
13 checks passed
dvilaverde added a commit to dvilaverde/go-mysql that referenced this pull request Apr 30, 2024
* Allow connect with context in order to provide configurable connect timeouts
* support collations IDs greater than 255 on the auth handshake
---------

Co-authored-by: dvilaverde <[email protected]>
lance6716 pushed a commit that referenced this pull request Jun 6, 2024
…outs (#885)

* allow setting the collation in auth handshake

* Allow connect with context in order to provide configurable connect timeouts

* add driver arguments

* check for empty ssl value when setting conn options

* allow setting the collation in auth handshake (#860)
* Allow connect with context in order to provide configurable connect timeouts
* support collations IDs greater than 255 on the auth handshake
---------

Co-authored-by: dvilaverde <[email protected]>

* refactored and added more driver args

* revert change to Makefile

* added tests for timeouts

* adding more tests

* fixing linting issues

* avoiding panic on test complete

* revert returning set readtimeout error in binlogsyncer

* fixing nil violation when connection with timeout from binlogsyncer

* Update README.md

Co-authored-by: Daniël van Eeden <[email protected]>

* addressing pull request feedback

* revert rename driver arg ssl to tls

* addressing PR feedback

* write compressed packet using writeWithTimeout

* updated README.md

---------

Co-authored-by: dvilaverde <[email protected]>
Co-authored-by: Daniël van Eeden <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants