Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Unknown encoding when undefined is used #3252

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

mcandeia
Copy link

The behavior of buffer.toString(encoding) is not compliant with nodejs API. it throws an error UNKNOWN_ENCODING: null (or undefined) because it's being passed as a string "undefined" or "null"

@mcandeia mcandeia requested review from a team as code owners December 17, 2024 13:14
Copy link

github-actions bot commented Dec 17, 2024

All contributors have signed the CLA ✍️ ✅
Posted by the CLA Assistant Lite bot.

Signed-off-by: Marcos Candeia <[email protected]>
Signed-off-by: Marcos Candeia <[email protected]>
@mcandeia
Copy link
Author

I have read the CLA Document and I hereby sign the CLA

@mcandeia
Copy link
Author

@anonrig fixed a test that was failing due to this expected behavior: https://github.com/cloudflare/workerd/blob/main/src/workerd/api/node/tests/buffer-nodejs-test.js#L5803

@mcandeia mcandeia force-pushed the fix/encoding-as-string branch from a7af939 to cab07c8 Compare December 17, 2024 17:30
@mcandeia
Copy link
Author

mcandeia commented Dec 17, 2024

@anonrig can you help me to decide one thing?

The NodeJs implementation does not fallback for null encoding - leading into a TypeError as this test covers (https://github.com/cloudflare/workerd/blob/main/src/workerd/api/node/tests/buffer-nodejs-test.js#L5823)

This test was failing before because of the wrong conversion to null (as a string) but now it isn't failing anymore because it is well-covered by the encoding?.toString().

The nodejs implementation on the other hand does the following:

  1. Buffer.prototype.toString implementation uses getEncodingOpts when encoding is null
  2. https://github.com/nodejs/node/blob/a73c41c51e7711086b9801afd5e91ae2f96a7694/lib/buffer.js#L722C10-L722C24
  3. it concats null with '' https://github.com/nodejs/node/blob/a73c41c51e7711086b9801afd5e91ae2f96a7694/lib/buffer.js#L723
  4. which results into a null as a string value with length of 4
image
  1. The switch case will not match with any option https://github.com/nodejs/node/blob/a73c41c51e7711086b9801afd5e91ae2f96a7694/lib/buffer.js#L725
  2. then it returns undefined and only then throws an error: https://github.com/nodejs/node/blob/a73c41c51e7711086b9801afd5e91ae2f96a7694/lib/buffer.js#L863

The options we have:

  1. Remove the test case and let undefined and null have the same behavior
  2. make sure null encoding is not allowed and explicitly throw an error

@anonrig
Copy link
Member

anonrig commented Dec 18, 2024

Hi @mcandeia,

Thank you for the awesome comment. I love it! Here's a suggestion to fix the issue:

diff --git a/src/node/internal/internal_buffer.ts b/src/node/internal/internal_buffer.ts
index cfde61ea..f57ba153 100644
--- a/src/node/internal/internal_buffer.ts
+++ b/src/node/internal/internal_buffer.ts
@@ -28,7 +28,10 @@ import {
   isUint8Array,
 } from 'node-internal:internal_types';
 
-import { normalizeEncoding } from 'node-internal:internal_utils';
+import {
+  normalizeEncoding,
+  getEncodingOps,
+} from 'node-internal:internal_utils';
 
 import { validateString } from 'node-internal:validators';
 
@@ -648,7 +651,7 @@ Buffer.prototype.toString = function toString(
     return '';
   }
 
-  const normalizedEncoding = normalizeEncoding(encoding?.toString());
+  const normalizedEncoding = getEncodingOps(encoding);
   if (normalizedEncoding === undefined) {
     throw new ERR_UNKNOWN_ENCODING(`${encoding}`);
   }
diff --git a/src/node/internal/internal_utils.ts b/src/node/internal/internal_utils.ts
index 7290f4f5..3a80b921 100644
--- a/src/node/internal/internal_utils.ts
+++ b/src/node/internal/internal_utils.ts
@@ -42,10 +42,12 @@ export function normalizeEncoding(enc?: string): Encoding | undefined {
     enc === 'UTF-8'
   )
     return UTF8;
-  return slowCases(enc);
+  return getEncodingOps(enc);
 }
 
-export function slowCases(enc: unknown): Encoding | undefined {
+export function getEncodingOps(enc: unknown): Encoding | undefined {
+  // eslint-disable-next-line @typescript-eslint/restrict-plus-operands
+  enc += '';
   // @ts-expect-error TS18046 TS complains about unknown can not have length.
   switch (enc.length) {
     case 4:
diff --git a/src/workerd/api/node/tests/buffer-nodejs-test.js b/src/workerd/api/node/tests/buffer-nodejs-test.js
index 3dc88c3a..09c56f80 100644
--- a/src/workerd/api/node/tests/buffer-nodejs-test.js
+++ b/src/workerd/api/node/tests/buffer-nodejs-test.js
@@ -5815,7 +5815,8 @@ export const toStringRange = {
       },
       {
         name: 'TypeError',
-      }
+      },
+      'toString() with 0 and null as the encoding should have thrown'
     );
 
     throws(
@@ -5824,7 +5825,8 @@ export const toStringRange = {
       },
       {
         name: 'TypeError',
-      }
+      },
+      'toString() with null encoding should have thrown'
     );
   },
 };

@anonrig
Copy link
Member

anonrig commented Dec 18, 2024

By the way, I forgot. Can you add a test so we don't regress in the future? Thank you.

Signed-off-by: Marcos Candeia <[email protected]>
Signed-off-by: Marcos Candeia <[email protected]>
@anonrig
Copy link
Member

anonrig commented Dec 20, 2024

It seems there is a formatting issue on [warn] src/workerd/api/node/tests/buffer-nodejs-test.js

Signed-off-by: Marcos Candeia <[email protected]>
@mcandeia
Copy link
Author

fixed @anonrig

Signed-off-by: Marcos Candeia <[email protected]>
@mcandeia
Copy link
Author

ok, had to fix it again @anonrig

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants