Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

src: fix generation of path objects in Windows #56696

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

yamachu
Copy link

@yamachu yamachu commented Jan 22, 2025

fix: #56650
ref: #56657

This PR fixes a problem that caused a segmentation fault in module resolution when creating a require object with multibyte characters in a non-English environment.

Since the issue occurred when generating a std::filesystem::path object, some patches were applied to the changed areas in the following PR.

a7dad43

Since the issue was reproduced only in different locales, such as ja-JP on Windows, I rewrote the locale in the CI of coverage-windows to run the test.
https://github.com/yamachu/node/pull/2/files#diff-29094741d50149aa772b3e577ad509116bad722ad2de85689b6cb2c01e806a46

.github/workflows/coverage-windows.yml

+      - name: Change locale ja-JP for testing on SJIS environment
+        run: Set-WinSystemLocale -SystemLocale "ja-JP"
+      # to avoid configure, nobuild and noprojgen is needed
+      - name: Test on SJIS environment
+        run: ./vcbuild.bat nobuild noprojgen test-ci-js; node -e 'process.exit(0)'
+        env:
+          NODE_V8_COVERAGE: ./coverage/tmp

The previous PR did not solve the problem when using unicode, so the process was separated for windows and the path object was generated.

The part where std::filesystem::path is used can be fixed by changing this PR.

The results of the test conducted in a Japanese environment can be seen below.
https://github.com/yamachu/node/actions/runs/12903928231/job/35980111577#step:12:16844

@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. needs-ci PRs that need a full CI run. labels Jan 22, 2025
Copy link

codecov bot commented Jan 22, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.21%. Comparing base (bf59539) to head (0697f02).
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #56696      +/-   ##
==========================================
- Coverage   89.21%   89.21%   -0.01%     
==========================================
  Files         662      662              
  Lines      191934   191936       +2     
  Branches    36945    36942       -3     
==========================================
+ Hits       171227   171228       +1     
- Misses      13543    13550       +7     
+ Partials     7164     7158       -6     
Files with missing lines Coverage Δ
src/node_modules.cc 78.91% <100.00%> (-0.19%) ⬇️

... and 38 files with indirect coverage changes

Copy link
Member

@anonrig anonrig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A similar already landed last August: #54653

We should use ToU8StringView() rather than introducing a new API.

@yamachu
Copy link
Author

yamachu commented Jan 23, 2025

@anonrig

Thanks for the review!
Does this mean in short that I should use std::u8string?

However, there is a problem with the u8string approach.
As I show in the description, the test-require-unicode.js test fails when I run code using u8string in SJIS and en-US(en-US...? e.g. GHA Windows Runner default locale)environment on Windows (so affected ALL WINDOWS).

Therefore, I provided and used an API to handle wstrings without using ToU8StringView.

@anonrig
Copy link
Member

anonrig commented Jan 23, 2025

I recommend improving existing solution to fit both cases rather than implementing a new solution.

@yamachu
Copy link
Author

yamachu commented Jan 23, 2025

What are "both cases" presented here?

I am sure that ToU8StringView is not used in current codebase.
Since it is not being used, it does not currently solve the problem.
It is not a solution because I know that using it again will cause problems.

I don't think that trying hard to use u8string in path is a better way to go....

@yamachu
Copy link
Author

yamachu commented Jan 23, 2025

The base branch is still bf59539 , so the associated test seems to be fail. (experimental flag)
Should I rebase the branch and force push?

@lpinca
Copy link
Member

lpinca commented Jan 23, 2025

Should I rebase the branch and force push?

Yes, thank you.

@yamachu yamachu force-pushed the fix-windows-path-string branch from 0697f02 to cba8830 Compare January 23, 2025 12:57
@lpinca lpinca added the request-ci Add this label to start a Jenkins CI on a PR. label Jan 23, 2025
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Jan 23, 2025
@nodejs-github-bot
Copy link
Collaborator

@yamachu
Copy link
Author

yamachu commented Jan 23, 2025

I had to rush to write my comments before work.
I apologize if any of my comments offended you.

I have rebased and force pushed the file, so please check with CI.

I would have liked to have been able to do this beforehand, but I looked over the entire code.
There I found the following helper functions.

node/src/node_file.cc

Lines 3149 to 3160 in d978610

std::wstring ConvertToWideString(const std::string& str) {
int size_needed = MultiByteToWideChar(
CP_UTF8, 0, &str[0], static_cast<int>(str.size()), nullptr, 0);
std::wstring wstrTo(size_needed, 0);
MultiByteToWideChar(CP_UTF8,
0,
&str[0],
static_cast<int>(str.size()),
&wstrTo[0],
size_needed);
return wstrTo;
}

This was exactly what I was looking for.
I rewrote what I wrote in this PR and added the CP settings and the test passed(on SJIS Windows environment).

…aths

refactoring will need to be done in the future. because it was in the form of copying similar code.
@@ -326,6 +330,22 @@ const BindingData::PackageConfig* BindingData::TraverseParent(
return nullptr;
}

#ifdef _WIN32
std::wstring ConvertToWideString(const std::string& str) {
auto cp = GetACP();
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section was copied from node_file.cc, but this GetACP is particularly important.
As shown in the description, when the Windows locale is changed and executed, if the original UTF-8 code is left here, the strings cannot be handled properly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ Issues and PRs that require attention from people who are familiar with C++. needs-ci PRs that need a full CI run.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Segmentation Fault When Passing Paths with Japanese Characters to createRequire in Node.js 22+
5 participants