Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix!: correctly detect if file is outside base path on Windows #59

Merged
merged 26 commits into from
Oct 24, 2024

Conversation

fasttime
Copy link
Member

@fasttime fasttime commented Jun 17, 2024

Prerequisites checklist

What is the purpose of this pull request?

Fix logic that detects when a file is outside the base path so it works well with Windows-style paths. Also fixed an edge case with paths that start with ".." but are not external like "..file.js".

What changes did you make? (Give an overview)

Windows paths have some unique features not shared by Unix paths. For one, absolute local paths start with a drive letter followed by a colon like "C:". The fact that the drive letter must be the first character in a local path means that a relative path cannot specify a drive. In Node.js, path.relative() will just return the absolute path of the second argument when it cannot calculate a relative path across different drives, so for example with path.relative("C:\\dir", "D:\\file.js").

Besides local paths, Windows uses UNC paths in the form \\Server\Volume\dir1\dir2\file to specify files on a network. Those paths are always absolute.

Both config-array and ESLint use relative paths to check if a file is inside a base directory: if the file's relative path from the directory starts with "..", then the file is not in the directory. This check is currently failing when the base directory and the file were located on different drives or servers, because as previously mentioned path.relative() doesn't handle those cases.

The fix I've come up with consists in calculating what Node.js calls namespace-prefixed paths of the base directory and the file and calling path.relative() on those two. Namespaced paths are prefixed with \\?\ for local paths or with \\?\UNC for network paths. When the provided arguments are namespaced paths on different drives, path.relative() will return a path in the form ..\..\C:\dir1\dir2\file.js. While such a path is not valid as it contains a drive letter in middle position, it can be readily recognized as for being outside the base path because of the leading .. and discarded as external.

Related Issues

Is there anything you'd like reviewers to focus on?

@eslint-github-bot eslint-github-bot bot added the bug Something isn't working label Jun 17, 2024
@fasttime fasttime changed the title fix: correctly detect is file is outside base path on Windows fix: correctly detect if file is outside base path on Windows Jun 17, 2024
eslint.config.js Outdated Show resolved Hide resolved
Comment on lines -355 to -358
if (relativeFilePath.startsWith("..")) {
return true;
}
Copy link
Member Author

@fasttime fasttime Jun 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is already performed in getConfigWithStatus().


//-----------------------------------------------------------------------------
// Helpers
//-----------------------------------------------------------------------------

// calculate base path using import.meta
const basePath = path.dirname(new URL(import.meta.url).pathname);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Windows, this was giving a path like /c:/foo/bar/rewrite/packages/config-array/tests, where the c: is considered a top-level directory rather than a drive specifier. This was producing incorrect results with path.resolve(basePath) like C:\c:\foo\bar\rewrite\packages\config-array\tests.

@@ -2860,6 +3010,30 @@ describe("ConfigArray", () => {
);
});

it("should return true when the directory is the parent of the base path", () => {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to ensure that a relative path like ".." without a slash is recognized as external.

packages/config-array/src/config-array.js Outdated Show resolved Hide resolved
packages/config-array/src/config-array.js Outdated Show resolved Hide resolved
packages/config-array/src/config-array.js Outdated Show resolved Hide resolved
@fasttime fasttime marked this pull request as ready for review June 23, 2024 20:18
packages/config-array/src/config-array.js Outdated Show resolved Hide resolved
packages/config-array/src/config-array.js Outdated Show resolved Hide resolved
packages/config-array/src/to-namespaced-path.js Outdated Show resolved Hide resolved
}

// On non-Windows systems, `toNamespacedPath` returns the argument as is.
export default sep === "\\" ? win32ToNamespacedPath : arg => arg;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem like a safe check. Again, because we aren't going to know what format a string is in when it's passed in a browser environment, I don't think we can assume path.sep is accurate for the data we'll be getting.

It seems safer to check for the first instance of \ and / in the string? Something like:

const isWinPath = Boolean(filePath.indexOf("/") - file.Path.indexOf("\\"));

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The input file path could be relative, so there may not be enough indication to determine if it's a Unix or a Windows path. Another complication is that path.relative and path.dirname have different implementations for Unix and Windows (the path separator / vs. \ is not the only difference). These functions will return different results depending on the platform where Node.js runs even for the same input.

If ConfigArray needs to work with both Windows and Unix paths on any platform including browsers, and not depend on cwd at all, it would be better to avoid using node:path functions to manipulate paths and implement the whole logic internally.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If ConfigArray needs to work with both Windows and Unix paths on any platform including browsers, and not depend on cwd at all, it would be better to avoid using node:path functions to manipulate paths and implement the whole logic internally.

This was actually the intent. Originally, the only reference to path I had was path.relative, which I intended to replace with an inline function in order to avoid that dependency. My goal going forward was to make sure we weren't starting to use even more of path to ensure everything would work in browsers correctly.

Whatever we do, I just don't want to add even deeper dependency on the path module. If we need to rethink things a bit, then let's do that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've started working on generic path handling helpers to handle both Unix and Windows style paths regardless of the current platform. I'm planning to show a prototype in this PR to get some feedback.

@nzakas
Copy link
Member

nzakas commented Aug 8, 2024

We may want to consider using the Deno standard path library, which is not runtime or OS specific:
https://jsr.io/@std/path

@fasttime
Copy link
Member Author

We may want to consider using the Deno standard path library, which is not runtime or OS specific: https://jsr.io/@std/path

It looks like that library contains different implementations for windows and posix, and by default, it selects the implementation for the current platform. That is similar to how Node.js node:path works.

It would be useful to find a library that can be installed via npm so we don't need to copy the source files manually.

@nzakas
Copy link
Member

nzakas commented Aug 13, 2024

npx jsr add @std/path works. 😄

@fasttime
Copy link
Member Author

Finally made it work as expected using only the Deno standard library for path handling and without considering the current directory. This fixes the way external files are matched on Windows by including drive letters or UNC server names when resolving relative paths.

An important note: external files are files located outside the base path, but the base path is an optional parameter. If no base path is specified, files will never be considered external. This changes the current behavior where the base path defaults to be the current directory if not specified (in ESLint, the base path is always specified).

@fasttime fasttime requested a review from nzakas August 19, 2024 08:14
@nzakas
Copy link
Member

nzakas commented Aug 20, 2024

An important note: external files are files located outside the base path, but the base path is an optional parameter. If no base path is specified, files will never be considered external. This changes the current behavior where the base path defaults to be the current directory if not specified (in ESLint, the base path is always specified).

This is actually the way I thought it would work when I first implemented it. 😄 I don't think this will affect users because we always pass basePath, but we may want to mark this as breaking for this package.

nzakas
nzakas previously approved these changes Aug 20, 2024
Copy link
Member

@nzakas nzakas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. As I mentioned, I think we should mark this as breaking.

Would like @mdjermanovic to review before merging.

@mdjermanovic mdjermanovic changed the title fix: correctly detect if file is outside base path on Windows fix!: correctly detect if file is outside base path on Windows Aug 25, 2024
@@ -1 +1,2 @@
package-lock = false
@jsr:registry=https://npm.jsr.io
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would end users need to add this to their .npmrc files?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tried using npm pack-ed version of config-array from this branch in eslint/eslint, and npm install fails with:

$ npm i
npm ERR! code E404
npm ERR! 404 Not Found - GET https://registry.npmjs.org/@jsr%2fstd__path - Not found
npm ERR! 404
npm ERR! 404  '@jsr/std__path@^1.0.2' is not in this registry.
npm ERR! 404
npm ERR! 404 Note that you can also install from a
npm ERR! 404 tarball, folder, http url, or git url.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can expect users to change their npm settings so that they can download packages from JSR. We could distribute @std/path in a separate directory inside config-array and import it from there. Or we could use the URL of the tarball from JSR to specify the dependecy in package.json:

-    "@jsr/std__path": "^1.0.2",
+    "@jsr/std__path": "https://npm.jsr.io/~/11/@jsr/std__path/1.0.2.tgz",

Either way, we will lose the magic of the caret ^ to pick newer versions automatically. I checked the JSR docs at https://jsr.io/docs/using-packages but I didn't find any recommendations on how to use their packages with npm apart from registering JSR in .npmrc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can one of you open an issue on JSR for this? I can't imagine they didn't encounter this before, but before we spin our wheels, it would be nice to verify if we're doing things the right way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened a discussion here: jsr-io/jsr#701

@nzakas
Copy link
Member

nzakas commented Sep 3, 2024

I'd say let's just use the URL for now. We can always update the approach if/when they respond to your question.

@fasttime
Copy link
Member Author

fasttime commented Sep 4, 2024

The build is now failing because the generated CommonJS module is trying to require() @jsr/std__path, which is an ESM-only package. In packages/config-array/dist/cjs/index.cjs I can see:

var posixPath = require('@jsr/std__path/posix');
var windowsPath = require('@jsr/std__path/windows');

I'm not sure why there was no build error in my previous commit. The error doesn't seem connected to the way @jsr/std__path is declared in package.json. Maybe we need to bundle @jsr/std__path in the CommonJS module so it gets transpiled.

@mdjermanovic
Copy link
Member

Maybe we need to bundle @jsr/std__path in the CommonJS module so it gets transpiled.

If we're going to bundle @jsr/std__path in CJS, then we could bundle it in ESM as well, to avoid having a URL depedency (which I think is not a common practice for packages published to npm).

@fasttime
Copy link
Member Author

fasttime commented Sep 8, 2024

Maybe we need to bundle @jsr/std__path in the CommonJS module so it gets transpiled.

If we're going to bundle @jsr/std__path in CJS, then we could bundle it in ESM as well, to avoid having a URL depedency (which I think is not a common practice for packages published to npm).

Thanks! I tried to bundle @jsr/std__path in dist/cjs/index.cjs and dist/esm/index.js, but then tsc started to complain because of the checkJs option in tsconfig.json:

dist/esm/index.js:316:46 - error TS1313: The body of an 'if' statement cannot be the empty statement.

316       if (lastSlash === i - 1 || dots === 1) ; else if (lastSlash !== i - 1 && dots === 2) {
                                                 ~

dist/esm/index.js:516:15 - error TS2339: Property 'Deno' does not exist on type 'typeof globalThis'.

516       const { Deno } = globalThis;
                  ~~~~

dist/esm/index.js:1148:5 - error TS2353: Object literal may only specify known properties, and 'extended' does not exist in type '{ globstar?: boolean; }'.

1148     extended,
         ~~~~~~~~

dist/esm/index.js:1590:46 - error TS1313: The body of an 'if' statement cannot be the empty statement.

1590       if (lastSlash === i - 1 || dots === 1) ; else if (lastSlash !== i - 1 && dots === 2) {
                                                  ~

dist/esm/index.js:1970:13 - error TS2339: Property 'Deno' does not exist on type 'typeof globalThis'.

1970     const { Deno } = globalThis;
                 ~~~~

dist/esm/index.js:2742:5 - error TS2353: Object literal may only specify known properties, and 'extended' does not exist in type '{ globstar?: boolean; }'.

2742     extended,
         ~~~~~~~~


Found 6 errors in the same file, starting at: dist/esm/index.js:316

So I decided to bundle @jsr/std__path/posix and @jsr/std__path/windows in separate modules that tsc would not check. This seems to be working well and results in the following files being generated:

dist/esm/types.d.ts
dist/esm/index.js
dist/esm/types.ts
dist/esm/std__path/posix.js
dist/esm/std__path/windows.js
dist/esm/index.d.ts
dist/cjs/index.d.cts
dist/cjs/types.ts
dist/cjs/index.cjs
dist/cjs/std__path/posix.cjs
dist/cjs/std__path/windows.cjs

I will update this PR shortly with the new solution, and hopefully the CI build will pass this time.

nzakas
nzakas previously approved these changes Oct 21, 2024
Copy link
Member

@nzakas nzakas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Would like @mdjermanovic to verify before merging.

@mdjermanovic
Copy link
Member

LGTM. Would like @mdjermanovic to verify before merging.

#59 (comment) still needs to be addressed by disallowing empty string?

@fasttime
Copy link
Member Author

#59 (comment) still needs to be addressed by disallowing empty string?

I pushed a commit to disallow empty strings: 4ed3c1c. Also updated the readme file to reflect the changes: 8c30779.

@@ -606,7 +606,7 @@ export class ConfigArray extends Array {
* @property basePath
* @type {string}
*/
this.basePath = basePath || "/";
this.basePath = basePath;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we validate here so we can throw a more specific error, like "basePath must be a non-empty string"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in fb712b1, thanks.

Comment on lines -518 to +548
const filename = path.resolve(basePath, "foo.js");
const filename = "foo.js";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After this change, most of the tests will be with relative paths?

Copy link
Member

@nzakas nzakas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM pending @mdjermanovic's comment.

Copy link
Member

@mdjermanovic mdjermanovic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks! We can revisit testing at some later point.

@mdjermanovic mdjermanovic merged commit f93aa4c into main Oct 24, 2024
14 checks passed
@mdjermanovic mdjermanovic deleted the fix-external-paths branch October 24, 2024 19:01
@github-actions github-actions bot mentioned this pull request Oct 24, 2024
fasttime added a commit to eslint/eslint that referenced this pull request Oct 28, 2024
nzakas pushed a commit to eslint/eslint that referenced this pull request Nov 6, 2024
* fix: update for eslint/rewrite#59

* revert unnecessary changes with `basePath` defaulting to `"/"`

* update to `config-array` v0.19.x

* add tests for `Linter` and `RuleTester`

* add tests for `lintFile` on a different drive

* remove unused fallback logic for relative paths

* add a test for `isPathIgnored` on a different drive

* add more tests for `lintFiles`

* add a test for `lintText`

* remove unused param from `Linter#verify` calls

* added comments to `lintFiles` test hooks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted breaking bug Something isn't working
Projects
Status: Complete
Development

Successfully merging this pull request may close these issues.

3 participants