-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
File.match? has an issue with paths #15319
Comments
The behaviour of the patterns with double wildcard is very clearly an error: On the other hand There could be some argument that Implementations such as Ruby and Python seem to allow substring matching, while Golang behaves similar as Crystal: import fnmatch
fnmatch.fnmatch("a/b/c.x", "*.x") # => True package main
import (
"fmt"
"path/filepath"
)
func main() {
fmt.Println(filepath.Match("*.x", "a/b/c.x")) // => false
} Note: |
And #include <stdio.h>
#include <stdlib.h>
#include <fnmatch.h>
int main() {
printf("*.x: %d\n", fnmatch("*.x", "a/b/c.x", FNM_PATHNAME)); // => 1
printf("**/*.x: %d\n", fnmatch("**/*.x", "a/b/c.x", FNM_PATHNAME)); // => 1
printf("**.x: %d\n", fnmatch("**.x", "a/b/c.x", FNM_PATHNAME)); // => 1
} |
Looks like everybody is doing something different.. p File.fnmatch?("*.x", "a.x", File::FNM_PATHNAME) # true
p File.fnmatch?("*.x", "a/b/c.x", File::FNM_PATHNAME) # false
p File.fnmatch?("**/*.x", "a/b/c.x", File::FNM_PATHNAME) # true
p File.fnmatch?("**.x", "a/b/c.x", File::FNM_PATHNAME) # false I find this (standardized...) option overly complex. Regexp matches don't care about extra pre- or postfixes, this is what I was expecting somehow from this method as well.
Could you give me an example? I struggled on this explanation in the docs already. Thanks! |
@straight-shoota Python's |
The only comparable method in Python is https://docs.python.org/3/library/pathlib.html#pathlib.PurePath.full_match |
I've made a script to compare all programming languages I could think of :D The script#!/bin/bash
data='
"*.x", "a.x"
"*.x", "a/b/c.x"
"**/*.x", "a/b/c.x"
"**.x", "a/b/c.x"
'
pr --merge --omit-header -s'|' <(echo) <(
echo
echo "---"
while IFS= read -r line; do
test -n "$line" && echo "\`$line\`"
done <<< "$data"
) <(
echo "Crystal"
echo "---"
(
while IFS= read -r line; do
test -n "$line" && echo "p File.match?($line)"
done <<< "$data"
) | crystal eval
) <(
echo "Python"
echo "---"
(
echo '
import pathlib
def matches(pattern, path):
return pathlib.PurePath(path).full_match(pattern)
'
while IFS= read -r line; do
test -n "$line" && echo "print('true' if matches($line) else 'false')"
done <<< "$data"
) | python
) <(
echo "C"
echo "---"
(
echo '
#include <stdio.h>
#include <stdlib.h>
#include <fnmatch.h>
int main() {
'
while IFS= read -r line; do
test -n "$line" && echo "printf(\"%s\n\", fnmatch($line, FNM_PATHNAME) ? \"true\" : \"false\");"
done <<< "$data"
echo '}'
) > test.c && gcc test.c -o test_c && ./test_c
) <(
echo "Rust"
echo "---"
echo '
[package]
name = "test"
edition = "2021"
[dependencies]
glob = "0.3.2"
[[bin]]
name = "test"
' > Cargo.toml
mkdir -p src/bin/test
(
echo '
extern crate glob;
use glob::Pattern;
fn matches(pattern: &str, path: &str) -> Option<bool> {
return Some(Pattern::new(pattern).ok()?.matches(path));
}
fn main() {
'
while IFS= read -r line; do
test -n "$line" && echo "println!(\"{}\", match matches($line) { Some(true) => \"true\", Some(false) => \"false\", None => \"error\" });"
done <<< "$data"
echo '}'
) > src/bin/test/main.rs
cargo run --quiet
) <(
echo "Ruby"
echo "---"
(
while IFS= read -r line; do
test -n "$line" && echo "p File.fnmatch?($line)"
done <<< "$data"
) | ruby
) <(
echo "Go"
echo "---"
(
echo '
package main
import (
"fmt"
"path/filepath"
)
func matches(pattern, path string) string {
if v, e := filepath.Match(pattern, path); e != nil {
return "error"
} else {
return fmt.Sprintf("%v", v)
}
}
func main() {
'
while IFS= read -r line; do
test -n "$line" && echo "fmt.Println(matches($line))"
done <<< "$data"
echo '}'
) > test.go && go run test.go
) <(echo)
I think the Python example is the only one that makes sense |
Thanks for putting together that comparison. I agree that Python's behaviour seems to be the most reasonable. And I believe that Crystal is actually intended to work the same but the implementation is buggy. |
Oops, deleted that example |
Essentially, |
thanks, so like this
|
I have found more reasonable implementations for Go, Rust (external libs) and Ruby (pass The script#!/bin/bash
lines=(
'"*.x", "a.x"'
'"a/b/*.x", "a/b/c.x"'
'"a/b/**", "a/b/c.x"'
'"a/**", "a/b/c.x"'
'"a/**/*", "a/b/c/d.x"'
'"a/**/d.x", "a/b/c/d.x"'
'"**/*.x", "a/b/c.x"'
'"a/b**/d.x", "a/bb/c/d.x"'
'"a/**b/d.x", "a/bb/c/d.x"'
'"a/b**/*", "a/bb/c/d.x"'
'"**.x", "a/b/c.x"'
'"*.x", "a/b/c.x"'
'"c.x", "a/b/c.x"'
'"b/*.x", "a/b/c.x"'
)
pr --merge --omit-header -s'|' <(echo) <(
echo 'Pattern|Path'
echo '-------|----'
for line in "${lines[@]}"; do
(IFS=', '; printf '`%s`|`%s`\n' $line)
done
) <(
echo 'Crystal'
echo '-----'
(
for line in "${lines[@]}"; do
echo "p File.match?($line)"
done
) | crystal eval
) <(
echo 'Go[^1]'
echo '-----'
(
touch go.mod
go mod edit -module=test
go get github.com/gobwas/glob
echo '
package main
import (
"fmt"
"github.com/gobwas/glob"
)
func matches(pattern, path string) string {
if g, e := glob.Compile(pattern, '"'/'"'); e != nil {
return "error"
} else {
return fmt.Sprintf("%v", g.Match(path))
}
}
func main() {
'
for line in "${lines[@]}"; do
echo "fmt.Println(matches($line))"
done
echo '}'
) > test.go && go run test.go
) <(
echo 'Python'
echo '-----'
(
echo '
import pathlib
def matches(pattern, path):
return pathlib.PurePath(path).full_match(pattern)
'
for line in "${lines[@]}"; do
echo "print(matches($line))"
done
) | python
) <(
echo 'Rust[^2]'
echo '-----'
echo '
[package]
name = "test"
edition = "2021"
[dependencies]
glob-match = "0.2.1"
[[bin]]
name = "test"
' > Cargo.toml
mkdir -p src/bin/test
(
echo '
extern crate glob_match;
use glob_match::glob_match;
fn main() {
'
for line in "${lines[@]}"; do
echo "println!(\"{}\", glob_match($line));"
done
echo '}'
) > src/bin/test/main.rs
cargo run --quiet
) <(
echo 'Ruby'
echo '-----'
(
for line in "${lines[@]}"; do
echo "p File.fnmatch?($line, File::FNM_PATHNAME | File::FNM_NOESCAPE)"
done
) | ruby
) <(
echo 'C'
echo '-----'
(
echo '
#include <stdio.h>
#include <stdlib.h>
#include <fnmatch.h>
int main() {
int matched;
'
for line in "${lines[@]}"; do
echo " matched = fnmatch($line, FNM_PATHNAME | FNM_NOESCAPE);"
echo ' printf("%s\n", matched == 0 ? "true" : matched == FNM_NOMATCH ? "false" : "error");'
done
echo '}'
) > test.c && gcc test.c -o test_c && ./test_c
) <(echo) | sed -e 's/\btrue\b/✅/Ig' -e 's/\bfalse\b/❌/Ig' -e 's/\berror\b/🚨/Ig'
echo
echo '[^1]: https://github.com/gobwas/glob'
echo '[^2]: https://github.com/devongovett/glob-match'
Footnotes |
The algorithm for This needs some more research to figure out how (or even if) we can implement double wildcards correctly with this algorithm, or if we need a different solution. |
how about just reusing (parts of) # Returns an array of all files that match against any of *patterns*.
#
# ```
# Dir.glob "path/to/folder/*.txt" # Returns all files in the target folder that end in ".txt".
# Dir.glob "path/to/folder/**/*" # Returns all files in the target folder and its subfolders.
# ```
# The pattern syntax is similar to shell filename globbing, see `File.match?` for details. |
I think I find the python behaviour the most reasonable in the comparison, or raising an error when |
This is how Crystal behaves:
In comparison to Ruby:
The text was updated successfully, but these errors were encountered: