Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement URLPattern #785

Open
wants to merge 148 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 117 commits
Commits
Show all changes
148 commits
Select commit Hold shift + click to select a range
b963c09
implement URLPattern skeleton
anonrig Nov 30, 2024
668f950
use correct value for clang-format
anonrig Nov 30, 2024
f0cfe7b
fix build errors
anonrig Nov 30, 2024
9772f44
create url_pattern-inl.h
anonrig Nov 30, 2024
d0abcff
add canonicalize methods
anonrig Dec 4, 2024
8e7ac7c
add ada::parse_url_pattern function
anonrig Dec 4, 2024
62eba64
add more comments
anonrig Dec 4, 2024
f2a1b1d
implement getters
anonrig Dec 4, 2024
ee48e30
add has_regexp_groups()
anonrig Dec 4, 2024
615f1ef
start implementing tokenizer & tokenize
anonrig Dec 4, 2024
5a27a07
add initial parser_url_pattern method
anonrig Dec 4, 2024
6634c09
add todos and remove redundant qualifiers
anonrig Dec 4, 2024
e1ea3a1
implement escape pattern
anonrig Dec 4, 2024
a5ffd39
add CompileComponentOptions
anonrig Dec 7, 2024
9899579
minor fixes for add-url-pattern (#800)
lemire Dec 8, 2024
7c40686
rename commits
anonrig Dec 8, 2024
ce624cd
add more parse_url_pattern
anonrig Dec 9, 2024
d1b3af1
rename url_pattern class
anonrig Dec 9, 2024
b8b2d4d
complete parse_url_pattern implementation
anonrig Dec 9, 2024
770a221
add `_component` suffix to components
anonrig Dec 9, 2024
66d2a67
remove unnecessary void
anonrig Dec 9, 2024
af903a1
implement generate regular expression methods
anonrig Dec 10, 2024
2df15e8
continue working on parser
anonrig Dec 10, 2024
22295d4
fix build error
anonrig Dec 12, 2024
5a66288
implement constructor string parser
anonrig Dec 12, 2024
f72fb07
implement all of tokenizer's functions
anonrig Dec 12, 2024
b424fd7
fix build errors
anonrig Dec 12, 2024
62f17fa
fix warnings
anonrig Dec 12, 2024
e18f51e
complete tokenizer
anonrig Dec 12, 2024
ab16828
implement escape_regexp_string
anonrig Dec 12, 2024
95c15f5
implement generate_pattern_string
anonrig Dec 12, 2024
a036101
fix compiler warnings
anonrig Dec 12, 2024
2e4b005
semi-implement match
anonrig Dec 13, 2024
0117444
complete one more todo
anonrig Dec 13, 2024
3bdb85c
simplify create_component_match_result
anonrig Dec 13, 2024
f5f408d
simplify
anonrig Dec 13, 2024
72a7527
use correct inputs for match/exec/test
anonrig Dec 13, 2024
6c0a11b
rename wpt_tests to wpt_url_tests
anonrig Dec 13, 2024
3c913be
add wpt_urlpattern_tests skeleton
anonrig Dec 13, 2024
69252c6
add first test
anonrig Dec 14, 2024
c8f4349
Build fixes (#801)
lemire Dec 14, 2024
d8d9225
fix 2 bugs
anonrig Dec 14, 2024
4191ee2
fix linter issues
anonrig Dec 14, 2024
fdf3b7a
fix 2 more bugs
anonrig Dec 14, 2024
47e6821
more progress on missing features
anonrig Dec 16, 2024
a584048
move url_pattern_helpers to separate file
anonrig Dec 16, 2024
c9edd50
fix build errors
anonrig Dec 16, 2024
a6d05ca
use url_pattern_encoding_callback
anonrig Dec 17, 2024
1193220
fix url pattern constructor error
anonrig Dec 17, 2024
c41af3f
fix more issues
anonrig Dec 17, 2024
71a5cd6
add initial version of wpt test runner
anonrig Dec 17, 2024
9c8035c
simplify json logic (#802)
lemire Dec 17, 2024
03a3406
add fuzzer
anonrig Dec 17, 2024
fdf74ba
removing the reset
lemire Dec 17, 2024
22fefea
update ada idna
anonrig Dec 18, 2024
616072b
use ada idna method for valid name code point
anonrig Dec 18, 2024
15da643
fix add part implementation
anonrig Dec 18, 2024
2d29dae
fix invalid access errors
anonrig Dec 18, 2024
53c28fe
implement tests correctly
anonrig Dec 18, 2024
f975719
improve test runner
anonrig Dec 18, 2024
ca16819
add url_pattern_init to_string() method
anonrig Dec 19, 2024
b692611
update WPT tests
anonrig Dec 19, 2024
600f981
fix last remaining todo
anonrig Dec 19, 2024
50d5ca5
simplify test runner
anonrig Dec 19, 2024
8456d16
minor fixes
lemire Dec 19, 2024
faffe57
some reworking
lemire Dec 19, 2024
8c153d7
make sure to skip invalid tests
anonrig Dec 19, 2024
4986212
remove std::ranges::iota due to clang
anonrig Dec 20, 2024
f57c131
add more fuzzing coverage
anonrig Dec 20, 2024
afb5c6b
try to fix windows issues
anonrig Dec 20, 2024
32e67d6
remove unnecessary copy
anonrig Dec 20, 2024
26075ab
start testing the validity of the correct responses
anonrig Dec 20, 2024
46133ec
fix couple of bugs
anonrig Dec 20, 2024
f1f36ab
fix invalid ascii checks
anonrig Dec 20, 2024
d74c6fe
make pattern generation more verbose
anonrig Dec 20, 2024
aff51cf
fix regex error
anonrig Dec 20, 2024
0062fd2
remove semicolon due to -Werror,-Wextra-semi
anonrig Dec 20, 2024
776f670
guarding regex call (#805)
lemire Dec 20, 2024
5a2e4ee
add more logging
anonrig Dec 23, 2024
fe256d7
change ada_idna to char32_t
anonrig Dec 23, 2024
1c092f6
remove try/catch
anonrig Dec 23, 2024
c0a22da
make canonicalize_ methods more flexible
anonrig Dec 23, 2024
384f035
fix change_state
anonrig Dec 23, 2024
65aa110
fix invalid substr call
anonrig Dec 23, 2024
8b1021e
fix generate_pattern_string impl
anonrig Dec 23, 2024
b01a0be
fix more small issues
anonrig Dec 23, 2024
b8392e4
improve url_pattern_init::process
anonrig Dec 23, 2024
ae728f1
correctly computing the next code point (#808)
lemire Dec 23, 2024
2d594ea
use std string view to avoid copy
anonrig Dec 23, 2024
5063bfa
use next_index instead of index
anonrig Dec 23, 2024
e772c8c
adding checks
lemire Dec 23, 2024
0e018e8
Merge branch 'yagiz/add-url-pattern' of https://github.com/ada-url/ad…
lemire Dec 23, 2024
8e284ef
highlight the error message
anonrig Dec 23, 2024
13b2f79
better decoding
lemire Dec 23, 2024
8586b04
I think that the test is in error (#810)
lemire Dec 23, 2024
d17f000
remove invalid WPT test data
anonrig Dec 24, 2024
666d41e
remove invalid assertion
anonrig Dec 24, 2024
e8a1b23
fix ipv6 address canonicalize
anonrig Dec 24, 2024
41ec170
fix canonicalize_ipv6_hostname
anonrig Dec 24, 2024
cf42ab6
simplify test runner
anonrig Dec 24, 2024
b35eb1a
fix test runner
anonrig Dec 24, 2024
22103f4
add a todo
anonrig Dec 24, 2024
0eb3e3c
remove invalid test case
anonrig Dec 24, 2024
8744150
add tests for expected object
anonrig Dec 24, 2024
86c06bf
fix hostname tests
anonrig Dec 25, 2024
fbf9a4b
complete match implementation
anonrig Dec 25, 2024
7393441
fix empty component tests
anonrig Dec 26, 2024
99da6e7
revert some wpt changes
anonrig Dec 26, 2024
2730b70
add some optional result logging (#812)
lemire Dec 26, 2024
ae17a77
Merge branch 'main' into yagiz/add-url-pattern
lemire Dec 26, 2024
5ef18a3
lint
lemire Dec 26, 2024
d9e8097
fixing logging
lemire Dec 26, 2024
b6b2ec9
removing diagram printout
Dec 27, 2024
bd8ac45
fix asan build errors
anonrig Dec 28, 2024
f3d65c0
simpler version of the yagiz/add-url-pattern branch (#815)
lemire Dec 28, 2024
e7c580d
simplify implementation
anonrig Dec 29, 2024
5ced1da
improve url_pattern_part emplace_back calls
anonrig Dec 29, 2024
674c0bb
fix url_pattern_component constructor
anonrig Dec 31, 2024
4ed57c3
remove the usage of ada.h inside src
anonrig Dec 31, 2024
3b51a9d
move all helper methods to url_pattern.cpp
anonrig Dec 31, 2024
d5b8cfc
fix urlpatterntestdata.json
anonrig Dec 31, 2024
3aad757
fix build errors
anonrig Dec 31, 2024
eebb5d4
add missing check
anonrig Dec 31, 2024
8a3d871
more tests (#817)
lemire Dec 31, 2024
7985050
fix assertion error
anonrig Dec 31, 2024
c0db9c8
don't move function calls
anonrig Dec 31, 2024
6b8ab43
fix token reference asan error
anonrig Dec 31, 2024
821a65a
another test (#818)
lemire Dec 31, 2024
4386553
simplify parser and tests
anonrig Jan 1, 2025
ea7e886
remove unnecessary duplicate_name method
anonrig Jan 1, 2025
ae39532
convert Token to class
anonrig Jan 1, 2025
7fd8c31
minor cleanups
anonrig Jan 1, 2025
3ab54e8
remove invalid std::move
anonrig Jan 1, 2025
21f1779
simplify parser
anonrig Jan 1, 2025
92d0838
remove invalid pathname WPT
anonrig Jan 1, 2025
64c7810
leave some todos for WPT
anonrig Jan 1, 2025
6b79301
complete inputs parsing
anonrig Jan 1, 2025
01ccd73
removed duplicated code
anonrig Jan 3, 2025
223cf5d
merge error enums
anonrig Jan 3, 2025
f2b423e
fix a boolean operation
anonrig Jan 3, 2025
ab605c6
update urlpatterntestdata.json
anonrig Jan 3, 2025
6e23ffa
remove unnecessary assertions
anonrig Jan 3, 2025
f310eec
removing GLIBCXX debug
Jan 3, 2025
c856379
updating macos ci
Jan 3, 2025
0a84167
indent
Jan 3, 2025
90e7ff0
keeping only static
Jan 3, 2025
603a619
improve wpt runner
anonrig Jan 3, 2025
8a44b81
fix match
anonrig Jan 3, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .clang-format
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
BasedOnStyle: Google
SortIncludes: false
SortIncludes: Never
2 changes: 2 additions & 0 deletions .editorconfig
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,5 @@ root = true
[*]
end_of_line = lf
insert_final_newline = true
indent_size = 2
indent_style = space
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ benchmarks/competitors/servo-url/target

#ignore VScode
.vscode/
.idea

# bazel output
bazel-*
bazel-*
8 changes: 8 additions & 0 deletions fuzz/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,14 @@ $CXX $CFLAGS $CXXFLAGS \
$CXX $CFLAGS $CXXFLAGS $LIB_FUZZING_ENGINE url_search_params.o \
-o $OUT/url_search_params

$CXX $CFLAGS $CXXFLAGS \
-std=c++20 \
-I build/singleheader \
-c fuzz/url_pattern.cc -o url_pattern.o

$CXX $CFLAGS $CXXFLAGS $LIB_FUZZING_ENGINE url_pattern.o \
-o $OUT/url_pattern

$CXX $CFLAGS $CXXFLAGS \
-std=c++20 \
-I build/singleheader \
Expand Down
44 changes: 44 additions & 0 deletions fuzz/url_pattern.cc
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
#include <fuzzer/FuzzedDataProvider.h>

#include <memory>
#include <string>

#include "ada.cpp"
#include "ada.h"

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
FuzzedDataProvider fdp(data, size);
std::string source = fdp.ConsumeRandomLengthString(256);
std::string base_source = fdp.ConsumeRandomLengthString(256);

// Without base or options
auto result = ada::parse_url_pattern(source, nullptr, nullptr);
(void)result;

// Testing with base_url
std::string_view base_source_view(base_source.data(), base_source.length());
auto result_with_base =
ada::parse_url_pattern(source, &base_source_view, nullptr);
(void)result_with_base;

// Testing with base_url and options
ada::url_pattern_options options{.ignore_case = true};
auto result_with_base_and_options =
ada::parse_url_pattern(source, &base_source_view, &options);
(void)result_with_base_and_options;

// Testing with url_pattern_init and base url.
ada::url_pattern_init init{.protocol = source,
.username = source,
.password = source,
.hostname = source,
.port = source,
.pathname = source,
.search = source,
.hash = source};
auto result_with_init =
ada::parse_url_pattern(init, &base_source_view, nullptr);
(void)result_with_init;

return 0;
}
5 changes: 5 additions & 0 deletions include/ada.h
Original file line number Diff line number Diff line change
Expand Up @@ -26,9 +26,14 @@
#include "ada/url_aggregator-inl.h"
#include "ada/url_search_params.h"
#include "ada/url_search_params-inl.h"
#include "ada/url_pattern.h"
#include "ada/url_pattern-inl.h"
#include "ada/url_pattern_helpers.h"
#include "ada/url_pattern_helpers-inl.h"

// Public API
#include "ada/ada_version.h"
#include "ada/implementation.h"
#include "ada/implementation-inl.h"

#endif // ADA_H
25 changes: 24 additions & 1 deletion include/ada/ada_idna.h
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
/* auto-generated on 2024-12-05 20:44:13 -0500. Do not edit! */
/* auto-generated on 2024-12-18 09:44:34 -0500. Do not edit! */
/* begin file include/idna.h */
#ifndef ADA_IDNA_H
#define ADA_IDNA_H
Expand Down Expand Up @@ -141,6 +141,29 @@ std::string to_unicode(std::string_view input);

#endif // ADA_IDNA_TO_UNICODE_H
/* end file include/ada/idna/to_unicode.h */
/* begin file include/ada/idna/identifier.h */
#ifndef ADA_IDNA_IDENTIFIER_H
#define ADA_IDNA_IDENTIFIER_H

#include <string>
#include <string_view>

namespace ada::idna {

// Access the first code point of the input string.
// Verify if it is valid name code point given a Unicode code point and a
// boolean first: If first is true return the result of checking if code point
// is contained in the IdentifierStart set of code points. Otherwise return the
// result of checking if code point is contained in the IdentifierPart set of
// code points. Returns false if the input is empty or the code point is not
// valid. There is minimal Unicode error handling: the input should be valid
// UTF-8. https://urlpattern.spec.whatwg.org/#is-a-valid-name-code-point
bool valid_name_code_point(char32_t input, bool first);

} // namespace ada::idna

#endif
/* end file include/ada/idna/identifier.h */

#endif
/* end file include/idna.h */
7 changes: 7 additions & 0 deletions include/ada/common_defs.h
Original file line number Diff line number Diff line change
Expand Up @@ -250,4 +250,11 @@ namespace ada {
#define ada_lifetime_bound
#endif

#ifdef __has_include
#if __has_include(<format>)
#include <format>
#define ADA_HAS_FORMAT 1
#endif
#endif

#endif // ADA_COMMON_DEFS_H
2 changes: 1 addition & 1 deletion include/ada/helpers.h
Original file line number Diff line number Diff line change
Expand Up @@ -139,7 +139,7 @@ ada_really_inline std::pair<size_t, bool> get_host_delimiter_location(
* Removes leading and trailing C0 control and whitespace characters from
* string.
*/
ada_really_inline void trim_c0_whitespace(std::string_view& input) noexcept;
void trim_c0_whitespace(std::string_view& input) noexcept;

/**
* @private
Expand Down
Loading
Loading