@@ -10,6 +10,9 @@ Ada is a fast and spec-compliant URL parser written in C++.
1010Specification for URL parser can be found from the
1111[ WHATWG] ( https://url.spec.whatwg.org/#url-parsing ) website.
1212
13+ Ada library also includes a [ URLPattern] ( https://url.spec.whatwg.org/#urlpattern ) implementation
14+ that is compatible with the [ web-platform tests] ( https://github.com/web-platform-tests/wpt/tree/master/urlpattern ) .
15+
1316The Ada library passes the full range of tests from the specification,
1417across a wide range of platforms (e.g., Windows, Linux, macOS). It fully
1518supports the relevant [ Unicode Technical Standard] ( https://www.unicode.org/reports/tr46/#ToUnicode ) .
@@ -19,16 +22,11 @@ The WHATWG URL specification has been adopted by most browsers. Other tools, su
1922standard libraries, follow the RFC 3986. The following table illustrates possible differences in practice
2023(encoding of the host, encoding of the path):
2124
22- | string source | string value |
23- | :--------------| : --------------|
24- | input string | https://www.7‑Eleven.com/Home/Privacy/Montréal |
25+ | string source | string value |
26+ | :------------------------ | :---------------------------------------------- --------------|
27+ | input string | https://www.7‑Eleven.com/Home/Privacy/Montréal |
2528| ada's normalized string | https://www.xn--7eleven-506c.com/Home/Privacy/Montr%C3%A9al |
26- | curl 7.87 | (returns the original unchanged) |
27-
28- ### Requirements
29-
30- The project is otherwise self-contained and it has no dependency.
31- A recent C++ compiler supporting C++20. We test GCC 12 or better, LLVM 12 or better and Microsoft Visual Studio 2022.
29+ | curl 7.87 | (returns the original unchanged) |
3230
3331## Ada is fast.
3432
@@ -50,9 +48,12 @@ Ada has improved the performance of the popular JavaScript environment Node.js:
5048
5149The Ada library is used by important systems besides Node.js such as Redpanda, Kong, Telegram and Cloudflare Workers.
5250
51+ [ ![ the ada library] ( http://img.youtube.com/vi/tQ-6OWRDsZg/0.jpg )] ( https://www.youtube.com/watch?v=tQ-6OWRDsZg ) <br />
5352
53+ ### Requirements
5454
55- [ ![ the ada library] ( http://img.youtube.com/vi/tQ-6OWRDsZg/0.jpg )] ( https://www.youtube.com/watch?v=tQ-6OWRDsZg ) <br />
55+ The project is otherwise self-contained and it has no dependency.
56+ A recent C++ compiler supporting C++20. We test GCC 12 or better, LLVM 14 or better and Microsoft Visual Studio 2022.
5657
5758## Installation
5859
@@ -67,8 +68,8 @@ Linux or macOS users might follow the following instructions if they have a rece
6768
68691 . Pull the library in a directory
6970 ```
70- wget https://github.com/ada-url/ada/releases/download/v2.6.10 /ada.cpp
71- wget https://github.com/ada-url/ada/releases/download/v2.6.10 /ada.h
71+ wget https://github.com/ada-url/ada/releases/download/v3.0.0 /ada.cpp
72+ wget https://github.com/ada-url/ada/releases/download/v3.0.00 /ada.h
7273 ```
73742 . Create a new file named ` demo.cpp ` with this content:
7475 ``` C++
@@ -131,7 +132,7 @@ components (path, host, and so forth).
131132- Parse and validate a URL from an ASCII or a valid UTF-8 string.
132133
133134```cpp
134- auto url = ada::parse("https://www.google.com");
135+ auto url = ada::parse<ada::url_aggregator> ("https://www.google.com");
135136if (url) { /* URL is valid */ }
136137```
137138
@@ -140,89 +141,45 @@ accessing it when you are not sure that it will succeed. The following
140141code is unsafe:
141142
142143``` cpp
143- auto url = ada::parse(" some bad url" );
144+ auto > url = ada::parse<ada::url_aggregator> (" some bad url" );
144145url->get_href ();
145146```
146147
147- You should do...
148-
149- ``` cpp
150- auto url = ada::parse(" some bad url" );
151- if (url) {
152- // next line is now safe:
153- url->get_href();
154- } else {
155- // report a parsing failure
156- }
157- ```
158-
159148For simplicity, in the examples below, we skip the check because
160149we know that parsing succeeds. All strings are assumed to be valid
161150UTF-8 strings.
162151
163- ### Examples
152+ ## Examples
164153
165- - Get/Update credentials
154+ ## URL Parser
166155
167- ``` cpp
168- auto url = ada::parse(" https://www.google.com" );
169- url->set_username ("username");
156+ ``` c++
157+ auto url = ada::parse<ada::url_aggregator>(" https://www.google.com" );
158+
159+ url->set_username ("username"); // Update credentials
170160url->set_password("password");
171161// ada->get_href() will return "
https://username:[email protected] / "
172- ```
173162
174- - Get/Update Protocol
175-
176- ```cpp
177- auto url = ada::parse("https://www.google.com");
178- url->set_protocol("wss");
163+ url->set_protocol("wss"); // Update protocol
179164// url->get_protocol() will return "wss:"
180- // url->get_href() will return "wss://www.google.com/"
181- ```
182165
183- - Get/Update host
184-
185- ``` cpp
186- auto url = ada::parse(" https://www.google.com" );
187- url->set_host ("github.com");
166+ url->set_host("github.com"); // Update host
188167// url->get_host() will return "github.com"
189- // you can use ` url.set_hostname ` depending on your usage.
190- ```
191168
192- - Get/Update port
193-
194- ```cpp
195- auto url = ada::parse("https://www.google.com");
196- url->set_port("8080");
169+ url->set_port("8080"); // Update port
197170// url->get_port() will return "8080"
198- ```
199171
200- - Get/Update pathname
201-
202- ``` cpp
203- auto url = ada::parse(" https://www.google.com" );
204- url->set_pathname ("/my-super-long-path")
172+ url->set_pathname("/my-super-long-path"); // Update pathname
205173// url->get_pathname() will return "/my-super-long-path"
206- ```
207-
208- - Get/Update search/query
209174
210- ```cpp
211- auto url = ada::parse("https://www.google.com");
212- url->set_search("target=self");
175+ url->set_search("target=self"); // Update search
213176// url->get_search() will return "?target=self"
214- ```
215-
216- - Get/Update hash/fragment
217177
218- ``` cpp
219- auto url = ada::parse(" https://www.google.com" );
220- url->set_hash ("is-this-the-real-life");
178+ url->set_hash("is-this-the-real-life"); // Update hash/fragment
221179// url->get_hash() will return "#is-this-the-real-life"
222180```
223- For more information about command-line options, please refer to the [CLI documentation](docs/cli.md).
224181
225- - URL search params
182+ ### URL Search Params
226183
227184```cpp
228185ada::url_search_params search_params("a=b&c=d&e=f");
@@ -236,6 +193,40 @@ while (keys.has_next()) {
236193}
237194```
238195
196+ ### URLPattern
197+
198+ Our implementation doesn't provide a regex engine and leaves the decision of choosing the right engine to the user.
199+ This is done as a security measure since the default std::regex engine is not safe and open to DDOS attacks.
200+ Runtimes like Node.js and Cloudflare Workers use the V8 regex engine, which is safe and performant.
201+
202+ ``` cpp
203+ // Define a regex engine that conforms to the following interface
204+ // For example we will use v8 regex engine
205+
206+ class v8_regex_provider {
207+ public:
208+ v8_regex_provider() = default;
209+ using regex_type = v8::Global< v8::RegExp > ;
210+ static std::optional<regex_type> create_instance(std::string_view pattern,
211+ bool ignore_case);
212+ static std::optional< std::vector<std::optional<std::string > >> regex_search(
213+ std::string_view input, const regex_type& pattern);
214+ static bool regex_match(std::string_view input, const regex_type& pattern);
215+ };
216+
217+ // Define a URLPattern
218+ auto pattern = ada::parse_url_pattern<v8_regex_provider>("/books/: id (\\ d+)", "https://example.com ");
219+
220+ // Check validity
221+ if (!pattern) { return EXIT_FAILURE; }
222+
223+ // Match a URL
224+ auto match = pattern->match("https://example.com/books/123 ");
225+
226+ // Test a URL
227+ auto matched = pattern->test("https://example.com/books/123 ");
228+ ```
229+
239230### C wrapper
240231
241232See the file `include/ada_c.h` for our C interface. We expect ASCII or UTF-8 strings.
@@ -298,23 +289,21 @@ c++ demo.o ada.o -o cdemo
298289./cdemo
299290```
300291
292+ ### Command-line interface
293+
294+ For more information about command-line options, please refer to the [ CLI documentation] ( docs/cli.md ) .
295+
301296### CMake dependency
302297
303298See the file ` tests/installation/CMakeLists.txt ` for an example of how you might use ada from your own
304299CMake project, after having installed ada on your system.
305300
306- ## Installation
307-
308- ### Homebrew
309-
310- Ada is available through [Homebrew](https://formulae.brew.sh/formula/ada-url#default).
311- You can install Ada using `brew install ada-url`.
312-
313301## Contributing
314302
315303### Building
316304
317- Ada uses cmake as a build system. It's recommended you to run the following commands to build it locally.
305+ Ada uses cmake as a build system, but also supports Bazel. It's recommended you to run the following
306+ commands to build it locally.
318307
319308Without tests:
320309
@@ -325,16 +314,13 @@ With tests (requires git):
325314- ** Build** : ` cmake -B build -DADA_TESTING=ON && cmake --build build `
326315- ** Test** : ` ctest --output-on-failure --test-dir build `
327316
328-
329317With tests (requires available local packages):
330318
331319- ** Build** : ` cmake -B build -DADA_TESTING=ON -D CPM_USE_LOCAL_PACKAGES=ON && cmake --build build `
332320- ** Test** : ` ctest --output-on-failure --test-dir build `
333321
334322Windows users need additional flags to specify the build configuration, e.g. ` --config Release ` .
335323
336-
337-
338324The project can also be built via docker using default docker file of repository with following commands.
339325
340326` docker build -t ada-builder . && docker run --rm -it -v ${PWD}:/repo ada-builder `
@@ -352,5 +338,4 @@ Our tests include third-party code and data. The benchmarking code includes thir
352338
353339### Further reading
354340
355-
356341* Yagiz Nizipli, Daniel Lemire, [ Parsing Millions of URLs per Second] ( https://doi.org/10.1002/spe.3296 ) , Software: Practice and Experience 54(5) May 2024.
0 commit comments