@@ -10,6 +10,9 @@ Ada is a fast and spec-compliant URL parser written in C++.
10
10
Specification for URL parser can be found from the
11
11
[ WHATWG] ( https://url.spec.whatwg.org/#url-parsing ) website.
12
12
13
+ Ada library also includes a [ URLPattern] ( https://url.spec.whatwg.org/#urlpattern ) implementation
14
+ that is compatible with the [ web-platform tests] ( https://github.com/web-platform-tests/wpt/tree/master/urlpattern ) .
15
+
13
16
The Ada library passes the full range of tests from the specification,
14
17
across a wide range of platforms (e.g., Windows, Linux, macOS). It fully
15
18
supports the relevant [ Unicode Technical Standard] ( https://www.unicode.org/reports/tr46/#ToUnicode ) .
@@ -19,16 +22,11 @@ The WHATWG URL specification has been adopted by most browsers. Other tools, su
19
22
standard libraries, follow the RFC 3986. The following table illustrates possible differences in practice
20
23
(encoding of the host, encoding of the path):
21
24
22
- | string source | string value |
23
- | :--------------| : --------------|
24
- | input string | https://www.7‑Eleven.com/Home/Privacy/Montréal |
25
+ | string source | string value |
26
+ | :------------------------ | :---------------------------------------------- --------------|
27
+ | input string | https://www.7‑Eleven.com/Home/Privacy/Montréal |
25
28
| ada's normalized string | https://www.xn--7eleven-506c.com/Home/Privacy/Montr%C3%A9al |
26
- | curl 7.87 | (returns the original unchanged) |
27
-
28
- ### Requirements
29
-
30
- The project is otherwise self-contained and it has no dependency.
31
- A recent C++ compiler supporting C++20. We test GCC 12 or better, LLVM 12 or better and Microsoft Visual Studio 2022.
29
+ | curl 7.87 | (returns the original unchanged) |
32
30
33
31
## Ada is fast.
34
32
@@ -50,9 +48,12 @@ Ada has improved the performance of the popular JavaScript environment Node.js:
50
48
51
49
The Ada library is used by important systems besides Node.js such as Redpanda, Kong, Telegram and Cloudflare Workers.
52
50
51
+ [ ![ the ada library] ( http://img.youtube.com/vi/tQ-6OWRDsZg/0.jpg )] ( https://www.youtube.com/watch?v=tQ-6OWRDsZg ) <br />
53
52
53
+ ### Requirements
54
54
55
- [ ![ the ada library] ( http://img.youtube.com/vi/tQ-6OWRDsZg/0.jpg )] ( https://www.youtube.com/watch?v=tQ-6OWRDsZg ) <br />
55
+ The project is otherwise self-contained and it has no dependency.
56
+ A recent C++ compiler supporting C++20. We test GCC 12 or better, LLVM 14 or better and Microsoft Visual Studio 2022.
56
57
57
58
## Installation
58
59
@@ -67,8 +68,8 @@ Linux or macOS users might follow the following instructions if they have a rece
67
68
68
69
1 . Pull the library in a directory
69
70
```
70
- wget https://github.com/ada-url/ada/releases/download/v2.6.10 /ada.cpp
71
- wget https://github.com/ada-url/ada/releases/download/v2.6.10 /ada.h
71
+ wget https://github.com/ada-url/ada/releases/download/v3.0.0 /ada.cpp
72
+ wget https://github.com/ada-url/ada/releases/download/v3.0.00 /ada.h
72
73
```
73
74
2 . Create a new file named ` demo.cpp ` with this content:
74
75
``` C++
@@ -131,7 +132,7 @@ components (path, host, and so forth).
131
132
- Parse and validate a URL from an ASCII or a valid UTF-8 string.
132
133
133
134
```cpp
134
- auto url = ada::parse("https://www.google.com");
135
+ auto url = ada::parse<ada::url_aggregator> ("https://www.google.com");
135
136
if (url) { /* URL is valid */ }
136
137
```
137
138
@@ -140,89 +141,45 @@ accessing it when you are not sure that it will succeed. The following
140
141
code is unsafe:
141
142
142
143
``` cpp
143
- auto url = ada::parse(" some bad url" );
144
+ auto > url = ada::parse<ada::url_aggregator> (" some bad url" );
144
145
url->get_href ();
145
146
```
146
147
147
- You should do...
148
-
149
- ``` cpp
150
- auto url = ada::parse(" some bad url" );
151
- if (url) {
152
- // next line is now safe:
153
- url->get_href();
154
- } else {
155
- // report a parsing failure
156
- }
157
- ```
158
-
159
148
For simplicity, in the examples below, we skip the check because
160
149
we know that parsing succeeds. All strings are assumed to be valid
161
150
UTF-8 strings.
162
151
163
- ### Examples
152
+ ## Examples
164
153
165
- - Get/Update credentials
154
+ ## URL Parser
166
155
167
- ``` cpp
168
- auto url = ada::parse(" https://www.google.com" );
169
- url->set_username ("username");
156
+ ``` c++
157
+ auto url = ada::parse<ada::url_aggregator>(" https://www.google.com" );
158
+
159
+ url->set_username ("username"); // Update credentials
170
160
url->set_password("password");
171
161
// ada->get_href() will return "
https://username:[email protected] / "
172
- ```
173
162
174
- - Get/Update Protocol
175
-
176
- ```cpp
177
- auto url = ada::parse("https://www.google.com");
178
- url->set_protocol("wss");
163
+ url->set_protocol("wss"); // Update protocol
179
164
// url->get_protocol() will return "wss:"
180
- // url->get_href() will return "wss://www.google.com/"
181
- ```
182
165
183
- - Get/Update host
184
-
185
- ``` cpp
186
- auto url = ada::parse(" https://www.google.com" );
187
- url->set_host ("github.com");
166
+ url->set_host("github.com"); // Update host
188
167
// url->get_host() will return "github.com"
189
- // you can use ` url.set_hostname ` depending on your usage.
190
- ```
191
168
192
- - Get/Update port
193
-
194
- ```cpp
195
- auto url = ada::parse("https://www.google.com");
196
- url->set_port("8080");
169
+ url->set_port("8080"); // Update port
197
170
// url->get_port() will return "8080"
198
- ```
199
171
200
- - Get/Update pathname
201
-
202
- ``` cpp
203
- auto url = ada::parse(" https://www.google.com" );
204
- url->set_pathname ("/my-super-long-path")
172
+ url->set_pathname("/my-super-long-path"); // Update pathname
205
173
// url->get_pathname() will return "/my-super-long-path"
206
- ```
207
-
208
- - Get/Update search/query
209
174
210
- ```cpp
211
- auto url = ada::parse("https://www.google.com");
212
- url->set_search("target=self");
175
+ url->set_search("target=self"); // Update search
213
176
// url->get_search() will return "?target=self"
214
- ```
215
-
216
- - Get/Update hash/fragment
217
177
218
- ``` cpp
219
- auto url = ada::parse(" https://www.google.com" );
220
- url->set_hash ("is-this-the-real-life");
178
+ url->set_hash("is-this-the-real-life"); // Update hash/fragment
221
179
// url->get_hash() will return "#is-this-the-real-life"
222
180
```
223
- For more information about command-line options, please refer to the [CLI documentation](docs/cli.md).
224
181
225
- - URL search params
182
+ ### URL Search Params
226
183
227
184
```cpp
228
185
ada::url_search_params search_params("a=b&c=d&e=f");
@@ -236,6 +193,40 @@ while (keys.has_next()) {
236
193
}
237
194
```
238
195
196
+ ### URLPattern
197
+
198
+ Our implementation doesn't provide a regex engine and leaves the decision of choosing the right engine to the user.
199
+ This is done as a security measure since the default std::regex engine is not safe and open to DDOS attacks.
200
+ Runtimes like Node.js and Cloudflare Workers use the V8 regex engine, which is safe and performant.
201
+
202
+ ``` cpp
203
+ // Define a regex engine that conforms to the following interface
204
+ // For example we will use v8 regex engine
205
+
206
+ class v8_regex_provider {
207
+ public:
208
+ v8_regex_provider() = default;
209
+ using regex_type = v8::Global< v8::RegExp > ;
210
+ static std::optional<regex_type> create_instance(std::string_view pattern,
211
+ bool ignore_case);
212
+ static std::optional< std::vector<std::optional<std::string > >> regex_search(
213
+ std::string_view input, const regex_type& pattern);
214
+ static bool regex_match(std::string_view input, const regex_type& pattern);
215
+ };
216
+
217
+ // Define a URLPattern
218
+ auto pattern = ada::parse_url_pattern<v8_regex_provider>("/books/: id (\\ d+)", "https://example.com ");
219
+
220
+ // Check validity
221
+ if (!pattern) { return EXIT_FAILURE; }
222
+
223
+ // Match a URL
224
+ auto match = pattern->match("https://example.com/books/123 ");
225
+
226
+ // Test a URL
227
+ auto matched = pattern->test("https://example.com/books/123 ");
228
+ ```
229
+
239
230
### C wrapper
240
231
241
232
See the file `include/ada_c.h` for our C interface. We expect ASCII or UTF-8 strings.
@@ -298,23 +289,21 @@ c++ demo.o ada.o -o cdemo
298
289
./cdemo
299
290
```
300
291
292
+ ### Command-line interface
293
+
294
+ For more information about command-line options, please refer to the [ CLI documentation] ( docs/cli.md ) .
295
+
301
296
### CMake dependency
302
297
303
298
See the file ` tests/installation/CMakeLists.txt ` for an example of how you might use ada from your own
304
299
CMake project, after having installed ada on your system.
305
300
306
- ## Installation
307
-
308
- ### Homebrew
309
-
310
- Ada is available through [Homebrew](https://formulae.brew.sh/formula/ada-url#default).
311
- You can install Ada using `brew install ada-url`.
312
-
313
301
## Contributing
314
302
315
303
### Building
316
304
317
- Ada uses cmake as a build system. It's recommended you to run the following commands to build it locally.
305
+ Ada uses cmake as a build system, but also supports Bazel. It's recommended you to run the following
306
+ commands to build it locally.
318
307
319
308
Without tests:
320
309
@@ -325,16 +314,13 @@ With tests (requires git):
325
314
- ** Build** : ` cmake -B build -DADA_TESTING=ON && cmake --build build `
326
315
- ** Test** : ` ctest --output-on-failure --test-dir build `
327
316
328
-
329
317
With tests (requires available local packages):
330
318
331
319
- ** Build** : ` cmake -B build -DADA_TESTING=ON -D CPM_USE_LOCAL_PACKAGES=ON && cmake --build build `
332
320
- ** Test** : ` ctest --output-on-failure --test-dir build `
333
321
334
322
Windows users need additional flags to specify the build configuration, e.g. ` --config Release ` .
335
323
336
-
337
-
338
324
The project can also be built via docker using default docker file of repository with following commands.
339
325
340
326
` docker build -t ada-builder . && docker run --rm -it -v ${PWD}:/repo ada-builder `
@@ -352,5 +338,4 @@ Our tests include third-party code and data. The benchmarking code includes thir
352
338
353
339
### Further reading
354
340
355
-
356
341
* Yagiz Nizipli, Daniel Lemire, [ Parsing Millions of URLs per Second] ( https://doi.org/10.1002/spe.3296 ) , Software: Practice and Experience 54(5) May 2024.
0 commit comments