Skip to content

Commit 0c7f3f3

Browse files
Nick-Nuonlemire
andauthored
Command line improvements (#371)
Command line improvements (#371) --------- Co-authored-by: Daniel Lemire <[email protected]> Co-authored-by: Daniel Lemire <[email protected]>
1 parent c93877b commit 0c7f3f3

10 files changed

+616
-79
lines changed

.gitignore

+3
Original file line numberDiff line numberDiff line change
@@ -22,3 +22,6 @@ singleheader/singleheader.zip
2222

2323
benchmarks/competitors/servo-url/debug
2424
benchmarks/competitors/servo-url/target
25+
26+
#ignore VScode
27+
.vscode/

README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -123,7 +123,7 @@ ada::result<ada::url_aggregator> url = ada::parse<ada::url_aggregator>("https://
123123
url->set_hash("is-this-the-real-life");
124124
// url->get_hash() will return "#is-this-the-real-life"
125125
```
126-
126+
For more information about command-line options, please refer to the [CLI documentation](docs/cli.md).
127127
### C wrapper
128128
129129
See the file `include/ada_c.h` for our C interface.

docs/cli.md

+146
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,146 @@
1+
### Command line options
2+
3+
- Options:
4+
- `-d`, `--diagram`: Print a diagram of the result
5+
- `-u`, `--url`: URL Parameter (required)
6+
- `-h`, `--help`: Print usage
7+
- `-g`, `--get`: Get a specific part of the URL (e.g., 'origin', 'host', etc. as mentioned in the examples above)
8+
- `-b`, `--benchmark`: Run benchmark for piped file functions
9+
- `-p`, `--path`: Process all the URLs in a given file
10+
- `-o`, `--output`: Output the results of the parsing to a file
11+
12+
### Usage/Examples:
13+
14+
Well-formatted URL:
15+
16+
```bash
17+
./buildbench/tools/adaparse "http://www.google.com"
18+
```
19+
Output:
20+
21+
```
22+
http://www.google.com
23+
```
24+
25+
Ill-formatted URL:
26+
27+
```bash
28+
./buildbench/tools/adaparse "h^tp:ws:/www.g00g.com"
29+
```
30+
Output:
31+
32+
```
33+
Invalid URL: h^tp:ws:/www.g00g.com
34+
```
35+
36+
37+
Diagram flag:
38+
39+
```bash
40+
$ ./buildbench/tools/adaparse -d http://www.google.com/bal\?a\=\=11\#fddfds
41+
```
42+
43+
Output:
44+
45+
```
46+
http://www.google.com/bal?a==11#fddfds [38 bytes]
47+
| | | | |
48+
| | | | `------ hash_start
49+
| | | `------------ search_start 25
50+
| | `---------------- pathname_start 21
51+
| | `---------------- host_end 21
52+
| `------------------------------ host_start 7
53+
| `------------------------------ username_end 7
54+
`-------------------------------- protocol_end 5
55+
```
56+
57+
58+
59+
### Piping Example
60+
61+
Ada can process URLs from piped input, making it easy to integrate with other command-line tools. Here's an example of how to pipe the output of another command into Ada.
62+
63+
```bash
64+
cat dragonball_url.txt | ./buildbench/tools/adaparse
65+
```
66+
67+
Output:
68+
```
69+
http://www.goku.com
70+
http://www.vegeta.com
71+
http://www.gohan.com
72+
73+
```
74+
75+
It also supports the passing of arguments to each URL in said file:
76+
77+
```bash
78+
cat dragonball_url.txt | ./buildbench/tools/adaparse -g host
79+
```
80+
81+
Output:
82+
```
83+
www.goku.com
84+
www.vegeta.com
85+
www.gohan.com
86+
```
87+
88+
The benchmark flag can be used to output the time it takes to process piped input:
89+
90+
```bash
91+
cat wikipedia_100k.txt | ./buildbench/tools/adaparse -b
92+
```
93+
94+
```bash
95+
(---snip---)
96+
file:///opt
97+
file:///Users
98+
file:///Users/lemire
99+
file:///Users/lemire/tmp
100+
file:///Users/lemire/tmp/linuxdump
101+
file:///Users/lemire/tmp/linuxdump/linuxfiles.txt
102+
file:///.dockerenv
103+
read 10124906 bytes in 3071328453 ns using 169312 lines
104+
0.003296588481153891 GB/s
105+
```
106+
107+
There is an option to output to a file on disk:
108+
109+
```bash
110+
111+
cat wikipedia_100k.txt | ./buildbench/tools/adaparse -o wiki_output.txt
112+
```
113+
114+
as well as read in from a file on disk without going through cat:
115+
116+
```bash
117+
./buildbench/tools/adaparse -p wikipedia_top_100_txt
118+
```
119+
120+
You may also combine different flags together. E.g. Say one wishes to extract only the host from URLs stored in wikipedia.txt and output it to the test_write.txt file:
121+
122+
```bash
123+
/build/tools/adaparse" -p wikipedia_top100.txt -o test_write.txt -g host -b
124+
```
125+
126+
Console output:
127+
```bash
128+
read 5209265 bytes in 26737131 ns using 100000 lines, total_bytes is 5209265 used 160 loads
129+
0.19483260937757307 GB/s(base)
130+
```
131+
132+
Content of test_write.txt:
133+
```bash
134+
(---snip---)
135+
en.wikipedia.org
136+
en.wikipedia.org
137+
en.wikipedia.org
138+
en.wikipedia.org
139+
en.wikipedia.org
140+
en.wikipedia.org
141+
en.wikipedia.org
142+
en.wikipedia.org
143+
en.wikipedia.org
144+
en.wikipedia.org
145+
(---snip---)
146+
```

tools/CMakeLists.txt

+1-8
Original file line numberDiff line numberDiff line change
@@ -1,8 +1 @@
1-
add_executable(adaparse adaparse.cpp)
2-
target_link_libraries(adaparse PRIVATE ada)
3-
target_include_directories(adaparse PUBLIC "$<BUILD_INTERFACE:${PROJECT_SOURCE_DIR}/include>")
4-
5-
include(${PROJECT_SOURCE_DIR}/cmake/import.cmake)
6-
import_dependency(cxxopts jarro2783/cxxopts eb78730)
7-
add_dependency(cxxopts)
8-
target_link_libraries(adaparse PRIVATE cxxopts::cxxopts)
1+
add_subdirectory(cli)

tools/adaparse.cpp

-70
This file was deleted.

tools/cli/CMakeLists.txt

+22
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
add_executable(adaparse adaparse.cpp line_iterator.h)
2+
target_link_libraries(adaparse PRIVATE ada)
3+
target_include_directories(adaparse PUBLIC "$<BUILD_INTERFACE:${PROJECT_SOURCE_DIR}/include>")
4+
5+
include(${PROJECT_SOURCE_DIR}/cmake/import.cmake)
6+
import_dependency(cxxopts jarro2783/cxxopts eb78730)
7+
add_dependency(cxxopts)
8+
import_dependency(fmt fmtlib/fmt a337011)
9+
add_dependency(fmt)
10+
target_link_libraries(adaparse PRIVATE cxxopts::cxxopts fmt::fmt)
11+
12+
if(MSVC OR MINGW)
13+
target_compile_definitions(adaparse PRIVATE _CRT_SECURE_NO_WARNINGS _CRT_NONSTDC_NO_DEPRECATE)
14+
endif()
15+
16+
install(
17+
TARGETS
18+
adaparse
19+
ARCHIVE DESTINATION ${CMAKE_INSTALL_LIBDIR}
20+
LIBRARY DESTINATION ${CMAKE_INSTALL_LIBDIR}
21+
RUNTIME DESTINATION ${CMAKE_INSTALL_BINDIR}
22+
)

0 commit comments

Comments
 (0)