Skip to content

Commit 2eebe61

Browse files
committed
Attempt at checking for license (rust-lang#209)
I'm not quite sure how best to handle loading the license template from a path -- I mean obviously I know *how* to do it, but I'm not sure where to fit it in the codebase :) So this first attempt puts the license template directly into the config file. These are my misgivings about the license template config option as a path to a file (I'd love feedback if some of these are wrong or can be easily circumvented!): 1. I thought the obvious choice for the type of `license_template` in `create_config!` should be `PathBuf`, but `PathBuf` doesn't implement `FromStr` (yet? see rust-lang/rust#44431), so it would have to be wrapped in a tuple struct, and I went down that road for a little while but then it seemed like too much ceremony for too little gain. 2. So a plain `String` then (which, mind you, also means the same `doc_hint()`, i.e. `<string>`, not `<path>` or something like that). The fact that it's a valid path will be checked once we try to read the file. 3. But where in the code should the license template be read? The obvious choice for me would be somewhere in `Config::from_toml()`, but since `Config` is defined via the `create_config!` macro, that would mean tight coupling between the macro invocation (which defines the configuration option `license_template`) and its definition (which would rely on the existence of that option to run the template loading code). 4. `license_template` could also be made a special option which is hardwired into the macro. This gets rid of the tight coupling, but special-casing one of the config options would make the code harder to navigate. 5. Instead, the macro could maybe be rewritten to allow for config options that load additional resources from files when the config is being parsed, but that's beyond my skill level I'm afraid (and probably overengineering the problem if it's only ever going to be used for this one option). 6. Finally, the file can be loaded at some later point in time, e.g. in `format_lines()`, right before `check_license()` is called. But to face a potential *IO* error at so late a stage, when the source files have already been parsed... I don't know, it doesn't feel right. BTW I don't like that I'm actually parsing the license template as late as inside `check_license()` either, but for much the same reasons, I don't know where else to put it. If the `Config` were hand-rolled instead of a macro, I'd just define a custom `license_template` option and load and parse the template in the `Config`'s init. But the way things are, I'm a bit at a loss. However, if someone more familiar with the project would kindly provide a few hints as to how the path approach can be done in a way that is as clean as possible in the context of the codebase, I'll be more than happy to implement it! :)
1 parent 5025a53 commit 2eebe61

File tree

2 files changed

+106
-4
lines changed

2 files changed

+106
-4
lines changed

src/config/mod.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,7 @@ create_config! {
5050
comment_width: usize, 80, false,
5151
"Maximum length of comments. No effect unless wrap_comments = true";
5252
normalize_comments: bool, false, true, "Convert /* */ comments to // comments where possible";
53+
license_template: String, String::default(), false, "Check for license";
5354

5455
// Single line expressions and items.
5556
empty_item_single_line: bool, true, false,

src/lib.rs

Lines changed: 105 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,7 @@ use syntax::ast;
4343
use syntax::codemap::{CodeMap, FilePathMapping};
4444
pub use syntax::codemap::FileName;
4545
use syntax::parse::{self, ParseSess};
46+
use regex::{Regex, RegexBuilder};
4647

4748
use checkstyle::{output_footer, output_header};
4849
use comment::{CharClasses, FullCodeCharKind};
@@ -99,6 +100,10 @@ pub enum ErrorKind {
99100
TrailingWhitespace,
100101
// TO-DO or FIX-ME item without an issue number
101102
BadIssue(Issue),
103+
// License check has failed
104+
LicenseCheck,
105+
// License template could not be parsed
106+
ParsingLicense,
102107
}
103108

104109
impl fmt::Display for ErrorKind {
@@ -111,6 +116,8 @@ impl fmt::Display for ErrorKind {
111116
),
112117
ErrorKind::TrailingWhitespace => write!(fmt, "left behind trailing whitespace"),
113118
ErrorKind::BadIssue(issue) => write!(fmt, "found {}", issue),
119+
ErrorKind::LicenseCheck => write!(fmt, "license check failed"),
120+
ErrorKind::ParsingLicense => write!(fmt, "parsing regex in license template failed"),
114121
}
115122
}
116123
}
@@ -127,7 +134,10 @@ pub struct FormattingError {
127134
impl FormattingError {
128135
fn msg_prefix(&self) -> &str {
129136
match self.kind {
130-
ErrorKind::LineOverflow(..) | ErrorKind::TrailingWhitespace => "error:",
137+
ErrorKind::LineOverflow(..)
138+
| ErrorKind::TrailingWhitespace
139+
| ErrorKind::LicenseCheck
140+
| ErrorKind::ParsingLicense => "error:",
131141
ErrorKind::BadIssue(_) => "WARNING:",
132142
}
133143
}
@@ -405,8 +415,39 @@ fn should_report_error(
405415
}
406416
}
407417

418+
fn check_license(text: &str, license_template: &str) -> Result<bool, regex::Error> {
419+
let mut template_re = String::from("^");
420+
// the template is parsed as a series of pairs of capture groups of (1) lazy whatever, which
421+
// will be matched literally, followed by (2) a {}-delimited block, which will be matched as a
422+
// regex
423+
let template_parser = RegexBuilder::new(r"(.*?)\{(.*?)\}")
424+
.dot_matches_new_line(true)
425+
.build()
426+
.unwrap();
427+
// keep track of the last matched offset and ultimately append the tail of the template (if any)
428+
// after the last {} block
429+
let mut last_matched_offset = 0;
430+
for caps in template_parser.captures_iter(license_template) {
431+
if let Some(mat) = caps.get(0) {
432+
last_matched_offset = mat.end()
433+
}
434+
if let Some(mat) = caps.get(1) {
435+
template_re.push_str(&regex::escape(mat.as_str()))
436+
}
437+
if let Some(mat) = caps.get(2) {
438+
let mut re = mat.as_str();
439+
if re.is_empty() {
440+
re = ".*?";
441+
}
442+
template_re.push_str(re)
443+
}
444+
}
445+
template_re.push_str(&regex::escape(&license_template[last_matched_offset..]));
446+
let template_re = Regex::new(&template_re)?;
447+
Ok(template_re.is_match(text))
448+
}
449+
408450
// Formatting done on a char by char or line by line basis.
409-
// FIXME(#209) warn on bad license
410451
// FIXME(#20) other stuff for parity with make tidy
411452
fn format_lines(
412453
text: &mut String,
@@ -415,7 +456,6 @@ fn format_lines(
415456
config: &Config,
416457
report: &mut FormatReport,
417458
) {
418-
// Iterate over the chars in the file map.
419459
let mut trims = vec![];
420460
let mut last_wspace: Option<usize> = None;
421461
let mut line_len = 0;
@@ -428,6 +468,33 @@ fn format_lines(
428468
let mut format_line = config.file_lines().contains_line(name, cur_line);
429469
let allow_issue_seek = !issue_seeker.is_disabled();
430470

471+
// Check license.
472+
if config.was_set().license_template() {
473+
match check_license(text, &config.license_template()) {
474+
Ok(check) => {
475+
if !check {
476+
errors.push(FormattingError {
477+
line: cur_line,
478+
kind: ErrorKind::LicenseCheck,
479+
is_comment: false,
480+
is_string: false,
481+
line_buffer: String::new(),
482+
});
483+
}
484+
}
485+
Err(_) => {
486+
errors.push(FormattingError {
487+
line: cur_line,
488+
kind: ErrorKind::ParsingLicense,
489+
is_comment: false,
490+
is_string: false,
491+
line_buffer: String::new(),
492+
});
493+
}
494+
}
495+
}
496+
497+
// Iterate over the chars in the file map.
431498
for (kind, (b, c)) in CharClasses::new(text.chars().enumerate()) {
432499
if c == '\r' {
433500
continue;
@@ -853,7 +920,7 @@ pub fn run(input: Input, config: &Config) -> Summary {
853920

854921
#[cfg(test)]
855922
mod test {
856-
use super::{format_code_block, format_snippet, Config};
923+
use super::{check_license, format_code_block, format_snippet, Config};
857924

858925
#[test]
859926
fn test_no_panic_on_format_snippet_and_format_code_block() {
@@ -939,4 +1006,38 @@ false,
9391006
};";
9401007
assert!(test_format_inner(format_code_block, code_block, expected));
9411008
}
1009+
1010+
#[test]
1011+
fn test_check_license() {
1012+
assert!(check_license("literal matching", "literal matching").unwrap());
1013+
assert!(!check_license("literal no match", "literal matching").unwrap());
1014+
assert!(
1015+
check_license(
1016+
"Regex start and end: 2018",
1017+
r"{[Rr]egex} start {} end: {\d+}"
1018+
).unwrap()
1019+
);
1020+
assert!(!check_license(
1021+
"Regex start and end no match: 2018",
1022+
r"{[Rr]egex} start {} end: {\d+}"
1023+
).unwrap());
1024+
assert!(
1025+
check_license(
1026+
"Regex in the middle: 2018 (tm)",
1027+
r"Regex {} middle: {\d+} (tm)"
1028+
).unwrap()
1029+
);
1030+
assert!(!check_license(
1031+
"Regex in the middle no match: 2018 (tm)",
1032+
r"Regex {} middle: {\d+} (tm)"
1033+
).unwrap());
1034+
assert!(!check_license("default doesn't match\nacross lines", "default {} lines").unwrap());
1035+
assert!(check_license("", "this is not a valid {[regex}").is_err());
1036+
assert!(
1037+
check_license(
1038+
"can't parse nested delimiters with regex",
1039+
r"can't parse nested delimiters with regex{\.{3}}"
1040+
).is_err()
1041+
);
1042+
}
9421043
}

0 commit comments

Comments
 (0)