Skip to content
This repository has been archived by the owner on Feb 13, 2025. It is now read-only.

MWPW-147877 Validator Improvements #42

Merged
merged 5 commits into from
Jun 3, 2024

Conversation

Brandon32
Copy link
Collaborator

@Brandon32 Brandon32 commented May 29, 2024

  • Creates two reports, standard report with anomalies, and deep report will observations and anomalies.
    • Checking observations for ASCII Corruption, Multiple Hashtags, Empty Links, and Whitespace Anomalies.
  • Outputs a list of all page links for pages that do not match 1:1 on deep report.
  • Updated readme's for downloader and validator.
  • Replace XLSX with ExcelJS, fix missing data in report, will need to add await when saving.

Resolves: MWPW-147877

Example Report:

@Brandon32 Brandon32 force-pushed the bmarshal/validator-improvements branch 3 times, most recently from 9562db1 to 2861d37 Compare May 29, 2024 18:44
Copy link
Collaborator

@JasonHowellSlavin JasonHowellSlavin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice stuff. A lot of very good work here. Wanted to mention a few things about the files in general:

  • Seems like this could have been broken down into smaller PRs,
    -- Replacing xlsx with ExcelJS
    -- Download Markdown Changes
    -- Validator v2
  • Doc blocks would be nice for link-validator.js

Also, wondering about the comparison variable throughout. I understand it serves a few purposes as it/they a) can be passed in as an arg with the node CLI call, b) serves as part of the path of the MD directory, and c) get relayed into log communication within template strings.

That being said, the most common use case I've seen is the ['source', 'updated'] form. Though it removes some flexibility from the input dirs that this code can use, it might be worth it for clarity are readability to remove the option to add the comparison strings, and instead reflect the expected MD pattern in the README? You'd likely save a lot of param passing and add a bit of readability to the places where comparison is used for logging, i.e ${comparison[0]} Text. Food for thought.

Good stuff again.

@@ -22,142 +18,160 @@ describe('ExcelReporter', () => {
});
});

describe('Check sheets js library is called', () => {
describe('Check reporter and excel js library is called', () => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may want to reword this describe block? The it statements are seem to be testing the functionality of the reporter. Maybe something like "check instantiation and methods" or something along those lines?

reporter.log('topic', 'status', 'message', 'arg1', 'arg2');

expect(xlsx.utils.book_append_sheet.calledTwice).to.be.true;
expect(reporter.log.calledOnce).to.be.true;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you may be able to remove this expect as you call reporter.log above on line 44 so we can verify visually, unless you are afraid reporter.log would be called more than once?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds reasonable, we shouldn't be concerned with log being called.

expect(xlsx.utils.sheet_add_aoa.calledOnce).to.be.true;
});
});
expect(reporter.log.callCount).to.equal(5);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above, we see you call log 5 times, maybe can remove.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good.

const strTwo = 'sitting';
const similarity = lSimilarity(strOne, strTwo);

expect(similarity).to.equal(0.5714285714285714);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrmm, may want to cap out this float for readability. parseFloat(similarity).toFixed(4), but within the method of course.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, floating point numbers can be tricky and can potentially change depending on what machine runs the calculation, at least with other languages. toFixed creates a string, '0.5714', and I'll compare against that.


expect(observations.url).to.contain({
[SIMILARITY_PERCENTAGE]: '84%',
[SIMILARITY]: 'High Similarity',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe good to have [PATHHNAME_MATCH]: false here, as that is the main difference?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, that surfaced an issue with observeUrl so thank you!

Comment on lines 216 to 217
const comparison = ['source', 'updated'];
const entry = '/entry';
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe save these consts higher in scope so they can be reused in multiple tests?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, thanks

Comment on lines 114 to 123
/**
* Observes the changes between two text values.
*
* @param {string} oldText - The old text value.
* @param {string} newText - The new text value.
* @returns {Object} - The observed changes between the old and new text values.
*/
export function observeText(oldText = '', newText = '') {
return observe(oldText, newText);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this function not be omitted in favor of composition in place? i.e const observeText = observe(oldText, newText). Adds some length to the code, creates an additional scope.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was expecting to have different observations for URLs and for Texts but all the text ones are shared in observe.
We can omit this function and bring it back if/when that becomes true.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I think it makes more sense to rename observe to observeText and have observeUrl call observeText instead of the generic observe function.

export function observeLinks(oldUrl, newUrl, oldText, newText) {
const observations = { oldUrl, newUrl, oldText, newText };

observations.text = observeText(oldText, newText);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
observations.text = observeText(oldText, newText);
observations.text = observe(oldText, newText);

@Brandon32 Brandon32 force-pushed the bmarshal/validator-improvements branch from a8bf4ba to f9cdc27 Compare June 3, 2024 17:44
@Brandon32 Brandon32 force-pushed the bmarshal/validator-improvements branch from f9cdc27 to 438cb3e Compare June 3, 2024 22:23
@Brandon32 Brandon32 force-pushed the bmarshal/validator-improvements branch from 438cb3e to 71471d9 Compare June 3, 2024 23:41
@Brandon32 Brandon32 merged commit 530a8b6 into main Jun 3, 2024
4 checks passed
@Brandon32 Brandon32 deleted the bmarshal/validator-improvements branch June 3, 2024 23:42
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants