Skip to content

Commit 4a3c5c9

Browse files
authored
Merge pull request #737 from microlinkhq/profiling
feat: add profiling support
2 parents 522b1bc + e88d2b1 commit 4a3c5c9

File tree

88 files changed

+428
-161
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

88 files changed

+428
-161
lines changed

CHANGELOG.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,16 @@
33
All notable changes to this project will be documented in this file.
44
See [Conventional Commits](https://conventionalcommits.org) for commit guidelines.
55

6+
# [5.46.0-beta.0](https://github.com/microlinkhq/metascraper/compare/v5.45.29...v5.46.0-beta.0) (2025-01-10)
7+
8+
### Bug Fixes
9+
10+
* load dependency ([6344788](https://github.com/microlinkhq/metascraper/commit/6344788ddbfc27a03f3ce12b2a842cd438574cc5))
11+
12+
### Features
13+
14+
* add profiling support ([9370e3c](https://github.com/microlinkhq/metascraper/commit/9370e3cdde056e86dcc2d189b3b22dd01a310372))
15+
616
## [5.45.29](https://github.com/microlinkhq/metascraper/compare/v5.45.28...v5.45.29) (2025-01-07)
717

818
### Bug Fixes

CONTRIBUTING.md

Lines changed: 26 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@ A set of rules under the same namespace runs in series and only the value return
4242
You can associate a `test` function with your rule bundle:
4343

4444
```js
45-
rules.test = ({ url }) => getVideoInfo(url).service === 'youtube'))
45+
rules.test = ({ url }) => getVideoInfo(url).service === 'youtube'
4646
```
4747

4848
The `test` function will receive the same arguments as a rule. This is useful for skipping all rules that doesn't target a specific URL.
@@ -52,12 +52,31 @@ A good practice is to use a memoize function to prevent unnecessary CPU cycles f
5252
```js
5353
const { memoizeOne } = require('@metascraper/helpers')
5454

55-
const test = memoizeOne(url => getVideoInfo(url).service === 'youtube'))
55+
const test = memoizeOne(url => getVideoInfo(url).service === 'youtube')
5656

5757
const rules = []
58-
rules.test ({ url }) => test(url)
58+
rules.test = ({ url }) => test(url)
5959
```
6060

61+
### Defining `pkgName` property
62+
63+
Additionally you can define `pkgName` property associated with your rules:
64+
65+
```js
66+
const { memoizeOne } = require('@metascraper/helpers')
67+
68+
const rules = []
69+
rules.pkgName = 'metascraper-module'
70+
```
71+
72+
This is using for printing debug logs, see debugging section to know how to use it.
73+
74+
## Debugging your Rules
75+
76+
In case you need to see what's happening under the hood, you can set `DEBUG='metascraper*'.
77+
78+
This is useful for verifying rule precedence and detecting slow rules.
79+
6180
## Testing your Rules
6281

6382
Since the order of the rules is important, testing it is also an important thing in order to be sure more popular rules are executed first over less popular rules.
@@ -74,7 +93,6 @@ const metascraper = require('metascraper')([
7493
require('metascraper-logo')()
7594
])
7695

77-
7896
describe('metascraper-logo', () => {
7997
it('creates an absolute favicon url if the logo is not present', async () => {
8098
const html = `
@@ -92,8 +110,8 @@ describe('metascraper-logo', () => {
92110
</body>
93111
</html>
94112
`
95-
const meta = await metascraper({ html, url }))
96-
should(meta.log).be.equal("open graph value")
113+
const meta = await metascraper({ html, url })
114+
should(meta.log).be.equal('open graph value')
97115
})
98116
})
99117
```
@@ -129,8 +147,8 @@ const metascraper = require('metascraper')([
129147
describe('metascraper-logo', () => {
130148
it('it resolves logo value', async () => {
131149
const html = fs.readFileSync('index.html', 'utf-8')
132-
const meta = await metascraper({ html, url }))
133-
should(meta.logo).be.equal("https://metascraper.js.org/static/logo.png")
150+
const meta = await metascraper({ html, url })
151+
should(meta.logo).be.equal('https://metascraper.js.org/static/logo.png')
134152
})
135153
})
136154
```

lerna.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
"packages": [
33
"packages/*"
44
],
5-
"version": "5.45.29",
5+
"version": "5.46.0-beta.2",
66
"command": {
77
"bootstrap": {
88
"npmClientArgs": [

packages/metascraper-amazon/CHANGELOG.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,10 @@
33
All notable changes to this project will be documented in this file.
44
See [Conventional Commits](https://conventionalcommits.org) for commit guidelines.
55

6+
# [5.46.0-beta.0](https://github.com/microlinkhq/metascraper/compare/v5.45.29...v5.46.0-beta.0) (2025-01-10)
7+
8+
**Note:** Version bump only for package metascraper-amazon
9+
610
## [5.45.28](https://github.com/microlinkhq/metascraper/compare/v5.45.27...v5.45.28) (2025-01-01)
711

812
**Note:** Version bump only for package metascraper-amazon

packages/metascraper-amazon/package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
"name": "metascraper-amazon",
33
"description": "Metascraper integration with Amazon",
44
"homepage": "https://github.com/microlinkhq/metascraper/packages/metascraper-amazon",
5-
"version": "5.45.28",
5+
"version": "5.46.0-beta.2",
66
"types": "src/index.d.ts",
77
"main": "src/index.js",
88
"author": {

packages/metascraper-amazon/src/index.js

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -60,5 +60,7 @@ module.exports = () => {
6060

6161
rules.test = ({ url }) => test(url)
6262

63+
rules.pkgName = 'metascraper-amazon'
64+
6365
return rules
6466
}

packages/metascraper-audio/CHANGELOG.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,10 @@
33
All notable changes to this project will be documented in this file.
44
See [Conventional Commits](https://conventionalcommits.org) for commit guidelines.
55

6+
# [5.46.0-beta.0](https://github.com/microlinkhq/metascraper/compare/v5.45.29...v5.46.0-beta.0) (2025-01-10)
7+
8+
**Note:** Version bump only for package metascraper-audio
9+
610
## [5.45.28](https://github.com/microlinkhq/metascraper/compare/v5.45.27...v5.45.28) (2025-01-01)
711

812
**Note:** Version bump only for package metascraper-audio

packages/metascraper-audio/package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
"name": "metascraper-audio",
33
"description": "Get audio property from HTML markup",
44
"homepage": "https://github.com/microlinkhq/metascraper/packages/metascraper-audio",
5-
"version": "5.45.28",
5+
"version": "5.46.0-beta.2",
66
"types": "src/index.d.ts",
77
"main": "src/index.js",
88
"author": {

packages/metascraper-audio/src/index.js

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -78,7 +78,7 @@ const _getIframe = (url, $, { src }) =>
7878
loadIframe(url, $.load(`<iframe src="${src}"></iframe>`))
7979

8080
module.exports = ({ getIframe = _getIframe } = {}) => {
81-
return {
81+
const rules = {
8282
audio: audioRules.concat(
8383
async ({ htmlDom: $, url }) => {
8484
const srcs = [
@@ -110,4 +110,8 @@ module.exports = ({ getIframe = _getIframe } = {}) => {
110110
}
111111
)
112112
}
113+
114+
rules.pkgName = 'metascraper-audio'
115+
116+
return rules
113117
}

packages/metascraper-author/CHANGELOG.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,10 @@
33
All notable changes to this project will be documented in this file.
44
See [Conventional Commits](https://conventionalcommits.org) for commit guidelines.
55

6+
# [5.46.0-beta.0](https://github.com/microlinkhq/metascraper/packages/metascraper-author/compare/v5.45.29...v5.46.0-beta.0) (2025-01-10)
7+
8+
**Note:** Version bump only for package metascraper-author
9+
610
## [5.45.28](https://github.com/microlinkhq/metascraper/packages/metascraper-author/compare/v5.45.27...v5.45.28) (2025-01-01)
711

812
**Note:** Version bump only for package metascraper-author

packages/metascraper-author/package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
"name": "metascraper-author",
33
"description": "Get author property from HTML markup",
44
"homepage": "https://metascraper.js.org",
5-
"version": "5.45.28",
5+
"version": "5.46.0-beta.2",
66
"types": "src/index.d.ts",
77
"main": "src/index.js",
88
"author": {

packages/metascraper-author/src/index.js

Lines changed: 29 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -24,27 +24,33 @@ const strict = rule => $ => {
2424
return REGEX_STRICT.test(value) && value
2525
}
2626

27-
module.exports = () => ({
28-
author: [
29-
toAuthor($jsonld('author.name')),
30-
toAuthor($jsonld('brand.name')),
31-
toAuthor($ => $('meta[name="author"]').attr('content')),
32-
toAuthor($ => $('meta[property="article:author"]').attr('content')),
33-
toAuthor($ => $filter($, $('[itemprop*="author" i] [itemprop="name"]'))),
34-
toAuthor($ => $filter($, $('[itemprop*="author" i]'))),
35-
toAuthor($ => $filter($, $('[rel="author"]'))),
36-
strict(toAuthor($ => $filter($, $('a[class*="author" i]')))),
37-
strict(toAuthor($ => $filter($, $('[class*="author" i] a')))),
38-
strict(toAuthor($ => $filter($, $('a[href*="/author/" i]')))),
39-
toAuthor($ => $filter($, $('a[class*="screenname" i]'))),
40-
strict(toAuthor($ => $filter($, $('[class*="author" i]')))),
41-
strict(
42-
toAuthor($ =>
43-
$filter($, $('[class*="byline" i]'), el => {
44-
const value = $filter.fn(el)
45-
return !date(value) && value
46-
})
27+
module.exports = () => {
28+
const rules = {
29+
author: [
30+
toAuthor($jsonld('author.name')),
31+
toAuthor($jsonld('brand.name')),
32+
toAuthor($ => $('meta[name="author"]').attr('content')),
33+
toAuthor($ => $('meta[property="article:author"]').attr('content')),
34+
toAuthor($ => $filter($, $('[itemprop*="author" i] [itemprop="name"]'))),
35+
toAuthor($ => $filter($, $('[itemprop*="author" i]'))),
36+
toAuthor($ => $filter($, $('[rel="author"]'))),
37+
strict(toAuthor($ => $filter($, $('a[class*="author" i]')))),
38+
strict(toAuthor($ => $filter($, $('[class*="author" i] a')))),
39+
strict(toAuthor($ => $filter($, $('a[href*="/author/" i]')))),
40+
toAuthor($ => $filter($, $('a[class*="screenname" i]'))),
41+
strict(toAuthor($ => $filter($, $('[class*="author" i]')))),
42+
strict(
43+
toAuthor($ =>
44+
$filter($, $('[class*="byline" i]'), el => {
45+
const value = $filter.fn(el)
46+
return !date(value) && value
47+
})
48+
)
4749
)
48-
)
49-
]
50-
})
50+
]
51+
}
52+
53+
rules.pkgName = 'metascraper-author'
54+
55+
return rules
56+
}

packages/metascraper-clearbit/CHANGELOG.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,10 @@
33
All notable changes to this project will be documented in this file.
44
See [Conventional Commits](https://conventionalcommits.org) for commit guidelines.
55

6+
# [5.46.0-beta.0](https://github.com/microlinkhq/metascraper/compare/v5.45.29...v5.46.0-beta.0) (2025-01-10)
7+
8+
**Note:** Version bump only for package metascraper-clearbit
9+
610
## [5.45.28](https://github.com/microlinkhq/metascraper/compare/v5.45.27...v5.45.28) (2025-01-01)
711

812
**Note:** Version bump only for package metascraper-clearbit

packages/metascraper-clearbit/package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
"name": "metascraper-clearbit",
33
"description": "Metascraper integration with Clearbit Logo API",
44
"homepage": "https://github.com/microlinkhq/metascraper/packages/metascraper-clearbit",
5-
"version": "5.45.28",
5+
"version": "5.46.0-beta.2",
66
"types": "src/index.d.ts",
77
"main": "src/index.js",
88
"author": {

packages/metascraper-clearbit/src/index.js

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,8 +43,12 @@ module.exports = opts => {
4343
const clearbit = createClearbit(opts)
4444
const getClearbit = composeRule(($, url) => clearbit(parseUrl(url).domain))
4545

46-
return {
46+
const rules = {
4747
logo: getClearbit({ from: 'logo' }),
4848
publisher: getClearbit({ from: 'name', to: 'publisher' })
4949
}
50+
51+
rules.pkgName = 'metascraper-clearbit'
52+
53+
return rules
5054
}

packages/metascraper-date/CHANGELOG.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,10 @@
33
All notable changes to this project will be documented in this file.
44
See [Conventional Commits](https://conventionalcommits.org) for commit guidelines.
55

6+
# [5.46.0-beta.0](https://github.com/microlinkhq/metascraper/compare/v5.45.29...v5.46.0-beta.0) (2025-01-10)
7+
8+
**Note:** Version bump only for package metascraper-date
9+
610
## [5.45.28](https://github.com/microlinkhq/metascraper/compare/v5.45.27...v5.45.28) (2025-01-01)
711

812
**Note:** Version bump only for package metascraper-date

packages/metascraper-date/package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
"name": "metascraper-date",
33
"description": "Get date property from HTML markup",
44
"homepage": "https://github.com/microlinkhq/metascraper/packages/metascraper-date",
5-
"version": "5.45.28",
5+
"version": "5.46.0-beta.2",
66
"types": "src/index.d.ts",
77
"main": "src/index.js",
88
"author": {

packages/metascraper-date/src/index.js

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -43,17 +43,19 @@ module.exports = (
4343
dateModified: false
4444
}
4545
) => {
46-
const result = {
46+
const rules = {
4747
date: dateModifiedRules().concat(datePublishedRules(), dateRules())
4848
}
4949

5050
if (datePublished) {
51-
result.datePublished = datePublishedRules()
51+
rules.datePublished = datePublishedRules()
5252
}
5353

5454
if (dateModified) {
55-
result.dateModified = dateModifiedRules()
55+
rules.dateModified = dateModifiedRules()
5656
}
5757

58-
return result
58+
rules.pkgName = 'metascraper-date'
59+
60+
return rules
5961
}

packages/metascraper-description/CHANGELOG.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,10 @@
33
All notable changes to this project will be documented in this file.
44
See [Conventional Commits](https://conventionalcommits.org) for commit guidelines.
55

6+
# [5.46.0-beta.0](https://github.com/microlinkhq/metascraper/compare/v5.45.29...v5.46.0-beta.0) (2025-01-10)
7+
8+
**Note:** Version bump only for package metascraper-description
9+
610
## [5.45.28](https://github.com/microlinkhq/metascraper/compare/v5.45.27...v5.45.28) (2025-01-01)
711

812
**Note:** Version bump only for package metascraper-description

packages/metascraper-description/package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
"name": "metascraper-description",
33
"description": "Get description property from HTML markup",
44
"homepage": "https://github.com/microlinkhq/metascraper/packages/metascraper-description",
5-
"version": "5.45.28",
5+
"version": "5.46.0-beta.2",
66
"types": "src/index.d.ts",
77
"main": "src/index.js",
88
"author": {

packages/metascraper-description/src/index.js

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ const { $jsonld, toRule, description } = require('@metascraper/helpers')
55
module.exports = opts => {
66
const toDescription = toRule(description, opts)
77

8-
return {
8+
const rules = {
99
description: [
1010
toDescription($ => $('meta[property="og:description"]').attr('content')),
1111
toDescription($ => $('meta[name="twitter:description"]').attr('content')),
@@ -18,4 +18,8 @@ module.exports = opts => {
1818
toDescription($jsonld('description'))
1919
]
2020
}
21+
22+
rules.pkgName = 'metascraper-description'
23+
24+
return rules
2125
}

packages/metascraper-feed/CHANGELOG.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,10 @@
33
All notable changes to this project will be documented in this file.
44
See [Conventional Commits](https://conventionalcommits.org) for commit guidelines.
55

6+
# [5.46.0-beta.0](https://github.com/microlinkhq/metascraper/compare/v5.45.29...v5.46.0-beta.0) (2025-01-10)
7+
8+
**Note:** Version bump only for package metascraper-feed
9+
610
## [5.45.28](https://github.com/microlinkhq/metascraper/compare/v5.45.27...v5.45.28) (2025-01-01)
711

812
**Note:** Version bump only for package metascraper-feed

packages/metascraper-feed/package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
"name": "metascraper-feed",
33
"description": "Get RSS/Atom feed URL from HTML markup",
44
"homepage": "https://github.com/microlinkhq/metascraper/packages/metascraper-description",
5-
"version": "5.45.28",
5+
"version": "5.46.0-beta.2",
66
"types": "src/index.d.ts",
77
"main": "src/index.js",
88
"author": {

packages/metascraper-feed/src/index.js

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,11 +5,15 @@ const { toRule, url } = require('@metascraper/helpers')
55
const toUrl = toRule(url)
66

77
module.exports = () => {
8-
return {
8+
const rules = {
99
feed: [
1010
toUrl($ => $('link[type="application/rss+xml"]').attr('href')),
1111
toUrl($ => $('link[type="application/feed+json"]').attr('href')),
1212
toUrl($ => $('link[type="application/atom+xml"]').attr('href'))
1313
]
1414
}
15+
16+
rules.pkgName = 'metascraper-feed'
17+
18+
return rules
1519
}

0 commit comments

Comments
 (0)