🚀 BlogCollector

BlogCollector is a naive AI/Tech blog aggregator that supports RSS feeds + web scraping, perfect for personal knowledge tracking and information monitoring.

📑 Table of Contents

🚀 BlogCollector

Features

✅ Multiple sources: RSS / Atom feeds & any webpage via custom CSS selectors
✅ Category & filter: Organization / Individual, one-click filtering + search

Quick Start

1. Clone the repo

$ git clone https://github.com/<yourname>/BlogCollector.git
$ cd BlogCollector

2. Start the backend

$ cd backend
$ npm install
$ npm start     # default PORT=3000, can be overridden

3. Preview the frontend locally

$ cd docs              # static assets now live in docs/
$ npx serve .           # or python3 -m http.server 8080

Then visit http://localhost:8080.

💡 VS Code users can also install Live Server and choose Open with Live Server.

Project Structure

BlogCollector/
├─ backend/            # Node.js / Express backend
│  ├─ server.js
│  └─ ...
├─ docs/               # static frontend (published via GitHub Pages)
│  ├─ index.html
│  ├─ script.js
│  └─ style.css
└─ README.md

Deploy to the Web (Render + GitHub Pages)

Part	Platform	Steps
Backend	Render	Connect the repo → New Web Service → root dir `backend` → Build `npm install` / Start `npm start` → get `https://<app>.onrender.com`
Frontend	GitHub Pages	Settings → Pages → Source `main` / Folder `/docs` → Save → access `https://<user>.github.io/<repo>/`

Update docs/script.js:

const API_BASE_URL = 'https://<app>.onrender.com/api';

Now anyone can open the GitHub Pages URL and the site will call your Render API.

Add / Modify Data Sources

1. RSS sources

Append entries to the rssSources array in backend/server.js:

const rssSources = [
  { name: 'OpenAI', url: 'https://openai.com/blog/rss.xml', category: 'organization' },
  // new source
  { name: 'Example Blog', url: 'https://example.com/rss.xml', category: 'individual' },
];

2. Scraping targets

Append entries to the scrapingTargets array in server.js:

const scrapingTargets = [
  {
    name: 'Lilian Weng',
    url: 'https://lilianweng.github.io/',
    category: 'individual',
    selectors: {
      articleContainer: 'article.post-entry',
      title: '.entry-header h2',
      link: 'a.entry-link',
      description: 'section.entry-content p',
      time: 'footer.entry-footer',
    },
  },
  // new source example
  {
    name: 'Karpathy',
    url: 'https://karpathy.bearblog.dev/blog/',
    category: 'individual',
    selectors: {
      articleContainer: 'ul.blog-posts li',
      title: 'a',
      link: 'a',
      description: '',      // this site has no summary
      time: 'time',
    },
  },
];

After editing sources, restart the backend:
$ cd backend && npm restart

FAQ

Issue	Solution
Port in use	Change the `PORT` env var or free port 3000
CORS error	CORS is enabled globally; update the whitelist if you set a CDN
Scrape fail	Check anti-bot measures & verify your CSS selectors

Contributing & License

Pull requests, issues and stars are welcome! 🌟
Released under the MIT License — free for personal & commercial use.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🚀 BlogCollector

📑 Table of Contents

Features

Quick Start

1. Clone the repo

2. Start the backend

3. Preview the frontend locally

Project Structure

Deploy to the Web (Render + GitHub Pages)

Add / Modify Data Sources

1. RSS sources

2. Scraping targets

FAQ

Contributing & License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
backend		backend
docs		docs
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

🚀 BlogCollector

📑 Table of Contents

Features

Quick Start

1. Clone the repo

2. Start the backend

3. Preview the frontend locally

Project Structure

Deploy to the Web (Render + GitHub Pages)

Add / Modify Data Sources

1. RSS sources

2. Scraping targets

FAQ

Contributing & License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages