Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving Self-Hosting and Removing 3rd Party dependencies. #4513

Merged
merged 101 commits into from
Jan 27, 2025

Conversation

Podginator
Copy link
Contributor

The intent of this PR is to improve the Self-Hosting documentation, to provide a working setup to get Omnivore working with Docker and Docker Compose. It intends to, as much as possible, remove third party dependencies and reliance on external infrastructure providers such as GCP.

The aim is to establish feature parity, or near feature parity to the previously hosted service. This includes RSS support, webhook support, email newsletter, and PDF Support.

The list of changes to date is below:

  • Create Dockerfile for Queue processing, which is used for parsing articles, alongside asynchronous tasks.

  • Update and expose ImageProxy and use the latest version with ARM64 support.

  • Create new docker-compose file in self-hosting/docker-compose.

  • Provide a minimal .env file to be able to run the service using docker-compose.

  • Created a guide for using Cloudflare Tunnels as a way to integrate with a device at your home.

  • Create a NGINX configuration for those looking to use NGINX Reverse Proxying for the service.

  • Replace use of Google Cloud Storage with Minio an open-source layer compatible with the S3 API that can run on Device.

    • This also allows other services, such as R2 and S3 to be the Storage Provider, if wanted.
  • Improvements to content-fetching to minimise instances where articles refused to parse.

    • Also improved to not use puppeteer for some articles, instead relying on raw html.
  • Overhaul the way email works, to ensure that there is an open source version. Three options are provided here.

    • Docker Mailserver: A production-ready fullstack but simple containerized mail server. This allows incoming emails to be received, parsed, and then added to Omnivore.
    • Amazon Simple Email Service A service provided by S3 that has a free tier. Allows for receiving of emails to a domain. Guide on how to set up in the Self-hosting readme.
    • Zapier: Used as a way to integrate gmail to hosting. This can be realistically achieved using some of the gmail apis, also.
  • Replace pspdfkit - Which required a license and would display the following when using PDFS image

    • Have an option for the Native Browser PDF Viewer for PDF Files. This removes the highlight functionality, but is stable.
      image
    • Create a new pdf viewer using PDF.js an open source pdf library used as the backing for the PDF viewer in firefox. This option includes near feature parity (highlights, reading progress) with the pspdfkit, but may have some bugs.
      image
  • Add some additional fixes to parsing articles, such as a Medium Parser, and a Wired parse

  • Updated Docker images and software to the latest LTS version of Node (20.12)

To-Do:

  • Re-Enable Youtube features - such as extraction of Transcripts.
    • Allow both an AI based feature for this, and a less formatted version.
  • Provide a guide on how to get up and running user Kubernetes.
  • Provide a guide on how to get up and running with Tailscale.
  • Provide a guide on getting email to work with G-Mail without the use of an external server.
  • Attempt to provide a lighter-weight queuing system, and removal of Redis/Caching for single-user hosting.

@tubit
Copy link

tubit commented Jan 28, 2025

Awesome! Thanks for all your effort, I was just able to deploy a running omnivore instance following the guide you provided.

Some things still broken:

  • iOS app crashes when login to self-hosted, no clear indicator why
  • No success setup email: Zapier example and AWS example fail, endpoint errors:
    • invalid request URL: malformed URL "/mail/mail": must provide absolute remote URL
    • invalid request URL: malformed URL "/mail/sns": must provide absolute remote URL
  • some unclear documentation when using nginx (e.g. you need to add /api the the SERVER_BASE_URL, /images to IMAGE_PROXY_URL etc.)

I may try to improve documentation and provide a PR.

@thiswillbeyourgithub
Copy link

Can we write somewhere obvious (a roadmap?) the state of the following questions?

  • ability to import from the files that were exported from omnivore
  • ability to import highlights
  • ability to import pdfs / pdf highlights

@Podginator
Copy link
Contributor Author

Awesome! Thanks for all your effort, I was just able to deploy a running omnivore instance following the guide you provided.

Some things still broken:

  • iOS app crashes when login to self-hosted, no clear indicator why

  • No success setup email: Zapier example and AWS example fail, endpoint errors:

    • invalid request URL: malformed URL "/mail/mail": must provide absolute remote URL
    • invalid request URL: malformed URL "/mail/sns": must provide absolute remote URL
  • some unclear documentation when using nginx (e.g. you need to add /api the the SERVER_BASE_URL, /images to IMAGE_PROXY_URL etc.)

I may try to improve documentation and provide a PR.

@tubit
Hi Tubit, the mail stuff wasn't working because of a dumb copy and paste error on my part. When changing out the docker-compose file to use the new published images I didn't change the mail one to the correct image. I am planning on fixing that tonight and should push it in a few hours.

Sorry about that!

@tubit
Copy link

tubit commented Jan 29, 2025

No worries, @Podginator, no need to say sorry. Looking forward to test this again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants