Skip to content

Commit ef103dd

Browse files
authored
Fix robots.txt (#239)
* Fix Robots.txt to be automatically generated for Production generation of robots.txt. Validate on staging site and fix staging yaml code to work correctly with GitHub pages Update GitHub Action replacing PeaceIris actions with vanilla GitHub actions; align with Hugo's documented deployment process * Update Readme Add section on Staging --------- Signed-off-by: Bill Stumbo <[email protected]>
1 parent 186752d commit ef103dd

File tree

9 files changed

+168
-40
lines changed

9 files changed

+168
-40
lines changed

.github/workflows/gh-pages.yml

+51-16
Original file line numberDiff line numberDiff line change
@@ -25,13 +25,26 @@ on:
2525
push:
2626
branches:
2727
- main # Set a branch to deploy
28+
2829
schedule:
2930
- cron: "0 3 * * *"
3031

32+
workflow_dispatch:
33+
34+
permissions:
35+
contents: read
36+
pages: write
37+
id-token: write
38+
39+
defaults:
40+
run:
41+
shell: bash
42+
3143
env:
3244
# ----------------------------------------------------------------------------
3345
# Specify the deployment environment: staging or production
34-
hugoEnvironment: production
46+
HUGO_ENVIRONMENT: production
47+
HUGO_VERSION: 0.133.1
3548

3649
jobs:
3750
# ----------------------------------------------------------------------------
@@ -42,6 +55,7 @@ jobs:
4255
outputs:
4356
zoteroVersion: ${{ fromJson(steps.zoteroVersion.outputs.headers).last-modified-version }}
4457
cacheHit: ${{ steps.cache-zotero-bib.outputs.cache-hit }}
58+
4559
runs-on: ubuntu-latest
4660
concurrency:
4761
group: ${{ github.workflow }}-${{ github.ref }}
@@ -66,15 +80,18 @@ jobs:
6680
${{ fromJson(steps.zoteroVersion.outputs.headers).last-modified-version }}
6781
6882
# ----------------------------------------------------------------------------
69-
# Deploy the website. This job is conditional, we will always run it on a
83+
# Build the website. This job is conditional, we will always run it on a
7084
# push or if on a scheduled run the cache was determined to be out of date.
7185
#
72-
deploy:
86+
build:
7387
needs: check
7488
runs-on: ubuntu-latest
7589
if: github.event_name == 'push' || needs.check.outputs.cacheHit != 'true'
7690
steps:
7791
- uses: actions/checkout@v4
92+
with:
93+
submodules: recursive
94+
fetch-depth: 0
7895

7996
- name: Cache Zotero Bibliography
8097
id: cache-zotero-bib
@@ -102,13 +119,17 @@ jobs:
102119
./update_bibliography.sh
103120
sudo cp --recursive ${GITHUB_WORKSPACE}/static/data ~/data
104121
fi
105-
shell: bash
106122
107-
- name: Setup Hugo
108-
uses: peaceiris/actions-hugo@v3
109-
with:
110-
hugo-version: '0.127.0'
111-
extended: true
123+
# Install Hugo Extended
124+
#
125+
- name: Install Hugo CLI
126+
run: |
127+
wget -O ${{ runner.temp }}/hugo.deb https://github.com/gohugoio/hugo/releases/download/v${HUGO_VERSION}/hugo_extended_${HUGO_VERSION}_linux-amd64.deb \
128+
&& sudo dpkg -i ${{ runner.temp }}/hugo.deb
129+
130+
- name: Setup Pages
131+
id: pages
132+
uses: actions/configure-pages@v5
112133

113134
- name: Setup Node
114135
uses: actions/setup-node@v4
@@ -120,11 +141,25 @@ jobs:
120141
- run: npm install --verbose
121142

122143
- name: Build
123-
run: hugo -e $hugoEnvironment
124-
125-
- name: Deploy
126-
uses: peaceiris/actions-gh-pages@v4
127-
if: github.ref == 'refs/heads/main'
144+
env:
145+
HUGO_CACHEDIR: ${{ runner.temp }}/hugo_cache
146+
TZ: America/New York
147+
run: hugo --cleanDestinationDir -e $HUGO_ENVIRONMENT
148+
149+
- name: Upload artifact
150+
uses: actions/upload-pages-artifact@v3
128151
with:
129-
github_token: ${{ secrets.GITHUB_TOKEN }}
130-
publish_dir: ./public
152+
path: ./public
153+
154+
deploy:
155+
environment:
156+
name: github-pages
157+
url: ${{ steps.deployment.outputs.page_url }}
158+
runs-on: ubuntu-latest
159+
needs: build
160+
161+
steps:
162+
- name: Deploy to GitHub Pages
163+
id: deployment
164+
uses: actions/deploy-pages@v4
165+

README.md

+99-17
Original file line numberDiff line numberDiff line change
@@ -86,35 +86,117 @@ reflection of the bibliographic material related to Medley and Interlisp.
8686

8787
Building the website is driven by a GitHub workflow.
8888

89-
The workflow is trigger by one of two events, a `push` to main, representing updates
89+
The workflow is triggered by one of two events, a `push` to main, representing updates
9090
to the Interlisp.org website or a scheduled execution of the workflow. The
9191
workflow is scheduled to run on a regular basis to ensure the bibliography remains
9292
consistent with the online Zotero catalog.
9393

94-
The workflow consists of two jobs. The first job, `check`, uses Zotero's REST
95-
interface to query for the latest version of the group bibliography. The call
96-
made is a `GET` call to `https://api.zotero.org/groups/2914042/items`. This call
94+
The GitHub Action workflow can also be initiated from the Action panel within
95+
the Interlisp.github.io repository. This option allows manual execution when
96+
necessary.
97+
98+
The workflow consists of three jobs. The first job, `check`, uses Zotero's REST
99+
interface to query for the latest version of the group bibliography. A `GET` call
100+
is made to `https://api.zotero.org/groups/2914042/items`. It
97101
returns a collection of metadata and information describing the current state of
98102
items within the catalog. We are interested in a specific header, `Last-Modified-Version`.
99103
The value returned with this header is incremented every time the Zotero Interlisp
100-
catalog is updated. The value returned is used as a cache-key for a cached
101-
version of the json file of the bibliography we create. The first job completes
102-
by providing the current Zotero version and whether the cache needs to be updated.
104+
catalog is updated. We use the value returned as a cache-key for the bibliography.
105+
If the cache-key matches one in the current GitHub Action cache we use the saved
106+
bibliography information and save the overhead of building it.
103107

104-
The second job, `deploy`, starts by determining if a deploy needs to occur. If
105-
the workflow was initiated by a `push` a deploy will always be done. However,
106-
if the workflow was started by a scheduled execution, if the Zotero bibliography
107-
cache is consistent with the online Zotero catalog, the deploy is skipped.
108+
The second job, `build`, starts by determining if a build and deploy need to occur.
109+
If the workflow was initiated by a `push` a deploy will always be done. However,
110+
if the workflow was started by a scheduled execution and the Zotero bibliography
111+
cache is consistent with the online Zotero catalog, the build and deploy are skipped.
108112

109-
A deploy starts by checking out the `Interlisp.github.io` repository. Then,
110-
if the cache is valid, the contents are copied into the `data` file within the
113+
A build starts by checking out the `Interlisp.github.io` repository. Then,
114+
if the Zotereo cache is valid, its contents are copied into the `data` file within the
111115
repository directory structure. If the cache is invalid, the `update_bibliography.sh`
112-
shell script is run to download a new copy of the bibliography as a `json` file.
116+
shell script runs and downloads a new copy of the bibliography as a `json` file.
113117
Once downloaded the script does some additional processing to complete the
114118
formatting of the file.
115119

116-
Once this work is completed, Hugo is setup and the website is deployed.
120+
After this work is completed, Hugo is setup and run to build the website. We use
121+
Hugo extended to build our site. The version of Hugo currently being used is
122+
defined by the environment variable, `HUGO_VERSION`.
123+
124+
We run Hugo with two flags:
125+
126+
- `-e $HUGO_ENVIRONMENT` to specify whether we are building a production or staging site. If the website is being build to deploy to Interlisp.org,it should be built with `HUGO_ENVIRONMENT` set to production. Deployment to any other site should set the environment flag to staging.
127+
- `--cleanDestinationDir` clears the destination directory, `./public` on each build. This will ensure we do not have any unneeded artifacts in our deployment.
128+
129+
The last part of the build activity is to save the created artifact, the information
130+
in the `./public` directory. We use the GitHub composition action `upload-pages-artifact`
131+
for this. It packages the contents of the directory and stores it in the appropriate
132+
format for deployment to GitHub pages.
133+
134+
The last job in the tool chain is `deploy`. This job simply takes the output
135+
of the build step and formally deploys it to GitHub pages using the GitHub `deploy-pages`
136+
action.
137+
138+
### Deploying a Staging Site
139+
140+
Successfully deploying a Staging Site requires you to configure your GitHub
141+
repository to enable GitHub Pages. The following steps will accomplish this task:
142+
143+
1. Clone the Interlisp.github.io repository into your GitHub site
144+
2. In GitHub go to the cloned repository, in my case https://github.com/stumbo/InterlispDraft.github.io and select Settings
145+
3. Under Settings, find Pages and select it
146+
4. Under **Build and deployment** set Source to Deploy *GitHub Actions*
147+
148+
Once the repository is cloned and GitHub Pages has been setup, you can deploy a
149+
staging site to validate changes prior to creating a Pull Request to merge your
150+
changes back into the main site.
151+
152+
When creating a staging site we want to do a couple things to ensure we do not
153+
interfere with the production site, first we want to disable Google Analytics
154+
and secondly we want to ensure the site is not crawled and indexed.
155+
156+
#### Setup Your Repository
157+
158+
A best practice for the updating your clone of the repository is to create a branch
159+
and make the following required changes on the branch you created.
160+
161+
The appropriate settings for this are all enabled by setting the `HUGO_ENVIRONMENT`
162+
variable in `.github/workflows/gh-pages.yaml` to *staging*.
163+
164+
You also need to set `baseURL` to match the GitHub site you are deploying to
165+
in the `config/staging/hugo.yaml` file. The file currently looks like:
166+
167+
```yaml
168+
baseURL: https://stumbo.github.io/InterlispDraft.github.io/
169+
170+
languageCode: en-us
171+
172+
# title
173+
# Insert Staging Environment onto every page to make clear
174+
# this is not the production site
175+
title: 'Staging Environment'
176+
```
177+
178+
Make sure the `baseURL` reflects the complete path of your repository. Failure
179+
to do this will either cause the deployment to fail or URLs within your built
180+
site may be incorrectly set. Resulting in 404s or expected resources not found.
181+
182+
With these changes the cloned repository is ready to be deployed to a staging site.
183+
184+
Commit the changes you made and push the new branch to your cloned repository.
185+
At this point, create a Pull Request to merge the changes you made into your
186+
repository's main branch. Complete the operation by merging the pull request.
187+
188+
Once the merge occurs, the GitHub Actions should fire off and your site will be
189+
built and deployed.
190+
191+
Once you have successfully completed this operation and your staging site is
192+
deployed and operational you can experiment with adding new content or
193+
functionality to the Interlisp site.
194+
195+
#### Develop a Feature
117196

197+
To develop new pages or functionality, create a new branch for your work. once
198+
you have completed development and testing on your staging site, you can create
199+
a PR to merge the content into the Interlisp site.
118200

119201
### Running Hugo and Docsy Locally
120202

@@ -233,7 +315,7 @@ that have components specific to `Interlisp.github.io` are as follows:
233315
- `documentation` - contains the pdf files referenced in the document section of the home page
234316
- `favicons` - contains `favicon.png` a small icon that browsers can use when referencing the website
235317
- `Resources` - contains the current `Interlisp-D` logo, used on the home page, and another instance of `favicon.png`
236-
- `CNAME` - a oneline text file that provides support for using a [custom domain](https://gohugo.io/hosting-and-deployment/hosting-on-github/#use-a-custom-domain)
318+
- `CNAME` - a one line text file that provides support for using a [custom domain](https://gohugo.io/hosting-and-deployment/hosting-on-github/#use-a-custom-domain)
237319

238320
## Search
239321

@@ -256,7 +338,7 @@ layout as being `search`.
256338

257339
### Updating Search
258340

259-
Modfying the websites that are searched requires updating the Google Custom
341+
Modifying the websites that are searched requires updating the Google Custom
260342
Search engine settings. This is done via logging into Google's Programmable Search
261343
Engine Dashboard at: [https://programmablesearchengine.google.com](https://programmablesearchengine.google.com)
262344

config/_default/hugo.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ canonifyURLs: false
1717
# Default to false and set to true for production in config/production/hugo.yaml
1818
enableRobotsTXT: false
1919

20-
# aasetDir: Location where Hugo looks for assets
20+
# assetDir: Location where Hugo looks for assets
2121
assetDir: static
2222

2323
# Enable .GitInfo object for each page. This will give values to .Lastmod etc.

config/_default/markup.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ goldmark:
1414
# See: https://gohugo.io/getting-started/configuration-markup/#highlight
1515
# for additional information on configuration options.
1616
# https://www.docsy.dev/docs/adding-content/lookandfeel/#code-highlighting-with-chroma
17-
# contains Docsy specfiic code highlighting information
17+
# contains Docsy specfic code highlighting information
1818
highlight:
1919
# See a complete list of available styles at https://xyproto.github.io/splash/docs/all.html
2020
#style: solarized-dark

config/_default/params.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,7 @@ github_branch: main
5353
github_subdir:
5454
github_project_repo: https://github.com/interlisp/medley
5555

56-
# Google custom seach engine configuration
56+
# Google custom search engine configuration
5757
# gcs_engine_id: search engine
5858
gcs_engine_id: 33ef4cbe0703b4f3a
5959

config/_default/privacy.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
# See: https://gohugo.io/about/hugo-and-gdpr/
66
#
77
# googleAnalytics:
8-
# ananymizIP: Enable anonymiation of IP addresses
8+
# anonymizeIP: Enable anonymization of IP addresses
99
# disable: Set to true to disable googleAnalytics
1010
# respectDoNotTrack: Check for Do Not Track in headers
1111
# useSessionStorage: Store session information in storage and

config/staging/hugo.yaml

+6-2
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,8 @@
1-
baseURL: https://wasm.interlisp.org
1+
baseURL: https://stumbo.github.io/InterlispDraft.github.io/
2+
23
languageCode: en-us
4+
5+
# title
6+
# Insert Staging Environment onto every page to make clear
7+
# this is not the production site
38
title: 'Staging Environment'
4-
publishDir: stage

layouts/robots.txt

+4
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
# Production robots.txt
2+
#
3+
User-agent: *
4+
Allow: /

static/robots.txt

+4-1
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,5 @@
1+
# Placeholder file. Overwritten in production by setting enableRobotsTXT to true
2+
# See config/production/hugo.yaml
3+
#
14
User-agent: *
2-
Allow: /
5+
Disallow: /

0 commit comments

Comments
 (0)