Skip to content

Commit

Permalink
Support human readable slugs for non-English locales
Browse files Browse the repository at this point in the history
This PR:

- No longer sets `slug` to `nil` when the locale is non-English.
  Every locale is considered "sluggable".
- This also means we no longer need the `sluggable_locale?`,
  `slug_eligible?` methods and ASCII checks. (The latter has
  caused a number of Zendesk tickets, whereby even English HTML
  Attachments have been given a content-ID based URL, because
  they contain a non-ASCII character like a fancy single quote).
- `sluggable_string` doesn't appear to have been in use, so it
  was deleted. (`sluggable_string` is used in the Document class:
  `friendly_id :sluggable_string`. But the Attachment class uses
  `friendly_id :title`, so overriding the `sluggable_string`
  method had no effect).
- Adds a test for friendly_id's native "uniqueness" feature
  (which we've opted into via `use: :scoped, scope: :attachable`
  on the Attachment class), so in theory we should be protected
  from clashes in the attachment slug namespace. Therefore the
  issue alluded to in #5321 (which reverts the last time this
  was attempted) should not happen again.
- Adds a basic safeguard that the slug-generation results in a
  slug that contains at least 4 a-z or A-Z characters. This is
  to protect against generating a slug like `-`, which could
  happen from sluggifying a title like `{chinese}-{chinese}`.
- That said, if we somehow DO get into a state whereby an
  attachment has what we'd call an invalid slug (<4 azAZ chars),
  we don't want to fix it if the attachment is already live.
  So we check for `safely_resluggable?`, introduced in #5942.

Result:

- Most HTML attachments should now have human-readable slugs.
  Only languages that are non-ASCII and cannot be converted to
  ASCII, such as Chinese, will continue to fall back to the
  content-ID based URLs.

Trello: https://trello.com/c/YGsEG9YV/3418-improve-html-attachment-slug-creation
  • Loading branch information
ChrisBAshton committed Feb 5, 2025
1 parent 5dc1afa commit 5308c32
Show file tree
Hide file tree
Showing 2 changed files with 72 additions and 61 deletions.
31 changes: 9 additions & 22 deletions app/models/html_attachment.rb
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,8 @@ class HtmlAttachment < Attachment
autosave: true,
inverse_of: :html_attachment

before_validation :clear_slug_if_non_english_locale
before_create :ensure_slug_is_valid
before_save :ensure_slug_is_valid

validates :govspeak_content, presence: true

Expand Down Expand Up @@ -61,9 +62,7 @@ def url(options = {})
end

def should_generate_new_friendly_id?
return false unless sluggable_locale?

slug.nil? || attachable.nil? || safely_resluggable?
slug.nil? || slug.blank? || attachable.nil? || safely_resluggable?
end

def deep_clone
Expand All @@ -83,9 +82,7 @@ def translated_locales
end

def identifier
return slug if slug_eligible?

content_id
slug || content_id
end

def base_path
Expand Down Expand Up @@ -123,21 +120,11 @@ def public_url(options = {})
root + public_path(options)
end

def sluggable_locale?
locale.blank? || (locale == "en")
end

def sluggable_string
sluggable_locale? ? title : nil
end

def slug_eligible?
title.ascii_only? && sluggable_locale?
end
# Sense-check that the title converted to ASCII contains more than just hyphens!
def ensure_slug_is_valid
return unless slug.present? && slug.count("A-Za-z") < 4

def clear_slug_if_non_english_locale
if locale_changed? && !sluggable_locale?
self.slug = nil
end
# Force FriendlyId to regenerate the slug, but only if attachment isn't live already
self.slug = nil if safely_resluggable?
end
end
102 changes: 63 additions & 39 deletions test/unit/app/models/html_attachment_test.rb
Original file line number Diff line number Diff line change
Expand Up @@ -46,18 +46,6 @@ class HtmlAttachmentTest < ActiveSupport::TestCase
assert_equal expected, actual
end

test "#url returns absolute path to the draft stack when previewing for non-english locale" do
edition = create(:draft_publication, :with_html_attachment)
attachment = edition.attachments.first
attachment.update!(locale: "fi")

expected = "https://draft-origin.test.gov.uk/government/publications/"
expected += "#{edition.slug}/#{attachment.content_id}?preview=#{attachment.id}"
actual = attachment.url(preview: true, full_url: true)

assert_equal expected, actual
end

test "#url returns absolute path to the live site when not previewing" do
edition = create(:published_publication, :with_html_attachment)
attachment = edition.attachments.first
Expand All @@ -69,31 +57,12 @@ class HtmlAttachmentTest < ActiveSupport::TestCase
assert_equal expected, actual
end

test "#url returns absolute path to the live site when not previewing for non-english locale" do
edition = create(:published_publication, :with_html_attachment)
attachment = edition.attachments.first
attachment.update!(locale: "fi")

expected = "https://www.test.gov.uk/government/publications/"
expected += "#{edition.slug}/#{attachment.content_id}"
actual = attachment.url(full_url: true)

assert_equal expected, actual
end

test "#url returns relative path by default" do
edition = create(:published_publication, :with_html_attachment)
attachment = edition.attachments.first
assert_equal "/government/publications/#{edition.slug}/#{attachment.slug}", attachment.url
end

test "#url returns relative path by default for non-english locale" do
edition = create(:published_publication, :with_html_attachment)
attachment = edition.attachments.first
attachment.update!(locale: "fi")
assert_equal "/government/publications/#{edition.slug}/#{attachment.content_id}", attachment.url
end

test "#url works when an attachment with english locale has special characters in title" do
edition = create(:published_publication, :with_html_attachment)
attachment = edition.attachments.first
Expand All @@ -111,7 +80,7 @@ class HtmlAttachmentTest < ActiveSupport::TestCase
statistics = create(:published_national_statistics)
attachment = statistics.attachments.last
attachment.update!(locale: "fi")
assert_equal "/government/statistics/#{statistics.slug}/#{attachment.content_id}", attachment.url
assert_equal "/government/statistics/#{statistics.slug}/#{attachment.slug}", attachment.url
end

test "#url works with consultation outcomes" do
Expand All @@ -124,7 +93,7 @@ class HtmlAttachmentTest < ActiveSupport::TestCase
consultation = create(:consultation_with_outcome_html_attachment)
attachment = consultation.outcome.attachments.first
attachment.update!(locale: "fi")
assert_equal "/government/consultations/#{consultation.slug}/outcome/#{attachment.content_id}", attachment.url
assert_equal "/government/consultations/#{consultation.slug}/outcome/#{attachment.slug}", attachment.url
end

test "#url works with consultation public feedback" do
Expand All @@ -137,7 +106,7 @@ class HtmlAttachmentTest < ActiveSupport::TestCase
consultation = create(:consultation_with_public_feedback_html_attachment)
attachment = consultation.public_feedback.attachments.first
attachment.update!(locale: "fi")
assert_equal "/government/consultations/#{consultation.slug}/public-feedback/#{attachment.content_id}", attachment.url
assert_equal "/government/consultations/#{consultation.slug}/public-feedback/#{attachment.slug}", attachment.url
end

test "#url works with call for evidence outcomes" do
Expand Down Expand Up @@ -232,7 +201,7 @@ class HtmlAttachmentTest < ActiveSupport::TestCase
assert_equal "an-html-attachment", second_draft_attachment.slug
end

test "slug is not created for non-english attachments" do
test "slug is not created for non-english attachments if conversion to non-ASCII isn't possible" do
# Additional attachment to ensure the duplicate detection behaviour isn't triggered
create(:html_attachment, locale: "fr")
attachment = create(:html_attachment, locale: "ar", title: "المملكة المتحدة والمملكة العربية السعودية")
Expand All @@ -249,11 +218,66 @@ class HtmlAttachmentTest < ActiveSupport::TestCase
assert_equal expected_slug, attachment.to_param
end

test "slug is cleared when changing from english to non-english" do
attachment = create(:html_attachment, locale: "en")
test "slug is created for non-english attachments where conversion to ASCII is possible" do
attachment = create(:html_attachment, locale: "cs", title: "toto je náhodný název")

attachment.update!(locale: "fr")
assert attachment.slug.blank?
expected_slug = "toto-je-nahodny-nazev"
assert_equal expected_slug, attachment.slug
assert_equal expected_slug, attachment.to_param
end

test "slug requires minimum 3 azAZ characters" do
attachment = create(:html_attachment, locale: "cs", title: "tot")
assert_equal nil, attachment.slug
end

test "slug reverts to nil, if document has never been published, if title is changed such that minimum 3 azAZ characters is not reached" do
attachment = build(:html_attachment, title: "an-html-attachment")

create(:draft_publication, attachments: [attachment])

attachment.title = "foo"
attachment.save!
attachment.reload

assert_equal nil, attachment.slug
end

test "even if slug is considered too short NOW, slug on legacy old attachment is not updated when the title is changed if document is published" do
edition = create(
:published_publication,
attachments: [
build(:html_attachment, title: "an-html-attachment"),
],
)
# Bypass validations and callbacks to manually set slug
attachment = edition.attachments.first
attachment.update_column(:slug, "foo")
assert_equal "foo", attachment.slug

#  now check slug not overridden
draft = edition.create_draft(create(:writer))
attachment = draft.attachments.first
attachment.title = "a-new-title"
attachment.save!
attachment.reload
assert_equal "foo", attachment.slug
end

# This behaviour is built into the friendly_id gem
test "slug is made unique if there is an existing clash" do
edition = create(
:published_publication,
attachments: [
build(:html_attachment, locale: "en", title: "food"),
build(:html_attachment, locale: "cs", title: "food"),
],
)
attachment_1 = edition.attachments.first
attachment_2 = edition.attachments.last

assert_equal "food", attachment_1.slug
assert_equal "food--2", attachment_2.slug
end

test "#translated_locales lists only the attachment's locale" do
Expand Down

0 comments on commit 5308c32

Please sign in to comment.