Skip to content

Adding a controlled vocabulary field

adam malantonio edited this page Nov 20, 2018 · 8 revisions

This is a guide to (hopefully) ease your pain as you go through the Rube Goldberg-ian steps needed to add a controlled vocabulary field to a work model. I think things are easier than they previously were, but it could always get better.

adding an authority

  • add local authority data
  • add remote authority data
  • update records/edit_fields partial
  • update form handling (optional)
  • update indexing (optional)

add local authority data

  • add authority yaml file

This is by far the easiest way to add a local authority, but it will depend on you to manage it. See the questioning_authority guide for info about how to set this up.

  • host remote authority locally

questioning_authority provides a way to use a local database table for vocabularies, which is helpful for cutting down request times when querying a third-party service. the downside is that you are then responsible for ensuring that the data is up-to-date.

To load a vocabulary, use the following rake task:

bundle exec rails spot:rdf:import[name-of-vocabulary, http://example.org/path/to/vocab.nt]

Note: The RDF authority parser being used is a custom one that will only load items with an @en tagged value.

You'll then have to tell QA how to access this vocabulary when being called from the JSON api path (/authorities/local/search/:authority). To do so, you'll need to append the config/initializers/local_authorities.rb file with the following:

Qa::Authorities::Local.register_subauthority(
  <authority_name>, 
  'Qa::Authorities::Local::TableBasedAuthority'
)

add remote authority data

  • questioning_authority out of the box

Out of the box questioning_authority supports:

  • CrossRef
    • funders
    • journals
  • GeoNames
  • Getty
    • aat
    • tgn
    • ulan
  • Library of Congress
    • actionsGranted
    • agentType
    • childrensSubjects
    • classification
    • countries
    • ethnographicTerms
    • genreForms
    • geographicAreas
    • graphicMaterials
    • iso639-1
    • iso639-2
    • iso639-5
    • languages
    • names
    • organizations
    • performanceMediums
    • preservation
    • relators
    • subjects
  • MeSH
  • OCLC FAST
  • OCLC TS
  • TGN languages

To extend that list with another source, follow the instructions on the questioning_authority readme.

update records/edit_fields partial

The following instructions are for setting up an autocomplete text input for fields. For fields with <= 15 terms, a <select> style input may be preferable. See below for a guide.

For the most part, the same partial will be used for any of the below types of inputs. This looks like:

<%# app/views/records/edit_fields/_your_field.html.erb %>
<%=
  f.input key,
  as: :input_type
  input_html: {
    class: 'form-control',
    data: {
      'autocomplete-url' => '/authorities/search/<name>',
      'autocomplete' => key,
    },
  },
  wrapper_html: {
    data: {
      'autocomplete-url' => '/authorities/search/<name>',
      'field-name' => key
    }
  },
  required: f.object.required?(key)
%>

The options for as: are as follows

key uses
:controlled_vocabulary for vocabularies that return RDF URIs as ids
:local_controlled_vocabulary for local vocabularies whose id differs from the labels
:multi_value for plain text values that need a simple autocomplete

:controlled_vocabulary

These inputs are handy for external vocabularies whose id values are RDF URIs. An example payload from this kind of endpoint would look like:

[
  {
    "id": "http://id.loc.gov/vocabulary/languages/fre",
    "label": "French"
  }
]

To use this input, you'll need to do some setup on the WorkModel:

  1. give the property a class_name that knows how to produce a Solr-izable label.
  2. call the method to allow ActiveFedora::Base to handle nested attributes for the property
# app/models/work.rb
class Work < ActiveFedora::Base
  # ...
  # 1)
  property :keyword, predicate: 'http://some/uri',
                     class_name: Spot::ControlledVocabularies::Base do |index|
    index.as :symbol
  end

  # ...
  # 2)
  id_blank = proc { |attributes| attributes[:id].blank? }
  accepts_nested_attributes_for :keyword, reject_if: id_blank, allow_destroy: true
end

Spot::ControlledVocabularies::Base chooses the first @en label among the following properties:

  • RDF::Vocab::SKOS.prefLabel
  • RDF::Vocab::DC.title
  • RDF::RDFS.label
  • RDF::Vocab::SKOS.altLabel
  • RDF::Vocab::SKOS.hiddenLabel
  • RDF::Vocab::GEONAMES.name

This label is then cached in the application database as an RdfLabel, which should help prevent multiple trips to fetch the same label. Because this is a plain-ol' ActiveRecord::Base object with timestamps, you can pretty easily clear out old items:

RdfLabel.where('created_at < ?', 30.days.ago).destroy_all

and the next time an item is indexed, it will fetch the label from the source.

:local_controlled_vocabulary

This type of input was created for locally controlled authorities (see: yaml-based qa ones) whose values are not the same as their labels. The practical example is the one this was designed to handle: storing ISO639-1 2-character values for language, but displaying the full-string label. ("en" => "English")

To add handling for this, you'll need to:

  1. update the form to expect this property to have nested_attributes
  2. tell the indexer how to translate the value to a label (optional)

To update the form, add the field name + '_attributes' to the .build_permitted_params method and add the field to the .transform_nested_fields! method call within .model_attributes:

class WorkForm < Hyrax::Forms::WorkForm
  # ...
  class < self
    def build_permitted_params
      super.tap do |params|
        params << { keyword_attributes: [:id, :_destroy] }
      end
    end

    def model_attributes(form_params)
      super.tap do |params|
        transform_nested_fields!(params, :keyword)
      end
    end
  end
end

If the value you're getting back is the same as the value you're storing, then you're all set at this point. If you need to translate the value back to the label for the indexer, you'll need to define a method to do that. Usually this method will just call a local service built for that purpose. See language label indexing and our local iso639-1 service for examples.

:multi_value

As the three types go, :multi_value inputs require the least amount of setup. Simply create a QA endpoint, add the URL to the partial (it'll look like /authorities/local/search/<name>) and you're all set. This is good for inputs where we want to suggest terms but not exactly require them (ex. keywords). It's also very useful for vocabularies whose values are the same as the labels we want to use.

update hyrax/autocomplete.es6

The default for the autocomplete widgets is to use the jquery-ui autocomplete, which presents as a text input that adds a simple dropdown. Personally, I prefer the look of the Select2 input that is only used for based_near in Hyrax out of the box. :controlled_vocabulary and :local_controlled_vocabulary inputs are built specifically for this layout, so this step is important for those.

In app/assets/javascripts/hyrax/autocomplete.es6, you'll need to add the field names onto the switch statement that returns a new LinkedData element:

  export default class Autocomplete {
    setup (element, fieldName, url) {
      switch(fieldName) {
+       case 'keyword':
        case 'based_near':
          new LinkedData(element, url)
        // ...
      }
  }

Using a <select> input

For authorities with <15 values, it may be preferable to present a <select> input for these options. To add this input, use the following partial instead:

<%# app/views/records/edit_fields/_division.html.erb %>
<% service = DivisionService.new %>
<%= f.input :key,
            as: :multi_value_select,
            collection: service.select_active_options,
            include_blank: true,
            input_html: { class: 'form-control' }
%>

In this case, DivisionService would be a class that simply inherits from Hyrax::QaSelectService to render the YAML values as those that the form builder can translate into a <select> box:

class DivisonService < Hyrax::QaSelectService
  def initialize(_auth_name = nil)
    super('division')
  end
end