Skip to content

Commit 390d761

Browse files
authored
Stop indexing from DataSpace (#746)
* Stop indexing from DataSpace * Fix typo * Increase solr_writer thread pool As suggested by an error message on the server * Remove change to solr_writer.thread_pool If we need it, it should be in a separate PR
1 parent 1fbe77c commit 390d761

File tree

2 files changed

+3
-20
lines changed

2 files changed

+3
-20
lines changed

README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# pdc_discovery
22

33

4-
A discovery portal for Princeton research data. Initially it will provide a better browsing experience for the research data contained in [DataSpace](https://dataspace.princeton.edu).
4+
A discovery portal for Princeton research data.
55

66
Please note: While this is open-source software, we would disourage anyone from trying to just check it out and run it. Princeton specifics, from styling to authentication and authorization, are hard coded, and we have not invested any time in the kind of configurabily that would be needed for use at another institution. Instead it should be taken as an example of breaking a monolithic project into separate components, and developing iteratively in response to local user feedback.
77

@@ -59,9 +59,9 @@ We utilize Rubocop for our Ryby code and Prettier for our JavaScript
5959

6060
To create a tagged release use the [steps in the RDSS handbook](https://github.com/pulibrary/rdss-handbook/blob/main/release_process.md)
6161

62-
## Indexing research data from DataSpace and PDC Describe
62+
## Indexing research data from PDC Describe
6363

64-
PDC Discovery indexes data from both DataSpace and from PDC Describe via the following rake task:
64+
PDC Discovery indexes data from PDC Describe via the following rake task:
6565

6666
```ruby
6767
rake index:research_data

lib/tasks/index.rake

Lines changed: 0 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -9,22 +9,12 @@ namespace :index do
99

1010
Rails.logger.info "Indexing: Fetching PDC Describe records"
1111
Rake::Task['index:pdc_describe_research_data'].invoke
12-
Rails.logger.info "Indexing: Fetching DataSpace records"
13-
Rake::Task['index:dspace_research_data'].invoke
1412
Rails.logger.info "Indexing: Fetching completed"
1513

1614
Indexing::SolrCloudHelper.update_solr_alias!
1715
Rails.logger.info "Indexing: Updated Solr to read from the new collection: #{Indexing::SolrCloudHelper.alias_url} -> #{Indexing::SolrCloudHelper.collection_reader_url}"
1816
end
1917

20-
desc 'Index all DSpace research data collections'
21-
task dspace_research_data: :environment do
22-
Rails.logger.info "Indexing: Harvesting and indexing DataSpace research data collections started"
23-
DspaceResearchDataHarvester.harvest(false)
24-
Indexing::SolrCloudHelper.collection_writer_commit!
25-
Rails.logger.info "Indexing: Harvesting and indexing DataSpace research data collections completed"
26-
end
27-
2818
desc 'Index all PDC Describe data'
2919
task pdc_describe_research_data: :environment do
3020
Rails.logger.info "Indexing: Harvesting and indexing PDC Describe data started"
@@ -40,13 +30,6 @@ namespace :index do
4030
Blacklight.default_index.connection.commit
4131
end
4232

43-
desc 'Fetches the most recent community information from DataSpace and saves it to a file.'
44-
task cache_dataspace_communities: :environment do
45-
cache_file = ENV['COMMUNITIES_FILE'] || './spec/fixtures/files/dataspace_communities.json'
46-
communities = DataspaceCommunities.new
47-
File.write(cache_file, JSON.pretty_generate(communities.tree))
48-
end
49-
5033
desc 'Prints to console the current Solr URLs and how they are configured'
5134
task print_solr_urls: :environment do
5235
puts "Solr alias.: #{Indexing::SolrCloudHelper.alias_url}"

0 commit comments

Comments
 (0)