Skip to content
This repository has been archived by the owner on Feb 21, 2024. It is now read-only.

Debug harvest service

thomas loubrieu edited this page Jul 5, 2022 · 2 revisions

How to debug scalable harvest by using docker compose to deploy the required components in the architecture.

Local network configuration

Set hosts (/etc/hosts) as:

127.0.0.1       localhost
127.0.0.1       elasticsearch
127.0.0.1       rabbit-mq

Data directory configuration in docker containers

in .env file

Set CONTAINER_HARVEST_DATA_DIR to the absolute path where the data is harvested. We want to have the container see the path as they are seen by the debugged component running outside a container (in an IDE) on the host.

CONTAINER_HARVEST_DATA_DIR=/Users/loubrieu/git/registry/docker/test-data/registry-harvest-data/

Start the required components with docker compose

Start the docker compose:

% docker compose --profile int-registry-service-loader up

Stop the component which is debugged

% docker ps

For example:

% docker stop 136188e4372f

Start the debugged component

Start the same component in your IDE.

In the debug configuration:

Main tab:

Project: registry-harvest-service Main class: gov.nasa.pds.harvest.HarvestServerMain

Arguments tab:

-c src/main/resources/conf/harvest-server.cfg

Environment:

HARVEST_HOME=/Users/loubrieu/git/registry-harvest-service/src/main/resources

Use elasticsearch and rabbit-mq as hosts of openSearch and rabbitMQ services in the harvest-server.cfg file.

Start the service in debug mode

Trigger ingestion of data in the registry

Trigger an ingestion with registry-harvest-cli:

./harvest-client harvest -j harvest-job-config.xml -c harvest-client.cfg -overwrite

harvest-job-config.xml:

<harvest nodeName="PDS_ENG">
  <directories>
    <path>/Users/loubrieu/git/registry/docker/test-data/registry-harvest-data/test-data</path>
  </directories>
  <registry url="https://elasticsearch:9200" index="registry" auth="/etc/es-auth.cfg" />
  <fileInfo>
    <!-- UPDATE with your own local path and base url where pds4 archive is published -->
    <fileRef replacePrefix="/Users/loubrieu/git/registry/docker/test-data/registry-harvest-data" with="http://localhost:81/archive" />
  </fileInfo>
  <autogenFields/>
</harvest>