-
Notifications
You must be signed in to change notification settings - Fork 0
Debug harvest service
How to debug scalable harvest by using docker compose to deploy the required components in the architecture.
Set hosts (/etc/hosts) as:
127.0.0.1 localhost
127.0.0.1 elasticsearch
127.0.0.1 rabbit-mq
in .env file
Set CONTAINER_HARVEST_DATA_DIR
to the absolute path where the data is harvested.
We want to have the container see the path as they are seen by the debugged component running outside a container (in an IDE) on the host.
CONTAINER_HARVEST_DATA_DIR=/Users/loubrieu/git/registry/docker/test-data/registry-harvest-data/
Start the required components with docker compose
Start the docker compose:
% docker compose --profile int-registry-service-loader up
Stop the component which is debugged
% docker ps
For example:
% docker stop 136188e4372f
Start the same component in your IDE.
In the debug configuration:
Project: registry-harvest-service Main class: gov.nasa.pds.harvest.HarvestServerMain
-c src/main/resources/conf/harvest-server.cfg
HARVEST_HOME=/Users/loubrieu/git/registry-harvest-service/src/main/resources
Use elasticsearch and rabbit-mq as hosts of openSearch and rabbitMQ services in the harvest-server.cfg file.
Start the service in debug mode
Trigger an ingestion with registry-harvest-cli:
./harvest-client harvest -j harvest-job-config.xml -c harvest-client.cfg -overwrite
harvest-job-config.xml:
<harvest nodeName="PDS_ENG">
<directories>
<path>/Users/loubrieu/git/registry/docker/test-data/registry-harvest-data/test-data</path>
</directories>
<registry url="https://elasticsearch:9200" index="registry" auth="/etc/es-auth.cfg" />
<fileInfo>
<!-- UPDATE with your own local path and base url where pds4 archive is published -->
<fileRef replacePrefix="/Users/loubrieu/git/registry/docker/test-data/registry-harvest-data" with="http://localhost:81/archive" />
</fileInfo>
<autogenFields/>
</harvest>