This tool can be used as a package or cli tool. It serves as a data extractor to support ELT pipelines or any kind of process that requires a heavy data dump.
This package allows the user to dump data into multiple json files.
The ping
command does a ping in database and returns a connection check.
mongoextract ping --conn-uri "$MONGO_CONN_URI"
The collxst
command does a ping in database and returns a connection check.
mongoextract collxst \
--conn-uri "$MONGO_CONN_URI" \
--db-name "$MONGO_DBNAME" \
--collection "$MONGO_COLLECTION" \
--app-name "$APPNAME"
The extract-batch
command iterates over mongo cursor and dumps chunks of data into json files.
mongoextract extract-batch \
--conn-uri "$MONGO_CONN_URI" \
--db-name "$MONGO_DBNAME" \
--collection "$MONGO_COLLECTION" \
--app-name "$APPNAME" \
--mapping $ID_NAME \
--query '{"latitude":{"$$gte":30}}' \
--output-path "./data" \
--output-prefix $ID_NAME \
--chunk-size 100 \
--num-concurrent-files 10
The extract
command fetches a mongo cursor and dumps the whole data into a json file.
mongoextract extract \
--conn-uri "$MONGO_CONN_URI" \
--db-name "$MONGO_DBNAME" \
--collection "$MONGO_COLLECTION" \
--app-name "$APPNAME" \
--mapping some_mapping_name \
--query '{"latitude":{"$$gte":30}}'
All releases are documented in CHANGELOG, please check there for more details.
All releases are containerized and available here To pull the latest version, execute the following command:
docker pull victorfaro/mongoextract:latest
All images uses the cli as entrypoint, check the above examples to see how to use it.