Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Staging to Prod #135

Merged
merged 12 commits into from
Feb 26, 2025
14 changes: 7 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,17 +14,17 @@ self-service update.
# Workflow

To deploy an updated data version to the Agora development database
1. Increment `data-version` in `data-manifest.json` on the `develop` branch.
1. Increment `data_version` in `data-manifest.json` on the `develop` branch.
2. Commit the change
3. The Github action CI system automatically updates the dev DB


To deploy an updated data version to the Agora staging database:
1. Merge the data-version update from the dev branch to the staging branch.
1. Merge the data_version update from the dev branch to the staging branch.
2. The Github action CI system automatically updates the dev DB

To deploy an updated data version to the Agora production database:
1. Merge the data-version update from the staging branch to the production branch.
1. Merge the data_version update from the staging branch to the production branch.
2. The Github action CI system automatically updates the dev DB


Expand Down Expand Up @@ -54,21 +54,21 @@ Context specific secrets for each environment that corresponds to a git branch (

## Self hosted runners

[agora2-infra] repository deploys a bastian host in AWS for each environment which have access to
[agora-infra-v3] repository deploys a bastian host in AWS for each environment which have access to
the databases. We manually configure a [Github self-hosted runner](https://docs.github.com/en/actions/hosting-your-own-runners)
for each bastian host, a label is applied to each runner to match the corresponding git branch name (develop/staging/prod).

Each runner corresponds to an environment which corresponds to a git branch. The update is
executed from these runners. When a push happens on a branch (i.e. develop), the update
is executed on the `agora-bastian-develop` runner which in turn updates the development database.
is executed on the self-hosted runner with the `develop` label, which in turn updates the development database.


![alt text][self_hosted_runners]


### Setup self hosted runners

Github self hosted runners are deployed with a [Sceptre template config file])(https://github.com/Sage-Bionetworks/agora2-infra/blob/main/config/agoradev/develop/agora-bastian.yaml).
Github self hosted runners are deployed with [Cloudformation](https://github.com/Sage-Bionetworks-IT/agora-infra-v3/blob/dev/src/bastion_stack.py).

Self Hosted Runner setup:
* Deploy the template to the Agora AWS account.
Expand Down Expand Up @@ -121,5 +121,5 @@ Enter name of work folder: [press Enter for _work]
[db_update]: agora-db-update.drawio.png "update diagram"
[github_secrets]: github_secrets.png "github secrets screen"
[self_hosted_runners]: self-hosted-runners.png "self hosted runners"
[agora2-infra]: https://github.com/Sage-Bionetworks/agora2-infra "agora2-infra repository"
[agora-infra-v3]: https://github.com/Sage-Bionetworks-IT/agora-infra-v3 "agora-infra-v3 repository"
[Github self-hosted runners]: https://docs.github.com/en/actions/hosting-your-own-runners/managing-self-hosted-runners/about-self-hosted-runners#about-self-hosted-runners
6 changes: 3 additions & 3 deletions data-manifest.json
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
{
"data-version": "71",
"data-manifest-id": "syn13363290",
"team-images-id": "syn12861877"
"data_version": "72",
"data_file": "syn13363290",
"team_images_id": "syn12861877"
}
21 changes: 16 additions & 5 deletions import-data.sh
Original file line number Diff line number Diff line change
Expand Up @@ -19,13 +19,13 @@ TEAM_IMAGES_DIR=$DATA_DIR/team_images
mkdir -p $TEAM_IMAGES_DIR

# Version key/value should be on his own line
DATA_VERSION=$(cat $WORKING_DIR/data-manifest.json | grep data-version | head -1 | awk -F: '{ print $2 }' | sed 's/[",]//g' | tr -d '[[:space:]]')
DATA_MANIFEST_ID=$(cat $WORKING_DIR/data-manifest.json | grep data-manifest-id | head -1 | awk -F: '{ print $2 }' | sed 's/[",]//g' | tr -d '[[:space:]]')
TEAM_IMAGES_ID=$(cat $WORKING_DIR/data-manifest.json | grep team-images-id | head -1 | awk -F: '{ print $2 }' | sed 's/[",]//g' | tr -d '[[:space:]]')
echo "$BRANCH branch, DATA_VERSION = $DATA_VERSION, manifest id = $DATA_MANIFEST_ID"
DATA_VERSION=$(cat $WORKING_DIR/data-manifest.json | grep data_version | head -1 | awk -F: '{ print $2 }' | sed 's/[",]//g' | tr -d '[[:space:]]')
DATA_FILE=$(cat $WORKING_DIR/data-manifest.json | grep data_file | head -1 | awk -F: '{ print $2 }' | sed 's/[",]//g' | tr -d '[[:space:]]')
TEAM_IMAGES_ID=$(cat $WORKING_DIR/data-manifest.json | grep team_images_id | head -1 | awk -F: '{ print $2 }' | sed 's/[",]//g' | tr -d '[[:space:]]')
echo "$BRANCH branch, DATA_VERSION = $DATA_VERSION, manifest id = $DATA_FILE"

# Download the manifest file from synapse
synapse -p $SYNAPSE_PASSWORD get --downloadLocation $DATA_DIR -v $DATA_VERSION $DATA_MANIFEST_ID
synapse -p $SYNAPSE_PASSWORD get --downloadLocation $DATA_DIR -v $DATA_VERSION $DATA_FILE

# Ensure there's a newline at the end of the manifest file; otherwise the last listed file will not be downloaded
# echo >> $DATA_DIR/data_manifest.csv
Expand All @@ -44,6 +44,14 @@ ls -al $WORKING_DIR
ls -al $DATA_DIR
ls -al $TEAM_IMAGES_DIR

# Check if dataversion exists and handle different data format
DATAVERSION_PATH="${DATA_DIR}/dataversion.json"
DATAVERSION_FLAG="--jsonArray"
if [ ! -f "${DATAVERSION_PATH}" ]; then
DATAVERSION_PATH="${WORKING_DIR}/data-manifest.json"
DATAVERSION_FLAG=""
fi

# Import synapse data to database
# Not using --mode upsert for now because we don't have unique indexes properly set for the collections

Expand All @@ -64,6 +72,9 @@ mongoimport -h $DB_HOST -d agora -u $DB_USER -p $DB_PASS --authenticationDatabas
mongoimport -h $DB_HOST -d agora -u $DB_USER -p $DB_PASS --authenticationDatabase admin --collection genesbiodomains --jsonArray --drop --file $DATA_DIR/genes_biodomains.json
mongoimport -h $DB_HOST -d agora -u $DB_USER -p $DB_PASS --authenticationDatabase admin --collection biodomaininfo --jsonArray --drop --file $DATA_DIR/biodomain_info.json

echo "Importing dataversion from ${DATAVERSION_PATH}"
mongoimport -h $DB_HOST -d agora -u $DB_USER -p $DB_PASS --authenticationDatabase admin --collection dataversion $DATAVERSION_FLAG --drop --file $DATAVERSION_PATH

mongosh --host $DB_HOST -u $DB_USER -p $DB_PASS --authenticationDatabase admin $WORKING_DIR/create-indexes.js

pushd $TEAM_IMAGES_DIR
Expand Down