Skip to content

Commit b623ad3

Browse files
committed
Allow labeling all existing resources; useful on first deployment
1 parent 5bbda7b commit b623ad3

File tree

8 files changed

+197
-96
lines changed

8 files changed

+197
-96
lines changed

README.md

+3-4
Original file line numberDiff line numberDiff line change
@@ -25,9 +25,9 @@ Note that Iris is designed to serve the organization. It is not designed around
2525

2626
Iris does not *add* information, only *copy* values that already exist. For example, it can label a VM instance with its zone; but it cannot add a "business unit" label because it does not know a resource's business unit. For that, you should label all resources when creating them, e.g., in your Terraform scripts. (Indeed, iris can be made extraneous in this way.)
2727

28-
## Existing resources are not all labeled (by default)
28+
## Labeling resources that existing resources when you deploy Iris
2929

30-
If you want to label lots of virtual machines,PubSub topics etc. that *already exist* when you install Iris, see section "[Labeling existing resources](#labeling-existing-resources)" below.
30+
If you want to label lots of virtual machines,PubSub topics etc. that *already exist* when you deploy Iris, see section "[Labeling existing resources](#labeling-existing-resources)" below.
3131

3232
# Open source
3333

@@ -51,8 +51,7 @@ You can also disable the scheduled labeling. See Deployment below or run `./dep
5151
## Labeling existing resources
5252

5353
* When you first use Iris, you may want to label all existing resources. Iris does not do this by default.
54-
* To do this, deploy Iris with `label_all_on_cron: True` and wait for the next scheduled run, or manually trigger a run through Cloud Scheduler.
55-
* Thenּ, you may want to then **redeploy** Iris with `label_all_on_cron: False`, to avoid the resource consumption of relabeling all resources with the same label every day forever.
54+
* To do this, publish a PubSub message (the content doesn't matter) to `iris_label_all_topic`, for example with `gcloud pubsub topics publish iris_label_all_topic --message=labelall --project $PROJECT_ID` (substituting the project ID where Iris is deployed.) Of course, you will need to have permissions to publish that message.
5655

5756
# Supported Google Cloud resources
5857

TODO.md

-2
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,6 @@
44

55
* P2 Memory consumption: Even an empty AppEngine app (not Iris, just a Hello World with 3 lines of code in total) crashes on out-of-memory for the smalled AppEngine instance. Google has confirmed this. See if there is a workaround. This will save money.
66

7-
8-
97
* P3 Label immediately after an event in certain cases, as opposed to using a daily cron as is now done.
108
* Cloud SQL Instances
119
* See above re Cloud Tasks

main.py

+49-24
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@
2828

2929
from functools import lru_cache
3030

31-
from typing import Dict, Type
31+
from typing import Dict, Type, List
3232

3333
import time
3434

@@ -91,34 +91,53 @@ def warmup():
9191
return "", 200, {}
9292

9393

94+
@app.route("/label_all", methods=["POST"])
95+
@log_time
96+
def label_all():
97+
with gae_memory_logging("label_all"):
98+
logging.info("Start label_all() invocation")
99+
increment_invocation_count("label_all")
100+
__check_pubsub_jwt()
101+
return __label_multiple_types(True)
102+
103+
94104
@app.route("/schedule", methods=["GET"])
95105
@log_time
96106
def schedule():
97-
"""
98-
Send out a message per-plugin per-project to label all objects of that type and project.
99-
"""
100-
# Not validated with JWT because validated with cron header (see below)
101-
increment_invocation_count("schedule")
102107
with gae_memory_logging("schedule"):
103-
try:
104-
logging.info("Start schedule() invocation")
108+
logging.info("Start schedule() invocation")
109+
increment_invocation_count("schedule")
110+
# Not validated with JWT because validated with cron header
111+
is_cron = flask.request.headers.get("X-Appengine-Cron")
112+
if not is_cron:
113+
return "Access Denied: No Cron header found", 403
114+
label_all_types = config_utils.label_all_on_cron()
115+
return __label_multiple_types(label_all_types)
105116

106-
is_cron = flask.request.headers.get("X-Appengine-Cron")
107-
if not is_cron:
108-
return "Access Denied: No Cron header found", 403
109117

110-
enabled_projects = __get_enabled_projects()
111-
__send_pubsub_per_projectplugin(enabled_projects)
112-
# All errors are actually caught before this point,
113-
# since most errors are unrecoverable.
114-
return "OK", 200
115-
except Exception:
116-
logging.exception("In schedule()")
117-
return "Error", 500
118+
def __label_multiple_types(label_all_types: bool):
119+
"""
120+
Send out a message per-plugin per-project; each msg labels objects of a given type and project.
121+
:param label_all_types: if false, only label those that are
122+
not labeled on creation (CloudSQL) or those that must be relabeled on cron (Disks),
123+
but if true, label all types;
124+
"""
125+
assert label_all_types is not None
126+
try:
127+
enabled_projects = __get_enabled_projects()
128+
__send_pubsub_per_projectplugin(
129+
enabled_projects, label_all_types=label_all_types
130+
)
131+
# All errors are actually caught, logged and ignored *before* this point,
132+
# since most errors are unrecoverable.
133+
return "OK", 200
134+
except Exception:
135+
logging.exception("In schedule()")
136+
return "Error", 500
118137

119138

120139
@lru_cache(maxsize=1)
121-
def __get_enabled_projects():
140+
def __get_enabled_projects() -> List:
122141
configured_as_enabled = config_utils.enabled_projects()
123142
if configured_as_enabled:
124143
enabled_projs = configured_as_enabled
@@ -142,14 +161,15 @@ def __get_enabled_projects():
142161
return enabled_projs
143162

144163

145-
def __send_pubsub_per_projectplugin(configured_projects):
164+
def __send_pubsub_per_projectplugin(configured_projects: List, label_all_types: bool):
146165
msg_count = 0
147166
for project_id in configured_projects:
148167
for plugin_cls in PluginHolder.plugins:
168+
149169
if (
150170
not plugin_cls.is_labeled_on_creation()
151171
or plugin_cls.relabel_on_cron()
152-
or config_utils.label_all_on_cron()
172+
or label_all_types
153173
):
154174
pubsub_utils.publish(
155175
msg=json.dumps(
@@ -205,6 +225,11 @@ def label_one():
205225
for supported_method in method_names:
206226
if supported_method.lower() in method_from_log.lower():
207227
if plugin_cls.is_labeled_on_creation():
228+
logging.info(
229+
"plugin_cls %s, with method %s",
230+
plugin_cls.__name__,
231+
method_from_log,
232+
)
208233
__label_one_0(data, plugin_cls)
209234

210235
plugins_found.append(
@@ -254,7 +279,7 @@ def __check_pubsub_jwt():
254279
logging.error(f"Email verified was {is_email_verif}")
255280
return False
256281

257-
logging.info("Claims: " + str(claim))
282+
# logging.info("Claims: " + str(claim))
258283
if (
259284
email := claim.get("email")
260285
) != f"iris-msg-sender@{current_project_id()}.iam.gserviceaccount.com":
@@ -263,7 +288,7 @@ def __check_pubsub_jwt():
263288
except Exception as e:
264289
logging.exception(f"Invalid JWT token: {e}")
265290
return False
266-
logging.info("JWT Passed")
291+
# logging.info("JWT Passed")
267292

268293
return True
269294

plugin.py

+7-7
Original file line numberDiff line numberDiff line change
@@ -26,9 +26,9 @@
2626
PLUGINS_MODULE = "plugins"
2727

2828

29-
# TODO Since subclasses are already singletons, and we are already using
30-
# a lot of classmethods and staticmethods, , could convert this to
31-
# never use instance methods
29+
# Since subclasses are already singletons, and we are already using
30+
# a lot of classmethods and staticmethods, could convert this code to
31+
# never use instance methods, maybe only staticmethods
3232
class Plugin(metaclass=ABCMeta):
3333
# Underlying API max is 1000; avoid off-by-one errors
3434
# We send a batch when _BATCH_SIZE or more tasks are in it, or at the end of a label_all
@@ -244,12 +244,12 @@ def load_plugin_class(name) -> Type:
244244
assert cls.plugins, "No plugins defined"
245245

246246
@classmethod
247-
def get_plugin_instance(cls, plugin_cls):
247+
def get_plugin_instance(cls, plugin_cls: Type[Plugin]):
248248
"""Lazy-initialize the instance. The classes are loaded in init()"""
249249
with cls.__lock:
250-
assert plugin_cls in cls.plugins, plugin_cls + " " + cls.plugins
251-
plugin_instance = cls.plugins[plugin_cls]
252-
250+
plugin_instance: Plugin = cls.plugins[plugin_cls]
251+
# Note: We initialized all keys in cls.plugins
252+
# with None values.
253253
assert not plugin_instance or isinstance(
254254
plugin_instance, (Plugin, plugin_cls)
255255
)

scripts/_deploy-project.sh

+54-23
Original file line numberDiff line numberDiff line change
@@ -7,15 +7,18 @@
77
# - Pass the project as the first command line argument.
88
#
99

10+
1011
#set -x
1112
set -u
1213
set -e
1314

1415
SCHEDULELABELING_TOPIC=iris_schedulelabeling_topic
16+
LABEL_ALL_TOPIC=iris_label_all_topic
1517
DEADLETTER_TOPIC=iris_deadletter_topic
1618
DEADLETTER_SUB=iris_deadletter
1719
DO_LABEL_SUBSCRIPTION=do_label
1820
LABEL_ONE_SUBSCRIPTION=label_one
21+
LABEL_ALL_SUBSCRIPTION=label_all
1922

2023
ACK_DEADLINE=60
2124
MAX_DELIVERY_ATTEMPTS=10
@@ -28,11 +31,12 @@ if [[ ! -f "config-test.yaml" ]] && [[ ! -f "config.yaml" ]]; then
2831
exit 1
2932
fi
3033

34+
#Next line duplicate of our Python func gae_url_with_multiregion_abbrev
3135
appengineHostname=$(gcloud app describe --project $PROJECT_ID | grep defaultHostname |cut -d":" -f2 | awk '{$1=$1};1' )
3236
if [[ -z "$appengineHostname" ]]; then
3337
echo >&2 "App Engine is not enabled in $PROJECT_ID.
34-
To do this, please deploy a simple \"Hello World\" default service to enable App Engine.
35-
In doing so, select the App Engine region that you prefer. It is immutable."
38+
To do this, please enable it with \"gcloud app create [--region=REGION]\",
39+
and then deploy a simple \"Hello World\" default service to enable App Engine."
3640

3741
exit 1
3842
fi
@@ -41,7 +45,8 @@ gae_svc=$(grep "service:" app.yaml | awk '{print $2}')
4145

4246

4347
LABEL_ONE_SUBSCRIPTION_ENDPOINT="https://${gae_svc}-dot-${appengineHostname}/label_one"
44-
DO_LABEL_SUBSCRIPTION_ENDPOINT="https://${gae_svc}-dot-${appengineHostname}/do_label?token"
48+
DO_LABEL_SUBSCRIPTION_ENDPOINT="https://${gae_svc}-dot-${appengineHostname}/do_label"
49+
LABEL_ALL_SUBSCRIPTION_ENDPOINT="https://${gae_svc}-dot-${appengineHostname}/label_all"
4550

4651
declare -A enabled_services
4752
while read -r svc _; do
@@ -59,6 +64,7 @@ required_svcs=(
5964
storage-component.googleapis.com
6065
sql-component.googleapis.com
6166
sqladmin.googleapis.com
67+
bigquery.googleapis.com
6268
)
6369
for svc in "${required_svcs[@]}"; do
6470
if ! [ ${enabled_services["$svc"]+_} ]; then
@@ -100,6 +106,9 @@ fi
100106

101107
project_number=$(gcloud projects describe $PROJECT_ID --format json|jq -r '.projectNumber')
102108
PUBSUB_SERVICE_ACCOUNT="service-${project_number}@gcp-sa-pubsub.iam.gserviceaccount.com"
109+
# The following line is only needed on first deployment, and so slows things
110+
# down unnecessarily otherwise. THe same is true for enabling the services, above.
111+
gcloud beta services identity create --project $PROJECT_ID --service pubsub
103112

104113
msg_sender_sa_name=iris-msg-sender
105114

@@ -113,30 +122,18 @@ set -e
113122

114123
MSGSENDER_SERVICE_ACCOUNT=${msg_sender_sa_name}@${PROJECT_ID}.iam.gserviceaccount.com
115124

116-
gcloud projects add-iam-policy-binding ${PROJECT_ID} \
117-
--member="serviceAccount:${PUBSUB_SERVICE_ACCOUNT}"\
118-
--role='roles/iam.serviceAccountTokenCreator'
119125

120-
set +e
126+
127+
121128
# Allow Pubsub to publish into the deadletter topic
122-
BINDING_ERR_OUTPUT=$(gcloud pubsub topics add-iam-policy-binding $DEADLETTER_TOPIC \
129+
gcloud pubsub topics add-iam-policy-binding $DEADLETTER_TOPIC \
123130
--member="serviceAccount:$PUBSUB_SERVICE_ACCOUNT"\
124-
--role="roles/pubsub.publisher" --project $PROJECT_ID 2>&1 )
125-
if [[ $? -ne 0 && $BINDING_ERR_OUTPUT == *"gcp-sa-pubsub.iam.gserviceaccount.com does not exist."* ]]; then
126-
# Sometimes the PubSub svc account ([email protected])
127-
# does not exist on new projects. I don't know why.
128-
# Note that this svc account is NOT in the user project;
129-
# it can be thought of as existing in a hidden Google-controlled project.
130-
gcloud beta services identity create --project $PROJECT_ID --service pubsub
131+
--role="roles/pubsub.publisher" --project $PROJECT_ID 2>&1
131132

132-
set -e
133-
#Redo the above binding command
134-
gcloud pubsub topics add-iam-policy-binding $DEADLETTER_TOPIC \
135-
--member="serviceAccount:$PUBSUB_SERVICE_ACCOUNT"\
136-
--role="roles/pubsub.publisher" --project $PROJECT_ID >/dev/null
137133

138-
fi
139-
set -e
134+
gcloud projects add-iam-policy-binding ${PROJECT_ID} \
135+
--member="serviceAccount:${PUBSUB_SERVICE_ACCOUNT}"\
136+
--role='roles/iam.serviceAccountTokenCreator'
140137

141138
# Create PubSub subscription receiving commands from the /schedule handler that is triggered from cron
142139
# If the subscription exists, it will not be changed.
@@ -220,7 +217,41 @@ else
220217
--role="roles/pubsub.subscriber" --project $PROJECT_ID >/dev/null
221218
fi
222219

220+
gcloud pubsub topics describe "$LABEL_ALL_TOPIC" --project="$PROJECT_ID" &>/dev/null ||
221+
gcloud pubsub topics create $LABEL_ALL_TOPIC --project="$PROJECT_ID" --quiet >/dev/null
222+
223+
set +e
224+
gcloud pubsub subscriptions describe "$LABEL_ALL_SUBSCRIPTION" --project="$PROJECT_ID" &>/dev/null
225+
label_all_subsc_exists=$?
226+
set -e
227+
228+
if [[ $label_all_subsc_exists -eq 0 ]]; then
229+
gcloud pubsub subscriptions update "$LABEL_ALL_SUBSCRIPTION" \
230+
--project="$PROJECT_ID" \
231+
--push-endpoint "$LABEL_ALL_SUBSCRIPTION_ENDPOINT" \
232+
--push-auth-service-account $MSGSENDER_SERVICE_ACCOUNT \
233+
--ack-deadline=$ACK_DEADLINE \
234+
--max-delivery-attempts=$MAX_DELIVERY_ATTEMPTS \
235+
--dead-letter-topic=$DEADLETTER_TOPIC \
236+
--min-retry-delay=$MIN_RETRY \
237+
--max-retry-delay=$MAX_RETRY \
238+
--quiet >/dev/null
239+
else
240+
gcloud pubsub subscriptions create "$LABEL_ALL_SUBSCRIPTION" \
241+
--topic "$LABEL_ALL_TOPIC" --project="$PROJECT_ID" \
242+
--push-endpoint "$LABEL_ALL_SUBSCRIPTION_ENDPOINT" \
243+
--push-auth-service-account $MSGSENDER_SERVICE_ACCOUNT \
244+
--ack-deadline=$ACK_DEADLINE \
245+
--max-delivery-attempts=$MAX_DELIVERY_ATTEMPTS \
246+
--dead-letter-topic=$DEADLETTER_TOPIC \
247+
--min-retry-delay=$MIN_RETRY \
248+
--max-retry-delay=$MAX_RETRY \
249+
--quiet >/dev/null
250+
fi
223251

252+
gcloud pubsub subscriptions add-iam-policy-binding $LABEL_ALL_SUBSCRIPTION \
253+
--member="serviceAccount:$PUBSUB_SERVICE_ACCOUNT" \
254+
--role="roles/pubsub.subscriber" --project $PROJECT_ID >/dev/null
224255

225256
if [[ "$LABEL_ON_CRON" == "true" ]]; then
226257
cp cron_full.yaml cron.yaml
@@ -229,6 +260,6 @@ else
229260
cp cron_empty.yaml cron.yaml
230261
fi
231262

232-
gcloud app deploy --project $PROJECT_ID --quiet app.yaml cron.yaml
263+
gcloud app deploy --project "$PROJECT_ID" --quiet app.yaml cron.yaml
233264

234265
rm cron.yaml # In this script, cron.yaml is a temp file, a copy of cron_full.yaml or cron_empty.yaml

0 commit comments

Comments
 (0)