Skip to content

Commit 6e88aec

Browse files
stelcodesaustb
authored andcommitted
(PDB-5351) Fix reports table partition gc bug
This commit fixes a bug that causes reports to get garbage collected even though they were created within the 'reports-ttl' timeframe. This was due to a comparision of these two datetime values: - the 'reports-ttl' derived "cut-off" datetime: exactly 'reports-ttl' ago. - the table partition's datetime: the very beginning of the day in which the reports within were created. In puppetlabs.puppetdb.scf.storage/prune-daily-partitions, a table partition is deemed expired if the latter datetime is *before* the former. When GC'ing at 00:05 with a 'reports-ttl' of '1d', the partition with all of yesterday's reports is deemed expired. The user is left with reports from only the last 5 minutes. This fix does a "floor" operation on the cutoff datetime generated from the 'reports-ttl' value before comparing it to the partition's date. This means that the "cut-off" datetime is rolled back to 0:00 of its selected day. This way, in order for a partition to get purged, the "cut-off" datetime has to be on a day before the partition's datetime. This approach is already how we handle the 'resource-events-ttl' so this fix meshes nicely with what's already in place. The puppetlabs.puppetdb.integration.reports/reports-ttl integration test is being removed because it is difficult to refactor into a useful test. It was written before the reports table was partitioned by date and does not take account how garbage collection currently works (dropping whole expired table partitions). This test was difficult to modify because the puppet agent service cannot easily create reports from the past. The puppetlabs.puppetdb.cli.services-test/correctly-sweep-reports test is sufficient to verify reports-ttl gc related functionality.
1 parent e25e25e commit 6e88aec

File tree

3 files changed

+32
-33
lines changed

3 files changed

+32
-33
lines changed

src/puppetlabs/puppetdb/cli/services.clj

+2-1
Original file line numberDiff line numberDiff line change
@@ -214,8 +214,9 @@
214214
{:pre [(map? db)
215215
(period? report-ttl)]}
216216
(let [rounded-events-ttl (some-> resource-events-ttl rounded-date)
217+
rounded-report-ttl (rounded-date report-ttl)
217218
update-lock-status (partial update-db-lock-status db-lock-status)
218-
del-opts (merge {:report-ttl (ago report-ttl)
219+
del-opts (merge {:report-ttl rounded-report-ttl
219220
:incremental? incremental?
220221
:update-lock-status update-lock-status
221222
:db db}

test/puppetlabs/puppetdb/cli/services_test.clj

+30
Original file line numberDiff line numberDiff line change
@@ -573,3 +573,33 @@
573573
(with-redefs [jdbc/pooled-datasource validation-fn]
574574
(testing "should use connection user"
575575
(init-write-dbs databases))))))
576+
577+
(deftest correctly-sweep-reports
578+
(with-test-db
579+
(let [config (-> (create-temp-config)
580+
(assoc :database *db*)
581+
(assoc-in [:database :gc-interval] "0.01"))
582+
store-report #(sync-command-post (svc-utils/pdb-cmd-url)
583+
example-certname
584+
"store report"
585+
cmd-consts/latest-report-version
586+
(change-report-time example-report %))
587+
db-lock-status (svcs/database-lock-status)]
588+
(call-with-puppetdb-instance
589+
config
590+
(fn []
591+
(testing "A report from 24 hours ago won't be gc'ed with a report-ttl of 1d"
592+
(store-report (-> 1 time/days time/ago time/to-string))
593+
(svcs/sweep-reports! *db* {:incremental true
594+
:report-ttl (time/parse-period "1d")
595+
:resource-events-ttl (time/parse-period "1d")
596+
:db-lock-status db-lock-status})
597+
(is (= 1 (count (jdbc/query ["SELECT * FROM reports"]))))
598+
(jdbc/do-commands "DELETE FROM reports"))
599+
(testing "A report from 48 hours ago will be gc'ed with a report-ttl of 1d"
600+
(store-report (-> 2 time/days time/ago time/to-string))
601+
(svcs/sweep-reports! *db* {:incremental true
602+
:report-ttl (time/parse-period "1d")
603+
:resource-events-ttl (time/parse-period "1d")
604+
:db-lock-status db-lock-status})
605+
(is (empty? (jdbc/query ["SELECT * FROM reports"])))))))))

test/puppetlabs/puppetdb/integration/reports.clj

-32
Original file line numberDiff line numberDiff line change
@@ -260,35 +260,3 @@
260260
(defn read-gc-count-metric []
261261
;; metrics are global, so...
262262
(counters/value (:report-purges @pdb-services/admin-metrics)))
263-
264-
(deftest ^:integration report-ttl
265-
(with-open [pg (int/setup-postgres)]
266-
(with-open [pdb (int/run-puppetdb pg {})
267-
ps (int/run-puppet-server [pdb] {})]
268-
(testing "Run agent once to populate database"
269-
(int/run-puppet-as "ttl-agent" ps pdb "notify { 'irrelevant manifest': }"))
270-
271-
(testing "Verify we have a report"
272-
(is (= 1 (count (int/pql-query pdb "reports { certname = 'ttl-agent' }"))))))
273-
274-
(testing "Sleep for one second to make sure we have a ttl to exceed"
275-
(Thread/sleep 1000))
276-
277-
(let [initial-gc-count (counters/value (:report-purges @pdb-services/admin-metrics))]
278-
(with-open [pdb (int/run-puppetdb pg {:database {:report-ttl "1s"}})]
279-
(let [start-time (System/currentTimeMillis)]
280-
(loop []
281-
(cond
282-
(> (- (System/currentTimeMillis) start-time) tu/default-timeout-ms)
283-
(throw (ex-info "Timeout waiting for pdb gc to happen" {}))
284-
285-
(> (read-gc-count-metric) initial-gc-count)
286-
true ;; gc happened
287-
288-
:default
289-
(do
290-
(Thread/sleep 250)
291-
(recur)))))
292-
293-
(testing "Verify that the report has been deleted"
294-
(is (= 0 (count (int/pql-query pdb "reports { certname = 'ttl-agent' }")))))))))

0 commit comments

Comments
 (0)