Skip to content

Releases: neondatabase/autoscaling

v0.24.0

09 Feb 23:56
79b0ddd
Compare
Choose a tag to compare
Large release - lots of fixes, with a side of new features thrown in to
spice it up. This is actually a re-attempt of the same release, due to
issues and noticing a critical bug, that have now been fixed.

Features:

- neonvm-controller: Make max concurrency configurable via CLI (#773)
  - This was previously released as v0.23.2.
- neonvm: Add .status.restartCount (#754)
- neonvm-controller: VM startup metrics (#774)

No breaking changes.

No protocol changes.

Fixes:

- neonvm-controller: Fix overwriting runner version (#753)
- neonvm-runner: Don't log i/o timeout (#768)
- neonvm: Use crictl to change container CPU, ditch cgroup (#738)
  - This is a BIG change! It's disabled by default, can be enabled with
    a neonvm-controller flag.
- neonvm-runner: Fix "dnsmasq: failed to create inotify .." errors (#786)
- agent: Fix unregistered metric lastSendDuration (#787)
- neonvm-controller: Don't retry update on conflict (#796)
- neonvm-controller: Update VM's .Status.PodName immediately on API server (#797)

Other changes:

- neonvm-runner: Skip QEMU powerdown if already exited (#526)
- plugin: Unify reserve and unreserve logic (#666)
- neonvm: Use container statuses, not pod phase, to trigger restart (#749)
- neonvm-controlller: Use --concurrency-limit=128 (#783)

Upgrade path from v0.23.x:

- No ordering requirements.

v0.23.2

30 Jan 21:41
abc09c2
Compare
Choose a tag to compare
Backport of #773 to allow configuring max neonvm-controller max
concurrency.

Now that we have neonvm-controller metrics, we discovered that a couple
regions were sometimes saturating the concurrency limit for extended
periods of time.

Refer to #773 for more details.

v0.23.1

25 Jan 21:35
1f5aacc
Compare
Choose a tag to compare
Enables SSH access to VMs by default — see #766 and #726 for more.

v0.23.0

25 Jan 04:05
1178783
Compare
Choose a tag to compare
Substantial release with many bugfixes and quality-of-life improvements.
Notable inclusions: SSH support for VMs, opt-in higher log throughput,
tech debt resolution inside the scheduler.

Features:

- neonvm: Clock synchronization using kvm_ptp (#732)
- neonvm: Enable SSH access into the VMs (#726)
- neonvm-controller: Expose default set of metrics (#739)
- neonvm-controller: Custom reconciler metrics (#757)

No breaking changes (kind of; see "Protocol changes").

Protocol changes:

- agent,plugin: Refer to memory quantities in bytes, not memory slots (#653)
- agent: Send compute unit in requests to plugin (#744)

Fixes:

- neonvm-runner: Use the right file extension for ISO images (#735)
- agent: Add small random amount to plugin requset tick (#745)
- agent: Sleep for random delay before first metrics request (#746)
- neonvm-runner: Pass logs through virtio-serial (#724)

No other changes.

Upgrade path from v0.22.0:

- The neonvm deployment must be updated BEFORE making use of the new
  vm-builder version, else ssh access and clock synchronization will not
  work.
- The new `spec.enableSSH` field on VMs is ignored by previous versions
  of the scheduler and autoscaler-agents; no ordering requirement there.
- The scheduler MUST be updated before autoscaler-agents, due to the
  protocol changes between them.

v0.22.0

10 Jan 18:04
a23780d
Compare
Choose a tag to compare
Larger-than-normal release, been a while since the last one because of
holidays. Contains a bunch of substantial bugfixes, alongside some other
smaller improvements.

No new features.

No breaking changes (kind of; see "Protocol changes").

Protocol changes:

- agent,plugin: Don't send ComputeUnit from plugin (#707)

Fixes:

- neonvm-runner: Fix iptables rules for traffic from localhost (#701)
- neonvm-controller: Fix runner pod cgroup cpu scaling (#702)
- neonvm-controller: Update VM resources status once the scaling phase is done (#708)
- neonvm-controller: Unify up/down memory scaling (#704)

Other changes:

- agent: Add project ID label to per VM metrics (#699)
- neonvm: Add manual QMP access (#703)
- agent: Send monitor requests in 1 CU increments (#713)
- neonvm-controller: Extract QMP commands for memory scaling (#704)

Upgrade path from v0.21.0:

- Scheduler must be upgraded before autoscaler-agent (and, if rolling
  back, rolled back after), due to protocol change in #707.
- The per-VM metrics have a new label added; this *may* inadvertently
  break certain usage.

v0.21.0

15 Dec 21:42
2d9b252
Compare
Choose a tag to compare
Relatively small release, with significant changes to per-VM metrics,
and some other minor improvements elsewhere.

Features:

- agent: Include autoscaling bounds annotation in per-VM metrics (#695)

Breaking changes:

- agent: Separate per-VM memory and cpu metrics (#684)
  - The `autoscaling_vm_resources` metric is now split into
    `autoscaling_vm_cpu_cores` and `autoscaling_vm_memory_bytes`, with
    the "resource" label removed.

No protocol changes.

Fixes:

- neonvm/whereabouts: Require amd64 to fix issues with ARM nodes (#691)
  - Previously released as v0.20.0-patch1
- plugin: Fix nil deref on failed ExtractVmInfo on VM update (#690)
  - Previously released as v0.20.0-patch1
- agent: Clarify some logs about monitor requests (#687)
- Revert neonvm/runner: Treat guest-side disk flush requests as no-ops (#628)
  - This was previously released as v0.20.0-patch2.

Other changes:

- vm-builder: Set vector scrape interval from 15s to 1s (#650)

Upgrade path from v0.20.0:

- Components can be deployed in any order.
- The per-VM metrics exposed by the autoscaler-agent have changed in a
  backwards-incompatible way. Any usage must be updated.

v0.20.0-patch2

14 Dec 20:56
5e754ec
Compare
Choose a tag to compare
Reverts the neonvm/runner 'cache=unsafe' change. See #694 for more.

v0.20.0-patch1

11 Dec 20:23
8cf814b
Compare
Choose a tag to compare
Small release with a couple bugfixes:
- neonvm/whereabouts: Require amd64 to fix issues with ARM nodes (#691)
  - This was a regression from v0.19.x because v0.20.0 unintentionally
    removed the affinity selector.
- plugin: Fix nil deref on failed ExtractVmInfo on VM update (#690)
  - Not a regression, but occurs more on staging after v0.20.0 because
    of unrelated changes.

v0.20.0

08 Dec 23:51
436edea
Compare
Choose a tag to compare
A larger release, focused on stability and cleaning up internals.

Features:

- neonvm/controller: Add pprof endpoint (#670)

Breaking changes:

- plugin: Remove config nodeOverrides (#654)
- plugin: Make compute unit config global (#655)

Protocol changes:

- agent,plugin: Requests include agent's last permit (#649)

Fixes:

- agent: Fix plugin approved resource metrics (#647)
- util/watch: Fix triggered relist (#667)

Other changes:

- neonvm: Whereabouts CNI updated v0.6.1 → v0.6.2 (#636)
- util/watch: Remove resourceVersion from List calls (#672)
- neonvm/runner: Treat guest-side disk flush requests as no-ops (#628)
  - Sets "cache=unsafe" for various disk mounts
- agent/billing: Log URL on billing requests (#681)
- agent: Log config on startup (#682)
- agent: Replace schedwatch/trackcurrent with global value (#675)

Upgrade path from v0.19.x:

- Upgraded scheduler MUST be released before upgraded autoscaler-agents.
  If you need to roll back, the autoscaler-agents MUST be completely
  rolled back before the scheduler.

v0.19.1-patch1

22 Nov 19:52
c6dc97b
Compare
Choose a tag to compare
Adds #643 - forgot to change kernel version in release workflow.