Skip to content

Commit

Permalink
EOL of upgrade-manager-v0 and make upgrade-manager-v1 the default. (#319
Browse files Browse the repository at this point in the history
)

* Delete README.md

Signed-off-by: Eytan Avisror <[email protected]>

* delete all

Signed-off-by: Eytan Avisror <[email protected]>

* scaffolding

Signed-off-by: Eytan Avisror <[email protected]>

* add API

Signed-off-by: Eytan Avisror <[email protected]>

* initial code

Signed-off-by: Eytan Avisror <[email protected]>

* add more scaffolding

Signed-off-by: Eytan Avisror <[email protected]>

* Add kubernetes API calls

Signed-off-by: Eytan Avisror <[email protected]>

* aws API calls

Signed-off-by: Eytan Avisror <[email protected]>

* AWS API calls & Drift detection

Signed-off-by: Eytan Avisror <[email protected]>

* initial rotation logic

Signed-off-by: Eytan Avisror <[email protected]>

* Implemented RollingUpgrade object validation. (#176)

* Validation step to check Nodes and ASG launch configs

Signed-off-by: shreyas-badiger <[email protected]>

* Validating launch definition after a rolling upgrade

Signed-off-by: shreyas-badiger <[email protected]>

* Fix all the "make vet" errors in Controller V2 branch. (#177)

* Validation step to check Nodes and ASG launch configs

Signed-off-by: shreyas-badiger <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Validating launch definition after a rolling upgrade

Signed-off-by: shreyas-badiger <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Resolve error log message and return statement

Signed-off-by: shreyas-badiger <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Adding Functional Test (#113)

* Adding BDD, workflow and badge

* Changing CI workflow job name

* Adding make manifests

* Clarifying cron time zone comment

Signed-off-by: shreyas-badiger <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* release 0.13 (#115)

* release 0.13

* Update CHANGELOG.md

Signed-off-by: shreyas-badiger <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* bump version (#116)

Signed-off-by: shreyas-badiger <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Repo selection for CI and BDD workflows & CI step for releases (#117)

* CI-BDD not on forks & Step for releases (#2)

* Testing CI-BDD not on forks & Step for releases

* Adding step for image with tag git-tag

Signed-off-by: shreyas-badiger <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Terminate unjoined nodes

Signed-off-by: shreyas-badiger <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Resolving PR comments

Signed-off-by: shreyas-badiger <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Set version and update CHANGELOG for version 0.14. (#121)

Co-authored-by: Shri Javadekar <[email protected]>
Signed-off-by: shreyas-badiger <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Bump version to 0.15-dev.

Signed-off-by: shreyas-badiger <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Fix typo in README.md. (#125)

Signed-off-by: shreyas-badiger <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Ignore the terminated instance during upgrade

Signed-off-by: shreyas-badiger <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Added WARNING prefix in the logging

Signed-off-by: shreyas-badiger <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Apply suggestions from code review

Co-authored-by: Kevin Downey <[email protected]>
Signed-off-by: shreyas-badiger <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Capitalize sprintf to Sprintf

Signed-off-by: shreyas-badiger <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Upgrade to Go 1.15 (#128)

Signed-off-by: Oleg Atamanenko <[email protected]>
Signed-off-by: shreyas-badiger <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Fix few typos and simplify error returns, remove redundant types (#131)

Signed-off-by: Oleg Atamanenko <[email protected]>
Signed-off-by: shreyas-badiger <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Readiness gates implementation for eager mode (#130)

Signed-off-by: Oleg Atamanenko <[email protected]>
Signed-off-by: shreyas-badiger <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Adding Functional Test (#113)

* Adding BDD, workflow and badge

* Changing CI workflow job name

* Adding make manifests

* Clarifying cron time zone comment

Signed-off-by: sbadiger <[email protected]>

* Validation step to check Nodes and ASG launch configs (#112)

* Validation step to check Nodes and ASG launch configs

* Validating launch definition after a rolling upgrade

* Resolve error log message and return statement

Co-authored-by: Eytan Avisror <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* release 0.13 (#115)

* release 0.13

* Update CHANGELOG.md

Signed-off-by: sbadiger <[email protected]>

* bump version (#116)

Signed-off-by: sbadiger <[email protected]>

* Repo selection for CI and BDD workflows & CI step for releases (#117)

* CI-BDD not on forks & Step for releases (#2)

* Testing CI-BDD not on forks & Step for releases

* Adding step for image with tag git-tag

Signed-off-by: sbadiger <[email protected]>

* Terminate unjoined nodes (#120)

* Validation step to check Nodes and ASG launch configs

* Validating launch definition after a rolling upgrade

* Resolve error log message and return statement

* Terminate unjoined nodes

* Resolving PR comments

Co-authored-by: Eytan Avisror <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Set version and update CHANGELOG for version 0.14. (#121)

Co-authored-by: Shri Javadekar <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Bump version to 0.15-dev.

Signed-off-by: sbadiger <[email protected]>

* Fix bug when switching to launch templates (#136)

* Update rollingupgrade_controller.go

* Update rollingupgrade_controller.go

Signed-off-by: Eytan Avisror <[email protected]>

* spacing fixes

Signed-off-by: Eytan Avisror <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Extract script runner to a separate type; fix work with env. variables (#132)

Signed-off-by: Oleg Atamanenko <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Set version and update CHANGELOG for version v0.15 (#137)

Signed-off-by: Shri Javadekar <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Bump version to v0.16-dev.

Signed-off-by: Shri Javadekar <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Propagate parent env variables to allow to talk with API Server (#144)

Signed-off-by: Oleg Atamanenko <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Bump Golang CI action to fix failed CI run (#146)

Signed-off-by: Oleg Atamanenko <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Simplify (#145)

Signed-off-by: Oleg Atamanenko <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Add Expiration to cache and do not refresh ASG if cache is not expired (#143)

Signed-off-by: Oleg Atamanenko <[email protected]>

Co-authored-by: Shri Javadekar <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Fix documentation for uniform across AZ Update strategy and fix typos (#147)

Signed-off-by: Oleg Atamanenko <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Move cluster state from package level to a cluster state impl (#148)

Signed-off-by: Oleg Atamanenko <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Simplify work with intstr type. (#149)

Signed-off-by: Oleg Atamanenko <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* If instance is in standby mode already, just return (#138)

Signed-off-by: Oleg Atamanenko <[email protected]>

Co-authored-by: Shri Javadekar <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Handle terminated instances gracefully. (#150)

Signed-off-by: Oleg Atamanenko <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Template version comparison fix (#155)

* get template version

Signed-off-by: Eytan Avisror <[email protected]>

* fix tests

Signed-off-by: Eytan Avisror <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* release 0.16 (#157)

Signed-off-by: Eytan Avisror <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* bump version to 0.17-dev (#158)

Signed-off-by: Eytan Avisror <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Don't uncordon node on failure to run postDrain script when IgnoreDrainFailures set (#151)

* Don't uncordon node on failure to run postDrain script when IgnoreDrainFailures set

Signed-off-by: Adam Malcontenti-Wilson <[email protected]>

* Test node uncordon when postDrain / postDrainWait script fails

Signed-off-by: Adam Malcontenti-Wilson <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Abort on strategy failure instead of continuing (#152)

* Abort on strategy failure instead of continuing

Signed-off-by: Adam Malcontenti-Wilson <[email protected]>

* Remove unformatted error message placeholder

Signed-off-by: Adam Malcontenti-Wilson <[email protected]>

* Explictly specify strategy for tests

Signed-off-by: Adam Malcontenti-Wilson <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* use NamespacedName (#160)

Signed-off-by: Eytan Avisror <[email protected]>

Co-authored-by: Shri Javadekar <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Set version and update CHANGELOG for version v0.17 (#161)

Signed-off-by: Shri Javadekar <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Bump version to v0.18-dev (#162)

Signed-off-by: Shri Javadekar <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Move constants to types so that they can be reused (#167)

Signed-off-by: Oleg Atamanenko <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Remove separate module for pkg/log (#168)

Signed-off-by: Oleg Atamanenko <[email protected]>

Co-authored-by: Shri Javadekar <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Bump dependencies. (#169)

Signed-off-by: Oleg Atamanenko <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* use standard fmt.Errorf to format error message; unify error format (#171)

Signed-off-by: Oleg Atamanenko <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Fix namespaced name order (#170)

Signed-off-by: Oleg Atamanenko <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Add instance id to the logs (#173)

Signed-off-by: Oleg Atamanenko <[email protected]>

Co-authored-by: Shri Javadekar <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Bump golang and busybox (#172)

Signed-off-by: Oleg Atamanenko <[email protected]>

Co-authored-by: Shri Javadekar <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Expose template list and other execution errors to logs (#166)

* Log and return wrapped launchtemplate error

Signed-off-by: Adam Malcontenti-Wilson <[email protected]>

* Expose execution error in logs

Signed-off-by: Adam Malcontenti-Wilson <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* output can contain other messages from API Server, so be more relaxed (#174)

Signed-off-by: Oleg Atamanenko <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Delete README.md

Signed-off-by: Eytan Avisror <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* delete all

Signed-off-by: Eytan Avisror <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* scaffolding

Signed-off-by: Eytan Avisror <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* add API

Signed-off-by: Eytan Avisror <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* initial code

Signed-off-by: Eytan Avisror <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* add more scaffolding

Signed-off-by: Eytan Avisror <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Add kubernetes API calls

Signed-off-by: Eytan Avisror <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* aws API calls

Signed-off-by: Eytan Avisror <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* AWS API calls & Drift detection

Signed-off-by: Eytan Avisror <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* validate() function

Signed-off-by: shreyas-badiger <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* modified validate()

Signed-off-by: sbadiger <[email protected]>

* modified validate()

Signed-off-by: sbadiger <[email protected]>

* initial rotation logic

Signed-off-by: Eytan Avisror <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* basic script_runner without any modifications

Signed-off-by: sbadiger <[email protected]>

* Fix all the vet related errors

Signed-off-by: sbadiger <[email protected]>

Co-authored-by: Alfredo Garo <[email protected]>
Co-authored-by: Eytan Avisror <[email protected]>
Co-authored-by: Shri Javadekar <[email protected]>
Co-authored-by: Shri Javadekar <[email protected]>
Co-authored-by: Shri Javadekar <[email protected]>
Co-authored-by: Craig Robson <[email protected]>
Co-authored-by: Kevin Downey <[email protected]>
Co-authored-by: Oleg Atamanenko <[email protected]>
Co-authored-by: Shreyas Badiger <[email protected]>
Co-authored-by: Adam Malcontenti-Wilson <[email protected]>
Co-authored-by: Adam Malcontenti-Wilson <[email protected]>
Co-authored-by: Eytan Avisror <[email protected]>

* Controller v2: Implementation of Instance termination (#178)

* fix make vet errors.

Signed-off-by: sbadiger <[email protected]>

* Terminate instances and run v2 for first time.

Signed-off-by: sbadiger <[email protected]>

* Addressing review comments

Signed-off-by: sbadiger <[email protected]>

* addressing more review comments

Signed-off-by: sbadiger <[email protected]>

* Log error message

Signed-off-by: sbadiger <[email protected]>

* error handling for instance tagging

Signed-off-by: sbadiger <[email protected]>

* Migrate Script Runner (#179)

* Basic script runner

Signed-off-by: Eytan Avisror <[email protected]>

* Update upgrade.go

Signed-off-by: Eytan Avisror <[email protected]>

* Implemented node drain. (#181)

* Eager mode implementation (#183)

* Eager mode implementation

Signed-off-by: sbadiger <[email protected]>

* Metrics features (#189)

Signed-off-by: xshao <[email protected]>

* Process the batch rotation in parallel (#192)

* Process the batch rotation in parallel

Signed-off-by: sbadiger <[email protected]>

* addressing review comments

Signed-off-by: sbadiger <[email protected]>

* Move the DrainManager within ReplaceBatch(), to access one per RollingUpgrade CR (#195)

Signed-off-by: sbadiger <[email protected]>

* Refine metrics implementation to support goroutines (#196)

* Refine the metrics status

Signed-off-by: xshao <[email protected]>

* Refine the metrics status

Signed-off-by: xshao <[email protected]>

* Fix test case error

Signed-off-by: xshao <[email protected]>

* Use group instead of ASG

Signed-off-by: xshao <[email protected]>

* Ignore generated code  (#201)

* Refine the metrics status

Signed-off-by: xshao <[email protected]>

* Refine the metrics status

Signed-off-by: xshao <[email protected]>

* Fix test case error

Signed-off-by: xshao <[email protected]>

* Use group instead of ASG

Signed-off-by: xshao <[email protected]>

* Ignore generated code

Signed-off-by: xshao <[email protected]>

* Ignore generated code

Signed-off-by: xshao <[email protected]>

* Fix bug in deleting the entry in syncMap (#203)

Signed-off-by: sbadiger <[email protected]>

* Unit tests for controller-v2 (#215)

* Unit tests

Signed-off-by: sbadiger <[email protected]>

* minor change in accessing the namespace name

Signed-off-by: sbadiger <[email protected]>

* move helper functions to a differnt file

Signed-off-by: sbadiger <[email protected]>

* #2285: rollup CR statistic metrics in v2 (#218)

* #2285: rollup CR statistic metrics in v2

Signed-off-by: sbadla1 <[email protected]>

* #2285: updated metric flags

Signed-off-by: sbadla1 <[email protected]>

* #2285: updated metric flags

Signed-off-by: sbadla1 <[email protected]>

* #2285: renamed some methods related to metrics (#224)

Signed-off-by: sbadla1 <[email protected]>

* #2286: removed version from metric namespace (#227)

Signed-off-by: sbadla1 <[email protected]>

* Create RollingUpgradeContext (#234)

* #2285: rollup CR statistic metrics in v2 (#218)

* #2285: rollup CR statistic metrics in v2

Signed-off-by: sbadla1 <[email protected]>

* #2285: updated metric flags

Signed-off-by: sbadla1 <[email protected]>

* #2285: updated metric flags

Signed-off-by: sbadla1 <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* log cloud discovery failure

Signed-off-by: sbadiger <[email protected]>

* Create RollingUpgrade Context

Signed-off-by: sbadiger <[email protected]>

* rollingupgrade context

Signed-off-by: sbadiger <[email protected]>

Co-authored-by: Sahil Badla <[email protected]>

* Resolve compile errors caused by merge conflict. (#235)

* #2285: rollup CR statistic metrics in v2 (#218)

* #2285: rollup CR statistic metrics in v2

Signed-off-by: sbadla1 <[email protected]>

* #2285: updated metric flags

Signed-off-by: sbadla1 <[email protected]>

* #2285: updated metric flags

Signed-off-by: sbadla1 <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* log cloud discovery failure

Signed-off-by: sbadiger <[email protected]>

* Create RollingUpgrade Context

Signed-off-by: sbadiger <[email protected]>

* rollingupgrade context

Signed-off-by: sbadiger <[email protected]>

* resolve compile errors due to merge conflict

Signed-off-by: sbadiger <[email protected]>

Co-authored-by: Sahil Badla <[email protected]>

* upgrade-manager-v2: Move DrainManager back to Reconciler (#236)

* #2285: rollup CR statistic metrics in v2 (#218)

* #2285: rollup CR statistic metrics in v2

Signed-off-by: sbadla1 <[email protected]>

* #2285: updated metric flags

Signed-off-by: sbadla1 <[email protected]>

* #2285: updated metric flags

Signed-off-by: sbadla1 <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* log cloud discovery failure

Signed-off-by: sbadiger <[email protected]>

* Create RollingUpgrade Context

Signed-off-by: sbadiger <[email protected]>

* rollingupgrade context

Signed-off-by: sbadiger <[email protected]>

* #2285: rollup CR statistic metrics in v2 (#218)

* #2285: rollup CR statistic metrics in v2

Signed-off-by: sbadla1 <[email protected]>

* #2285: updated metric flags

Signed-off-by: sbadla1 <[email protected]>

* #2285: updated metric flags

Signed-off-by: sbadla1 <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* #2285: renamed some methods related to metrics (#224)

Signed-off-by: sbadla1 <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* #2286: removed version from metric namespace (#227)

Signed-off-by: sbadla1 <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* resolve compile errors due to merge conflict

Signed-off-by: sbadiger <[email protected]>

* move drain-manager to reconciler

Signed-off-by: sbadiger <[email protected]>

* initialize RollingUpgrade object

Signed-off-by: sbadiger <[email protected]>

* use bool instead of count for standby function

Signed-off-by: sbadiger <[email protected]>

* refactor in-progress and standby code

Signed-off-by: sbadiger <[email protected]>

* rename instance standby function

Signed-off-by: sbadiger <[email protected]>

* DrainManager changes in unit test files

Signed-off-by: sbadiger <[email protected]>

Co-authored-by: Sahil Badla <[email protected]>

* V2 controller metrics concurrency fix (#231)

* Refine the metrics status

Signed-off-by: xshao <[email protected]>

* Refine the metrics status

Signed-off-by: xshao <[email protected]>

* Fix test case error

Signed-off-by: xshao <[email protected]>

* Use group instead of ASG

Signed-off-by: xshao <[email protected]>

* Ignore generated code

Signed-off-by: xshao <[email protected]>

* Ignore generated code

Signed-off-by: xshao <[email protected]>

* Fix the concurrent issue

Signed-off-by: xshao <[email protected]>

* Fix the concurrent issue

Signed-off-by: xshao <[email protected]>

* Move metrics related functions into RollingUpgradeContext

Signed-off-by: xshao <[email protected]>

* Move metrics related functions into RollingUpgradeContext

Signed-off-by: xshao <[email protected]>

* Move metrics related functions into upgrade_metrics.go

Signed-off-by: xshao <[email protected]>

* Move metrics related functions into metrics.go

Signed-off-by: xshao <[email protected]>

* add missing parenthesis (#239)

* metricsMutex should be initialized (#240)

Signed-off-by: xshao <[email protected]>

* upgrade-manager-v2: Load test fixes (#245)

* upgrade-manager-v2: Move DrainManager back to Reconciler (#236)

* #2285: rollup CR statistic metrics in v2 (#218)

* #2285: rollup CR statistic metrics in v2

Signed-off-by: sbadla1 <[email protected]>

* #2285: updated metric flags

Signed-off-by: sbadla1 <[email protected]>

* #2285: updated metric flags

Signed-off-by: sbadla1 <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* log cloud discovery failure

Signed-off-by: sbadiger <[email protected]>

* Create RollingUpgrade Context

Signed-off-by: sbadiger <[email protected]>

* rollingupgrade context

Signed-off-by: sbadiger <[email protected]>

* #2285: rollup CR statistic metrics in v2 (#218)

* #2285: rollup CR statistic metrics in v2

Signed-off-by: sbadla1 <[email protected]>

* #2285: updated metric flags

Signed-off-by: sbadla1 <[email protected]>

* #2285: updated metric flags

Signed-off-by: sbadla1 <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* #2285: renamed some methods related to metrics (#224)

Signed-off-by: sbadla1 <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* #2286: removed version from metric namespace (#227)

Signed-off-by: sbadla1 <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* resolve compile errors due to merge conflict

Signed-off-by: sbadiger <[email protected]>

* move drain-manager to reconciler

Signed-off-by: sbadiger <[email protected]>

* initialize RollingUpgrade object

Signed-off-by: sbadiger <[email protected]>

* use bool instead of count for standby function

Signed-off-by: sbadiger <[email protected]>

* refactor in-progress and standby code

Signed-off-by: sbadiger <[email protected]>

* rename instance standby function

Signed-off-by: sbadiger <[email protected]>

* DrainManager changes in unit test files

Signed-off-by: sbadiger <[email protected]>

Co-authored-by: Sahil Badla <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* V2 controller metrics concurrency fix (#231)

* Refine the metrics status

Signed-off-by: xshao <[email protected]>

* Refine the metrics status

Signed-off-by: xshao <[email protected]>

* Fix test case error

Signed-off-by: xshao <[email protected]>

* Use group instead of ASG

Signed-off-by: xshao <[email protected]>

* Ignore generated code

Signed-off-by: xshao <[email protected]>

* Ignore generated code

Signed-off-by: xshao <[email protected]>

* Fix the concurrent issue

Signed-off-by: xshao <[email protected]>

* Fix the concurrent issue

Signed-off-by: xshao <[email protected]>

* Move metrics related functions into RollingUpgradeContext

Signed-off-by: xshao <[email protected]>

* Move metrics related functions into RollingUpgradeContext

Signed-off-by: xshao <[email protected]>

* Move metrics related functions into upgrade_metrics.go

Signed-off-by: xshao <[email protected]>

* Move metrics related functions into metrics.go

Signed-off-by: xshao <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* add missing parenthesis

Signed-off-by: sbadiger <[email protected]>

* load test fixes

Signed-off-by: sbadiger <[email protected]>

* handle scaling group not found

Signed-off-by: sbadiger <[email protected]>

* Update upgrade.go

Signed-off-by: sbadiger <[email protected]>

* log one level up

* remove double logging

Signed-off-by: sbadiger <[email protected]>

* final push before RC release. (#254)

* support IgnoreDrainFailures flag

Signed-off-by: sbadiger <[email protected]>

* add else condition

Signed-off-by: sbadiger <[email protected]>

* set min for maxUnavailable

Signed-off-by: sbadiger <[email protected]>

* calculateMaxUnavailable function

Signed-off-by: sbadiger <[email protected]>

* add a new coloumn (completePercentage)

Signed-off-by: sbadiger <[email protected]>

* disable debug logs by default

Signed-off-by: sbadiger <[email protected]>

* Fix metrics collecting issue (#249)

* metricsMutex should be initialized

Signed-off-by: xshao <[email protected]>

* Use InProcessingNode instead of Stringp[] so that it can have the status of steps

Signed-off-by: xshao <[email protected]>

* Revert "Fix metrics collecting issue (#249)" (#256)

This reverts commit f5dd1cb5f76f2b78cb15c53daed14032a2a4c6ec.

* Fix metrics calculation issue (#258)

* metricsMutex should be initialized

Signed-off-by: xshao <[email protected]>

* Use InProcessingNode instead of Stringp[] so that it can have the status of steps

Signed-off-by: xshao <[email protected]>

* Make the change backward compatible

Signed-off-by: xshao <[email protected]>

* Make the change backward compatible

Signed-off-by: xshao <[email protected]>

* Add mutex for InProcessingNode deleting

Signed-off-by: xshao <[email protected]>

* Add a mock for test and update version in Makefile (#262)

Signed-off-by: sbadiger <[email protected]>

* and CR end time (#264)

Signed-off-by: sbadiger <[email protected]>

* upgrade-manager-v2: expose totalProcessing time and other metrics (#265)

* and CR end time

Signed-off-by: sbadiger <[email protected]>

* expose totalProcessing time and other metrics

Signed-off-by: sbadiger <[email protected]>

* addressing review comments

Signed-off-by: sbadiger <[email protected]>

* upgrade-manager-v2: remove function duplicate declaration. (#266)

* and CR end time

Signed-off-by: sbadiger <[email protected]>

* expose totalProcessing time and other metrics

Signed-off-by: sbadiger <[email protected]>

* addressing review comments

Signed-off-by: sbadiger <[email protected]>

* remove function duplication

Signed-off-by: sbadiger <[email protected]>

* Carry the metrics status in RollingUpgrade CR (#267)

* Update metrics status at same time

Signed-off-by: xshao <[email protected]>

* Update metrics status when terminating instance

Signed-off-by: xshao <[email protected]>

* Add terminated step

Signed-off-by: xshao <[email protected]>

* Add terminated step

Signed-off-by: xshao <[email protected]>

* Add terminated step

Signed-off-by: xshao <[email protected]>

* move cloud discovery after nodeInterval / drainInterval wait (#270)

Signed-off-by: sbadiger <[email protected]>

* upgrade-manager-v2: Add nodeEvents handler instead of a watch handler (#272)

* upgrade-manager-v2: remove function duplicate declaration. (#266)

* and CR end time

Signed-off-by: sbadiger <[email protected]>

* expose totalProcessing time and other metrics

Signed-off-by: sbadiger <[email protected]>

* addressing review comments

Signed-off-by: sbadiger <[email protected]>

* remove function duplication

Signed-off-by: sbadiger <[email protected]>

* Carry the metrics status in RollingUpgrade CR (#267)

* Update metrics status at same time

Signed-off-by: xshao <[email protected]>

* Update metrics status when terminating instance

Signed-off-by: xshao <[email protected]>

* Add terminated step

Signed-off-by: xshao <[email protected]>

* Add terminated step

Signed-off-by: xshao <[email protected]>

* Add terminated step

Signed-off-by: xshao <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* move cloud discovery after nodeInterval / drainInterval wait

Signed-off-by: sbadiger <[email protected]>

* Add watch event for cluster nodes instead of API calls

Signed-off-by: sbadiger <[email protected]>

* upon node deletion, remove it from syncMap as well

Signed-off-by: sbadiger <[email protected]>

* Add nodeEvents handler instead of watch handler

Signed-off-by: sbadiger <[email protected]>

* Ignore Reconciles on nodeEvents

Signed-off-by: sbadiger <[email protected]>

* Add comments

Signed-off-by: sbadiger <[email protected]>

Co-authored-by: Sheldon Shao <[email protected]>

* upgrade-manager-v2: Process next batch while waiting on nodeInterval period. (#273)

* upgrade-manager-v2: remove function duplicate declaration. (#266)

* and CR end time

Signed-off-by: sbadiger <[email protected]>

* expose totalProcessing time and other metrics

Signed-off-by: sbadiger <[email protected]>

* addressing review comments

Signed-off-by: sbadiger <[email protected]>

* remove function duplication

Signed-off-by: sbadiger <[email protected]>

* Carry the metrics status in RollingUpgrade CR (#267)

* Update metrics status at same time

Signed-off-by: xshao <[email protected]>

* Update metrics status when terminating instance

Signed-off-by: xshao <[email protected]>

* Add terminated step

Signed-off-by: xshao <[email protected]>

* Add terminated step

Signed-off-by: xshao <[email protected]>

* Add terminated step

Signed-off-by: xshao <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* move cloud discovery after nodeInterval / drainInterval wait

Signed-off-by: sbadiger <[email protected]>

* Add watch event for cluster nodes instead of API calls

Signed-off-by: sbadiger <[email protected]>

* upon node deletion, remove it from syncMap as well

Signed-off-by: sbadiger <[email protected]>

* Add nodeEvents handler instead of watch handler

Signed-off-by: sbadiger <[email protected]>

* Ignore Reconciles on nodeEvents

Signed-off-by: sbadiger <[email protected]>

* Add comments

Signed-off-by: sbadiger <[email protected]>

* Set nextbatch to standBy while waiting for terminate

* Avoid parallel reconcile operation per ASG

* add default requeue time

Co-authored-by: Sheldon Shao <[email protected]>

* upgrade-manager-v2: Fix unit tests (#275)

* Delete README.md

Signed-off-by: Eytan Avisror <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* delete all

Signed-off-by: Eytan Avisror <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* scaffolding

Signed-off-by: Eytan Avisror <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* add API

Signed-off-by: Eytan Avisror <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* initial code

Signed-off-by: Eytan Avisror <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* add more scaffolding

Signed-off-by: Eytan Avisror <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Add kubernetes API calls

Signed-off-by: Eytan Avisror <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* aws API calls

Signed-off-by: Eytan Avisror <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* AWS API calls & Drift detection

Signed-off-by: Eytan Avisror <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* initial rotation logic

Signed-off-by: Eytan Avisror <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Implemented RollingUpgrade object validation. (#176)

* Validation step to check Nodes and ASG launch configs

Signed-off-by: shreyas-badiger <[email protected]>

* Validating launch definition after a rolling upgrade

Signed-off-by: shreyas-badiger <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Fix all the "make vet" errors in Controller V2 branch. (#177)

* Validation step to check Nodes and ASG launch configs

Signed-off-by: shreyas-badiger <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Validating launch definition after a rolling upgrade

Signed-off-by: shreyas-badiger <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Resolve error log message and return statement

Signed-off-by: shreyas-badiger <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Adding Functional Test (#113)

* Adding BDD, workflow and badge

* Changing CI workflow job name

* Adding make manifests

* Clarifying cron time zone comment

Signed-off-by: shreyas-badiger <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* release 0.13 (#115)

* release 0.13

* Update CHANGELOG.md

Signed-off-by: shreyas-badiger <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* bump version (#116)

Signed-off-by: shreyas-badiger <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Repo selection for CI and BDD workflows & CI step for releases (#117)

* CI-BDD not on forks & Step for releases (#2)

* Testing CI-BDD not on forks & Step for releases

* Adding step for image with tag git-tag

Signed-off-by: shreyas-badiger <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Terminate unjoined nodes

Signed-off-by: shreyas-badiger <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Resolving PR comments

Signed-off-by: shreyas-badiger <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Set version and update CHANGELOG for version 0.14. (#121)

Co-authored-by: Shri Javadekar <[email protected]>
Signed-off-by: shreyas-badiger <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Bump version to 0.15-dev.

Signed-off-by: shreyas-badiger <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Fix typo in README.md. (#125)

Signed-off-by: shreyas-badiger <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Ignore the terminated instance during upgrade

Signed-off-by: shreyas-badiger <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Added WARNING prefix in the logging

Signed-off-by: shreyas-badiger <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Apply suggestions from code review

Co-authored-by: Kevin Downey <[email protected]>
Signed-off-by: shreyas-badiger <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Capitalize sprintf to Sprintf

Signed-off-by: shreyas-badiger <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Upgrade to Go 1.15 (#128)

Signed-off-by: Oleg Atamanenko <[email protected]>
Signed-off-by: shreyas-badiger <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Fix few typos and simplify error returns, remove redundant types (#131)

Signed-off-by: Oleg Atamanenko <[email protected]>
Signed-off-by: shreyas-badiger <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Readiness gates implementation for eager mode (#130)

Signed-off-by: Oleg Atamanenko <[email protected]>
Signed-off-by: shreyas-badiger <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Adding Functional Test (#113)

* Adding BDD, workflow and badge

* Changing CI workflow job name

* Adding make manifests

* Clarifying cron time zone comment

Signed-off-by: sbadiger <[email protected]>

* Validation step to check Nodes and ASG launch configs (#112)

* Validation step to check Nodes and ASG launch configs

* Validating launch definition after a rolling upgrade

* Resolve error log message and return statement

Co-authored-by: Eytan Avisror <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* release 0.13 (#115)

* release 0.13

* Update CHANGELOG.md

Signed-off-by: sbadiger <[email protected]>

* bump version (#116)

Signed-off-by: sbadiger <[email protected]>

* Repo selection for CI and BDD workflows & CI step for releases (#117)

* CI-BDD not on forks & Step for releases (#2)

* Testing CI-BDD not on forks & Step for releases

* Adding step for image with tag git-tag

Signed-off-by: sbadiger <[email protected]>

* Terminate unjoined nodes (#120)

* Validation step to check Nodes and ASG launch configs

* Validating launch definition after a rolling upgrade

* Resolve error log message and return statement

* Terminate unjoined nodes

* Resolving PR comments

Co-authored-by: Eytan Avisror <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Set version and update CHANGELOG for version 0.14. (#121)

Co-authored-by: Shri Javadekar <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Bump version to 0.15-dev.

Signed-off-by: sbadiger <[email protected]>

* Fix bug when switching to launch templates (#136)

* Update rollingupgrade_controller.go

* Update rollingupgrade_controller.go

Signed-off-by: Eytan Avisror <[email protected]>

* spacing fixes

Signed-off-by: Eytan Avisror <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Extract script runner to a separate type; fix work with env. variables (#132)

Signed-off-by: Oleg Atamanenko <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Set version and update CHANGELOG for version v0.15 (#137)

Signed-off-by: Shri Javadekar <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Bump version to v0.16-dev.

Signed-off-by: Shri Javadekar <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Propagate parent env variables to allow to talk with API Server (#144)

Signed-off-by: Oleg Atamanenko <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Bump Golang CI action to fix failed CI run (#146)

Signed-off-by: Oleg Atamanenko <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Simplify (#145)

Signed-off-by: Oleg Atamanenko <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Add Expiration to cache and do not refresh ASG if cache is not expired (#143)

Signed-off-by: Oleg Atamanenko <[email protected]>

Co-authored-by: Shri Javadekar <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Fix documentation for uniform across AZ Update strategy and fix typos (#147)

Signed-off-by: Oleg Atamanenko <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Move cluster state from package level to a cluster state impl (#148)

Signed-off-by: Oleg Atamanenko <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Simplify work with intstr type. (#149)

Signed-off-by: Oleg Atamanenko <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* If instance is in standby mode already, just return (#138)

Signed-off-by: Oleg Atamanenko <[email protected]>

Co-authored-by: Shri Javadekar <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Handle terminated instances gracefully. (#150)

Signed-off-by: Oleg Atamanenko <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Template version comparison fix (#155)

* get template version

Signed-off-by: Eytan Avisror <[email protected]>

* fix tests

Signed-off-by: Eytan Avisror <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* release 0.16 (#157)

Signed-off-by: Eytan Avisror <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* bump version to 0.17-dev (#158)

Signed-off-by: Eytan Avisror <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Don't uncordon node on failure to run postDrain script when IgnoreDrainFailures set (#151)

* Don't uncordon node on failure to run postDrain script when IgnoreDrainFailures set

Signed-off-by: Adam Malcontenti-Wilson <[email protected]>

* Test node uncordon when postDrain / postDrainWait script fails

Signed-off-by: Adam Malcontenti-Wilson <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Abort on strategy failure instead of continuing (#152)

* Abort on strategy failure instead of continuing

Signed-off-by: Adam Malcontenti-Wilson <[email protected]>

* Remove unformatted error message placeholder

Signed-off-by: Adam Malcontenti-Wilson <[email protected]>

* Explictly specify strategy for tests

Signed-off-by: Adam Malcontenti-Wilson <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* use NamespacedName (#160)

Signed-off-by: Eytan Avisror <[email protected]>

Co-authored-by: Shri Javadekar <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Set version and update CHANGELOG for version v0.17 (#161)

Signed-off-by: Shri Javadekar <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Bump version to v0.18-dev (#162)

Signed-off-by: Shri Javadekar <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Move constants to types so that they can be reused (#167)

Signed-off-by: Oleg Atamanenko <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Remove separate module for pkg/log (#168)

Signed-off-by: Oleg Atamanenko <[email protected]>

Co-authored-by: Shri Javadekar <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Bump dependencies. (#169)

Signed-off-by: Oleg Atamanenko <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* use standard fmt.Errorf to format error message; unify error format (#171)

Signed-off-by: Oleg Atamanenko <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Fix namespaced name order (#170)

Signed-off-by: Oleg Atamanenko <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Add instance id to the logs (#173)

Signed-off-by: Oleg Atamanenko <[email protected]>

Co-authored-by: Shri Javadekar <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Bump golang and busybox (#172)

Signed-off-by: Oleg Atamanenko <[email protected]>

Co-authored-by: Shri Javadekar <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Expose template list and other execution errors to logs (#166)

* Log and return wrapped launchtemplate error

Signed-off-by: Adam Malcontenti-Wilson <[email protected]>

* Expose execution error in logs

Signed-off-by: Adam Malcontenti-Wilson <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* output can contain other messages from API Server, so be more relaxed (#174)

Signed-off-by: Oleg Atamanenko <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Delete README.md

Signed-off-by: Eytan Avisror <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* delete all

Signed-off-by: Eytan Avisror <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* scaffolding

Signed-off-by: Eytan Avisror <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* add API

Signed-off-by: Eytan Avisror <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* initial code

Signed-off-by: Eytan Avisror <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* add more scaffolding

Signed-off-by: Eytan Avisror <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Add kubernetes API calls

Signed-off-by: Eytan Avisror <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* aws API calls

Signed-off-by: Eytan Avisror <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* AWS API calls & Drift detection

Signed-off-by: Eytan Avisror <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* validate() function

Signed-off-by: shreyas-badiger <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* modified validate()

Signed-off-by: sbadiger <[email protected]>

* modified validate()

Signed-off-by: sbadiger <[email protected]>

* initial rotation logic

Signed-off-by: Eytan Avisror <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* basic script_runner without any modifications

Signed-off-by: sbadiger <[email protected]>

* Fix all the vet related errors

Signed-off-by: sbadiger <[email protected]>

Co-authored-by: Alfredo Garo <[email protected]>
Co-authored-by: Eytan Avisror <[email protected]>
Co-authored-by: Shri Javadekar <[email protected]>
Co-authored-by: Shri Javadekar <[email protected]>
Co-authored-by: Shri Javadekar <[email protected]>
Co-authored-by: Craig Robson <[email protected]>
Co-authored-by: Kevin Downey <[email protected]>
Co-authored-by: Oleg Atamanenko <[email protected]>
Co-authored-by: Shreyas Badiger <[email protected]>
Co-authored-by: Adam Malcontenti-Wilson <[email protected]>
Co-authored-by: Adam Malcontenti-Wilson <[email protected]>
Co-authored-by: Eytan Avisror <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Controller v2: Implementation of Instance termination (#178)

* fix make vet errors.

Signed-off-by: sbadiger <[email protected]>

* Terminate instances and run v2 for first time.

Signed-off-by: sbadiger <[email protected]>

* Addressing review comments

Signed-off-by: sbadiger <[email protected]>

* addressing more review comments

Signed-off-by: sbadiger <[email protected]>

* Log error message

Signed-off-by: sbadiger <[email protected]>

* error handling for instance tagging

Signed-off-by: sbadiger <[email protected]>

* Migrate Script Runner (#179)

* Basic script runner

Signed-off-by: Eytan Avisror <[email protected]>

* Update upgrade.go

Signed-off-by: Eytan Avisror <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Implemented node drain. (#181)

Signed-off-by: sbadiger <[email protected]>

* Eager mode implementation (#183)

* Eager mode implementation

Signed-off-by: sbadiger <[email protected]>

* Metrics features (#189)

Signed-off-by: xshao <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Process the batch rotation in parallel (#192)

* Process the batch rotation in parallel

Signed-off-by: sbadiger <[email protected]>

* addressing review comments

Signed-off-by: sbadiger <[email protected]>

* Move the DrainManager within ReplaceBatch(), to access one per RollingUpgrade CR (#195)

Signed-off-by: sbadiger <[email protected]>

* Refine metrics implementation to support goroutines (#196)

* Refine the metrics status

Signed-off-by: xshao <[email protected]>

* Refine the metrics status

Signed-off-by: xshao <[email protected]>

* Fix test case error

Signed-off-by: xshao <[email protected]>

* Use group instead of ASG

Signed-off-by: xshao <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Ignore generated code  (#201)

* Refine the metrics status

Signed-off-by: xshao <[email protected]>

* Refine the metrics status

Signed-off-by: xshao <[email protected]>

* Fix test case error

Signed-off-by: xshao <[email protected]>

* Use group instead of ASG

Signed-off-by: xshao <[email protected]>

* Ignore generated code

Signed-off-by: xshao <[email protected]>

* Ignore generated code

Signed-off-by: xshao <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Fix bug in deleting the entry in syncMap (#203)

Signed-off-by: sbadiger <[email protected]>

* Unit tests for controller-v2 (#215)

* Unit tests

Signed-off-by: sbadiger <[email protected]>

* minor change in accessing the namespace name

Signed-off-by: sbadiger <[email protected]>

* move helper functions to a differnt file

Signed-off-by: sbadiger <[email protected]>

* #2285: rollup CR statistic metrics in v2 (#218)

* #2285: rollup CR statistic metrics in v2

Signed-off-by: sbadla1 <[email protected]>

* #2285: updated metric flags

Signed-off-by: sbadla1 <[email protected]>

* #2285: updated metric flags

Signed-off-by: sbadla1 <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* #2285: renamed some methods related to metrics (#224)

Signed-off-by: sbadla1 <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* #2286: removed version from metric namespace (#227)

Signed-off-by: sbadla1 <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Create RollingUpgradeContext (#234)

* #2285: rollup CR statistic metrics in v2 (#218)

* #2285: rollup CR statistic metrics in v2

Signed-off-by: sbadla1 <[email protected]>

* #2285: updated metric flags

Signed-off-by: sbadla1 <[email protected]>

* #2285: updated metric flags

Signed-off-by: sbadla1 <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* log cloud discovery failure

Signed-off-by: sbadiger <[email protected]>

* Create RollingUpgrade Context

Signed-off-by: sbadiger <[email protected]>

* rollingupgrade context

Signed-off-by: sbadiger <[email protected]>

Co-authored-by: Sahil Badla <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Resolve compile errors caused by merge conflict. (#235)

* #2285: rollup CR statistic metrics in v2 (#218)

* #2285: rollup CR statistic metrics in v2

Signed-off-by: sbadla1 <[email protected]>

* #2285: updated metric flags

Signed-off-by: sbadla1 <[email protected]>

* #2285: updated metric flags

Signed-off-by: sbadla1 <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* log cloud discovery failure

Signed-off-by: sbadiger <[email protected]>

* Create RollingUpgrade Context

Signed-off-by: sbadiger <[email protected]>

* rollingupgrade context

Signed-off-by: sbadiger <[email protected]>

* resolve compile errors due to merge conflict

Signed-off-by: sbadiger <[email protected]>

Co-authored-by: Sahil Badla <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* upgrade-manager-v2: Move DrainManager back to Reconciler (#236)

* #2285: rollup CR statistic metrics in v2 (#218)

* #2285: rollup CR statistic metrics in v2

Signed-off-by: sbadla1 <[email protected]>

* #2285: updated metric flags

Signed-off-by: sbadla1 <[email protected]>

* #2285: updated metric flags

Signed-off-by: sbadla1 <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* log cloud discovery failure

Signed-off-by: sbadiger <[email protected]>

* Create RollingUpgrade Context

Signed-off-by: sbadiger <[email protected]>

* rollingupgrade context

Signed-off-by: sbadiger <[email protected]>

* #2285: rollup CR statistic metrics in v2 (#218)

* #2285: rollup CR statistic metrics in v2

Signed-off-by: sbadla1 <[email protected]>

* #2285: updated metric flags

Signed-off-by: sbadla1 <[email protected]>

* #2285: updated metric flags

Signed-off-by: sbadla1 <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* #2285: renamed some methods related to metrics (#224)

Signed-off-by: sbadla1 <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* #2286: removed version from metric namespace (#227)

Signed-off-by: sbadla1 <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* resolve compile errors due to merge conflict

Signed-off-by: sbadiger <[email protected]>

* move drain-manager to reconciler

Signed-off-by: sbadiger <[email protected]>

* initialize RollingUpgrade object

Signed-off-by: sbadiger <[email protected]>

* use bool instead of count for standby function

Signed-off-by: sbadiger <[email protected]>

* refactor in-progress and standby code

Signed-off-by: sbadiger <[email protected]>

* rename instance standby function

Signed-off-by: sbadiger <[email protected]>

* DrainManager changes in unit test files

Signed-off-by: sbadiger <[email protected]>

Co-authored-by: Sahil Badla <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* V2 controller metrics concurrency fix (#231)

* Refine the metrics status

Signed-off-by: xshao <[email protected]>

* Refine the metrics status

Signed-off-by: xshao <[email protected]>

* Fix test case error

Signed-off-by: xshao <[email protected]>

* Use group instead of ASG

Signed-off-by: xshao <[email protected]>

* Ignore generated code

Signed-off-by: xshao <[email protected]>

* Ignore generated code

Signed-off-by: xshao <[email protected]>

* Fix the concurrent issue

Signed-off-by: xshao <[email protected]>

* Fix the concurrent issue

Signed-off-by: xshao <[email protected]>

* Move metrics related functions into RollingUpgradeContext

Signed-off-by: xshao <[email protected]>

* Move metrics related functions into RollingUpgradeContext

Signed-off-by: xshao <[email protected]>

* Move metrics related functions into upgrade_metrics.go

Signed-off-by: xshao <[email protected]>

* Move metrics related functions into metrics.go

Signed-off-by: xshao <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* add missing parenthesis (#239)

Signed-off-by: sbadiger <[email protected]>

* metricsMutex should be initialized (#240)

Signed-off-by: xshao <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* upgrade-manager-v2: Load test fixes (#245)

* upgrade-manager-v2: Move DrainManager back to Reconciler (#236)

* #2285: rollup CR statistic metrics in v2 (#218)

* #2285: rollup CR statistic metrics in v2

Signed-off-by: sbadla1 <[email protected]>

* #2285: updated metric flags

Signed-off-by: sbadla1 <[email protected]>

* #2285: updated metric flags

Signed-off-by: sbadla1 <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* log cloud discovery failure

Signed-off-by: sbadiger <[email protected]>

* Create RollingUpgrade Context

Signed-off-by: sbadiger <[email protected]>

* rollingupgrade context

Signed-off-by: sbadiger <[email protected]>

* #2285: rollup CR statistic metrics in v2 (#218)

* #2285: rollup CR statistic metrics in v2

Signed-off-by: sbadla1 <[email protected]>

* #2285: updated metric flags

Signed-off-by: sbadla1 <[email protected]>

* #2285: updated metric flags

Signed-off-by: sbadla1 <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* #2285: renamed some methods related to metrics (#224)

Signed-off-by: sbadla1 <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* #2286: removed version from metric namespace (#227)

Signed-off-by: sbadla1 <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* resolve compile errors due to merge conflict

Signed-off-by: sbadiger <[email protected]>

* move drain-manager to reconciler

Signed-off-by: sbadiger <[email protected]>

* initialize RollingUpgrade object

Signed-off-by: sbadiger <[email protected]>

* use bool instead of count for standby function

Signed-off-by: sbadiger <[email protected]>

* refactor in-progress and standby code

Signed-off-by: sbadiger <[email protected]>

* rename instance standby function

Signed-off-by: sbadiger <[email protected]>

* DrainManager changes in unit test files

Signed-off-by: sbadiger <[email protected]>

Co-authored-by: Sahil Badla <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* V2 controller metrics concurrency fix (#231)

* Refine the metrics status

Signed-off-by: xshao <[email protected]>

* Refine the metrics status

Signed-off-by: xshao <[email protected]>

* Fix test case error

Signed-off-by: xshao <[email protected]>

* Use group instead of ASG

Signed-off-by: xshao <[email protected]>

* Ignore generated code

Signed-off-by: xshao <[email protected]>

* Ignore generated code

Signed-off-by: xshao <[email protected]>

* Fix the concurrent issue

Signed-off-by: xshao <[email protected]>

* Fix the concurrent issue

Signed-off-by: xshao <[email protected]>

* Move metrics related functions into RollingUpgradeContext

Signed-off-by: xshao <[email protected]>

* Move metrics related functions into RollingUpgradeContext

Signed-off-by: xshao <[email protected]>

* Move metrics related functions into upgrade_metrics.go

Signed-off-by: xshao <[email protected]>

* Move metrics related functions into metrics.go

Signed-off-by: xshao <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* add missing parenthesis

Signed-off-by: sbadiger <[email protected]>

* load test fixes

Signed-off-by: sbadiger <[email protected]>

* handle scaling group not found

Signed-off-by: sbadiger <[email protected]>

* Update upgrade.go

Signed-off-by: sbadiger <[email protected]>

* log one level up

* remove double logging

Signed-off-by: sbadiger <[email protected]>

* final push before RC release. (#254)

* support IgnoreDrainFailures flag

Signed-off-by: sbadiger <[email protected]>

* add else condition

Signed-off-by: sbadiger <[email protected]>

* set min for maxUnavailable

Signed-off-by: sbadiger <[email protected]>

* calculateMaxUnavailable function

Signed-off-by: sbadiger <[email protected]>

* add a new coloumn (completePercentage)

Signed-off-by: sbadiger <[email protected]>

* disable debug logs by default

Signed-off-by: sbadiger <[email protected]>

* Fix metrics collecting issue (#249)

* metricsMutex should be initialized

Signed-off-by: xshao <[email protected]>

* Use InProcessingNode instead of Stringp[] so that it can have the status of steps

Signed-off-by: xshao <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* Revert "Fix metrics collecting issue (#249)" (#256)

This reverts commit f5dd1cb5f76f2b78cb15c53daed14032a2a4c6ec.

Signed-off-by: sbadiger <[email protected]>

* Fix metrics calculation issue (#258)

* metricsMutex should be initialized

Signed-off-by: xshao <[email protected]>

* Use InProcessingNode instead of Stringp[] so that it can have the status of steps

Signed-off-by: xshao <[email protected]>

* Make the change backward compatible

Signed-off-by: xshao <[email protected]>

* Make the change backward compatible

Signed-off-by: xshao <[email protected]>

* Add mutex for InProcessingNode deleting

Signed-off-by: xshao <[email protected]>
Signed-off-by: sbadiger <[email protected]>

* …
  • Loading branch information
17 people authored Feb 23, 2022
1 parent eaea1ab commit 167e10b
Show file tree
Hide file tree
Showing 100 changed files with 4,777 additions and 7,647 deletions.
114 changes: 78 additions & 36 deletions .github/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,42 +1,84 @@
# Change Log
All notable changes to this project will be documented in this file.

## [v0.21] - 2021-06-16

1e430f3 additional check in setting maxUnavailable (#255)

## [v0.20] - 2021-06-14

3c70352 Upgrade to go 1.16 (#184)
42cf8d4 Improve warning logs (#242)
fce274e fix: status update failing due to unomitted fields (#251)

## [v0.19] - 2021-05-18

5860eb7 #2285: renamed some methods related to metrics (#223)
4da0472 feat: add go-env support (#207)
cc1d2f6 fix: documentation and minor naming changes (#208)
3c26418 fix: log testenv errors (#209)
67cbab6 fix: logging improvements (#211)
708d88e fix: documentation improvements (#210)
95e28cd #2285: updated metric namespace for consistency with others (#219)
30b9ad9 #2285: added CR status metrics (#217)
51e3ff8 Issue 2108: step duration metrics to v1 (#216)
a1af90b Cache the ASG before nodes are rotated in a loop (#212)

## [v0.18] - 2021-03-23

8b2d320 Fix for Launch definition validation. Consider only the "InService" instances. (#197)
42f810c Fail the CR for drain failures, when IgnoreDrainFailures isn't set. (#185)
f5c9457 output can contain other messages from API Server, so be more relaxed (#174)
391b2fb Expose template list and other execution errors to logs (#166)
757b669 Bump golang and busybox (#172)
b8f69e8 Add instance id to the logs (#173)
ac7be6b Fix namespaced name order (#170)
51f469d use standard fmt.Errorf to format error message; unify error format (#171)
b552c69 Bump dependencies. (#169)
36a2784 Remove separate module for pkg/log (#168)
237f93d Move constants to types so that they can be reused (#167)
## [v1.0.4] - 2021-10-04
995b81b controller flags for ignoreDrainFailures and drainTimeout (#307)


## [v1.0.3] - 2021-09-03
6252725 revert #300 (#305)
df08ab0 Set Instances to StandBy in batches (#303)
e77431c fix: fix panic when using MixedInstancesPolicy (#298)
1e6d29d Add ignoreDrainFailure and DrainTimeout as controller arguements (#300)


## [v1.0.2] - 2021-08-05
d73da1b replace launchTemplate latest string with version number (#296)

## [v1.0.1] - 2021-08-05
52d80d9 check for ASG's launch template version instead latest. (#293)
c35445d Controller v2: fix BDD template and update Dockerfile with bash (#292)
db54e0b Controller v2: fix BDD template (#291)
b698dd6 Controller v2: remove cleaning up ruObject as BDD already does. (#290)
86412d5 Controller v2: increase memory/CPU limit and update args (#289)
2d8651c Controller v2: update args (#288)
835fd0d V2 bdd (#286)
998de0d V2 bdd (#285)
3841cc7 #2122: bdd changes for v2 (#284)
93626b4 Controller v2: BDD cron update (#283)
1be8190 Controller v2: BDD cron update (#282)
62c2255 Controller v2: BDD cron update (#280)
42abe52 Controller v2: BDD cron update (#279)
5bdc134 Controller v2 bdd changes (#278)

## [v1.0.0] - 2021-07-21
7a4766d (HEAD -> controller-v2, origin/controller-v2) upgrade-manager-v2: Add CI github action, fix lint errors. (#276)
00f7e89 upgrade-manager-v2: Fix unit tests (#275)
0e64929 upgrade-manager-v2: Process next batch while waiting on nodeInterval period. (#273)
b2b39a0 upgrade-manager-v2: Add nodeEvents handler instead of a watch handler (#272)
c0a163b move cloud discovery after nodeInterval / drainInterval wait (#270)
b15838e Carry the metrics status in RollingUpgrade CR (#267)
610f454 upgrade-manager-v2: remove function duplicate declaration. (#266)
a4e0e84 upgrade-manager-v2: expose totalProcessing time and other metrics (#265)
2390ea0 and CR end time (#264)
79db022 (tag: v1.0.0-RC1) Add a mock for test and update version in Makefile (#262)
3eafd00 Fix metrics calculation issue (#258)
376657f Revert "Fix metrics collecting issue (#249)" (#256)
f5dd1cb Fix metrics collecting issue (#249)
066731d final push before RC release. (#254)
18e0e75 upgrade-manager-v2: Load test fixes (#245)
1fc5847 metricsMutex should be initialized (#240)
a9ac50f add missing parenthesis (#239)
6fef5fd V2 controller metrics concurrency fix (#231)
a490333 upgrade-manager-v2: Move DrainManager back to Reconciler (#236)
b659e0f Resolve compile errors caused by merge conflict. (#235)
b664fdd Create RollingUpgradeContext (#234)
b8d0e72 #2286: removed version from metric namespace (#227)
c445af9 #2285: renamed some methods related to metrics (#224)
1f0f075 #2285: rollup CR statistic metrics in v2 (#218)
d5935e3 Unit tests for controller-v2 (#215)
665c64b Fix bug in deleting the entry in syncMap (#203)
77f985c Ignore generated code (#201)
71b310a Refine metrics implementation to support goroutines (#196)
668c5d8 Move the DrainManager within ReplaceBatch(), to access one per RollingUpgrade CR (#195)
728dae9 Process the batch rotation in parallel (#192)
14e950e Metrics features (#189)
11d3ae6 Eager mode implementation (#183)
57df5a5 Implemented node drain. (#181)
dd6a332 Migrate Script Runner (#179)
2c1d8e7 Controller v2: Implementation of Instance termination (#178)
7cb15b0 Fix all the "make vet" errors in Controller V2 branch. (#177)
59e9b0d Implemented RollingUpgrade object validation. (#176)
5cb9efb initial rotation logic
6b8dad5 AWS API calls & Drift detection
335fb4f aws API calls
41bd571 Add kubernetes API calls
8f33f1e add more scaffolding
25644a6 initial code
87afbd6 add API
2816490 scaffolding
3ad13b8 delete all
6ce7953 Delete README.md

## [v0.17] - 2020-12-11

Expand Down
22 changes: 9 additions & 13 deletions .github/CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,14 @@
# How to contribute

## Development
- Open an issue and discuss the feature / bug you want
- Fork (<https://github.com/keikoproj/upgrade-manager/fork>)
- Clone your fork
- Install [kubebuilder](https://book.kubebuilder.io/quick-start.html)
- `make test` to ensure everything is working
- Create your feature branch (`git checkout -b feature/fooBar`)
- Implement your change and add tests
- `make test` to ensure everything is working
- Commit your changes (`git commit -am 'Add some fooBar'`)
- Add a "Testing Done" section in the commit message. Be as explicit as possible about all the manual and automated tests performed.
- Push to the branch (`git push origin feature/fooBar`)
- Create a new Pull Request
1. Fork it (<https://github.com/keikoproj/upgrade-manager/fork>)
2. Add your fork as a git remote named, say `reviews`.
2. Open an issue and discuss the feature / bug
3. Create your feature branch in your fork (`git checkout -b feature/fooBar`)
4. Commit your changes (`git commit -am 'Add some fooBar'`)
5. Add a "Testing Done" section in the commit message. Be as explicit as possible about all the manual and automated tests performed.
6. Push to the branch (`git push reviews feature/fooBar`)
7. Make sure unit tests and any static-analysis (linting) tests are passing
8. Create a new Pull Request

## How to report a bug

Expand Down
8 changes: 5 additions & 3 deletions .github/workflows/bdd.yaml
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@
name: BDD

on:
schedule:
- cron: '0 7 * * *' # UTC is being used, 07:00 am would be 12:00am in PT
push:
branches:
- master

jobs:
build:
Expand Down Expand Up @@ -41,4 +42,5 @@ jobs:
$HOME/go/bin/godog
- name: Cleanup
run: kubectl delete deployment upgrade-manager-controller-manager -n upgrade-manager-system
run: |
kubectl delete deployment upgrade-manager-controller-manager -n upgrade-manager-system
2 changes: 1 addition & 1 deletion .github/workflows/ci.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ on:
jobs:
build:
name: CI # Lint, Test, Codecov, Docker build & Push
if: github.repository == 'keikoproj/upgrade-manager'
runs-on: ubuntu-latest
steps:

Expand All @@ -39,7 +40,6 @@ jobs:
mv kubebuilder_${version}_linux_${arch} kubebuilder && sudo mv kubebuilder /usr/local/
# update your PATH to include /usr/local/kubebuilder/bin
export PATH=$PATH:/usr/local/kubebuilder/bin
- name: Run Tests
run: make test

Expand Down
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,11 @@ bin

# Output of the go coverage tool, specifically when used with LiteIDE
*.out
coverage.txt

# Kubernetes Generated files - skip generated files, except for vendored files

!vendor/**/zz_generated.*
!api/**/zz_generated.*

# editor and IDE paraphernalia
.idea
Expand Down
1 change: 0 additions & 1 deletion .go-version

This file was deleted.

13 changes: 4 additions & 9 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,13 +1,12 @@
# Build the manager binary
FROM golang:1.16 as builder
FROM golang:1.15 as builder

WORKDIR /workspace
# Copy the Go Modules manifests
COPY go.mod go.mod
COPY go.sum go.sum
# cache deps before building and copying source so that we don't need to re-download as much
# and so that source changes don't invalidate our downloaded layer
COPY pkg pkg
RUN go mod download

# Copy the go source
Expand All @@ -18,19 +17,15 @@ COPY controllers/ controllers/
# Build
RUN CGO_ENABLED=0 GOOS=linux GOARCH=amd64 GO111MODULE=on go build -a -o manager main.go

# Add kubectl
RUN curl -L https://storage.googleapis.com/kubernetes-release/release/v1.14.10/bin/linux/amd64/kubectl -o /usr/local/bin/kubectl
RUN chmod +x /usr/local/bin/kubectl

# Add busybox
FROM busybox:1.32.1 as shelladder

# Use distroless as minimal base image to package the manager binary
# Refer to https://github.com/GoogleContainerTools/distroless for more details
FROM gcr.io/distroless/static:latest
FROM gcr.io/distroless/static:nonroot
WORKDIR /
COPY --from=builder /workspace/manager .
USER 65532:65532

COPY --from=shelladder /bin/sh /bin/sh
COPY --from=builder /workspace/manager .
COPY --from=builder /usr/local/bin/kubectl /usr/local/bin/kubectl
ENTRYPOINT ["/manager"]
Loading

0 comments on commit 167e10b

Please sign in to comment.