Update to Version 3.7
Prerequisites
CloudOps for Kubernetes 3.7.x has the following pre-requisites.
CloudOps for Kubernetes 3.5.x
important
You can skip up to one CloudOps for Kubernetes release when upgrading.
For example, if you are using CloudOps for Kubernetes v3.5.x then you can update directly to v3.7.x without first upgrading to v3.6.x. This is a result of a new upgrade policy that allows skipping one release, as-of release 3.7.x.
Supported upgrade paths:
- Upgrade from your current
3.5.xpatch level to3.7.x - Upgrade from your current
3.6.xpatch level to3.7.x
Complete the upgrade to CloudOps for Kubernetes release 3.5.x or higher before upgrading to release 3.7.x.
- For instructions on upgrading to release
3.6.x, see Update to Version 3.6. - For instructions on upgrading to release
3.5.x, see Update to Version 3.5.
What Has Changed
For a list of the changes in the release, see Release Notes.
Changes that impact the upgrade process
CloudOps for Kubernetes release 3.7.x contains the following changes that impact the upgrade process.
The Kubernetes version is updated to
1.35. Additional upgrade steps due to this change are noted in the sectionAdditional Upgrade Stepsbelow.The Terraform version is updated from
0.14.11to1.8.2. Additional upgrade steps due to this change are noted below.PodDisruptionBudgetresources will be created forep-cortexandep-searchslaveby default (and optionally forep-integration) in every Commerce environment, during the next Terraform apply. These budgets protect Self-Managed Commerce availability during node-group rebuilds and other node-drain events by ensuring at least one replica of each service remains schedulable while pods are being evicted. Twelve new Terraform variables control this behaviour. The stockdev.tfvarsandprod-small.tfvarsprofiles already declare all twelve variables with appropriate values.
Additional Upgrade Steps
If you created custom Terraform code or Terraform modules in your CloudOps for Kubernetes project or in any Extensions projects, review that code before starting the upgrade. Ensure that your custom Terraform modules are compatible with Terraform
1.8.2. All out-of-the-box Terraform modules have been properly updated.Review any additional tools that you may have installed in the Kubernetes cluster, such as monitoring or security tools, and ensure that they are compatible with Kubernetes
1.35. If they are not compatible then plan to update them as part of the upgrade process.Rebuild the Kubernetes nodegroups, as documented in Upgrading CloudOps for Kubernetes, as part of the upgrade process. This is required for the Kubernetes
1.35upgrade.In
docker-compose.override.yml, setTF_VAR_rebuild_nodegroupsto"true"for this upgrade.TF_VAR_rebuild_nodegroups: "true"This replaces the existing EKS node groups with node groups that use the Kubernetes
1.35compatible Amazon Machine Image (AMI).If you have been using a custom Amazon Machine Image (AMI) for your Kubernetes nodes, ensure it is compatible with Kubernetes
1.35. One way to determine if you have been using a custom AMI is to view yourdocker-compose.override.ymlfile and see if a value was specified for parameteraws_eks_ami.If you have custom resourcing profile files in
terraform/commerce-stack/env-file/, add the twelve newEP_*_PDB_*variables to each file.important
This step is especially important for deployments with
EP_CORTEX_MIN_REPLICASorEP_SEARCH_SLAVE_MIN_REPLICASset to2or higher. Without explicit values, Terraform falls back to the module default ofmaxUnavailable: 1. This serialises pod evictions during nodegroup rebuilds and extends the drain window and the CloudOps for Kubernetes upgrade. See Tuning Pod Disruption Budgets for guidance on choosing the right policy for your environment.Optional and recommended: after the upgrade completes, and after updating any custom resourcing profile files as noted above, run the
deploy-or-delete-commerce-stackjob, or a pipeline that calls it, to create thePodDisruptionBudgetresources. Do this for each of your Commerce environments to improve their uptime and availability during future maintenance activities. For information aboutPodDisruptionBudgetresources, see Auto Scaling and Replicas.In
docker-compose.override.yml, setTF_VAR_jenkins_overwrite_pluginsto"true"for this upgrade.TF_VAR_jenkins_overwrite_plugins: "true"This is required to update Jenkins for both the AWS Secrets Manager credentials integration and the SAML integration.
Optional and recommended: after the upgrade completes, review Jenkins Git and other credentials and decide which credentials should move to AWS Secrets Manager. For procedures to add Secrets Manager credentials, see Manage Jenkins Credentials.
Preparation
Review all upgrade steps and requirements described in Upgrading CloudOps for Kubernetes.
Document an upgrade plan with all steps that you will need to perform to complete the upgrade. Include validation steps in the upgrade plan. Validation should include the functionality of Jenkins jobs, Commerce services, and any custom tools.
Suggested: practice and refine the upgrade plan in an AWS account not used by your team or by the business. For example, obtain an empty AWS account, set up CloudOps for Kubernetes release
3.6.x, deploy Commerce services, and then follow your upgrade plan.Identify a time to apply the CloudOps for Kubernetes update to your production cluster. As part of the upgrade process you will need to rebuild the nodegroups because of the Kubernetes version change. There will likely be a Self-Managed Commerce service outage while the EKS node groups are rebuilt. A Self Managed Commerce service outage duration of up to 15 minutes has been observed in our testing.
Review your custom Jenkins jobs to determine whether any call the modified CloudOps for Kubernetes jobs listed in Jenkins job parameter changes. If they do, update your custom jobs as part of the upgrade so they continue to work with release
3.7.x.
Jenkins job parameter changes
Several Jenkins jobs have parameter changes in CloudOps for Kubernetes 3.7.0. Review whether you have custom jobs that call or invoke the affected jobs and update your custom jobs where appropriate.
The Jenkins jobs with parameter changes in CloudOps for Kubernetes 3.7.0 are listed below.
| Jenkins job | Added parameters | Removed parameters |
|---|---|---|
create-and-manage-bastion-instance | subnetType | None |
multi-purpose-commerce-tool | activeMqResourcingProfile | None |
Perform the Update
Follow the general instructions on updating the cluster in the Upgrading CloudOps for Kubernetes documentation.