Release Notes
3.7.0
New Features
CLOUD-3025: Added optional PodDisruptionBudgets (PDBs) for Cortex, Search Slave, and Integration to reduce outages during nodegroup rebuilds and other drain operations. PDBs are enabled by default for Cortex and Search Slave, and are opt-in for Integration. Therebuild-nodegroups.shdrain script is also hardened with per-node pacing and a 30-minute timeout. For full details, see Auto Scaling and Replicas.CLOUD-3275: Added thegenerate-support-reportJenkins job, which produces areport.zipartifact containing Terraform workspace state, ECR repository information, and Kubernetes diagnostics to assist with infrastructure troubleshooting and support.CLOUD-3333: Changed the default AWS database instance type fromdb.r5.xlargeto the more modern and cost-effectivedb.r6g.xlarge. The default value of the parameter nameddbInstanceClass, in Jenkins jobcreate-and-manage-database-server, is updated todb.r6g.xlarge. New database instances created with this job will default to the new instance type unless the parameter is changed. You can update existing instances to use the new type by rerunning the job with the updateddbInstanceClassparameter; changes will take effect during the configured AWS maintenance window.CLOUD-3557: Added support for deploying a bastion instance in a private subnet, where previously bastions were deployed only in public subnets. This is controlled by the newsubnetTypeparameter in thecreate-and-manage-bastion-instanceJenkins job.CLOUD-3580: Updated the Jenkins server image version to2.541.2-lts-jdk21and the Jenkins helm chart version to5.8.142.CLOUD-3622: Replaced the legacy New Relic agent deployment with the officially supportednri-bundleHelm chart, and updated the default New Relic version to6.0.40. The agent continues to be controlled by the existingTF_VAR_enable_new_relic_agentandTF_VAR_enable_new_relic_k8s_dataparameters. For more information about the New Relic Infrastructure (NRI) Helm chart deployment, see nri-bundle.CLOUD-3628: Added a file namedAGENTS.mdto the root of thecloud-ops-kubernetesGit project to help coding agents better understand the content, purpose and structure of the project. For more information about AGENTS.md files, see AGENTS.md.CLOUD-3641: Updated CloudOps for Kubernetes to provision and operate Amazon EKS1.35. The update includes compatible Kubernetes add-ons and bootstrap tools:eksctl;kubectl; Helm; Cluster Autoscaler; kube-proxy; CoreDNS; Metrics Server; Amazon Elastic File System Container Storage Interface Driver; kube-state-metrics; and the EKS pause image. For upgrade details, see Update to Version 3.7.CLOUD-3684: Added a new page about granting additional AWS users and roles access to view the cluster in the Amazon EKS console and to runkubectlcommands. See Grant Users and Roles Access to EKS.CLOUD-3688: Added an Eclipse P2 caching proxy service to improve Self-Managed Commerce build stability by drastically reducing external Eclipse downloads across builds. It is enabled by default but can be disabled if desired. For more information, see Eclipse P2 Caching Proxy.CLOUD-3694: Added AWS Secrets Manager support for Jenkins credentials. This helps preserve customer-managed credentials across CloudOps for Kubernetes upgrades, including credentials added through the Jenkins web interface. For upgrade details, see Update to Version 3.7. For credential management procedures, see Manage Jenkins Credentials.CLOUD-3716: Updated the Terraform version from0.14.11to1.8.2. Updated the Terraform provider version constraints used by the bundled modules to current supported versions, including AWS~> 5.0, Kubernetes~> 2.29, and Helm~> 2.12, and included compatibility fixes required by the newer providers. No additional upgrade tasks are required beyond the normal CloudOps for Kubernetes upgrade process. If your team runs Terraform directly against the CloudOps for Kubernetes code, ensure that you use the Terraform version required by the checked out release.CLOUD-3724: Added support for SAML integration with Jenkins, so teams can use Single Sign On (SSO) to control Jenkins authentication and authorization. This has several advantages over using Jenkins native login features, including easier on-boarding and off-boarding of team members, and ensures that user credentials are not lost when Jenkins is reconfigured from infrastructure code. For more information, see Jenkins SAML SSO.CLOUD-3759: Added diagnostic commands to thebootstrap/scripts/rebuild-nodegroups.shscript, to provide additional information about the node and pod states during EKS node group rebuild maintenance.
Bug Fixes
CLOUD-3206: Updated thebootstrapcontainer'sentrypoint.shscript to always clear the local Terraform state that manages the backend S3 bucket. This change is to avoid error message likekubernetes-bootstrap | Error: error deleting S3 Bucket (bucketname): BucketNotEmpty: The bucket you tried to delete is not emptythat might occur when runningdocker-compose up.CLOUD-3595: Added validation to thecommerce-branch-validationJenkins job so missing branch parameters fail fast with clear error messages.CLOUD-3639: Switched the HAProxy Ingress implementation from a Kubernetes deployment to a daemonset, to improve availability and uptime when HAProxy Ingress is restarted.CLOUD-3623: Updated themulti-purpose-commerce-toolJenkins job to use a dedicatedactiveMqResourcingProfileparameter for ActiveMQ deployments, separate from Commerce stack resourcing. Added early validation to ensure the selected ActiveMQ profile exists interraform/activemq/env-file, with clear error messaging when the profile is missing.CLOUD-3661: Removed legacy ActiveMQ HA deployment behavior and updated ActiveMQ deployments to use the KubernetesRecreatestrategy during updates. This prevents a second broker pod from being started during rollout, which avoids KahaDB lock contention.CLOUD-3690: Changed the HAProxy-Ingress default timeouts from fifty seconds (50s) to seventy seconds (70s), to address"504 (Gateway Timeout)"issues reported in the Commerce Manager application.CLOUD-3695: Removed American National Standards Institute (ANSI) colour codes from bootstrap console output. The ANSI colour codes were adding extra printed characters to console output when redirected to files.CLOUD-3736: Corrected an issue in the database Terraform modules where, in a specific scenario, Terraform would mark the existing in-use Key Management Service (KMS) key for deletion. This would only occur if the user reran thecreate-and-manage-database-serverjob and manually specified the existing Terraform-managed KMS key in theencryptionKeyparameter.CLOUD-3750: Fixed issues in thebootstrap/scripts/remove-nodegroups.shscript that intermittently caused node group CloudFormation stack deletion to fail due to security group associations.CLOUD-3758: Fixed an issue where all pods in themodsecurity-spoaDaemonSet might not start up correctly after node group changes by assigning the DaemonSet an elevated PriorityClass.CLOUD-3775: Removed the unnecessaryReservedCodeCacheSizeMAVEN_OPTS setting in the Maven Java build jobs, for compatibility with Self-Managed Commerce release8.8.x.CLOUD-3776: Specified all Jenkins plugins to be installed, including the specific versions, to ensure that compatible plugins are always selected and installed.SUP-6337: Updated the Jenkins controller Java memory configuration so the maximum heap is set at 60% of available memory instead of 87.5%. This makes more memory available to non-heap components to avoid Kubernetes OOMKilled issues.
Deprecations & Removals
CLOUD-3656: Removed code that was used to create and manage some remaining Account Management API resources. Support for Account Management API ended with CloudOps for Kubernetes release3.1.CLOUD-3611: Removed the Alert Logic daemonset from CloudOps for Kubernetes. Alert Logic support was deprecated in CloudOps for Kubernetes release2.14.
See Deprecations and Removals.
Upgrade Instructions
For upgrade instructions, see Upgrading CloudOps for Kubernetes.