Overview

This page contains the list of deprecations and important or breaking changes for Vault 1.13.x compared to 1.12. Please read it carefully.

Changes

Consul dataplane compatibility

If you are using Consul on Kubernetes, please be aware that upgrading to Consul 1.14.0 will impact Consul secrets, storage, and service registration. As of Consul 1.14.0, Consul on Kubernetes uses Consul Dataplane by default instead of client agents. Vault does not currently support Consul Dataplane. Please follow the Consul 1.14.0 upgrade guide to ensure that your Consul on Kubernetes deployment continues to use client agents.

User lockout

As of version 1.13, Vault will stop trying to validate user credentials if the user submits multiple invalid credentials in quick succession. During lockout, Vault ignores requests from the barred user rather than responding with a permission denied error.

User lockout is enabled by default with a lockout threshold of 5 attempt, a lockout duration of 15 minutes, and a counter reset window of 15 minutes.

For more information, refer to the User lockout overview.

Active directory secrets engine deprecation

The Active Directory (AD) secrets engine has been deprecated as of the Vault 1.13 release. We will continue to support the AD secrets engine in maintenance mode for six major Vault releases. Maintenance mode means that we will fix bugs and security issues but will not add new features. For additional information, see the deprecation table and migration guide.

AliCloud auth role parameter

The AliCloud auth plugin will now require the role parameter on login. This has always been documented as a required field but the requirement will now be enforced.

Mounts associated with removed builtin plugins will result in core shutdown on upgrade

As of 1.13.0 Standalone (logical) DB Engines and the AppId Auth Method have been marked with the Removed status. Any attempt to unseal Vault with mounts backed by one of these builtin plugins will result in an immediate shutdown of the Vault core.

NOTE In the event that an external plugin with the same name and type as a deprecated builtin is deregistered, any subsequent unseal will continue to unseal with an unusable auth backend, and a corresponding ERROR log.

$ vault plugin register -sha256=c805cf3b69f704dfcd5176ef1c7599f88adbfd7374e9c76da7f24a32a97abfe1 auth app-id
Success! Registered plugin: app-id
$ vault auth enable -plugin-name=app-id plugin
Success! Enabled app-id auth method at: app-id/
$ vault auth list -detailed | grep "app-id"
app-id/    app-id    auth_app-id_3a8f2e24    system         system     default-service    replicated     false        false                      map[]      n/a                        0018263c-0d64-7a70-fd5c-50e05c5f5dc3    n/a        n/a                      c805cf3b69f704dfcd5176ef1c7599f88adbfd7374e9c76da7f24a32a97abfe1    n/a
$ vault plugin deregister auth app-id
Success! Deregistered plugin (if it was registered): app-id
$ vault plugin list -detailed | grep "app-id"
app-id                               auth        v1.13.0+builtin.vault                                 removed
$ curl --header "X-Vault-Token: $VAULT_TOKEN" --request POST http://127.0.0.2:8200/v1/sys/seal
$ vault operator unseal <key1>
...
$ vault operator unseal <key2>
...
$ vault operator unseal <key3>
...
$ grep "app-id" /path/to/vault.log
[ERROR] core: skipping deprecated auth entry: name=app-id path=app-id/ error="mount entry associated with removed builtin"
[ERROR] core: skipping initialization for nil auth backend: path=app-id/ type=app-id version="v1.13.0+builtin.vault"

The remediation for affected mounts is to downgrade to the previously-used version of Vault environment variable and replace any Removed feature with the preferred alternative feature.

For more information on the phases of deprecation, see the Deprecation Notices FAQ.

Impacted versions

Affects upgrading from any version of Vault to 1.13.x. All other upgrade paths are unaffected.

Known issues

Rotation configuration persistence issue could lose transform tokenization key versions

A rotation performed manually or via automatic time based rotation after restarting or leader change of Vault, where configuration of rotation was changed since the initial configuration of the tokenization transform can result in the loss of intermediate key versions. Tokenized values from these versions would not be decodeable. It is recommended that customers who have enabled automatic rotation disable it, and other customers avoid key rotation until the upcoming fix.

Affected versions

This issue affects Vault Enterprise with ADP versions 1.10.x and higher. A fix will be released in Vault 1.11.9, 1.12.5, and 1.13.1.

PKI OCSP GET requests can return HTTP redirect responses

If a base64 encoded OCSP request contains consecutive '/' characters, the GET request will return a 301 permanent redirect response. If the redirection is followed, the request will not decode as it will not be a properly base64 encoded request.

As a workaround, OCSP POST requests can be used which are unaffected.

Impacted versions

Affects all current versions of 1.12.x and 1.13.x

PKI revocation request forwarding

If a revocation request comes in to a standby or performance secondary node, for a certificate that is present locally, the request will not be correctly forwarded to the active node of this cluster.

As a workaround, submit revocation requests to the active node only.

STS credentials do not return a lease_duration

Vault 1.13.0 introduced a change to the AWS Secrets Engine such that it no longer creates leases for STS credentials due to the fact that they cannot be revoked or renewed. As part of this change, a bug was introduced which causes lease_duration to always return zero. This prevents the Vault Agent from refreshing STS credentials and may introduce undesired behaviour for anything which relies on a non-zero lease_duration.

For applications that can control what value to look for, the ttl value in the response can be used to know when to request STS credentials next.

An additional workaround for users rendering STS credentials via the Vault Agent is to set the static-secret-render-interval for a template using the credentials. Setting this configuration to 15 minutes accommodates the default minimum duration of an STS token and overrides the default render interval of 5 minutes.

Impacted versions

Affects Vault 1.13.0 only.

LDAP pagination issue

There was a regression introduced in 1.13.2 relating to LDAP maximum page sizes, resulting in an error no LDAP groups found in groupDN [...] only policies from locally-defined groups available. The issue occurs when upgrading Vault with an instance that has an existing LDAP Auth configuration.

As a workaround, disable paged searching using the following:

vault write auth/ldap/config max_page_size=-1

Impacted versions

Affects Vault 1.13.2.

PKI Cross-Cluster revocation requests and unified CRL/OCSP

When revoking certificates on a cluster that doesn't own the certificate, writing the revocation request will fail with a message like error persisting cross-cluster revocation request. Similar errors will appear in the log for failure to write unified CRL and unified delta CRL WAL entries.

As a workaround, submit revocation requests to the cluster which issued the certificate, or use BYOC revocation. Use cluster-local OCSP and CRLs until this is resolved.

Impacted versions

Affects Vault 1.13.0 to 1.13.2. Fixed in 1.13.3.

On upgrade, all local revocations will be synchronized between clusters; revocation requests are not persisted when failing to write cross-cluster.

Slow startup time when storing PKI certificates

There was a regression introduced in 1.13.0 where Vault is slow to start because the PKI secret engine performs a list operation on the stored certificates. If a large number of certificates are stored this can cause long start times on active and standby nodes.

There is currently no workaround for this other than limiting the number of certificates stored in Vault via the PKI tidy or using no_store flag for PKI roles.

Impacted versions

Affects Vault 1.13.0+

Token creation with a new entity alias could silently fail

A regression caused token creation requests under specific circumstances to be forwarded from perf standbys (Enterprise only) to the active node incorrectly. They would appear to succeed, however no lease was created. The token would then be revoked on first use causing a 403 error.

This only happened when all of the following conditions were met:

the token is being created against a role
the request specifies an entity alias which has never been used before with the same role (for example for a brand new role or a unique alias)
the request happens to be made to a perf standby rather than the active node

Retrying token creation after the affected token is rejected would work since the entity alias has already been created.

Affected versions

Affects Vault 1.13.0 to 1.13.3. Fixed in 1.13.4.

update-primary can lead to data loss

It's possible to lose data from a Vault cluster given a particular configuration and sequence of steps. This page describes two paths to data loss, both associated with the use of update-primary.

Normally update-primary does not need to be used. However, there are a few cases where it's needed, e.g. when the known primary cluster addresses of a secondary don't contain any of the correct addresses. But update-primary does more than you might think: it does almost everything that enabling a secondary does, except that it doesn't wipe storage. One of the steps that it takes is to temporarily remove most of the mount table records: it removes all mount entries except for those that are managed automatically by vault, e.g. identity mounts.

This update-primary behaviour is unintended and we'll be reworking it in an upcoming release. Once it lands the changelog entry will be "Fix a race condition with update-primary that could result in data loss after a DR failover."

update-primary with local data in shared mounts

If update-primary is done on a PR secondary with shared mounts containing local data (e.g. pki certs, approle secretids), the merkle tree on the PR secondary may get corrupted due to a timing race.

When this happens, the PR secondary still contains all the stored data, e.g. listing local certs from PKI mounts will return the correct results. However, because the merkle tree has been corrupted, a downstream DR secondary will not receive the local data, and will delete it if it already had it. If the PR secondary's DR secondary is promoted before the PR secondary is repaired, the newly promoted PR secondary will not contain the local data it ought to. If the former PR secondary is lost or destroyed, the missing data will not be recoverable other than via a snapshot restore.

Detection and remediation

If the TRACE level log line "cleaning key in merkle tree" appears immediately subsequent to an update-primary on a PR secondary, that's an indicator that the timing race was lost and that the merkle tree may be corrupt.

Repairing the corrupt merkle tree is done by issuing a replication reindex request to the PR secondary.

If logs are no longer present (the update-primary was done some time in the past), it's probably best to reindex the PR secondary pre-emptively as a precaution.

update-primary with "Allow" path filters

There is a further path to data loss associated update-primary. This issue requires that the PR secondary receiving an update-primary request has an associated Allow path filter defined for it. Like the first issue, this one too has a timing aspect: the problem may or may not manifest, depending on how quickly the mount tables truncated by update-primary get repaired by replication.

At startup/unseal (and after an update-primary), Vault runs a background job that looks at the mount data it has stored and tries to delete any that doesn't belong there, based on path filters. This behaviour was introduced in 1.0.3.1 to recover from a regression that allowed for inappropriate filtering of data: we needed to ensure that any previously unfiltered data got cleaned up on secondaries that ought not have it.

If a performance secondary has an associated Allow path filter, this cleanup code can misfire during the interval between when the truncated mount tables are written by update-primary and the time when they get rewritten by replication. The cleanup code will delete the data associated with the missing mount entries. The cleanup code doesn't modify the merkle tree, and as a result this deleted data won't be discovered as missing and repaired by replication.

Detection and remediation

When the cleanup code fires it logs the INFO level message "deleted mistakenly stored mount entry from backend". This is a reliable indicator that the bug was hit.

If logs aren't available, the other indicator that this problem has manifested is to query the shared mount in question. The secondary won't have any of the data that the primary does, e.g. roles and configuration will be absent.

Reindexing the performance secondary will update the merkle tree to reflect the missing storage entries and allow missing shared data to be replaced by replication. However, any local data on shared mounts (such as PKI certs) will not be recoverable.

Impacted versions

Affects all current versions of Vault.

PKI storage migration revives deleted issuers

Vault 1.11 introduced Storage v1, a new storage layout that supported multiple issuers within a single mount. Bug fixes in Vault 1.11.6, 1.12.2, and 1.13.0 corrected a write-ordering issue that lead to invalid CA chains. Specifically, incorrectly ordered writes could fail due to load, resulting in the mount being re-migrated next time it was loaded or silently truncating CA chains. This collection of bug fixes introduced Storage v2.

Affected versions

Vault may incorrectly re-migrated legacy issuers created before Vault 1.11 that were migrated to Storage v1 and deleted before upgrading to a Vault version with Storage v2.

The migration fails when Vault finds managed keys associated with the legacy issuers that were removed from the managed key repository prior to the upgrade.

The migration error appears in Vault logs as:

Error during migration of PKI mount: failed to lookup public key from managed key: no managed key found with uuid

Note

Issuers created in Vault 1.11+ and direct upgrades to a Storage v2 layout are not affected.

The Storage v1 upgrade bug was fixed in Vault 1.14.1, 1.13.5, and 1.12.9.

Using 'update_primary_addrs' on a demoted cluster causes Vault to panic

Affected versions

1.13.3, 1.13.4 & 1.14.0

Issue

If the update_primary_addrs parameter is used on a recently demoted cluster, Vault will panic due to no longer having information about the primary cluster.

Workaround

Instead of using update_primary_addrs on the recently demoted cluster, instead provide an activation token.