Edit

Share via


Pipeline caching

Azure DevOps Services

Pipeline caching can help reduce build time by reusing downloaded dependencies from previous runs, avoiding the need to recreate or redownload the same files. This is particularly helpful in scenarios where the same dependencies are downloaded repeatedly at the start of each run. This is often a time consuming process involving hundreds or thousands of network calls.

Caching is most effective when the time required to restore and save the cache is less than the time it takes to regenerate the files. However, in some cases, caching may not provide performance benefits and could even negatively impact build time. It's important to evaluate your specific scenario to determine whether caching is the right approach.

Note

Pipeline caching is not supported for Classic release pipelines.

When to use pipeline artifacts versus pipeline caching

Pipeline caching and pipeline artifacts perform similar functions but are intended for different scenarios and shouldn't be used interchangeably.

  • Use pipeline artifacts: when you need to take specific files produced by one job and share them with other jobs (and those other jobs would likely fail without them).

  • Use pipeline caching: when you want to improve build time by reusing files from previous runs (and not having those files won't impact the job's ability to run).

Note

Pipeline caching and pipeline artifacts are available at no cost for all tiers (free and paid). See Artifacts storage consumption for more details.

Self-hosted agent requirements

The following executables must be located in a folder listed in the PATH environment variable. Note that these requirements apply only to self-hosted agents, as hosted agents come preinstalled with the necessary software.

Archive software / Platform Windows Linux Mac
GNU Tar Required Required No
BSD Tar No No Required
7-Zip Recommended No No

Cache task: How it works

Caching is added to a pipeline by adding the Cache task to the steps section of a job.

During pipeline execution, when a cache step is encountered, the task attempts to restore the cache based on the provided inputs. If no cache is found, the step completes, and the next step in the job is executed.

Once all steps in the job have run successfully, a special "Post-job: Cache" step is automatically added and triggered for each "restore cache" step that wasn't skipped. This step is responsible for saving the cache.

Note

Caches are immutable. Once a cache is created, its content cannot be modified.

Configure the Cache task

The Cache task has two required arguments: path and key:

  1. path: The path to the folder you want to cache. This can be an absolute or relative path. Relative paths are resolved against $(System.DefaultWorkingDirectory).

    Tip

    You can use predefined variables to store the path to the folder you want to cache. However, wildcards are not supported.

  2. key: This defines the identifier for the cache you want to restore or save. The key is composed of a combination of string values, file paths, or file patterns, with each segment separated by a | character.

    • Strings:
      A fixed value (such as the cache name or a tool name), or taken from an environment variable (like the current OS or job name).

    • File paths:
      The path to a specific file whose contents will be hashed. The file must exist at the time the task is run. Any segment that resembles a file path is treated as such, so be cautious, especially when using segments containing ., as this may lead to "file doesn't exist" failures.

      Tip

      To avoid a path-like string segment from being treated like a file path, wrap it with double quotes, for example: "my.key" | $(Agent.OS) | key.file

    • File patterns:
      A comma-separated list of glob-style wildcard patterns that must match at least one file. Examples:

      • **/yarn.lock: all yarn.lock files under the sources directory.
      • */asset.json, !bin/**: all asset.json files located in a directory under the sources directory, except those in the bin directory.

The contents of any file identified by a file path or file pattern are hashed to generate a dynamic cache key. This is useful when your project has files that uniquely identify what’s being cached. For instance, files like package-lock.json, yarn.lock, Gemfile.lock, or Pipfile.lock are often referenced in a cache key, as they represent a unique set of dependencies. Relative file paths or patterns are resolved against $(System.DefaultWorkingDirectory).

  • Example:

The following example shows how to cache Yarn packages:

variables:
  YARN_CACHE_FOLDER: $(Pipeline.Workspace)/s/.yarn

steps:
- task: Cache@2
  inputs:
    key: '"yarn" | "$(Agent.OS)" | yarn.lock'
    restoreKeys: |
       "yarn" | "$(Agent.OS)"
       "yarn"
    path: $(YARN_CACHE_FOLDER)
  displayName: Cache Yarn packages

- script: yarn --frozen-lockfile

In this example, the cache key consists of three parts: a static string ("yarn"), the OS the job is running on (since the cache is unique per operating system), and the hash of the yarn.lock file (which uniquely identifies the dependencies).

On the first run after the task is added, the cache step will report a "cache miss" because the cache identified by this key doesn't exist. After the last step, a cache will be created from the files in $(Pipeline.Workspace)/s/.yarn and uploaded. On the next run, the cache step will report a "cache hit" and the contents of the cache will be downloaded and restored.

When using checkout: self, the repository is checked out to $(Pipeline.Workspace)/s, and your .yarn folder will likely reside in the repository itself.

Note

Pipeline.Workspace is the local path on the agent running your pipeline where all directories are created. This variable has the same value as Agent.BuildDirectory. If you're not using checkout: self, ensure you update the YARN_CACHE_FOLDER variable to point to the location of .yarn in your repository.

Use restore keys

restoreKeys allows you to query multiple exact keys or key prefixes. It's used as a fallback when the specified key doesn't yield a hit. A restore key searches for a key by prefix and returns the most recently created cache entry. This is helpful when the pipeline cannot find an exact match but still wants to use a partial cache hit.

To specify multiple restore keys, list them on separate lines. The order in which the restore keys are tried is from top to bottom.

  • Example:

Here's an example of how to use restore keys to cache Yarn packages:

variables:
  YARN_CACHE_FOLDER: $(Pipeline.Workspace)/.yarn

steps:
- task: Cache@2
  inputs:
    key: '"yarn" | "$(Agent.OS)" | yarn.lock'
    restoreKeys: |
       yarn | "$(Agent.OS)"
       yarn
    path: $(YARN_CACHE_FOLDER)
  displayName: Cache Yarn packages

- script: yarn --frozen-lockfile

In this example, the cache task first attempts to restore the specified key. If the key doesn't exist in the cache, it then tries the first restore key: yarn | $(Agent.OS). This searches for any cache keys that exactly match or start with this prefix.

A prefix match can occur if the hash of the yarn.lock file has changed. For example, if the cache contains the key yarn | $(Agent.OS) | old-yarn.lock (where old-yarn.lock has a different hash than the current yarn.lock), this restore key would result in a partial cache hit.

If the first restore key doesn't yield a match, the next restore key (yarn) This will search for any cache key that starts with yarn. For prefix matches, the restore process returns the most recently created cache entry.

Note

A pipeline can include multiple caching tasks, and there's no storage limit for caching. Jobs and tasks within the same pipeline can access and share the same cache.

Use restore condition

In some scenarios, you may want to conditionally execute steps based on whether the cache was successfully restored. For example, you can skip a step that installs dependencies if the cache was restored. This can be achieved using the cacheHitVar argument.

Setting this input to the name of an environment variable causes the variable to be set to true when there's a cache hit, inexact if a restore key yields a partial cache hit, and false if no cache is found. You can then reference this variable in a step condition or within a script.

Here’s an example where the install-deps.sh step is skipped when the cache is restored:

steps:
- task: Cache@2
  inputs:
    key: mykey | mylockfile
    restoreKeys: mykey
    path: $(Pipeline.Workspace)/mycache
    cacheHitVar: CACHE_RESTORED

- script: install-deps.sh
  condition: ne(variables.CACHE_RESTORED, 'true')

- script: build.sh

Cache isolation and security

To ensure isolation between caches from different pipelines and different branches, every cache is stored within a logical container called a scope. Scopes act as a security boundary that guarantees:

  • Jobs from one pipeline can’t access caches from a different pipeline.

  • Jobs building pull requests can read caches from the target branch (for the same pipeline), but can't write (create) caches in the target branch's scope.

When a cache step is encountered during a run, the cache identified by the key is requested from the server. The server then looks for a cache with this key from the scopes visible to the job, and returns the cache (if available). On cache save (at the end of the job), a cache is written to the scope representing the pipeline and branch.

CI, manual, and scheduled runs

Scope Read Write
Source branch Yes Yes
main branch Yes No
master branch Yes No

Pull request runs

Scope Read Write
Source branch Yes No
Target branch Yes No
Intermediate branch (such as refs/pull/1/merge) Yes Yes
main branch Yes No
master branch Yes No

Pull request fork runs

Branch Read Write
Target branch Yes No
Intermediate branch (such as refs/pull/1/merge) Yes Yes
main branch Yes No
master branch Yes No

Tip

Because caches are already scoped to a project, pipeline, and branch, there's no need to include any project, pipeline, or branch identifiers in the cache key.

Examples

For Ruby projects using Bundler, override the BUNDLE_PATH environment variable to set the path where Bundler looks for Gems.

Example:

variables:
  BUNDLE_PATH: $(Pipeline.Workspace)/.bundle

steps:
- task: Cache@2
  displayName: Bundler caching
  inputs:
    key: 'gems | "$(Agent.OS)" | Gemfile.lock'
    path: $(BUNDLE_PATH)
    restoreKeys: | 
      gems | "$(Agent.OS)"
      gems   

Known issues and feedback

If you're having trouble setting up caching in your pipeline, check the list of open issues in the microsoft/azure-pipelines-tasks repo. If you don't see your issue listed, create a new one and provide the necessary information about your scenario.

Q&A

Q: Can I clear a cache?

A: Clearing a cache is not supported. However, you can avoid hits on existing caches by adding a string literal (such as version2) to your cache key. For example, change the following cache key from this:

key: 'yarn | "$(Agent.OS)" | yarn.lock'

To this:

key: 'version2 | yarn | "$(Agent.OS)" | yarn.lock'

Q: When does a cache expire?

A: Caches expire after seven days of no activity.

Q: When does the cache get uploaded?

A: A cache is created from your specified path and uploaded after the last step of the job. See the example for more details.

Q: Is there a limit on the size of a cache?

A: There's no enforced limit on the size of individual caches or the total cache size within an organization.