Bazel rule extensions
One of Bazel's best features is being able to easily write custom rules specific to your project. This is great for many use cases, but when what you really want is to enhance the behavior of existing rules, historically your options have been limited. What you would often do is wrap the existing rule in a macro, and add some number of custom rules to try and achieve the desired effect. When really what you want is to edit the existing rule, without having to re-implement all of its functionality (or maintain a fork).
With Bazel 8.0, Googlers added a few new ways to extend existing rules that can help with this use case. In this post we will look at the aptly named rule extensions feature and some practical use cases I have found for it.
Basic rule extensions
Rule extensions allow you to inherit the behavior of an existing rule, similar to class inheritance in object-oriented programming. Importantly you can make a few modifications to augment the behavior of the rule to your liking.
Let's say you have a rule that concatenates the given srcs:
def _foo_impl(ctx):
output = ctx.actions.declare_file("output.txt")
ctx.actions.run_shell(
inputs = ctx.files.srcs,
outputs = [output],
command = "cat {} > {}".format(" ".join([src.path for src in ctx.files.srcs]), output.path),
)
return [DefaultInfo(files = depset([output]))]
foo = rule(
implementation = _foo_impl,
attrs = {
"srcs": attr.label_list(allow_files = True),
},
)
Now let's assume in your project, you want the output file to be sorted.
If you own the original rule you could of course change _foo_impl to
handle that for you, but if you are relying on a more complex upstream
rule, you may not have that luxury. Here's how we can extend this to
post-process the file it produces:
def _bar_impl(ctx):
providers = ctx.super() # Invoke 'foo' and get the providers
# NOTE: This assumes there's always only the provider we want.
original_output = providers[0].files.to_list()[0]
new_output = ctx.actions.declare_file("new_output.txt")
ctx.actions.run_shell(
inputs = [original_output],
outputs = [new_output],
command = "sort {} > {}".format(original_output.path, new_output.path),
)
return [DefaultInfo(files = depset([new_output]))]
bar = rule(
implementation = _bar_impl,
parent = foo, # Inherit everything from 'foo'
)
This example illustrates the core new features of rule extensions. First
we inherit everything from foo with:
parent = foo,
Then we invoke the original implementation with:
providers = ctx.super()
At this point we have all of the original providers, and we can post-process them however we want. In this example we choose to extract what we need from them and return entirely different providers based on our new action.
Manipulating providers
As well as creating new providers, you can also manipulate the returned
providers for your use case. Recently I wanted to unify how debug info
was returned for cc_binary targets across macOS and Linux.
Specifically I wanted to be able to create a filegroup target pointing
to a specific output_group that would work on both platforms, where by
default you would have to fetch these from different output locations on
different platforms.
I was able to achieve this with a rule extension of cc_binary:
load("@cc_compatibility_proxy//:proxy.bzl", _upstream_cc_binary = "cc_binary")
def _cc_binary_impl(ctx):
providers = ctx.super()
output_group_info = None
debug_package_info = None
passthrough_providers = []
for provider in providers:
if type(provider) == "OutputGroupInfo":
output_group_info = provider
elif type(provider) == "struct" and hasattr(provider, "unstripped_file"): # NOTE: Will require an update when this provider moves to starlark
debug_package_info = provider
passthrough_providers.append(provider)
else:
passthrough_providers.append(provider)
if not output_group_info:
fail("No OutputGroupInfo provider found")
if not debug_package_info:
fail("No DebugPackageInfo provider found")
dsyms = getattr(output_group_info, "dsyms", depset())
new_output_group_info = {}
if dsyms:
new_output_group_info["debug_info"] = dsyms
else:
new_output_group_info["debug_info"] = depset([debug_package_info.unstripped_file])
for group in dir(output_group_info):
new_output_group_info[group] = getattr(output_group_info, group)
return passthrough_providers + [
OutputGroupInfo(**new_output_group_info),
]
cc_binary = rule(
implementation = _cc_binary_impl,
parent = _upstream_cc_binary,
)
Most of this implementation is about collecting and re-propagating the providers that I don't care about. The methods for doing this today are pretty tedious but hopefully that will improve in the future.
The core logic once I collect the original providers is this:
dsyms = getattr(output_group_info, "dsyms", depset())
if dsyms:
new_output_group_info["debug_info"] = dsyms
else:
new_output_group_info["debug_info"] = depset([debug_package_info.unstripped_file])
Here I decide that if the OutputGroupInfo provider from the upstream
cc_binary implementation includes dsyms, we propagate that,
otherwise we propagate the unstripped binary which contains all the
debug info for Linux. I am then able to create a single filegroup to
fetch whichever one is found:
filegroup(
name = "some_binary.debug_info",
srcs = [":some_binary"],
output_group = "debug_info",
)
Extending rules with transitions
Let's look at another use case for extending rules. This time I want to
add a custom transition to an upstream rule. In our project we produce
python wheels that include native extensions. In order to distribute
these wheels we have to build them for each version of python we
support. rules_python has a py_wheel rule that creates the wheel for
us, but it does that targeting the "current python version" (this could
potentially be improved upstream). The current version depends on how
you setup python in your MODULE.bazel file, but we want to change it
when we build different targets so we can target multiple python
versions in a single build. Thankfully the way rules_python has
implemented version selection is with a flag that we can write a
transition for. To add a transition to an upstream rule, while otherwise
maintaining the original functionality, we can do this:
py_wheel = rule(
implementation = lambda ctx: ctx.super(),
parent = py_wheel_rule,
attrs = {
"python_version": attr.string(),
},
cfg = python_version_transition,
)
This use case has a few interesting things to note. Since we don't want to change the functionality or providers returned by this rule, we don't even create an implementation function, instead opting to call super directly in a lambda:
implementation = lambda ctx: ctx.super(),
Then we add an attribute the rule didn't have before. This is then read by our transition to decide which python version to use:
attrs = {
"python_version": attr.string(),
},
If the original rule had an appropriate attribute we could use that instead, but in this case we need to provide our own. Finally we add the transition (the contents of which aren't necessary to understand for this example, but can be found here):
cfg = python_version_transition,
Now with our new py_wheel rule we can set:
python_version = "3.13",
And all transitive dependencies will be built targeting the passed version instead of the default version.
Previously to solve this same use case you potentially could have
created a custom rule that applied this transition, and made the
original py_wheel target depend on your custom rule's output, but that
is definitely more overhead to understand and maintain for the use cases
I commonly hit.
Applying platform specific transitions
Another use case for adding a transition to an existing rule is for
platform specific builds. A longstanding frustration in the iOS
community is bare targets like cc_library or swift_library, don't
have any knowledge of the platform they are being built for, even if you
have written them in a way that only supports a single platform. This
means that if you tried to directly build a swift_library whose code
only supports iOS, you'll be greeted with a compiler error as Bazel
attempts to build it for macOS.
Developers have often worked around this by wrapping their libraries in
a macro that adds an underlying platform specific target, such as an
ios_build_test, that has the necessary platform transition. This adds
complexity and causes confusion for non-Bazel developers. It also adds
general overhead in your build when you're querying things or otherwise
inspecting what targets exist, as every library now has additional
underlying targets.
With rule extensions you can eliminate this overhead by applying the
Apple platform transition directly to swift_library:
load("@rules_apple//apple/internal:transition_support.bzl", "transition_support")
load("@rules_swift//swift:swift.bzl", _upstream_swift_library = "swift_library")
swift_library = rule(
implementation = lambda ctx: ctx.super(),
parent = _upstream_swift_library,
cfg = transition_support.apple_rule_transition,
attrs = {
"platform_type": attr.string(default = "ios"),
# TODO: Extract to a constant that matches your ios_application targets
"minimum_os_version": attr.string(default = "16.0"),
},
)
This requires a bit of knowledge about how the transition works, specifically we have to add 2 attributes the transition relies on, but that's something that could potentially be improved.
Once you have this extended rule, building your swift_library targets
correctly builds for iOS, and they are not rebuilt when building them
from a platform specific target's dependency tree.
If your swift_library target supports multiple platforms, you can
still use something like this while respecting inherited platform from
your top level targets. The easiest way I found to do this would be to
select() on the current platform in a macro, and provide the correct
values given that target platform:
load("@rules_apple//apple/internal:transition_support.bzl", "transition_support")
load("@rules_swift//swift:swift.bzl", _upstream_swift_library = "swift_library")
swift_library = rule(
implementation = lambda ctx: ctx.super(),
parent = _upstream_swift_library,
cfg = transition_support.apple_rule_transition,
attrs = {
"platform_type": attr.string(mandatory = True),
"minimum_os_version": attr.string(mandatory = True),
},
)
def my_swift_library(**kwargs):
swift_library(
platform_type = select({
"@platforms//os:tvos": "tvos",
"//conditions:default": "ios",
}),
minimum_os_version = "16.0", # NOTE: Could select() here too if necessary
**kwargs
)
With this example building the library directly will default to iOS, but if you have a top level tvOS target that depends on it, it will still correctly compile for tvOS.
Adding an attribute with aspects
Another powerful use case for this feature is to add aspects to attributes on the rule. For example if you want a rule to collect custom files from all of its dependencies and propagate those in its own runfiles, you can override an attribute from the parent, adding your custom aspect:
bar = rule(
implementation = _bar_impl,
parent = foo,
attrs = {
"deps": attr.label_list(aspects = [custom_aspect]),
},
)
Then in _bar_impl you can do whatever processing you need to collect
the outputs of the aspect.
Other notes
- There are currently some limitations on what types of attributes can
be overridden, for example you cannot override a
attr.label_keyed_string_dictto add aspects. In some cases a quick workaround is to add a new attribute instead, and wrap your usage in a macro that populates the new attribute with the same value as the original attribute. - While it may look confusing at first, I think there's some value in
re-using the same rule name with your extended rules so
bazel query kind(cc_binary, ...)continues to work as before. This way developers who are using queries like this don't have to know about this extension. - You have to set
parentto another Bazel rule, not a macro. This is sometimes difficult as many popular rulesets expose macros for their rules to perform pre-processing on the passed attributes. I hope as this feature becomes more widely used that pattern is reduced. In the cases where I am using this today I found I could skip the custom macro logic as long as I was careful about what it was trying to accomplish. - I would like to see how an approach like this could augment existing
rules to add outputs for actions that already exist. For example when
passing
-ftime-traceto clang ideally we could create that output in our rule implementation but use the existing action to create it, I couldn't find a way to modify thecoptsto do that today. It might be possible with a combination of this approach and a macro.