Autoharness Subcommand #3874

carolynzech · 2025-02-04T18:21:38Z

This PR introduces an initial implementation for the autoharness subcommand and a book chapter describing the feature. I recommend reading the book chapter before reviewing the code (or proceeding further in this PR description).

Callouts

--harness is to manual harnesses what --include-function and --exclude-function are to autoharness; both allow the user to filter which harnesses/functions get verified. Their implementation is different, however----harness is a driver-only flag, i.e., we still compile every harness, but then only call CBMC on the ones specified in --harness. --include-function and --exclude-function get passed to the compiler. I made this design choice to try to be more aggressive about improving compiler performance--if a user only asks to verify one function and we go try to codegen a thousand, the compiler will take much longer than it needs to. I thought this more aggressive optimization was warranted given that crates are likely to have far many more functions eligible for autoverification than manual Kani harnesses.

(See also the limitations in the book chapter).

Testing

Besides the new tests in this PR, I also ran against s2n-quic to confirm that the subcommand works on larger crates.

Towards #3832

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 and MIT licenses.

We approximated an empty body by checking for TerminatorKind::Return in the first basic block, but it turns out some identity functions have this MIR as well, so remove it. One could argue that identity functions aren't worth checking either, but we'd rather check more than less for now.

tautschnig · 2025-02-04T20:58:08Z

I think we'll have to find a way to report which functions were not considered, or else this creates a soundness risk: a code base may be fully covered by Kani in one version of the code base, then a change is made such that Arbitrary is no longer derived for some parameter type (or a new parameter with a type not supporting Arbitrary is added). At that point a previously fully-covered code base turns into one with partial coverage with no obvious way for the user to detect this.

carolynzech · 2025-02-04T21:01:39Z

I think we'll have to find a way to report which functions were not considered, or else this creates a soundness risk: a code base may be fully covered by Kani in one version of the code base, then a change is made such that Arbitrary is no longer derived for some parameter type (or a new parameter with a type not supporting Arbitrary is added). At that point a previously fully-covered code base turns into one with partial coverage with no obvious way for the user to detect this.

@tautschnig It does currently print every function that it did consider, so one could figure out which functions weren't considering by doing a (mental) set difference between all the functions in their crate versus what got printed. So we're not claiming to verify anything that we haven't; it's still on a function-by-function basis, and we print all the functions we considered.
But your point is well taken that changes over time may be difficult to spot. I can explicitly print all the functions we skipped. It's not a fundamental limitation--it just involves saving some more compiler-side metadata for the driver to interpret. This PR is already quite large, so I was trying to avoid introducing too many changes, but if we're uncomfortable merging this PR without it, happy to add it.

zhassan-aws

A high-level comment: the command name, "autoverify" might give a false impression that the functions that were covered have been verified, even though the harnesses only check the absence of panics and undefined behavior (at least for functions that do not have contracts), whereas "verified" is associated with proving their functional correctness. The same applies to the description in the new chapter.

I suggest we use a different name, e.g. "check-panic-ub" or just "auto"?

docs/src/reference/experimental/autoverify.md

carolynzech · 2025-02-05T23:05:25Z

whereas "verified" is associated with proving their functional correctness. The same applies to the description in the new chapter.

@zhassan-aws Point taken about the chapter wording--I can make it clearer that this since this subcommand generates a harness under the hood, it has the same guarantees as a harness, so users shouldn't take it to mean that they're magically getting anything stronger (i.e., functional correctness) by using it.

I'm not sure I agree with the broader point about the "autoverify" name, though--our printed CBMC output refers to "verification results," so it seems like this name is consistent with that convention. That being said, I don't want to bikeshed this too much, so I'll think about some names... I don't love check-panic-ub because we can also check for overflow, check-panic-ub-overflow is too long, and if we ever introduce an additional default check the name will be out of date. Perhaps kani autoharness to hammer home that we're just generating a MIR harness in the end?

zhassan-aws · 2025-02-06T00:13:10Z

Perhaps kani autoharness to hammer home that we're just generating a MIR harness in the end?

I like this better. kani already implies that we're running verification, and autoharness suggests that the harnesses checked are automatically generated.

tautschnig · 2025-02-06T15:03:08Z

But your point is well taken that changes over time may be difficult to spot. I can explicitly print all the functions we skipped. It's not a fundamental limitation--it just involves saving some more compiler-side metadata for the driver to interpret. This PR is already quite large, so I was trying to avoid introducing too many changes, but if we're uncomfortable merging this PR without it, happy to add it.

Having changed the name to autoharness I am less worried, but I'd still like to see a way to log all skipped functions eventually. So I'd appreciate follow-up work towards this. I'd then even suggest more fine-grained information that tells the user why a particular function was skipped.

carolynzech · 2025-02-06T15:05:10Z

But your point is well taken that changes over time may be difficult to spot. I can explicitly print all the functions we skipped. It's not a fundamental limitation--it just involves saving some more compiler-side metadata for the driver to interpret. This PR is already quite large, so I was trying to avoid introducing too many changes, but if we're uncomfortable merging this PR without it, happy to add it.

Having changed the name to autoharness I am less worried, but I'd still like to see a way to log all skipped functions eventually. So I'd appreciate follow-up work towards this. I'd then even suggest more fine-grained information that tells the user why a particular function was skipped.

Sounds good. I can & will create a follow-up PR with UX improvements... I think getting this out and trying to apply to a library as big as the standard library will give us a lot of good ideas about things that can be improved.

docs/src/reference/experimental/autoharness.md

zhassan-aws

Looks great so far! I mostly reviewed the tests in this round.

I suggest adding a test with dependencies to ensure we don't generate harnesses for functions outside the local crate.

Also, on the question of not terminating, I think we should run verification with a small default unwind value and a small default timeout and allow the user to control those.

kani-compiler/src/args.rs

tests/script-based-pre/cargo_autoharness_filter/src/lib.rs

tests/script-based-pre/cargo_autoharness_harnesses_fail/harnesses_fail.expected

tests/script-based-pre/cargo_autoharness_loops_fixme/config.yml

zhassan-aws

I imagine that in the future, we would want to introduce a #[kani::autoharness] attribute that functions can be annotated with (or just allow #[kani::proof] directly on functions with arguments, see #1919). Perhaps, autoharness would add those annotations automatically.

kani-compiler/src/args.rs

carolynzech · 2025-02-10T19:43:57Z

I imagine that in the future, we would want to introduce a #[kani::autoharness] attribute that functions can be annotated with (or just allow #[kani::proof] directly on functions with arguments, see #1919). Perhaps, autoharness would add those annotations automatically.

You mean that we'd phase out --include-functions and --exclude-functions and just have people put this annotation on their functions instead (or support both)?

zhassan-aws · 2025-02-10T21:49:28Z

What I was thinking is that functions annotated with #[kani::autoharness] would be targeted when we run cargo kani. The new subcommand cargo kani autoharness would target functions that do not have harness annotations.

zhassan-aws

Sorry for the late comments, but I found that I had a bunch that I didn't submit.

docs/src/reference/experimental/autoharness.md

kani-compiler/src/args.rs

zhassan-aws · 2025-02-10T23:20:39Z

kani-compiler/src/kani_middle/codegen_units.rs

-            self.harness_info.get_mut(harness).unwrap().has_loop_contracts = true;
+            let metadata = self.harness_info.get_mut(harness).unwrap();
+            metadata.has_loop_contracts = true;
+            // If we're generating this harness automatically and we encounter a loop contract,


Is this necessary? Functions with loop contracts can be verified with #[kani::proof] (e.g. see https://github.com/model-checking/kani/blob/main/tests/expected/loop-contract/count_zero.rs).

You're right. I made the mistake of assuming that since we call them loop "contracts," they were composable with function contracts, and that does not appear to be the case.

I tried this example:

#![feature(stmt_expr_attributes)] #![feature(proc_macro_hygiene)] #[kani::requires(true)] fn simple_loop_with_loop_contracts() { let mut x: u64 = kani::any_where(|i| *i >= 1); #[kani::loop_invariant(x >= 1)] while x > 1 { x = x - 1; } assert!(x == 1); } #[kani::proof_for_contract(simple_loop_with_loop_contracts)] fn foo() { simple_loop_with_loop_contracts() }

and the loop invariant is ignored, so the loop unwinds forever.

The autoharness generation feature shows the same behavior, i.e. given:

#![feature(proc_macro_hygiene)] #![feature(stmt_expr_attributes)] #[kani::requires(true)] fn has_loop_contract() { let mut x: u8 = kani::any_where(|i| *i >= 2); #[kani::loop_invariant(x >= 2)] while x > 2 { x = x - 1; } assert!(x == 2); }

the loop invariant is ignored.

This was an oversight on my part; I should have had test coverage for this loop contract / function contract combination case. That being said, I think we should clarify this in our documentation. @qinheping, I recommend updating the loop contracts reference chapter to clarify that the provided simple_loop_with_loop_contracts should be have a #[kani::proof] attribute--there's currently no attribute right now, so the example as given will just say "No proofs found to verify." I would also add an item to the limitations section that function contracts and loop contracts are not composable--i.e., if you try to prove your function contract, your loop invariant will be ignored.

I need to think more about what we should do if we encounter a function with both loop contracts and function contracts--I suppose we should generate both a #[kani::proof] and a #[kani::proof_for_contract] (to prove the loop invariant and the other contracts, respectively).

While I'm at it, I'll implement the timeout/default unwind for loops without contracts that we discussed.

the loop invariant is ignored.

Even with -Zloop-contracts?

kani-compiler/src/kani_middle/codegen_units.rs

carolynzech added 7 commits February 1, 2025 19:55

Automatic harness generation (standard harnesses only)

16a9b37

contract harnesses support

e46bb3a

Improve summary

53c534e

Add option to include or exclude partial function paths

0aed576

don't let --harness affect automatic harnesses

bfeac3b

add chapter to book

eaab5cf

github-actions bot added the Z-BenchCI Tag a PR to run benchmark CI label Feb 4, 2025

carolynzech force-pushed the optional-harness branch from 8b85bee to f814676 Compare February 4, 2025 19:14

fix formatting & regression failures

bfebfdb

carolynzech force-pushed the optional-harness branch from f814676 to bfebfdb Compare February 4, 2025 20:28

carolynzech marked this pull request as ready for review February 4, 2025 21:02

carolynzech requested a review from a team as a code owner February 4, 2025 21:02

zhassan-aws reviewed Feb 5, 2025

View reviewed changes

docs/src/reference/experimental/autoverify.md Outdated Show resolved Hide resolved

change subcommand name to autoharness

ec674e8

carolynzech changed the title ~~Autoverify Subcommand~~ Autoharness Subcommand Feb 6, 2025

Improve documentation to make clear that it generates a harness

9787664

carolynzech mentioned this pull request Feb 6, 2025

Tracking Issue: Make Harnesses Optional #3832

Open

8 tasks

tautschnig approved these changes Feb 6, 2025

View reviewed changes

docs/src/reference/experimental/autoharness.md Show resolved Hide resolved

Merge branch 'main' into optional-harness

db2a71c

carolynzech enabled auto-merge February 6, 2025 15:41

carolynzech disabled auto-merge February 6, 2025 16:12

zhassan-aws reviewed Feb 6, 2025

View reviewed changes

kani-compiler/src/args.rs Show resolved Hide resolved

feliperodri assigned carolynzech Feb 10, 2025

carolynzech added 2 commits February 10, 2025 14:45

Rename reachability mode to AllFns

4643132

add expected file for loops fixme test

8750860

carolynzech assigned zhassan-aws and unassigned carolynzech Feb 10, 2025

carolynzech requested a review from zhassan-aws February 10, 2025 19:46

Merge branch 'main' into optional-harness

d4f9522

carolynzech enabled auto-merge February 11, 2025 15:00

carolynzech added this pull request to the merge queue Feb 11, 2025

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Feb 11, 2025

carolynzech added this pull request to the merge queue Feb 11, 2025

Merged via the queue into model-checking:main with commit 8b2ec77 Feb 11, 2025
27 of 28 checks passed

carolynzech deleted the optional-harness branch February 11, 2025 16:51

zhassan-aws reviewed Feb 11, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Autoharness Subcommand #3874

Autoharness Subcommand #3874

carolynzech commented Feb 4, 2025 •

edited

Loading

tautschnig commented Feb 4, 2025

carolynzech commented Feb 4, 2025 •

edited

Loading

zhassan-aws left a comment

carolynzech commented Feb 5, 2025 •

edited

Loading

zhassan-aws commented Feb 6, 2025

tautschnig commented Feb 6, 2025

carolynzech commented Feb 6, 2025

zhassan-aws left a comment

zhassan-aws left a comment

carolynzech commented Feb 10, 2025

zhassan-aws commented Feb 10, 2025

zhassan-aws left a comment

zhassan-aws Feb 10, 2025

carolynzech Feb 14, 2025 •

edited

Loading

carolynzech Feb 14, 2025

zhassan-aws Feb 14, 2025

carolynzech Feb 17, 2025

Autoharness Subcommand #3874

Autoharness Subcommand #3874

Conversation

carolynzech commented Feb 4, 2025 • edited Loading

Callouts

Testing

tautschnig commented Feb 4, 2025

carolynzech commented Feb 4, 2025 • edited Loading

zhassan-aws left a comment

Choose a reason for hiding this comment

carolynzech commented Feb 5, 2025 • edited Loading

zhassan-aws commented Feb 6, 2025

tautschnig commented Feb 6, 2025

carolynzech commented Feb 6, 2025

zhassan-aws left a comment

Choose a reason for hiding this comment

zhassan-aws left a comment

Choose a reason for hiding this comment

carolynzech commented Feb 10, 2025

zhassan-aws commented Feb 10, 2025

zhassan-aws left a comment

Choose a reason for hiding this comment

zhassan-aws Feb 10, 2025

Choose a reason for hiding this comment

carolynzech Feb 14, 2025 • edited Loading

Choose a reason for hiding this comment

carolynzech Feb 14, 2025

Choose a reason for hiding this comment

zhassan-aws Feb 14, 2025

Choose a reason for hiding this comment

carolynzech Feb 17, 2025

Choose a reason for hiding this comment

carolynzech commented Feb 4, 2025 •

edited

Loading

carolynzech commented Feb 4, 2025 •

edited

Loading

carolynzech commented Feb 5, 2025 •

edited

Loading

carolynzech Feb 14, 2025 •

edited

Loading