Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Autoharness Subcommand #3874

Merged
merged 14 commits into from
Feb 11, 2025
Merged

Conversation

carolynzech
Copy link
Contributor

@carolynzech carolynzech commented Feb 4, 2025

This PR introduces an initial implementation for the autoharness subcommand and a book chapter describing the feature. I recommend reading the book chapter before reviewing the code (or proceeding further in this PR description).

Callouts

--harness is to manual harnesses what --include-function and --exclude-function are to autoharness; both allow the user to filter which harnesses/functions get verified. Their implementation is different, however----harness is a driver-only flag, i.e., we still compile every harness, but then only call CBMC on the ones specified in --harness. --include-function and --exclude-function get passed to the compiler. I made this design choice to try to be more aggressive about improving compiler performance--if a user only asks to verify one function and we go try to codegen a thousand, the compiler will take much longer than it needs to. I thought this more aggressive optimization was warranted given that crates are likely to have far many more functions eligible for autoverification than manual Kani harnesses.

(See also the limitations in the book chapter).

Testing

Besides the new tests in this PR, I also ran against s2n-quic to confirm that the subcommand works on larger crates.

Towards #3832

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 and MIT licenses.

We approximated an empty body by checking for TerminatorKind::Return in the first basic block, but it turns out some identity functions have this MIR as well, so remove it. One could argue that identity functions aren't worth checking either, but we'd rather check more than less for now.
@github-actions github-actions bot added the Z-BenchCI Tag a PR to run benchmark CI label Feb 4, 2025
@tautschnig
Copy link
Member

I think we'll have to find a way to report which functions were not considered, or else this creates a soundness risk: a code base may be fully covered by Kani in one version of the code base, then a change is made such that Arbitrary is no longer derived for some parameter type (or a new parameter with a type not supporting Arbitrary is added). At that point a previously fully-covered code base turns into one with partial coverage with no obvious way for the user to detect this.

@carolynzech
Copy link
Contributor Author

carolynzech commented Feb 4, 2025

I think we'll have to find a way to report which functions were not considered, or else this creates a soundness risk: a code base may be fully covered by Kani in one version of the code base, then a change is made such that Arbitrary is no longer derived for some parameter type (or a new parameter with a type not supporting Arbitrary is added). At that point a previously fully-covered code base turns into one with partial coverage with no obvious way for the user to detect this.

@tautschnig It does currently print every function that it did consider, so one could figure out which functions weren't considering by doing a (mental) set difference between all the functions in their crate versus what got printed. So we're not claiming to verify anything that we haven't; it's still on a function-by-function basis, and we print all the functions we considered.
But your point is well taken that changes over time may be difficult to spot. I can explicitly print all the functions we skipped. It's not a fundamental limitation--it just involves saving some more compiler-side metadata for the driver to interpret. This PR is already quite large, so I was trying to avoid introducing too many changes, but if we're uncomfortable merging this PR without it, happy to add it.

@carolynzech carolynzech marked this pull request as ready for review February 4, 2025 21:02
@carolynzech carolynzech requested a review from a team as a code owner February 4, 2025 21:02
Copy link
Contributor

@zhassan-aws zhassan-aws left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A high-level comment: the command name, "autoverify" might give a false impression that the functions that were covered have been verified, even though the harnesses only check the absence of panics and undefined behavior (at least for functions that do not have contracts), whereas "verified" is associated with proving their functional correctness. The same applies to the description in the new chapter.

I suggest we use a different name, e.g. "check-panic-ub" or just "auto"?

docs/src/reference/experimental/autoverify.md Outdated Show resolved Hide resolved
@carolynzech
Copy link
Contributor Author

carolynzech commented Feb 5, 2025

whereas "verified" is associated with proving their functional correctness. The same applies to the description in the new chapter.

@zhassan-aws Point taken about the chapter wording--I can make it clearer that this since this subcommand generates a harness under the hood, it has the same guarantees as a harness, so users shouldn't take it to mean that they're magically getting anything stronger (i.e., functional correctness) by using it.

I'm not sure I agree with the broader point about the "autoverify" name, though--our printed CBMC output refers to "verification results," so it seems like this name is consistent with that convention. That being said, I don't want to bikeshed this too much, so I'll think about some names... I don't love check-panic-ub because we can also check for overflow, check-panic-ub-overflow is too long, and if we ever introduce an additional default check the name will be out of date. Perhaps kani autoharness to hammer home that we're just generating a MIR harness in the end?

@zhassan-aws
Copy link
Contributor

Perhaps kani autoharness to hammer home that we're just generating a MIR harness in the end?

I like this better. kani already implies that we're running verification, and autoharness suggests that the harnesses checked are automatically generated.

@carolynzech carolynzech changed the title Autoverify Subcommand Autoharness Subcommand Feb 6, 2025
@tautschnig
Copy link
Member

But your point is well taken that changes over time may be difficult to spot. I can explicitly print all the functions we skipped. It's not a fundamental limitation--it just involves saving some more compiler-side metadata for the driver to interpret. This PR is already quite large, so I was trying to avoid introducing too many changes, but if we're uncomfortable merging this PR without it, happy to add it.

Having changed the name to autoharness I am less worried, but I'd still like to see a way to log all skipped functions eventually. So I'd appreciate follow-up work towards this. I'd then even suggest more fine-grained information that tells the user why a particular function was skipped.

@carolynzech
Copy link
Contributor Author

But your point is well taken that changes over time may be difficult to spot. I can explicitly print all the functions we skipped. It's not a fundamental limitation--it just involves saving some more compiler-side metadata for the driver to interpret. This PR is already quite large, so I was trying to avoid introducing too many changes, but if we're uncomfortable merging this PR without it, happy to add it.

Having changed the name to autoharness I am less worried, but I'd still like to see a way to log all skipped functions eventually. So I'd appreciate follow-up work towards this. I'd then even suggest more fine-grained information that tells the user why a particular function was skipped.

Sounds good. I can & will create a follow-up PR with UX improvements... I think getting this out and trying to apply to a library as big as the standard library will give us a lot of good ideas about things that can be improved.

@carolynzech carolynzech enabled auto-merge February 6, 2025 15:41
@carolynzech carolynzech disabled auto-merge February 6, 2025 16:12
Copy link
Contributor

@zhassan-aws zhassan-aws left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great so far! I mostly reviewed the tests in this round.

I suggest adding a test with dependencies to ensure we don't generate harnesses for functions outside the local crate.

Also, on the question of not terminating, I think we should run verification with a small default unwind value and a small default timeout and allow the user to control those.

Copy link
Contributor

@zhassan-aws zhassan-aws left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I imagine that in the future, we would want to introduce a #[kani::autoharness] attribute that functions can be annotated with (or just allow #[kani::proof] directly on functions with arguments, see #1919). Perhaps, autoharness would add those annotations automatically.

kani-compiler/src/args.rs Show resolved Hide resolved
@carolynzech
Copy link
Contributor Author

I imagine that in the future, we would want to introduce a #[kani::autoharness] attribute that functions can be annotated with (or just allow #[kani::proof] directly on functions with arguments, see #1919). Perhaps, autoharness would add those annotations automatically.

You mean that we'd phase out --include-functions and --exclude-functions and just have people put this annotation on their functions instead (or support both)?

@zhassan-aws
Copy link
Contributor

What I was thinking is that functions annotated with #[kani::autoharness] would be targeted when we run cargo kani. The new subcommand cargo kani autoharness would target functions that do not have harness annotations.

@carolynzech carolynzech added this pull request to the merge queue Feb 11, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Feb 11, 2025
@carolynzech carolynzech added this pull request to the merge queue Feb 11, 2025
Merged via the queue into model-checking:main with commit 8b2ec77 Feb 11, 2025
27 of 28 checks passed
@carolynzech carolynzech deleted the optional-harness branch February 11, 2025 16:51
Copy link
Contributor

@zhassan-aws zhassan-aws left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late comments, but I found that I had a bunch that I didn't submit.

kani-compiler/src/args.rs Show resolved Hide resolved
self.harness_info.get_mut(harness).unwrap().has_loop_contracts = true;
let metadata = self.harness_info.get_mut(harness).unwrap();
metadata.has_loop_contracts = true;
// If we're generating this harness automatically and we encounter a loop contract,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this necessary? Functions with loop contracts can be verified with #[kani::proof] (e.g. see https://github.com/model-checking/kani/blob/main/tests/expected/loop-contract/count_zero.rs).

Copy link
Contributor Author

@carolynzech carolynzech Feb 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. I made the mistake of assuming that since we call them loop "contracts," they were composable with function contracts, and that does not appear to be the case.

I tried this example:

#![feature(stmt_expr_attributes)]
#![feature(proc_macro_hygiene)]

#[kani::requires(true)]
fn simple_loop_with_loop_contracts() {
    let mut x: u64 = kani::any_where(|i| *i >= 1);

    #[kani::loop_invariant(x >= 1)]
    while x > 1 {
        x = x - 1;
    }

    assert!(x == 1);
}


#[kani::proof_for_contract(simple_loop_with_loop_contracts)]
fn foo() {
    simple_loop_with_loop_contracts()
}

and the loop invariant is ignored, so the loop unwinds forever.

The autoharness generation feature shows the same behavior, i.e. given:

#![feature(proc_macro_hygiene)]
#![feature(stmt_expr_attributes)]

#[kani::requires(true)]
fn has_loop_contract() {
    let mut x: u8 = kani::any_where(|i| *i >= 2);

    #[kani::loop_invariant(x >= 2)]
    while x > 2 {
        x = x - 1;
    }

    assert!(x == 2);
}

the loop invariant is ignored.

This was an oversight on my part; I should have had test coverage for this loop contract / function contract combination case. That being said, I think we should clarify this in our documentation. @qinheping, I recommend updating the loop contracts reference chapter to clarify that the provided simple_loop_with_loop_contracts should be have a #[kani::proof] attribute--there's currently no attribute right now, so the example as given will just say "No proofs found to verify." I would also add an item to the limitations section that function contracts and loop contracts are not composable--i.e., if you try to prove your function contract, your loop invariant will be ignored.

I need to think more about what we should do if we encounter a function with both loop contracts and function contracts--I suppose we should generate both a #[kani::proof] and a #[kani::proof_for_contract] (to prove the loop invariant and the other contracts, respectively).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I'm at it, I'll implement the timeout/default unwind for loops without contracts that we discussed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the loop invariant is ignored.

Even with -Zloop-contracts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Z-BenchCI Tag a PR to run benchmark CI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants