Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document how to write and run tests #373

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

caspervonb
Copy link
Contributor

@caspervonb caspervonb commented Jan 14, 2021

Based on my recent work and our discussions in yesterday's meeting I started drafting up the basis of a simple specification of a test runner and guidelines for authoring tests.

docs/Testing.md Outdated Show resolved Hide resolved
docs/Testing.md Outdated Show resolved Hide resolved
docs/Testing.md Outdated Show resolved Hide resolved
@@ -0,0 +1,127 @@
# Testing

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if the sources for test should be in a separate directory, that way a test source can be in any format, include multi-file or even nested hierarchy or files. Specifically I see two advantages to this:

  1. The test binary directory (the one the test runner processes) only contains files with a fixed set of known suffixes.
  2. The test source directory structure can be language-specfic and take what ever form it likes. Perhaps some source languages like to have each program in its directory, or each program requires several files?

Copy link
Contributor Author

@caspervonb caspervonb Jan 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably we'll gather all the tests in a webassembly/wasi-testsuite repository ala webassembly/testsuite that can be pulled in as submodule or subtree containing all the test fixtures you'll need to bring your own runner in an existing test setup.

Having a single self contained source file here makes it really easy to reference exactly what is failing; an implementation's test runner can even parse that programatically from a trace and output that.

That said; some of the webassembly proposal repositories do have a test/meta directory which contains the files used to generate tests.

Generating parametric tests this way is definitively an option but I'd still like the generated sources to be available in the main test directory next to the binary for reference without having to "ship" the entire meta directory.

Copy link
Contributor Author

@caspervonb caspervonb Jan 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test binary directory (the one the test runner processes) only contains files with a fixed set of known suffixes.

I don't think we can promise this in general.

As proposals emerge the tests may have to be expanded upon with new meta files; additional permissions for sockets/networking comes to mind for example.

This initials set is just enough to support the filesystem and command proposals.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One additional comment to this: though wasi-nn is rather exotic compared to, say, wasi-filesystem, I would like to consider what it would take to write WASI tests. With the current spec, the ML model and the input tensors would likely be shipped as separate files since they are rather large. Does that fit into this model?

Additionally, the output tensor(s) will likely not be exact matches but instead need some type of fuzzy assertion. Not sure how we will handle that...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One additional comment to this: though wasi-nn is rather exotic compared to, say, wasi-filesystem, I would like to consider what it would take to write WASI tests. With the current spec, the ML model and the input tensors would likely be shipped as separate files since they are rather large. Does that fit into this model?

It's open ended and additive so it can be extended to encompass whatever we need but I'm not sure what wasi-nn needs, not that familiar with the proposal but seems it just loads models from memory?

If so then a data section seems more appropriate.

Additionally, the output tensor(s) will likely not be exact matches but instead need some type of fuzzy assertion. Not sure how we will handle that...

Depends on the previous question but probably just define some fuzzy asserters in the source language.
Internal logic should be handled internally IMO.

Really a case of you tell me tho; I'm not familiar enough with that proposal at this time to make any recommendations.

Copy link
Contributor

@abrown abrown Jan 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depends on the previous question but probably just define some fuzzy asserters in the source language.
Internal logic should be handled internally IMO.

Sounds good.

I'm not sure what wasi-nn needs, not that familiar with the proposal but seems it just loads models from memory?

That's right but, practically speaking, those models/weights/tensors are large and are easier to manage as separate files than static data sections in a compiled Wasm file. It's not an exact corollary, but in an example over in Wasmtime I load this data from files before using them in wasi-nn. If we could bundle additional files with the tests and were allowed to use these files in <basename>.arg (or even statically in the test, I guess), that would seem easiest. What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another reason to embed any data the test needs to that it avoid the dependency on the filesystem API, making the test more precise and less fragile. It should be fairly straight forward to embed data in wasm, I think there are a few different ways you can do this if you are using an llvm-based toolchain.

Copy link
Contributor Author

@caspervonb caspervonb Jan 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we could bundle additional files with the tests and were allowed to use these files in .arg (or even statically in the test, I guess), that would seem easiest. What do you think?

You could but this also introduces a lot of extra unrelated dependencies in your tests that can be avoided.

Different toolchains have different mechanisms for it but it can be quite trivial to embed the data. In Rust for example one could use the include_bytes macro.

let model_1 = include_bytes!("model_1.tensor")

Other toolchains have their ways too.

docs/Testing.md Outdated Show resolved Hide resolved
## Writing tests

Any source language may be used to write tests as long as the compiler supports
the wasm32-unknown-unknown target and is well-known and freely available.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wasm32-unknown-unknown is an llvm-ism so we probably don't want to mention that here. Perhaps:

"Any source language may be used to write tests as long as the compiler supports
WebAssembly output targeting the WASI API"

The part about "well-known and freely available" will probably want to be tightened up too.

Copy link
Contributor Author

@caspervonb caspervonb Jan 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wasm32-unknown-unknown is an llvm-ism so we probably don't want to mention that here. Perhaps:

Targeting wasm32-wasi is probably fine actually, just don't pull in libc.

I just want to make sure we point out that using wasm32-wasi's main is generally going to make for a bad test.

For example; trying to test proc_exit in main isn't great as it'll do all the bootstrapping that libc needs but isn't relevant in the context.

The part about "well-known and freely available" will probably want to be tightened up too.

Yeah I'll need to think about what that means.

I'm thinking:

  • We can't allow proprietary compilers.
  • We can't allow non-portable compilers.
  • We can't allow toy languages that won't be or aren't maintained.

I'll double back to this later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do tend to think we do want to allow tests to pull in libc. It does call proc_exit and some of the stdio routines needlessly pull in fdstat_get and perhaps other things, however bare WASI is really inconvenient to code to, so if we have to do everythin in bare WASI, it seems like we'd end with people writing fewer tests. It seems better to have more tests, even if some of them do call more functions than they strictly need to.

Copy link
Contributor Author

@caspervonb caspervonb Jan 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do tend to think we do want to allow tests to pull in libc. It does call proc_exit and some of the stdio routines needlessly pull in fdstat_get and perhaps other things, however bare WASI is really inconvenient to code to, so if we have to do everythin in bare WASI, it seems like we'd end with people writing fewer tests. It seems better to have more tests, even if some of them do call more functions than they strictly need to.

From experience it was a bit tedious to deal with as _main also calls into the prestat, args and env functions as-well (at-least at the time).

Simple to write tests against but early days of WASI in Deno basically couldn't be tested with my libc tests until everything was bootstrapped.

I think we'll just have to revisit this once the split is done and it's a bit clearer how that's going to work; but presumably we'd want to generate headers and such based on the the local witx files and not be dependent on the current state of say the wasi-libc implementation and headers or the wasi crate (which would also be a circular dependency).

As for wasi-libc itself we can test that in wasi-sdk with the same harness, it's already fairly similar we'd just have to collect them along with the tests from this repository into something like webassembly/wasi-testsuite to let implementers make use of them as-well.

docs/Testing.md Outdated
- \<basename\>.stdin
- \<basename\>.stdout
- \<basename\>.stderr
- \<basename\>.status
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably say something about where the test live for a given proposal, and how a test running can/should find the tests for given proposal or all proposals.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably say something about where the test live for a given proposal

Read through make-snapshot.sh and I'm thinking a proposal repository would put it's tests in test/<module_name>/<test_name> where proposal is the root directory of the repository.

The make snapshot script merges proposals with $snapshot_dir/proposals/${repo} so this allows us to use simple path filtering to match just about any selection of tests.

Run all the tests in a standalone repository:

git clone
https://github.com/WebAssembly/wasi-clocks
wasi-test-runner.sh wasmtime wasi-clocks/test

Run a specific proposals tests in a snapshot

git clone
https://github.com/WebAssembly/wasi
wasi-test-runner.sh wasmtime wasi/snapshot/ephemeral/proposals/<proposal>

Run a specific module's tests in standard

git clone
https://github.com/WebAssembly/wasi
wasi-test-runner.sh wasmtime wasi/standard/<module>

Simple but effective.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just having the full test path be test/<module>.<function>[-\<variant\>].<ext> would also work if we desire a flatter structure.

docs/Testing.md Outdated

In addition to the source and binary files, a test can have any of the following
auxilary files and directories:
- \<basename\>.arg
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO; add an example for each file type

docs/Testing.md Outdated Show resolved Hide resolved
Base automatically changed from master to main January 19, 2021 23:08
@caspervonb
Copy link
Contributor Author

caspervonb commented Jan 21, 2021

One open ended question is how do we handle printing diagnostics to the testing environment.

Assertions which just call unreachable aren't that great but it It would be preferable to NOT depend on output_stream in every single test to print the failure before aborting.

Spectests rely on getting a spectest module which just has a bunch of prints in it; maybe we could do something similar.

Doesn't matter that much in say Deno's current test-suite; I'm expecting us to support 100% but when we're testing specific modules as we'll end up doing here it'd be good if we cut this inter-module dependency.

@pchickey
Copy link
Contributor

The wasmtime repo's WASI test suite uses Rust as the source language, which fails by panicking. It will print messages to stdout and eventually hit the unreachable opcode. Our test harness captures guest stdout and stderr, and prints any of them that are non-empty when the instance traps.

This doesn't do you much good if your implementation doesn't have basic stuff like writing to stdio working, but in my experience that is usually the first thing you get working.

@caspervonb
Copy link
Contributor Author

The wasmtime repo's WASI test suite uses Rust as the source language, which fails by panicking. It will print messages to stdout and eventually hit the unreachable opcode. Our test harness captures guest stdout and stderr, and prints any of them that are non-empty when the instance traps.

I'd assume that be the case for any current tests; but with the new module split I'm not sure if that makes sense to do here.

@sbc100
Copy link
Member

sbc100 commented Jan 22, 2021

The wasmtime repo's WASI test suite uses Rust as the source language, which fails by panicking. It will print messages to stdout and eventually hit the unreachable opcode. Our test harness captures guest stdout and stderr, and prints any of them that are non-empty when the instance traps.

I'd assume that be the case for any current tests; but with the new module split I'm not sure if that makes sense to do here.

I think it make sense to assume stdout and the ability to set an exit code as a minimum bar for running any test. If we don't do that we would forced to create some other bespoke I/O system just for the test system (wouldn't we?). I'm not sure its worth the extra complexity to do that.

@caspervonb
Copy link
Contributor Author

caspervonb commented Jan 26, 2021

I think it make sense to assume stdout and the ability to set an exit code as a minimum bar for running any test. If we don't do that we would forced to create some other bespoke I/O system just for the test system (wouldn't we?). I'm not sure its worth the extra complexity to do that.

It does make them more integration test like; which is fine just wanted to point it out as I'm unclear on where this moduralization that started last-year'ish is going (especially weak linkage).

Deno and my test repos do this and it has worked out fine.

Technically you don't actually even have to implement stdio (but it does have to be stubbed); failed assertions will always terminate with an unreachable instruction at the end regardless.

@linclark linclark added this to the WASI Modularization milestone Jan 26, 2021
Copy link
Member

@sunfishcode sunfishcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for writing this up! This looks like what we talked about in the meeting, and a good start.

A test case takes the form of a binary \<basename\>.wasm file, next to that
that there will always be a \<basename\>.\<ext\> source file from which the
test was originally compiled from which can be used as a reference in the event
of an error.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the simplicity of this, but I expect it'll be too limiting as we add more tests in more languages. It'll be convenient to build Rust source files with cargo, and other language source files with their respective build systems, and that'll often require additional support files, so it won't always be as simple as <basename>.<ext>.

What would you think of a convention where we have a sources directory, and in it we have source code and build scripts that one can run to re-generate the wasm files?

Copy link
Contributor Author

@caspervonb caspervonb Jan 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would you think of a convention where we have a sources directory, and in it we have source code and build scripts that one can run to re-generate the wasm files?

Some of the WebAssembly proposals have a meta directory that contain wast generators; might want to keep with that convention?

Would still like to see the "final" source files next to the test binaries as part of the build for reference as more often than not assertion errors don't make much sense without looking at the source.

Slight duplication but copying it back out to the parent directory next to the binary as part of the build means the implementor running the tests doesn't have to care how a particular proposal structured the build system.

In-fact, this way the meta directory could just be omitted from an implementation's test-data folder entirely if they have no need for it.

docs/Testing.md Outdated

- Prepare inputs
- Given an `<input>.wasm` file; take the `<basename>` of said file.
- If `<basename>.<arg>` exists; take the program arguments from said file.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a minor detail, but I'd find this more clear with ".args" instead of ".arg".

Also, we should document the whitespacing convention in the arguments file. Can we say that arguments are separated by whitespace, and '"' quotes may be used to enclose an argument containing whitespace?

docs/Testing.md Outdated
```bash
# Usage: $1 <runtime> <path_to_binary.wasm>
# $1 wasmtime proc_exit-success.wasm
# $1 wasmer proc_exit-failure.wasm
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know these are only examples, but in the interests of not wanting to exclude engines, we shouldn't mention either of these examples here. Would "Usage: $1 <runtime_cmd> <path_to_binary>.wasm" be clear enough, without the examples?

## Writing tests

Any source language may be used to write tests as long as the compiler supports
the wasm32-unknown-unknown target and is well-known and freely available.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do tend to think we do want to allow tests to pull in libc. It does call proc_exit and some of the stdio routines needlessly pull in fdstat_get and perhaps other things, however bare WASI is really inconvenient to code to, so if we have to do everythin in bare WASI, it seems like we'd end with people writing fewer tests. It seems better to have more tests, even if some of them do call more functions than they strictly need to.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants