-
Notifications
You must be signed in to change notification settings - Fork 176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Systematically survey message content for unimplemented features #190
Comments
We'll also want to keep these scripts in reusable form. It'll be useful to rerun them periodically, both as we make changes to the parser and to validate that there aren't new patterns we haven't implemented and are unaware of. In particular when there's an invariant we believe applies and want to validate that empirically, we can add asserts for that invariant to the parser and then rerun the survey script, and see if it finds failures. (We'll basically always want to run the script with assertions enabled, since the point of it is to find situations we didn't expect.) |
(Separately, when we run into an unimplemented feature in a message we're trying to show in the message list, our current UI is pretty loudly explicit about that. That's helpful for development but we'll naturally want to handle it differently for normal use beyond the beta: #194.) |
… features. We added 2 scripts. - fetch_messages.dart, the script that fetches messages from a given Zulip server, that does not depend on Flutter or other involved Zulip Flutter packages, so that it can run without Flutter. It is meant to be run first to produce the corpuses needed for surveying the unimplemented features. The fetched messages are formatted in JSON Lines format, where each individual entry is JSON containing the message ID and the rendered HTML content. The user is encouraged to have a separate file for messages from each server, because message IDs are not unique across them. - unimplemented_features_test.dart, a test that goes over all messages collected, parses then with the content parser, and report the unimplemented features it discovered. This is implemented as a test mainly because of its dependency on the content parser, which depends on Flutter. It has be run manually via: `flutter test --dart-define=corpusDir=path/to/corpusDir tools/content` See comments from the file for more instructions. Fixes: zulip#190 Signed-off-by: Zixuan James Li <[email protected]>
… features. We added 2 scripts. - fetch_messages.dart, the script that fetches messages from a given Zulip server, that does not depend on Flutter or other involved Zulip Flutter packages, so that it can run without Flutter. It is meant to be run first to produce the corpuses needed for surveying the unimplemented features. The fetched messages are formatted in JSON Lines format, where each individual entry is JSON containing the message ID and the rendered HTML content. The user is encouraged to have a separate file for messages from each server, because message IDs are not unique across them. - unimplemented_features_test.dart, a test that goes over all messages collected, parses then with the content parser, and report the unimplemented features it discovered. This is implemented as a test mainly because of its dependency on the content parser, which depends on Flutter. It has be run manually via: `flutter test --dart-define=corpusDir=path/to/corpusDir tools/content` See comments from the file for more instructions. Fixes: zulip#190 Signed-off-by: Zixuan James Li <[email protected]>
… features. We added 2 scripts. - fetch_messages.dart, the script that fetches messages from a given Zulip server, that does not depend on Flutter or other involved Zulip Flutter packages, so that it can run without Flutter. It is meant to be run first to produce the corpuses needed for surveying the unimplemented features. The fetched messages are formatted in JSON Lines format, where each individual entry is JSON containing the message ID and the rendered HTML content. The user is encouraged to have a separate file for messages from each server, because message IDs are not unique across them. - unimplemented_features_test.dart, a test that goes over all messages collected, parses then with the content parser, and report the unimplemented features it discovered. This is implemented as a test mainly because of its dependency on the content parser, which depends on Flutter. It has be run manually via: `flutter test --dart-define=corpusDir=path/to/corpusDir tools/content` See comments from the file for more instructions. Fixes: zulip#190 Signed-off-by: Zixuan James Li <[email protected]>
… features. We added 2 scripts. - fetch_messages.dart, the script that fetches messages from a given Zulip server, that does not depend on Flutter or other involved Zulip Flutter packages, so that it can run without Flutter. It is meant to be run first to produce the corpuses needed for surveying the unimplemented features. The fetched messages are formatted in JSON Lines format, where each individual entry is JSON containing the message ID and the rendered HTML content. The user is encouraged to have a separate file for messages from each server, because message IDs are not unique across them. - unimplemented_features_test.dart, a test that goes over all messages collected, parses then with the content parser, and report the unimplemented features it discovered. This is implemented as a test mainly because of its dependency on the content parser, which depends on Flutter. It has be run manually via: `flutter test --dart-define=corpusDir=path/to/corpusDir tools/content` See comments from the file for more instructions. Fixes: zulip#190 Signed-off-by: Zixuan James Li <[email protected]>
… features. We added 2 scripts. - fetch_messages.dart, the script that fetches messages from a given Zulip server, that does not depend on Flutter or other involved Zulip Flutter packages, so that it can run without Flutter. It is meant to be run first to produce the corpuses needed for surveying the unimplemented features. The fetched messages are formatted in JSON Lines format, where each individual entry is JSON containing the message ID and the rendered HTML content. The user is encouraged to have a separate file for messages from each server, because message IDs are not unique across them. - unimplemented_features_test.dart, a test that goes over all messages collected, parses then with the content parser, and report the unimplemented features it discovered. This is implemented as a test mainly because of its dependency on the content parser, which depends on Flutter. It has be run manually via: `flutter test --dart-define=corpusDir=path/to/corpusDir tools/content` See comments from the file for more instructions. Fixes: zulip#190 Signed-off-by: Zixuan James Li <[email protected]>
… features. We added 2 scripts. - fetch_messages.dart, the script that fetches messages from a given Zulip server, that does not depend on Flutter or other involved Zulip Flutter packages, so that it can run without Flutter. It is meant to be run first to produce the corpuses needed for surveying the unimplemented features. The fetched messages are formatted in JSON Lines format, where each individual entry is JSON containing the message ID and the rendered HTML content. The user is encouraged to have a separate file for messages from each server, because message IDs are not unique across them. - unimplemented_features_test.dart, a test that goes over all messages collected, parses then with the content parser, and report the unimplemented features it discovered. This is implemented as a test mainly because of its dependency on the content parser, which depends on Flutter. It has be run manually via: `flutter test --dart-define=corpusDir=path/to/corpusDir tools/content` See comments from the file for more instructions. Fixes: zulip#190 Signed-off-by: Zixuan James Li <[email protected]>
… features. We added 2 scripts. - fetch_messages.dart, the script that fetches messages from a given Zulip server, that does not depend on Flutter or other involved Zulip Flutter packages, so that it can run without Flutter. It is meant to be run first to produce the corpuses needed for surveying the unimplemented features. The fetched messages are formatted in JSON Lines format, where each individual entry is JSON containing the message ID and the rendered HTML content. The user is encouraged to have a separate file for messages from each server, because message IDs are not unique across them. - unimplemented_features_test.dart, a test that goes over all messages collected, parses then with the content parser, and report the unimplemented features it discovered. This is implemented as a test mainly because of its dependency on the content parser, which depends on Flutter. It has be run manually via: `flutter test --dart-define=corpusDir=path/to/corpusDir tools/content` See comments from the file for more instructions. Fixes: zulip#190 Signed-off-by: Zixuan James Li <[email protected]>
… features. We added 2 scripts. - fetch_messages.dart, the script that fetches messages from a given Zulip server, that does not depend on Flutter or other involved Zulip Flutter packages, so that it can run without Flutter. It is meant to be run first to produce the corpuses needed for surveying the unimplemented features. The fetched messages are formatted in JSON Lines format, where each individual entry is JSON containing the message ID and the rendered HTML content. The user is encouraged to have a separate file for messages from each server, because message IDs are not unique across them. - unimplemented_features_test.dart, a test that goes over all messages collected, parses then with the content parser, and report the unimplemented features it discovered. This is implemented as a test mainly because of its dependency on the content parser, which depends on Flutter. It has be run manually via: `flutter test --dart-define=corpusDir=path/to/corpusDir tools/content` See comments from the file for more instructions. Fixes: zulip#190 Signed-off-by: Zixuan James Li <[email protected]>
… features. We added 2 scripts. - fetch_messages.dart, the script that fetches messages from a given Zulip server, that does not depend on Flutter or other involved Zulip Flutter packages, so that it can run without Flutter. It is meant to be run first to produce the corpuses needed for surveying the unimplemented features. The fetched messages are formatted in JSON Lines format, where each individual entry is JSON containing the message ID and the rendered HTML content. The user is encouraged to have a separate file for messages from each server, because message IDs are not unique across them. - unimplemented_features_test.dart, a test that goes over all messages collected, parses then with the content parser, and report the unimplemented features it discovered. This is implemented as a test mainly because of its dependency on the content parser, which depends on Flutter. It has be run manually via: `flutter test --dart-define=corpusDir=path/to/corpusDir tools/content` See comments from the file for more instructions. Fixes: zulip#190 Signed-off-by: Zixuan James Li <[email protected]>
Some main benefits of adding both tools/content scripts to tools/check are that you don't need to manage your directories for storing the zuliprc files and corpuses, or specify any command line options for fetching message contents and running the unimplemented features test. The script is intended to be run manually, not as a part of the CI, because it is very slow, and it relies on some out of tree files like API configs (zuliprc files) and big dumps of chat history. Fixes: zulip#190 Signed-off-by: Zixuan James Li <[email protected]>
Some main benefits of adding both tools/content scripts to tools/check are that you don't need to manage your directories for storing the zuliprc files and corpuses, or specify any command line options for fetching message contents and running the unimplemented features test. The script is intended to be run manually, not as a part of the CI, because it is very slow, and it relies on some out of tree files like API configs (zuliprc files) and big dumps of chat history. Fixes: zulip#190 Signed-off-by: Zixuan James Li <[email protected]>
Some main benefits of adding both tools/content scripts to tools/check are that you don't need to manage your directories for storing the zuliprc files and corpuses, or specify any command line options for fetching message contents and running the unimplemented features test. The script is intended to be run manually, not as a part of the CI, because it is very slow, and it relies on some out of tree files like API configs (zuliprc files) and big dumps of chat history. Fixes: zulip#190 Signed-off-by: Zixuan James Li <[email protected]>
Some main benefits of adding both tools/content scripts to tools/check are that you don't need to manage your directories for storing the zuliprc files and corpuses, or specify any command line options for fetching message contents and running the unimplemented features test. The script is intended to be run manually, not as a part of the CI, because it is very slow, and it relies on some out of tree files like API configs (zuliprc files) and big dumps of chat history. Fixes: zulip#190 Signed-off-by: Zixuan James Li <[email protected]>
Some main benefits of adding both tools/content scripts to tools/check are that you don't need to manage your directories for storing the zuliprc files and corpuses, or specify any command line options for fetching message contents and running the unimplemented features test. The script is intended to be run manually, not as a part of the CI, because it is very slow, and it relies on some out of tree files like API configs (zuliprc files) and big dumps of chat history. Fixes: zulip#190 Signed-off-by: Zixuan James Li <[email protected]>
Some main benefits of adding both tools/content scripts to tools/check are that you don't need to manage your directories for storing the zuliprc files and corpuses, or specify any command line options for fetching message contents and running the unimplemented features test. The script is intended to be run manually, not as a part of the CI, because it is very slow, and it relies on some out of tree files like API configs (zuliprc files) and big dumps of chat history. Fixes: zulip#190 Signed-off-by: Zixuan James Li <[email protected]>
Some main benefits of adding both tools/content scripts to tools/check are that you don't need to manage your directories for storing the zuliprc files and corpuses, or specify any command line options for fetching message contents and running the unimplemented features test. The script is intended to be run manually, not as a part of the CI, because it is very slow, and it relies on some out of tree files like API configs (zuliprc files) and big dumps of chat history. Fixes: zulip#190 Signed-off-by: Zixuan James Li <[email protected]>
Some main benefits of having a wrapper script to access dart code are that we can provide a more intuitive interface consistent with other tools, for fetching message corpuses and/or running the check for unimplemented features. Very rarely, you might want to use fetch_messages.dart directly, to use the `fetch-newer` flag for example to update a existing corpus file. If we find it helpful, the flag can be added to check-features as well, but we are skipping that for now. The script is intended to be run manually, not as a part of the CI, because it is very slow, and it relies on some out of tree files like API configs (zuliprc files) and big dumps of chat history. Fixes: zulip#190
Some main benefits of having a wrapper script to access dart code are that we can provide a more intuitive interface consistent with other tools, for fetching message corpuses and/or running the check for unimplemented features. Very rarely, you might want to use fetch_messages.dart directly, to use the `fetch-newer` flag for example to update a existing corpus file. If we find it helpful, the flag can be added to check-features as well, but we are skipping that for now. The script is intended to be run manually, not as a part of the CI, because it is very slow, and it relies on some out of tree files like API configs (zuliprc files) and big dumps of chat history. Fixes: zulip#190
Some main benefits of having a wrapper script to access dart code are that we can provide a more intuitive interface consistent with other tools, for fetching message corpuses and/or running the check for unimplemented features. Very rarely, you might want to use fetch_messages.dart directly, to use the `fetch-newer` flag for example to update a existing corpus file. If we find it helpful, the flag can be added to check-features as well, but we are skipping that for now. The script is intended to be run manually, not as a part of the CI, because it is very slow, and it relies on some out of tree files like API configs (zuliprc files) and big dumps of chat history. For the most part, we intend to only keep the detailed explanations in the underlying scripts close to the implementation, and selectively repeat some of the helpful information in the wrapper. Fixes: zulip#190
Some main benefits of having a wrapper script to access dart code are that we can provide a more intuitive interface consistent with other tools, for fetching message corpuses and/or running the check for unimplemented features. Very rarely, you might want to use fetch_messages.dart directly, to use the `fetch-newer` flag for example to update a existing corpus file. If we find it helpful, the flag can be added to check-features as well, but we are skipping that for now. The script is intended to be run manually, not as a part of the CI, because it is very slow, and it relies on some out of tree files like API configs (zuliprc files) and big dumps of chat history. For the most part, we intend to only keep the detailed explanations in the underlying scripts close to the implementation, and selectively repeat some of the helpful information in the wrapper. Fixes: zulip#190
Some main benefits of having a wrapper script to access dart code are that we can provide a more intuitive interface consistent with other tools, for fetching message corpuses and/or running the check for unimplemented features. Very rarely, you might want to use fetch_messages.dart directly, to use the `fetch-newer` flag for example to update an existing corpus file. If we find it helpful, the flag can be added to check-features as well, but we are skipping that for now. The script is intended to be run manually, not as a part of the CI, because it is very slow, and it relies on some out of tree files like API configs (zuliprc files) and big dumps of chat history. For the most part, we intend to only keep the detailed explanations in the underlying scripts close to the implementation, and selectively repeat some of the helpful information in the wrapper. This also repeats some easy checks for options, so that we can produce nicer error messages for some common errors (like missing zuliprc for `fetch`). Fixes: zulip#190
Some main benefits of having a wrapper script to access dart code are that we can provide a more intuitive interface consistent with other tools, for fetching message corpuses and/or running the check for unimplemented features. Very rarely, you might want to use fetch_messages.dart directly, to use the `fetch-newer` flag for example to update an existing corpus file. If we find it helpful, the flag can be added to check-features as well, but we are skipping that for now. The script is intended to be run manually, not as a part of the CI, because it is very slow, and it relies on some out of tree files like API configs (zuliprc files) and big dumps of chat history. For the most part, we intend to only keep the detailed explanations in the underlying scripts close to the implementation, and selectively repeat some of the helpful information in the wrapper. This also repeats some easy checks for options, so that we can produce nicer error messages for some common errors (like missing zuliprc for `fetch`). Fixes: zulip#190
Some main benefits of having a wrapper script to access dart code are that we can provide a more intuitive interface consistent with other tools, for fetching message corpuses and/or running the check for unimplemented features. Very rarely, you might want to use fetch_messages.dart directly, to use the `fetch-newer` flag for example to update an existing corpus file. If we find it helpful, the flag can be added to check-features as well, but we are skipping that for now. The script is intended to be run manually, not as a part of the CI, because it is very slow, and it relies on some out of tree files like API configs (zuliprc files) and big dumps of chat history. For the most part, we intend to only keep the detailed explanations in the underlying scripts close to the implementation, and selectively repeat some of the helpful information in the wrapper. This also repeats some easy checks for options, so that we can produce nicer error messages for some common errors (like missing zuliprc for `fetch`). Fixes: zulip#190
… features. We added 2 scripts. - fetch_messages.dart, the script that fetches messages from a given Zulip server, that does not depend on Flutter or other involved Zulip Flutter packages, so that it can run without Flutter. It is meant to be run first to produce the corpora needed for surveying the unimplemented features. The fetched messages are formatted in JSON Lines format, where each individual entry is JSON containing the message ID and the rendered HTML content. The script stores output in separate files for messages from each server, because message IDs are not unique across them. - unimplemented_features_test.dart, a test that goes over all messages collected, parses then with the content parser, and report the unimplemented features it discovered. This is implemented as a test mainly because of its dependency on the content parser, which depends on the Flutter engine (and `flutter test` conveniently sets up a test device). We mostly avoid prints (https://dart.dev/tools/linter-rules/avoid_print) in both scripts. While we don't lose much by disabling this lint rule for them, because they are supposed to be CLI programs after all, the rule (potentially) helps with reducing developer inclination to be verbose. See comments from the scripts for more details on the implementations. ===== Some main benefits of having a wrapper script to access dart code are that we can provide a more intuitive interface consistent with other tools, for fetching message corpora and/or running the check for unimplemented features. Very rarely, you might want to use fetch_messages.dart directly, to use the `fetch-newer` flag for example to update an existing corpus file. If we find it helpful, the flag can be added to check-features as well, but we are skipping that for now. The script is intended to be run manually, not as a part of the CI, because it is very slow, and it relies on some out of tree files like API configs (zuliprc files) and big dumps of chat history. For the most part, we intend to only keep the detailed explanations in the underlying scripts close to the implementation, and selectively repeat some of the helpful information in the wrapper. This also repeats some easy checks for options, so that we can produce nicer error messages for some common errors (like missing zuliprc for `fetch`). Fixes: zulip#190
… features. We added 2 scripts and a wrapper for them both. - fetch_messages.dart, the script that fetches messages from a given Zulip server, that does not depend on Flutter or other involved Zulip Flutter packages, so that it can run without Flutter. It is meant to be run first to produce the corpora needed for surveying the unimplemented features. The fetched messages are formatted in JSON Lines format, where each individual entry is JSON containing the message ID and the rendered HTML content. The script stores output in separate files for messages from each server, because message IDs are not unique across them. - unimplemented_features_test.dart, a test that goes over all messages collected, parses then with the content parser, and report the unimplemented features it discovered. This is implemented as a test mainly because of its dependency on the content parser, which depends on the Flutter engine (and `flutter test` conveniently sets up a test device). We mostly avoid prints (https://dart.dev/tools/linter-rules/avoid_print) in both scripts. While we don't lose much by disabling this lint rule for them, because they are supposed to be CLI programs after all, the rule (potentially) helps with reducing developer inclination to be verbose. See comments from the scripts for more details on the implementations. ===== Some main benefits of having the wrapper script to access dart code are that we can provide a more intuitive interface consistent with other tools, for fetching message corpora and/or running the check for unimplemented features. Very rarely, you might want to use fetch_messages.dart directly, to use the `fetch-newer` flag for example to update an existing corpus file. If we find it helpful, the flag can be added to check-features as well, but we are skipping that for now. The script is intended to be run manually, not as a part of the CI, because it is very slow, and it relies on some out of tree files like API configs (zuliprc files) and big dumps of chat history. For the most part, we intend to only keep the detailed explanations in the underlying scripts close to the implementation, and selectively repeat some of the helpful information in the wrapper. This also repeats some easy checks for options, so that we can produce nicer error messages for some common errors (like missing zuliprc for `fetch`). Fixes: zulip#190
Our parsing of Zulip message content HTML is designed to be precise about what it expects, and explicit about anything it doesn't understand. This means that when we encounter some content that does something we don't have support for, our code generally knows that, rather than silently plow ahead with a wrong interpretation.
This is helpful because, among other things, it means that if we take a corpus of Zulip message content, we can run our parser on it to learn about constructs that exist in the wild that we haven't yet implemented. (This includes constructs that a current Zulip server will generate, and constructs that older servers would generate and consequently still exist in old messages.)
So we should do that, as an iterative process:
Some specific likely steps in that process:
parseContent
, and report anyUnimplementedNode
results (as well as any crashes, which should be fixed immediately).The text was updated successfully, but these errors were encountered: