-
-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build the string directly instead of copying and then pruning tabs or newlines #536
base: main
Are you sure you want to change the base?
Build the string directly instead of copying and then pruning tabs or newlines #536
Conversation
Thanks. Please consider running with We need some evidence that the change is beneficial (meaning that the code runs faster). |
Sure. Here is the benchmark result: Before:
After:
The benchmark results are unstable (and only for x86), so I think you can run on your side if you are interested : ) |
@anonrig I will assess this soon. I recommend waiting before we include it in a release. (Not because I think it is bad, but because we want to make sure we document the benefits.) |
It provides a different output for me. If you run the script with Before:
After:
|
I am also getting so-so results with GCC 12 and a recent Intel compiler. Before:
After:
As you can see, you save about 0.01 instructions per byte (which is zero) and the performance is down. So this PR does not seem to improve the performance. |
When I was trying to improve I think the biggest performance improvement would be achieved by changing stuff on line 499 tmp_buffer=input |
Following the comment below, I add a new function called
get_ascii_tab_or_newline_removed
to build the string directly instead of copying and then pruning tabs or newlines.ada/src/url.cpp
Lines 500 to 502 in d8f77a1
Note:
inline
oralways_inline
attribute to a function, it is better to put the function definition to the header file instead of source file. Since all inline optimizitions cannot take effect if the definition cannot be reached inside the current translation unit (except link-time optimization enabled). (the definition may be in another translation unit if we put them in a source file)