Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ASTNode::simplify has supralinear performance with deep nesting of expressions #4562

Open
whitequark opened this issue Aug 22, 2024 · 4 comments
Labels
pending-verification This issue is pending verification and/or reproduction

Comments

@whitequark
Copy link
Member

whitequark commented Aug 22, 2024

Version

Yosys 0.39+1 (git sha1 f7153573c, ccache clang++ 14.0.6 -O0 -fPIC)

On which OS did this happen?

Linux

Reproduction Steps

Run yosys stackoverflow.v -p hierarchy while commenting out some of the 512 levels of nesting (with stackoverflow.v being the example from YoWASP/yosys#37) and observe time plus stack usasge, for example with valgrind --tool=drd --main-stacksize=16777216 --show-stack-usage=yes.

Expected Behavior

Time is linear in input size.

Actual Behavior

Time is quadratic in input size:

Screenshot_20240822_190839

(link to the spreadsheet)

LibreOffice Calc provides the following polynomial fit:

Screenshot_20240822_190905

Also, using 24.5K per nesting level seems excessive and results in issues such as YoWASP/yosys#37.

@whitequark whitequark added the pending-verification This issue is pending verification and/or reproduction label Aug 22, 2024
@widlarizer
Copy link
Collaborator

using 24.5K per nesting level

I ran massif on stackoverflow.v. In the peak snapshot, I can see

92.53% (3,345,110B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
->32.73% (1,183,104B) 0x720AB4: Yosys::AST::AstNode::clone() const (frontends/ast/ast.cc:253)
| ->32.69% (1,181,664B) 0x720BB7: Yosys::AST::AstNode::clone() const (frontends/ast/ast.cc:256)
| | ->32.63% (1,179,648B) 0x720BB7: Yosys::AST::AstNode::clone() const (frontends/ast/ast.cc:256)

and so on, there's a lot of deep copies of nodes. There's also a lot of memory spent on parser-constructed AstNodes. Edited massif.log. The sizeof(AstNode) is 288B according to a dump I did some weeks back. This means we need 85 AstNodes per ternary nesting level.

@whitequark
Copy link
Member Author

That's 24.5K of stack memory, not heap memory. (I'm sure heap use isn't great either, but that's not what I'm talking about.)

You only get 1M of stack on Windows and macOS, so beyond 80 levels or so Yosys crashes.

@widlarizer
Copy link
Collaborator

...right, of course. valgrind --tool=massif --heap=no --stacks=yes ./yosys garbage/stackoverflow.v -p "hierarchy" says we get up to 1,335,808 bytes of stack, but can't attribute it to its sources. perf shows that AstNode::detectSignWidthWorker takes up 90% of the runtime.

@widlarizer
Copy link
Collaborator

Just noticed that we're raising the stack size in a linux ifdef in kernel/rtlil.cc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pending-verification This issue is pending verification and/or reproduction
Projects
None yet
Development

No branches or pull requests

2 participants