Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chapter2: switch O(n) #31

Open
rvansa opened this issue Mar 7, 2019 · 6 comments
Open

chapter2: switch O(n) #31

rvansa opened this issue Mar 7, 2019 · 6 comments

Comments

@rvansa
Copy link

rvansa commented Mar 7, 2019

I actually got surprised that type switches are O(N).

"how else could it work?"

I would assume that since the list of hashes is known at compile time, the compiler could find a perfect hashing function and just do a few arithmetic ops to get the address on which to jump1. At least if the size of switch exceeds some threshold.

If finding perfect hash would be too time-consuming, it could at least sort the hashes and do a binary search.

I guess the answer is that this is one of the optimizations Go is still awaiting...

EDIT: 1 Actually it would need to get just an index and lookup the target address; compiler knows the list of target hashes but not all input hashes.

@neetle
Copy link

neetle commented Mar 7, 2019

Half the issue would be go's duck typing, wouldn't it? Since any type can fulfil any interface implicitly, you'd need to manually check each one.

@rvansa
Copy link
Author

rvansa commented Mar 7, 2019

@neetle I don't follow... Yes, any type can be passed to the switch, and if that is not one of the expected types the hash function would simply transform its hash to one of the indices. Then you'd jump to the given case, compare actual type (which you need to do anyway because of collisions on the 'original' hash) and jump to the default branch if it's not matching.

@xry111
Copy link

xry111 commented Mar 7, 2019

This thing is just like "cascaded" dynamic_casts in C++. We should always refactor it to a function call via a interface if n is really large.

@neetle
Copy link

neetle commented Mar 7, 2019

I agree with you when it comes to things like int and string. I was talking about type switches operating on interfaces like the following:

func foo(x interface{}) {
	switch x.(type) {
	case io.Reader:
		// normal interfaces
	case interface {
		io.Writer
		CanBar() bool
	}:
		// embedded interfaces
	case struct { Baz string `tag:"value"` }:
		// anon struct matching
	}
}

For values of x above, you'd need to verify that x is equal to the target type. For interface{} values, this will likely just be a comparison of the method set ensuring the method set of x is a superset of that in the interface. For the structs, the matching is completely different given the funky rules around type equality checks between different structs. If it was just a list of "primitives", I could see the use of the hashing function. As soon as you introduce something that has a little more nuance than just "does type declaration x == type declaration y" though, it falls apart.

As an aside, I'm not sure the hash-lookup strategy would even be worth the cost of entry. O(n) isn't evil by its nature. Sometimes it's worth dropping back to O(n) when the constant introduced by O(log(n)) or "better" far exceeds the real-world numbers O(n) gives you.

@rvansa
Copy link
Author

rvansa commented Mar 8, 2019

@neetle OK, I was not aware that you can do embedded interfaces/anon struct matching like the above. The assembly for that would probably look quite different to the one shown in the chapter. The simple matching is probably too niche to be worth the optimization.

I agree that there would probably need to be a threshold to go for any 'smarter' strategy, and that should be set based on experimental evidence; I just assumed that given that arithmetic operations as those performed by common hashing functions are cheap enough.

@neetle
Copy link

neetle commented Mar 8, 2019

No problem! Most people don't consider the pathological case that our more "inspired" coworkers take. As an aside - the struct doesn't need to be anon & inline to match, it just needs to have the same fields in the same order with the same tags iirc.

If we were to drift into theorising land, I'd argue that a better strategy would be to use something more akin to bloom trees. The "perfect hash" for them could be calculated at compile time, along with the bloom tree itself. Following that, it's just a "make sure it's where we think it is". In saying that though, that could be expensive, unwieldy and the possible interactions with reflection make me want to gag. Also, my compsci skills are non-existent so take this last portion with a mountain of salt.

Repository owner deleted a comment from katsavav Feb 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants