Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-46248: Allow CompoundRegion to be composed of many regions #74

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

andy-slac
Copy link

@andy-slac andy-slac commented Nov 7, 2024

To avoid very deep recursion when CompoundRegion includes many regions it is switched from 2-Region array to a non-empty vector of Regions. Python wrapper should be backward compatible for constructing the instance but it now takes arbitrary number of positional arguments. New method nOperands() was added to access the actual number of regions. (Python interface could be made more natural by adding iteration over regions).

There is still a performance issue with relate() implementation which has to iterate over all operands almost always just because we calculate all types of relations in the same method. To help performance when we only need to know that twio regions are disjoint, new Region.isDisjoint method is added.

Checklist

  • ran Jenkins
  • added a release note for user-visible changes to doc/changes

To avoid very deep recursion when CompoundRegion includes many
regions it is switched from 2-Region array to a non-empty vector of
Regions. Python wrapper should be backward compatible for
constructing the instance but it now takes arbitrary number of
positional arguments. New method `nOperands()` was added to
access the actual number of regions. (Python interface could be made
more natural by adding iteration over regions).

There is still a performance issue with `relate()` implementation
which has to iterate over all operands almost always just because we
calculate all types of relations in the same method.
This new method can be implemented more efficiently for CompoundRegion
that `relate` which returns superset of region relations.
@andy-slac
Copy link
Author

@TallJimbo, could you look at my changes to sphgeom, there are two changes which should help with the performance of UnionRegion:

  • CompoundRegion is now a non-empty vector of Regions
  • New method Region.isDisjoint to calculate only that type of region relation to avoid using more expensive relate. Maybe we need a better name for the new method, is there is already isDisjointFrom in some specific subclasses.

@@ -125,6 +125,7 @@ class UnionRegion : public CompoundRegion {
using Region::contains;
bool contains(UnitVector3d const &v) const override;
Relationship relate(Region const &r) const override;
bool isDisjoint(Region const& other) const override;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the precedent, I do think isDisjointFrom would be a slightly better name. But if that leads to lots of complications with overloading and shadowing it may not be worth it.

Or, if the exiting isDisjointFrom implementations are all exact rather than conservative, maybe we call this one isDefinitelyDisjointFrom? That's verbose, but it might make it a little more clear that False only means "maybe disjoint".

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, but isDisjointFrom is also "definitely disjoint", would isDefinitelyDisjointFrom make people think that isDisjointFrom is less definite?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was actually thinking the opposite, i.e. "definitely" means "only returns true if it is definitely disjoint", and now I see that it's confusing either way, so not a good choice for the name.

/// the reason for its existence is that in same cases it may be
/// implemented more efficiently.
///
/// The meaning of the returned value is the same as for `relate` metod -
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// The meaning of the returned value is the same as for `relate` metod -
/// The meaning of the returned value is the same as for `relate` method -

@@ -68,6 +67,9 @@ class CompoundRegion : public Region {
CompoundRegion &operator=(CompoundRegion const &) = delete;
CompoundRegion &operator=(CompoundRegion &&) = delete;

// Return number of operands
unsigned int nOperands() const { return _operands.size(); }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless there's precedent in sphgeom for using unsigned int for sizes of things, I think std::size_t would be better.

{
if (_operands.empty()) {
throw std::invalid_argument("CompoundRegion requires non-empty region list.");
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative to banning empty lists that might be more convenient would be to declare:

  • an empty UnionRegion is an empty region, i.e. contains no points;
  • an empty IntersectionRegion is a full region, i.e. contains the full sphere.

I think that's the natural limiting behavior for all of the isDisjoint and relate implementations.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually started with that approach but it requires extra logic in few places which was not quite obvious to me, and I was not sure that it was useful at all. I'll try to restore that and add few unit tests.

// If any operand contains the given region, the union contains it.
// NOTE: The logic here is incomplete there may be cases when union
// of regions can fully contain a region even though individual regions
// do not contain it.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thankfully this is still consistent with the carefully conservative relate documentation:

/// Said another way: if the CONTAINS, WITHIN or DISJOINT bit is set, then
/// the corresponding spatial relationship between the two regions holds
/// conclusively. If it is not set, the relationship may or may not
/// hold.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, I read relate documentation later, but forgot to remove my dummy comments.

// within it.
| ((r1 & WITHIN) | (r2 & WITHIN));
// NOTE: There may be cases when intersectiuon is within the region
// even though not all operand are within.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(NOTEs here are also consistent with the relate documentation.)

cls.def(
"cloneOperand",
[](CompoundRegion const &self, std::ptrdiff_t n) {
return self.getOperand(python::convertIndex(2, n)).clone();
int nOperands = self.nOperands();
return self.getOperand(python::convertIndex(nOperands, n)).clone();
Copy link
Member

@TallJimbo TallJimbo Nov 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd love to have a way to let Python code iterate over regions without having to clone them, if we can do it safely. As long as we don't provide a way to modify a compound region after it has been constructed, I believe pybind11's return_value_policy::reference_internal should make it safe to return a reference to an operand. We could make the data member const to make it more clear that future changes should not relax this.

: _operands(std::move(operands)) {}
CompoundRegion::CompoundRegion(std::vector<std::unique_ptr<Region>> operands) noexcept
: _operands(std::move(operands))
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be worth it to have some logic here (or maybe in subclass constructors) to look for operands that are themselves compound regions of the same type, in order to flatten them out.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That logic would also need to handle nested unions and intersections differently, I guess?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but only in the sense that we don't try to simplify a compound region of the opposite type. I don't think we need to go looking for intersections buried under unions inside an intersection, for instance; if we did I think it'd be very easy to do more harm than good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants