-
-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Garbage collection #7
Comments
Anyway, we will probably end up with garbage collector without any of the As for the exact choices:
I think that the simplest choice is a serial, stop-the-word, non-compacting and generational garbage collector. As for the generations:
Btw. this whole thing depends on me having time and will to write it. |
After some thoughts, there is a way to implement a nice garbage collection in Swift. The catch is that it requires move-only types. Semantics
Then we would define a type with following semantics:
Collections/aggregates would be pain to deal with, but oh… well… (also, we can define our own collections with Obviously, we would still need a garbage collector to resolve reference cycles… Implementation// Helper protocol to which all of the types conform to (already exists).
protocol PyObjectMixin: MoveOnly {
var ptr: RawPtr { get }
init(ptr: RawPtr)
}
extension PyObjectMixin {
/// Create a new reference to this object.
func copyRef() -> Self {
let result = Self(ptr: self.ptr)
result.referenceCount += 1 // Stored on the heap at 'self.ptr + someOffset'
return result
}
}
struct PyInt: PyObjectMixin, MoveOnly {
let ptr: RawPtr { get }
init(ptr: RawPtr) {
self.ptr = ptr
}
// 'MoveOnly' enables 'deinit' on 'struct'
deinit {
self.referenceCount -= 1
if self.referenceCount == 0 {
// Let it go, let it go
// Can't hold it back anymore
// Let it go, let it go
// Turn away and slam the door
}
}
}
// === Usage ===
func foo() {
let int: PyInt = …
let intCopy = int.copyRef() // ref count += 1
bar(intCopy) // consumes 'intCopy', calls 'ref count -= 1'
print("foo: \(int)") // ok
print("foo: \(intCopy)") // compiler error, 'intCopy' was consumed by 'bar'
}
func bar(int: consume PyInt) {
print("bar: \(int)")
// end of lifetime: ref count -= 1
} DowncastingDowncasting would be weird, because it would always retain. We can't express: "valid cast -> let object: PyObject = …
if let int = py.cast.asInt(object) {
// 'object' was moved into 'int'
} else {
// 'object' is still valid
} The only way to do this would be to return Making some types copyableIt would be tempting to opt out of the reference counting for some types (for example Can we convince the compiler that we know what we are doing? |
CPython
In CPython they use manual reference counting:
Py_INCREF
is used to retain (reference count += 1)Py_DECREF
to release (reference count -= 1) an objectIf the reference count reaches 0 then it is deallocated and the linked list entries are updated (having a reference to both previous and next object makes this trivial). Linked list itself is used by garbage collection algorithm to break cycles (at some point they need to iterate over all of the currently alive objects).
This is how the object header looks like:
Violet
In Swift we have automatic reference counting for
class
instances (and a few others). The big catch here is that the user (programmer) is expected to take care of the strong reference cycles (to simplify things a bit: imagine 2 objects that hold a reference to each other, both have reference count 1, so neither of them will be deallocated).Our options are:
manual
retain
/release
- in this approach the programmer is responsible for inserting theretain
/release
calls. For example: when adding an object to alist
weretain
it, when removing it (or deleting the wholelist
) werelease
it.The main drawback is of course the manual labor of adding the
retain
/release
calls. It is also extremely difficult to get right and even a single mistake may have consequences:retain
- object lives forever (along with all of the objects it references)release
- “use after free” error -> crashThis approach is really good if you can make it right. CPython uses it, because they can afford it: they have the time and manpower to find and fix any possible errors. I don't think we could do the same.
manually implemented smart pointer - I don't think it is possible in Swift to write our own version of smart pointer. Even in languages which give you more control (like C++) it is an extremely hard thing to do.
full garbage collection without ARC - there are tons of materials about this on the internet.
(THIS ONE IS NOT POSSIBLE ACCORDING TO NEW OBJECT MODEL) using Swift native ARC - I'm not really sure if you can just allocate a bunch of memory with native Swift ARC, but as a last resort we can use ManagedBufferPointer without elements.
This is nice because (in theory) Swift takes case of everything. In practice however… there are some problems. First of all, you still need a some form of garbage collection to break the cycles. This will be simpler than full-on garbage collection and the requirements will be not as demanding since it will run less often, but still, it is something to keep in mind.
The other things is that most of the garbage collection algoritms work by somehow marking all of the reachable objects and then in a 2nd phase removing all of the not-marked ones. Unfortunately to do this you need a reference to all of the reachable objects (most commonly by some sort of a linked list). The question is: what kind of reference would it be?
unowned
- this will allow objects to be deallocated, but we would somehow need to know which references are alive and which notweak
- this is an obvious choiceUnfortunately there is a possible performance problem since having just a single the
weak
reference in Swift, moves the whole reference count to side-table. Unfortunately this side-table is off-line (it is an separate allocation outside of the object). This is a potential memory/cache waste, not to mention that the retain now has to fetch the object and then fetch the side-table (it is calledslowRC
for a reason).Code to test the presence of the side table
Btw. we had a similar problem when writing our implementation of
BigInt
: after the heap-allocated storage is not needed -> how do we release it? And how do we even know that it is no longer needed? We went with ManagedBufferPointer to get Swift ARC, but technically you can implement anBigInt
with a garbage collector (although I am not sure that this would be a good idea).I think that the main difference between the
BigInt
and the Python object representation is that the Python objects can contain references to other objects, possibly creating a cycle. Anyway, under the hood we were dealing with an unowned pointers, so we can either find the owners (this would be on case-by-case basis, so a lot of room for mistakes) or automate things (either via ARC or garbage collector).The text was updated successfully, but these errors were encountered: