Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
New ingress <-> PP communication (#2111)
* Another squash WIP wire up new PartitionRouting A bit of docs Retries and timeouts in the RpcRequestDispatcher Fix names in RequestDispatcher abstraction and remove method we don't need Fix bad change that I've done before to how we handle actions Fix comment Use PartitionProcessorRequestId as "physical deduplication id" for RPC requests. We do this because we don't want to rely on the different semantics between idempotency id/workflow id when dealing with duplicates. Rebase changes Many things: * Remove node awareness from ServiceInvocationResponseSink::Ingress and SubmitNotificationSink::Ingress * Inline ActionEffect inside leadership.rs * Now for append_invocation and append_invocation_response we await bifrost to ack the append * Various renamings Propagate completion expiry time. There are some cases in the PP state machine where we could have the completion expiry time, but for now I set None to avoid too many conflicts with #2035 Types cleanup Cleanup ingress-dispatcher. This is now used only by kafka ingress The ingress http stops won't use anymore ingress-dispatcher. The interface InvocationStorageReader became RequestDispatcher and just encapsulates all the RPC features. Moved inside the PartitionProcessor the setting of the "_sink" fields of the ServiceInvocation. In theory the ServiceInvocation data structure shouldn't be used here anymore for this RPC contract, but for now we keep it for simplicity. Implement all the Partition Processor RPC commands on the PP side. Move the business logic of self proposing from action effect handler into `SelfProposer`, part of the Leader state. Implement SubmitInvocationResponse rpc Define the contract of interaction between ingress <-> PP (see partition_processor_rpc_client and types/net/partition_processor). Reorganize a bit types here and there. Decouple ingress from partition store This commit replaces the direct dependency of the ingress on the partition store with rpcs. The ingress now looks up the corresponding partition processor and asks it for the output result of the queried invocation. This fixes #1811. * rebase * Fix PartitionProcessor business logic: * On the same invocation, we now just run the same business logic we would with idempotent requests. * I still kept some form of "unexpected duplicate invocation id" check, this might turn out to be useful to debug issues related to rpc and/or routing * Added the business logic to correctly handle the /send notification case with duplicate RPCs * Miscellaneous changes related to the previous one. I'll deal with back-compat problems in one of the next commits. * PartitionProcessorRpcClient now uses the ConnectionAwareRpcRouter, and has a little method on the error to discriminate what is safe to retry or not from the rpc client perspective * Various renaming of methods of RpcRequestDispatcher to make more obvious what they're needed for + generics massaging. * Add a new data structure for defining invocation requests. Now the InvocationRequest is just the invocation parameters, while the ServiceInvocation is the "wal" invocation request, which contains additional metadata that are added while the invocation flows through the system. * Adapt the business logic of the RpcRequestDispatcher to the various changes * Sort out situations when it's safe to retry or not. * PR Feedback * Still need to wait for PR #2172 to enable partition routing * Feedback * Remove assertion for bad duplicate situations * Fix test * Feedback --------- Co-authored-by: Till Rohrmann <[email protected]>
- Loading branch information