Skip to content

Commit

Permalink
New ingress <-> PP communication (#2111)
Browse files Browse the repository at this point in the history
* Another squash

WIP wire up new PartitionRouting

A bit of docs

Retries and timeouts in the RpcRequestDispatcher

Fix names in RequestDispatcher abstraction and remove method we don't need

Fix bad change that I've done before to how we handle actions

Fix comment

Use PartitionProcessorRequestId as "physical deduplication id" for RPC requests. We do this because we don't want to rely on the different semantics between idempotency id/workflow id when dealing with duplicates.

Rebase changes

Many things:

* Remove node awareness from ServiceInvocationResponseSink::Ingress and SubmitNotificationSink::Ingress
* Inline ActionEffect inside leadership.rs
* Now for append_invocation and append_invocation_response we await bifrost to ack the append
* Various renamings

Propagate completion expiry time. There are some cases in the PP state machine where we could have the completion expiry time, but for now I set None to avoid too many conflicts with #2035

Types cleanup

Cleanup ingress-dispatcher. This is now used only by kafka ingress

The ingress http stops won't use anymore ingress-dispatcher. The interface InvocationStorageReader became RequestDispatcher and just encapsulates all the RPC features.

Moved inside the PartitionProcessor the setting of the "_sink" fields of the ServiceInvocation.

In theory the ServiceInvocation data structure shouldn't be used here anymore for this RPC contract, but for now we keep it for simplicity.

Implement all the Partition Processor RPC commands on the PP side.

Move the business logic of self proposing from action effect handler into `SelfProposer`, part of the Leader state.
Implement SubmitInvocationResponse rpc

Define the contract of interaction between ingress <-> PP (see partition_processor_rpc_client and types/net/partition_processor).
Reorganize a bit types here and there.

Decouple ingress from partition store

This commit replaces the direct dependency of the ingress on the partition
store with rpcs. The ingress now looks up the corresponding partition processor
and asks it for the output result of the queried invocation.

This fixes #1811.

* rebase

* Fix PartitionProcessor business logic:

* On the same invocation, we now just run the same business logic we would with idempotent requests.
* I still kept some form of "unexpected duplicate invocation id" check, this might turn out to be useful to debug issues related to rpc and/or routing
* Added the business logic to correctly handle the /send notification case with duplicate RPCs

* Miscellaneous changes related to the previous one. I'll deal with back-compat problems in one of the next commits.

* PartitionProcessorRpcClient now uses the ConnectionAwareRpcRouter, and has a little method on the error to discriminate what is safe to retry or not from the rpc client perspective

* Various renaming of methods of RpcRequestDispatcher to make more obvious what they're needed for + generics massaging.

* Add a new data structure for defining invocation requests.

Now the InvocationRequest is just the invocation parameters, while the ServiceInvocation is the "wal" invocation request, which contains additional metadata that are added while the invocation flows through the system.

* Adapt the business logic of the RpcRequestDispatcher to the various changes

* Sort out situations when it's safe to retry or not.

* PR Feedback

* Still need to wait for PR #2172 to enable partition routing

* Feedback

* Remove assertion for bad duplicate situations

* Fix test

* Feedback

---------

Co-authored-by: Till Rohrmann <[email protected]>
  • Loading branch information
slinkydeveloper and tillrohrmann authored Nov 4, 2024
1 parent d71b35b commit 3a293a4
Show file tree
Hide file tree
Showing 52 changed files with 2,799 additions and 2,407 deletions.
77 changes: 74 additions & 3 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions crates/core/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ restate-types = { workspace = true }
anyhow = { workspace = true }
axum = { workspace = true, default-features = false }
arc-swap = { workspace = true }
assert2 = { workspace = true }
async-trait = { workspace = true }
bytes = { workspace = true }
bytestring = { workspace = true }
Expand Down
1 change: 1 addition & 0 deletions crates/core/src/network/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ mod multiplex;
pub mod net_util;
mod network_sender;
mod networking;
pub mod partition_processor_rpc_client;
pub mod protobuf;
pub mod rpc_router;
mod server_builder;
Expand Down
Loading

0 comments on commit 3a293a4

Please sign in to comment.