-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The ACS and BA algorithms #11
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks, good, apart from a few details. (Most of my comments are just nit-picks.)
bool bval = 1; | ||
bool aux = 2; | ||
bool bval = 2; | ||
bool aux = 3; |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rustfmt doesn't work with anything but Rust sources:
$ rustfmt proto/message.proto
error: expected one of `!` or `::`, found `=`
--> /home/vk/src/poanetwork/hbbft/proto/message.proto:1:8
|
1 | syntax = "proto3";
| ^ expected one of `!` or `::` here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, sorry! You're right, of course, and indentation 2 is fine in a proto file!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed indentation to 4 spaces and committed.
src/agreement.rs
Outdated
|
||
#[derive(Default)] | ||
pub struct Agreement { | ||
pub struct Agreement<NodeUid> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add a TODO about amiller/HoneyBadgerBFT#59, so we don't forget to add the CONF
message round later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.
src/agreement.rs
Outdated
/// Values received in AUX messages. Reset on every epoch update. | ||
received_aux: HashMap<NodeUid, BTreeSet<bool>>, | ||
/// All the output values in all epochs. | ||
outputs: BTreeMap<u32, bool>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be called estimates
instead? An output should only be produced once, I think.
Since the epochs start from 0
and are then incremented by 1
, we might as well use a Vec
here.
Or could we even do away with keeping the previous rounds' estimates around, and just make this a bool
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreement can output multiple times. That's why the termination criterion is more complex than whether the instance has output a value. I have to correct the code to reflect that. At the moment only one value is stored per agreement instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, that's a bit ambiguous in the paper. I think the reason for the complex termination criterion is just that a node needs to keep sending messages to help the other nodes output, too. The formulation "then output b" should really be read as "then output b, unless you already have done so". For the user of the module it will certainly be more convenient if it only outputs once?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree on the uniqueness of output. The paper Signature-Free Asynchronous Byzantine Consensus with t < n/3 and O(n^2) Messages from which this version of BA originates has better presentation and does say "decide(v) if not yet done" in Figure 2.
src/agreement.rs
Outdated
} else { | ||
count | ||
} | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe slightly simpler:
self.received_bval.values().filter(|values| values.contains(&b)).count()
src/agreement.rs
Outdated
if count_bval == 2 * self.num_faulty_nodes + 1 { | ||
self.bin_values.insert(b); | ||
|
||
// wait until bin_values_r /= 0, then multicast AUX_r(w) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like Haskell. 😁 Let's use !=
or ≠
.
src/common_subset.rs
Outdated
} else { | ||
// Send the message to the agreement instance. | ||
agreement_instance | ||
.on_input(uid.clone(), &amessage) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That shouldn't be the same uid
as above, should it? If I understand correctly:
self.agreement_instances.get_mut(&uid)
is the instance that decides whether the element proposed by nodeuid
should be part of the common subset in the end.- The parameter of
Agreement::on_input
should be theuid
of the node that sent this particular message.
src/common_subset.rs
Outdated
|
||
if let Ok((output, mut outgoing)) = result { | ||
if let Some(b) = output { | ||
outgoing.append(&mut self.on_agreement_result(uid, b)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that &mut
a typo?
src/common_subset.rs
Outdated
let results1: usize = | ||
self.agreement_results | ||
.iter() | ||
.fold(0, |count, (_, v)| if *v { count + 1 } else { count }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That could also be simplified with filter(|v| **v).count()
, I think.
src/common_subset.rs
Outdated
if instance_uids == completed_uids { | ||
// All instances of Agreement that delivered `true`. | ||
let delivered_1: HashSet<NodeUid> = self.agreement_results | ||
.all(|(_, instance)| instance.terminated()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If only the values are needed, values()
should be preferred over iter()
:
self.agreement_instances.values().all(Agreement::terminated)
src/proto/mod.rs
Outdated
/// BVAL message with an epoch. | ||
BVal((u32, bool)), | ||
/// AUX message with an epoch. | ||
Aux((u32, bool)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should move the AgreementMessage
type into the agreement
module. I think the individual algorithms' modules could be pretty self-contained.
@afck, I've changed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's still a few details I'm not sure about. We should add extensive tests for this soon. (In a later PR, if you prefer.)
proto/message.proto
Outdated
oneof payload { | ||
BroadcastProto broadcast = 1; | ||
AgreementProto agreement = 2; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry about my remark! You were right, of course, and we should leave the indentation at 2 spaces instead of 4 here, as is the standard for proto files.
src/agreement.rs
Outdated
self.received_aux | ||
.values() | ||
.filter(|values| values.is_subset(&self.bin_values)) | ||
.map(|values| vals.union(values)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTreeSet::union
doesn't modify the receiver, so vals
will remain empty. This should probably say:
vals.extend(values)
src/agreement.rs
Outdated
/// Sent BVAL values. Reset on every epoch update. | ||
sent_bval: BTreeSet<bool>, | ||
/// Values received in AUX messages. Reset on every epoch update. | ||
received_aux: HashMap<NodeUid, BTreeSet<bool>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could make this a HashMap<NodeUid, bool>
, since multiple Aux
messages from the same node can be safely ignored. (I think… can they?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think an agreement instance can output different AUX values in different epochs. Note that the values in AUX messages depend on the current state of bin_values which is subject to change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I meant multiple Aux
messages in a single epoch can be safely ignored. This container should be cleared in each epoch change, shouldn't it?
src/agreement.rs
Outdated
let (count_aux, vals) = self.count_aux(); | ||
if count_aux < self.num_nodes - self.num_faulty_nodes { | ||
// Continue waiting for the (N - f) AUX messages. | ||
(None, None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could make this an early return (None, None)
, then the rest of the method wouldn't have to be indented.
src/agreement.rs
Outdated
} else { | ||
// FIXME: Implement the Common Coin algorithm. At the moment the | ||
// coin value is random and local to each instance of Agreement. | ||
let coin2 = random::<bool>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using random
means the coin won't be "common" at all: each node will have a different value. I think that breaks the guarantee that it terminates, even without any malicious nodes. The instances take the coin value as a hint when to terminate, so an unlucky node may outlive all its remote counterparts, and never be able to output anything.
src/agreement.rs
Outdated
let output: Vec<Option<bool>> = vals.into_iter() | ||
.take(1) | ||
.map(|b| { | ||
message = Some(self.set_input(b)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does self.epoch
need to be incremented before this call? I imagine the BVal
messages this sends should have the next epoch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. It makes sense to increment the epoch earlier.
src/common_subset.rs
Outdated
/// Message from a remote node `message_sender_id` concerning the common | ||
/// subset element proposed by the node `element_proposer_id`. | ||
Agreement { | ||
message_sender_id: NodeUid, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The message sender's ID doesn't need to be part of the message itself, I think, since it's passed into on_input
anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are confusing CommonSubset::on_input which does not receive the ID as argument with Agreement::on_input which does. The latter gets the sender ID from the message received by CommonSubset::on_input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't CommonSubset::on_input
just pass on the sender ID to Agreement::on_input
? It looks dangerous to me to have it in the message, since it would basically allow the sender to lie about their identity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, no, the sender ID is in common_subset::Input
which is not a message but rather a type of input to the Common Subset algorithm. As we discussed before, AgreementMessage
does not have any IDs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I see! But then what will actually be transmitted over the network, i.e. what's the "common subset" message type? Because that will need to contain element_proposer_id
, but not message_sender_id
, I think. So wouldn't it be more natural to replace Input
with Message
, and change the signature of CommonSubset::on_input
so that it takes the sender ID separately from the message?
src/common_subset.rs
Outdated
) -> Result<Option<AgreementMessage>, Error> { | ||
if let Some(agreement_instance) = self.agreement_instances.get_mut(uid) { | ||
fn on_broadcast_result(&mut self, uid: &NodeUid) -> Result<Option<AgreementMessage>, Error> { | ||
if let Some(agreement_instance) = self.agreement_instances.get_mut(&uid) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder whether it's dangerous to change an agreement instance's estimate value later than round 0. Could this prevent termination? (Not sure about that.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I think the Agreement
interface needs to change instead, and just ignore inputs after the first round.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I implemented this in a stricter way. There is a check has_input
which returns true iff the first estimated value has been written. set_input
will return an error if the input has already been provided.
src/common_subset.rs
Outdated
let mut result = Err(Error::NoSuchAgreementInstance); | ||
|
||
// Send the message to the local instance of Agreement | ||
if let Some(mut agreement_instance) = self.agreement_instances.get_mut(&element_proposer_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of the mutable result
, you could just return here on error:
let agreement_instance = self.agreement_instances
.get_mut(&element_proposer_id)
.ok_or(Error::NoSuchAgreementInstance)?;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually the mutable result helps to avoid a multiple mutable borrow conflict.
src/common_subset.rs
Outdated
let instances = &mut self.agreement_instances; | ||
for (_uid0, instance) in instances.iter_mut() { | ||
if results1 >= self.num_nodes - self.num_faulty_nodes { | ||
for instance in self.agreement_instances.values_mut() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above: I'm worried that this could break the agreement if the instance has already progressed beyond the first round. (But it's not quite clear in the paper and I haven't thought much about it yet.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the paper, BA instances can set the estimated value only for the next epoch. So, if an instance hasn't been provided with input true
from the ACS instance, the estimated value of the epoch 0 will remain empty until the ACS instance fills it in using Agreement::set_input(false)
. Then that BA can terminate as soon as the coin value is false
due to the termination criterion.
src/agreement.rs
Outdated
if self.bin_values.contains(b) { | ||
vals.insert(b.clone()); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But now vals.len()
will always be <= 2
(number of values instead of number of entries). Maybe it should be:
let aux_count = self.received_aux.values()
.filter(|b| self.bin_values.contains(b))
.count();
(aux_count, vals)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you define vals
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same way you already do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If so then aux_count == vals.len()
. So, aux_count
is redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It isn't, in general: vals
is a set of bool
s, so it has 0
, 1
or 2
elements. The above aux_count
would be the number of entries in the received_aux
map that have one of those values, which can be anything between 0
and num_nodes
. (The values()
iterator doesn't deduplicate, so it will yield &true
and &false
as often as they appear in the map.)
src/agreement.rs
Outdated
let vals: BTreeSet<bool> = self.received_aux | ||
.values() | ||
.filter(|b| { | ||
count += 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now it will count the total number of entries in received_aux
, instead of just the ones whose values are in bin_values
.
src/agreement.rs
Outdated
// Check the termination condition: "continue looping until both a | ||
// value b is output in some round r, and the value Coin_r' = b for | ||
// some round r' > r." | ||
self.terminated = self.terminated || self.estimated.values().any(|b| *b == coin2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not sufficient that any previous estimate agrees with the coin. The output has to agree with it.
src/agreement.rs
Outdated
/// Estimates of the decision value in all epochs. The first estimated value | ||
/// is provided as input by Common Subset using the `set_input` function | ||
/// which triggers the algorithm to start. | ||
estimated: BTreeMap<u32, bool>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we actually need to remember the previous estimates: Once we move on to epoch i + 1
, the value from epoch i
can safely be forgotten.
I think we could even just make estimate
a simple bool
and initialize it with false
(without sending messages). Agreement::on_input
would then do nothing if epoch > 0
or input has already been provided before (i.e. we'd need another has_input: bool
).
We'd also need a has_output: bool
. After that's true
, the estimate can't change again (we don't even need to cater for that, as it's guaranteed by the algorithm itself, I think), and we can terminate as soon as in any future round has_output && coin == estimate
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the termination criterion, I mostly agree but I think the paper asks for self.output == Some(coin)
. In that case do we need estimated
at all? It is never read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My mistake, estimated
is used for broadcast.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I think you're right: It is used for broadcasting BVal
, but only at the point where we transition to a new epoch (or receive user input in epoch 0). After that we can forget it, so we don't actually have to store it in the struct at all. (But then we'd have to make output
an Option<bool>
, as you say. Either way is fine with me.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please have a look at my last commit. I added a BVal message at the start of every epoch r, r > 0. I think the paper clearly requires those.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great now! 👍
Thanks for your patience and sorry for the long list of nit-picks! I've just got one more, but we can also resolve this in another PR.
src/agreement.rs
Outdated
self.try_coin(); | ||
Ok((self.output, VecDeque::new())) | ||
outgoing.extend(self.try_coin()); | ||
Ok((self.output, outgoing)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to make sure we output only once.
Yes, I fixed the once-per-instance property of output too. |
I've defined the ACS algorithm and an interface to it consisting of 3 functions: new, send_proposed_value and on_input. The latter is a message handler for RB and BA messages. These two types of message are therefore joined in the types of message inputs and outputs.
Note about BA: A common coin implementation is not provided in this pull request. The coin value is constant 1 at
hbbft/src/agreement.rs
Line 174 in 15353e8