The ACS and BA algorithms #11

vkomenda · 2018-05-07T10:07:08Z

I've defined the ACS algorithm and an interface to it consisting of 3 functions: new, send_proposed_value and on_input. The latter is a message handler for RB and BA messages. These two types of message are therefore joined in the types of message inputs and outputs.

Note about BA: A common coin implementation is not provided in this pull request. The coin value is constant 1 at

hbbft/src/agreement.rs

Line 174 in 15353e8

let coin: u64 = 1;

.

afck

Looks, good, apart from a few details. (Most of my comments are just nit-picks.)

vkomenda · 2018-05-07T16:49:01Z

proto/message.proto

-    bool bval = 1;
-    bool aux = 2;
+    bool bval = 2;
+    bool aux = 3;


rustfmt doesn't work with anything but Rust sources:

$ rustfmt proto/message.proto error: expected one of `!` or `::`, found `=` --> /home/vk/src/poanetwork/hbbft/proto/message.proto:1:8 | 1 | syntax = "proto3"; | ^ expected one of `!` or `::` here

Ah, sorry! You're right, of course, and indentation 2 is fine in a proto file!

Changed indentation to 4 spaces and committed.

afck · 2018-05-07T10:35:19Z

src/agreement.rs


-#[derive(Default)]
-pub struct Agreement {
+pub struct Agreement<NodeUid> {


Let's add a TODO about amiller/HoneyBadgerBFT#59, so we don't forget to add the CONF message round later.

afck · 2018-05-07T10:38:46Z

src/agreement.rs

+    /// Values received in AUX messages. Reset on every epoch update.
+    received_aux: HashMap<NodeUid, BTreeSet<bool>>,
+    /// All the output values in all epochs.
+    outputs: BTreeMap<u32, bool>,


Should this be called estimates instead? An output should only be produced once, I think.

Since the epochs start from 0 and are then incremented by 1, we might as well use a Vec here.

Or could we even do away with keeping the previous rounds' estimates around, and just make this a bool?

Agreement can output multiple times. That's why the termination criterion is more complex than whether the instance has output a value. I have to correct the code to reflect that. At the moment only one value is stored per agreement instance.

Right, that's a bit ambiguous in the paper. I think the reason for the complex termination criterion is just that a node needs to keep sending messages to help the other nodes output, too. The formulation "then output b" should really be read as "then output b, unless you already have done so". For the user of the module it will certainly be more convenient if it only outputs once?

I agree on the uniqueness of output. The paper Signature-Free Asynchronous Byzantine Consensus with t < n/3 and O(n^2) Messages from which this version of BA originates has better presentation and does say "decide(v) if not yet done" in Figure 2.

afck · 2018-05-07T10:48:30Z

src/agreement.rs

+                    } else {
+                        count
+                    }
+                });


Maybe slightly simpler:

self.received_bval.values().filter(|values| values.contains(&b)).count()

afck · 2018-05-07T10:50:31Z

src/agreement.rs

+                if count_bval == 2 * self.num_faulty_nodes + 1 {
+                    self.bin_values.insert(b);
+
+                    // wait until bin_values_r /= 0, then multicast AUX_r(w)


Looks like Haskell. 😁 Let's use != or ≠.

afck · 2018-05-07T13:17:56Z

src/common_subset.rs

+                    } else {
+                        // Send the message to the agreement instance.
+                        agreement_instance
+                            .on_input(uid.clone(), &amessage)


That shouldn't be the same uid as above, should it? If I understand correctly:

self.agreement_instances.get_mut(&uid) is the instance that decides whether the element proposed by node uid should be part of the common subset in the end.

The parameter of Agreement::on_input should be the uid of the node that sent this particular message.

afck · 2018-05-07T13:20:49Z

src/common_subset.rs

+
+                if let Ok((output, mut outgoing)) = result {
+                    if let Some(b) = output {
+                        outgoing.append(&mut self.on_agreement_result(uid, b));


Is that &mut a typo?

afck · 2018-05-07T13:26:35Z

src/common_subset.rs

+            let results1: usize =
+                self.agreement_results
+                    .iter()
+                    .fold(0, |count, (_, v)| if *v { count + 1 } else { count });


That could also be simplified with filter(|v| **v).count(), I think.

afck · 2018-05-07T13:28:57Z

src/common_subset.rs

-        if instance_uids == completed_uids {
-            // All instances of Agreement that delivered `true`.
-            let delivered_1: HashSet<NodeUid> = self.agreement_results
+            .all(|(_, instance)| instance.terminated())


If only the values are needed, values() should be preferred over iter():

self.agreement_instances.values().all(Agreement::terminated)

afck · 2018-05-07T13:30:30Z

src/proto/mod.rs

+    /// BVAL message with an epoch.
+    BVal((u32, bool)),
+    /// AUX message with an epoch.
+    Aux((u32, bool)),


Maybe we should move the AgreementMessage type into the agreement module. I think the individual algorithms' modules could be pretty self-contained.

vkomenda · 2018-05-08T16:31:04Z

@afck, I've changed on_input_agreement considerably while working on your passing remark on uid. Please check whether you find any issues with the new version. The rationale behind the new version is that any incoming BA message should be delivered to all local BA instances since all BA messages are broadcasts, hence there is no explicit destination of a BA message.

afck

There's still a few details I'm not sure about. We should add extensive tests for this soon. (In a later PR, if you prefer.)

afck · 2018-05-09T09:04:39Z

proto/message.proto

+    oneof payload {
+        BroadcastProto broadcast = 1;
+        AgreementProto agreement = 2;
+    }


Sorry about my remark! You were right, of course, and we should leave the indentation at 2 spaces instead of 4 here, as is the standard for proto files.

afck · 2018-05-09T09:30:36Z

src/agreement.rs

+            self.received_aux
+                .values()
+                .filter(|values| values.is_subset(&self.bin_values))
+                .map(|values| vals.union(values))


BTreeSet::union doesn't modify the receiver, so vals will remain empty. This should probably say:

vals.extend(values)

afck · 2018-05-09T09:32:13Z

src/agreement.rs

+    /// Sent BVAL values. Reset on every epoch update.
+    sent_bval: BTreeSet<bool>,
+    /// Values received in AUX messages. Reset on every epoch update.
+    received_aux: HashMap<NodeUid, BTreeSet<bool>>,


We could make this a HashMap<NodeUid, bool>, since multiple Aux messages from the same node can be safely ignored. (I think… can they?)

I think an agreement instance can output different AUX values in different epochs. Note that the values in AUX messages depend on the current state of bin_values which is subject to change.

Sorry, I meant multiple Aux messages in a single epoch can be safely ignored. This container should be cleared in each epoch change, shouldn't it?

afck · 2018-05-09T09:33:34Z

src/agreement.rs

+        let (count_aux, vals) = self.count_aux();
+        if count_aux < self.num_nodes - self.num_faulty_nodes {
+            // Continue waiting for the (N - f) AUX messages.
+            (None, None)


You could make this an early return (None, None), then the rest of the method wouldn't have to be indented.

afck · 2018-05-09T09:36:30Z

src/agreement.rs

+        } else {
+            // FIXME: Implement the Common Coin algorithm. At the moment the
+            // coin value is random and local to each instance of Agreement.
+            let coin2 = random::<bool>();


Using random means the coin won't be "common" at all: each node will have a different value. I think that breaks the guarantee that it terminates, even without any malicious nodes. The instances take the coin value as a hint when to terminate, so an unlucky node may outlive all its remote counterparts, and never be able to output anything.

afck · 2018-05-09T09:43:23Z

src/agreement.rs

+                let output: Vec<Option<bool>> = vals.into_iter()
+                    .take(1)
+                    .map(|b| {
+                        message = Some(self.set_input(b));


Does self.epoch need to be incremented before this call? I imagine the BVal messages this sends should have the next epoch.

You are right. It makes sense to increment the epoch earlier.

afck · 2018-05-09T09:44:29Z

src/common_subset.rs

+    /// Message from a remote node `message_sender_id` concerning the common
+    /// subset element proposed by the node `element_proposer_id`.
+    Agreement {
+        message_sender_id: NodeUid,


The message sender's ID doesn't need to be part of the message itself, I think, since it's passed into on_input anyway.

I think you are confusing CommonSubset::on_input which does not receive the ID as argument with Agreement::on_input which does. The latter gets the sender ID from the message received by CommonSubset::on_input.

Why can't CommonSubset::on_input just pass on the sender ID to Agreement::on_input? It looks dangerous to me to have it in the message, since it would basically allow the sender to lie about their identity.

Ah, no, the sender ID is in common_subset::Input which is not a message but rather a type of input to the Common Subset algorithm. As we discussed before, AgreementMessage does not have any IDs.

Right, I see! But then what will actually be transmitted over the network, i.e. what's the "common subset" message type? Because that will need to contain element_proposer_id, but not message_sender_id, I think. So wouldn't it be more natural to replace Input with Message, and change the signature of CommonSubset::on_input so that it takes the sender ID separately from the message?

afck · 2018-05-09T09:46:35Z

src/common_subset.rs

-    ) -> Result<Option<AgreementMessage>, Error> {
-        if let Some(agreement_instance) = self.agreement_instances.get_mut(uid) {
+    fn on_broadcast_result(&mut self, uid: &NodeUid) -> Result<Option<AgreementMessage>, Error> {
+        if let Some(agreement_instance) = self.agreement_instances.get_mut(&uid) {


I wonder whether it's dangerous to change an agreement instance's estimate value later than round 0. Could this prevent termination? (Not sure about that.)

Actually, I think the Agreement interface needs to change instead, and just ignore inputs after the first round.

I implemented this in a stricter way. There is a check has_input which returns true iff the first estimated value has been written. set_input will return an error if the input has already been provided.

afck · 2018-05-09T09:50:24Z

src/common_subset.rs

+        let mut result = Err(Error::NoSuchAgreementInstance);
+
+        // Send the message to the local instance of Agreement
+        if let Some(mut agreement_instance) = self.agreement_instances.get_mut(&element_proposer_id)


Instead of the mutable result, you could just return here on error:

let agreement_instance = self.agreement_instances .get_mut(&element_proposer_id) .ok_or(Error::NoSuchAgreementInstance)?;

Actually the mutable result helps to avoid a multiple mutable borrow conflict.

afck · 2018-05-09T09:54:32Z

src/common_subset.rs

-                let instances = &mut self.agreement_instances;
-                for (_uid0, instance) in instances.iter_mut() {
+            if results1 >= self.num_nodes - self.num_faulty_nodes {
+                for instance in self.agreement_instances.values_mut() {


See above: I'm worried that this could break the agreement if the instance has already progressed beyond the first round. (But it's not quite clear in the paper and I haven't thought much about it yet.)

In the paper, BA instances can set the estimated value only for the next epoch. So, if an instance hasn't been provided with input true from the ACS instance, the estimated value of the epoch 0 will remain empty until the ACS instance fills it in using Agreement::set_input(false). Then that BA can terminate as soon as the coin value is false due to the termination criterion.

afck · 2018-05-09T14:46:43Z

src/agreement.rs

+            if self.bin_values.contains(b) {
+                vals.insert(b.clone());
+            }
+        }


But now vals.len() will always be <= 2 (number of values instead of number of entries). Maybe it should be:

let aux_count = self.received_aux.values() .filter(|b| self.bin_values.contains(b)) .count(); (aux_count, vals)

How do you define vals?

The same way you already do.

If so then aux_count == vals.len(). So, aux_count is redundant.

It isn't, in general: vals is a set of bools, so it has 0, 1 or 2 elements. The above aux_count would be the number of entries in the received_aux map that have one of those values, which can be anything between 0 and num_nodes. (The values() iterator doesn't deduplicate, so it will yield &true and &false as often as they appear in the map.)

afck · 2018-05-10T07:02:32Z

src/agreement.rs

+        let vals: BTreeSet<bool> = self.received_aux
+            .values()
+            .filter(|b| {
+                count += 1;


Now it will count the total number of entries in received_aux, instead of just the ones whose values are in bin_values.

… algorithm

…estimated values

…n interface fn

afck · 2018-05-10T10:02:14Z

src/agreement.rs

+        // Check the termination condition: "continue looping until both a
+        // value b is output in some round r, and the value Coin_r' = b for
+        // some round r' > r."
+        self.terminated = self.terminated || self.estimated.values().any(|b| *b == coin2);


It's not sufficient that any previous estimate agrees with the coin. The output has to agree with it.

afck · 2018-05-10T10:06:22Z

src/agreement.rs

+    /// Estimates of the decision value in all epochs. The first estimated value
+    /// is provided as input by Common Subset using the `set_input` function
+    /// which triggers the algorithm to start.
+    estimated: BTreeMap<u32, bool>,


I don't think we actually need to remember the previous estimates: Once we move on to epoch i + 1, the value from epoch i can safely be forgotten.

I think we could even just make estimate a simple bool and initialize it with false (without sending messages). Agreement::on_input would then do nothing if epoch > 0 or input has already been provided before (i.e. we'd need another has_input: bool).

We'd also need a has_output: bool. After that's true, the estimate can't change again (we don't even need to cater for that, as it's guaranteed by the algorithm itself, I think), and we can terminate as soon as in any future round has_output && coin == estimate.

For the termination criterion, I mostly agree but I think the paper asks for self.output == Some(coin). In that case do we need estimated at all? It is never read.

My mistake, estimated is used for broadcast.

No, I think you're right: It is used for broadcasting BVal, but only at the point where we transition to a new epoch (or receive user input in epoch 0). After that we can forget it, so we don't actually have to store it in the struct at all. (But then we'd have to make output an Option<bool>, as you say. Either way is fine with me.)

Please have a look at my last commit. I added a BVal message at the start of every epoch r, r > 0. I think the paper clearly requires those.

… the current epoch

afck

Looks great now! 👍

Thanks for your patience and sorry for the long list of nit-picks! I've just got one more, but we can also resolve this in another PR.

afck · 2018-05-10T11:40:16Z

src/agreement.rs

-            self.try_coin();
-            Ok((self.output, VecDeque::new()))
+            outgoing.extend(self.try_coin());
+            Ok((self.output, outgoing))


We need to make sure we output only once.

vkomenda · 2018-05-10T11:46:53Z

Yes, I fixed the once-per-instance property of output too.

afck requested changes May 7, 2018

View reviewed changes

afck requested changes May 9, 2018

View reviewed changes

afck reviewed May 9, 2018

View reviewed changes

afck reviewed May 10, 2018

View reviewed changes

vkomenda added 9 commits May 10, 2018 10:02

Binary Agreement implementation and its wiring into Common Subset

d3b974f

defined the output from the Common Subset algorithm

5215156

added a TODO file and changed indentation in the .proto file

2205f90

changed code according to review comments

394462c

added element_proposer_id to the Agreement input to the Common Subset…

3e35cc6

… algorithm

removed the separate field in Agreement and corrected computation of …

e0005a6

…estimated values

clear the received AUX messages on every epoch update

b6a6bb3

corrected the count of AUX messages

259d536

fixed the count of matching AUX messages

51ef11b

vkomenda force-pushed the vk-agreement branch from 8c75f44 to 51ef11b Compare May 10, 2018 09:07

added the proposer ID to common_subset::handle_broadcast and mae it a…

cf33bac

…n interface fn

afck requested changes May 10, 2018

View reviewed changes

vkomenda added 3 commits May 10, 2018 12:09

replaced the map of estimated values with only one optional value for…

fb50e38

… the current epoch

added a dev dependency on rand for CI

49c33d2

added the missing agreement broadcast message on epoch change

68e6a7a

afck approved these changes May 10, 2018

View reviewed changes

correction: Agreement outputs a value only once

57ff64c

afck merged commit d9febca into master May 10, 2018

afck deleted the vk-agreement branch May 10, 2018 12:01

vkomenda mentioned this pull request May 10, 2018

Binary agreement protocol instance #2

Closed

The ACS and BA algorithms #11

The ACS and BA algorithms #11

Conversation

vkomenda commented May 7, 2018

afck left a comment

Choose a reason for hiding this comment

This comment was marked as outdated.

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vkomenda commented May 8, 2018

afck left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

afck May 9, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

afck left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vkomenda commented May 10, 2018

afck May 9, 2018 •

edited

Loading