FAQ section of README.md updated #172

afiskon · 2016-10-21T15:15:09Z

Based on discussion: #168 (comment)

sgotti

Thanks for the PR! Some notes and a better description of the "synchronous" handling status.

sgotti · 2016-10-21T15:25:39Z

README.md

+### Does stolon use Consul as a DNS server as well?
+
+Consul (or etcd) is used only as a key-value storage.
+


I don't completely get the meaning of this question. Do you mean registering a service in consul? If so which service (the proxies?)? I don't see why stolon should do this.

Consul has a concept of services (API like curl -XPUT -d @req.json http://10.0.3.223:8500/v1/agent/service/register). These services are available not only via REST API (like curl -s http://10.0.3.224:8500/v1/catalog/service/postgresql-replica) but also via DNS which is build-in into Consul:

$ dig @127.0.0.1 -p 8600 postgresql-replica.service.consul ;; QUESTION SECTION: ;postgresql-replica.service.consul. IN A ;; ANSWER SECTION: postgresql-replica.service.consul. 0 IN A 10.0.3.223 postgresql-replica.service.consul. 0 IN A 10.0.3.224

You can also request an SRV record, in this case you will also receive port numbers:

$ dig srv @127.0.0.1 -p 8600 postgresql-replica.service.consul ;; QUESTION SECTION: ;postgresql-replica.service.consul. IN SRV ;; ANSWER SECTION: postgresql-replica.service.consul. 0 IN SRV 1 1 5432 postgresql-slave-2.node.dc1.consul. postgresql-replica.service.consul. 0 IN SRV 1 1 5432 postgresql-slave.node.dc1.consul. ;; ADDITIONAL SECTION: postgresql-slave-2.node.dc1.consul. 0 IN A 10.0.3.224 postgresql-slave.node.dc1.consul. 0 IN A 10.0.3.223

This is very convenient for applications that are not aware of Consul. Everything you need - is to use a proper DNS without any caching (Consul's TTL is 0) and use domain names like current-postgresql-master.service.consul and currrent-standby-3.service.consul. No proxy is required. Naturally when you promote a standby you better close all client's connections so clients will be aware that something changed (e.g. SELECT pg_is_in_recovery();) - see http://stackoverflow.com/a/5408501/1565238

So basically the question is whether a client can determine where are current master and standbys using DNS protocol.

sgotti · 2016-10-21T15:48:49Z

README.md

+> Specifies a comma-separated list of standby names that can support synchronous replication, as described in Section 25.2.8. At any one time there will be at most one active synchronous standby; transactions waiting for commit will be allowed to proceed after this standby server confirms receipt of their data. The synchronous standby will be the first standby named in this list that is both currently connected and streaming data in real-time (as shown by a state of streaming in the pg\_stat\_replication view). Other standby servers appearing later in this list represent potential synchronous standbys.
+
+It means that in case of netsplit synchronous standby can be not among majority nodes. In this case some recent changes will be lost. Although it's not a major problem for most web projects, currently you shouldn't use stolon for storing data that under no circumstances can't be lost.
+


I won't add the concept of quorum here since it creates more confusion. Also the postgres doc doesn't talk about "quorum".

In addition the real problem here is not a netsplit (this is just one of the possible causes) but the fact that we let postgres choose the active synchronous standby, so the sentinel cannot know what was the active synchronous standby when the master was declared as dead. So the unique ways the sentinel has is to find the "best" standby based on the last know xlog position. But if both the master and the active synchronous standby goes down at the same time another standby will be choosed and it cannot be in full sync.

I opened #173 with a description and a solution to this. This will work also with postgresql <= 9.5 but with the limitation of setting only one sync standby. Thoughts?

And I've realized that the answer here is not entirely true anyway. If cluster size is 3 then master + one synchronous replica make a quorum, so in this case data can't be lost. I'll rewrite this.

#173 looks good to me. To determine which version of PostgreSQL is running is simple, and knowing that we know what to write to postgresql.conf if user would like to have a real consistency.

sgotti · 2016-10-21T15:56:56Z

README.md

+
+Currently the proxy redirects all requests to the master. There is a [feature request](https://github.com/sorintlab/stolon/issues/132) for using the proxy also for standbys but it's low in the priority list. There is a workaround though.
+
+Application can learn cluster configuration from `stolon/cluster/mycluster/clusterdata` key. Consul allows to subscribe to updates of this key like this: 


I'll add an example (also if this is going to change with #160) on what to do with that data (ie get the clusterview.keeperole infos)

The real problem with this that it's not assured that the standby are in sync, dead or else without more logic.

sgotti · 2016-10-21T15:57:05Z

README.md

+Currently the proxy redirects all requests to the master. There is a [feature request](https://github.com/sorintlab/stolon/issues/132) for using the proxy also for standbys but it's low in the priority list. There is a workaround though.
+
+Application can learn cluster configuration from `stolon/cluster/mycluster/clusterdata` key. Consul allows to subscribe to updates of this key like this: 
+


This is also available with etcd. Not sure if this should be detailed but could just say that one can use the watching features of the store.

OK, I'll fix this.

shaneog · 2016-11-24T17:01:48Z

README.md

+
+### Lets say I have multiple stolon clusters. Do I need a separate Consul / etcd cluster for each stolon cluster?
+
+It depends on your architecture and where the different stolon clusters are located. In general, if two clusters live on complitely different hardware, to to handle all possible courner cases (like netslits) you need a separate Consul / etcd cluster for each stolon cluster.


complitely -> completely

courner -> corner

netslits -> netsplits

sgotti · 2016-12-06T11:03:57Z

After #219 I'll open a PR to update these FAQs and change them due to the new implemented features.

Merge and update sorintlab#164 and sorintlab#172.

sgotti · 2016-12-12T14:58:38Z

Reworked in #224.

Merge and update sorintlab#164 and sorintlab#172.

FAQ section of README.md updated

ab69cfd

sgotti requested changes Oct 21, 2016

View reviewed changes

shaneog reviewed Nov 24, 2016

View reviewed changes

sgotti added a commit to sgotti/stolon that referenced this pull request Dec 12, 2016

doc: Improve faq section.

6fe3b76

Merge and update sorintlab#164 and sorintlab#172.

sgotti added a commit to sgotti/stolon that referenced this pull request Dec 12, 2016

doc: Improve faq section.

76af807

Merge and update sorintlab#164 and sorintlab#172.

sgotti added a commit to sgotti/stolon that referenced this pull request Dec 12, 2016

doc: Improve faq section.

9898092

Merge and update sorintlab#164 and sorintlab#172.

sgotti mentioned this pull request Dec 12, 2016

doc: Improve faq section. #224

Merged

sgotti closed this Dec 12, 2016

sgotti added a commit to sgotti/stolon that referenced this pull request Dec 13, 2016

doc: Improve faq section.

381eb03

Merge and update sorintlab#164 and sorintlab#172.

sgotti added a commit to sgotti/stolon that referenced this pull request Dec 13, 2016

doc: Improve faq section.

2541575

Merge and update sorintlab#164 and sorintlab#172.

sgotti added a commit to sgotti/stolon that referenced this pull request Dec 13, 2016

doc: Improve faq section.

4c213cd

Merge and update sorintlab#164 and sorintlab#172.

sgotti added a commit to sgotti/stolon that referenced this pull request Dec 13, 2016

doc: Improve faq section.

3aa247f

Merge and update sorintlab#164 and sorintlab#172.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FAQ section of README.md updated #172

FAQ section of README.md updated #172

afiskon commented Oct 21, 2016

sgotti left a comment

sgotti Oct 21, 2016

afiskon Oct 21, 2016

sgotti Oct 21, 2016

afiskon Oct 21, 2016

sgotti Oct 21, 2016

sgotti Oct 21, 2016

afiskon Oct 21, 2016

shaneog Nov 24, 2016

sgotti commented Dec 6, 2016

sgotti commented Dec 12, 2016

		### Does stolon use Consul as a DNS server as well?

		Consul (or etcd) is used only as a key-value storage.

		> Specifies a comma-separated list of standby names that can support synchronous replication, as described in Section 25.2.8. At any one time there will be at most one active synchronous standby; transactions waiting for commit will be allowed to proceed after this standby server confirms receipt of their data. The synchronous standby will be the first standby named in this list that is both currently connected and streaming data in real-time (as shown by a state of streaming in the pg\_stat\_replication view). Other standby servers appearing later in this list represent potential synchronous standbys.

		It means that in case of netsplit synchronous standby can be not among majority nodes. In this case some recent changes will be lost. Although it's not a major problem for most web projects, currently you shouldn't use stolon for storing data that under no circumstances can't be lost.


		Currently the proxy redirects all requests to the master. There is a [feature request](https://github.com/sorintlab/stolon/issues/132) for using the proxy also for standbys but it's low in the priority list. There is a workaround though.

		Application can learn cluster configuration from `stolon/cluster/mycluster/clusterdata` key. Consul allows to subscribe to updates of this key like this:


		### Lets say I have multiple stolon clusters. Do I need a separate Consul / etcd cluster for each stolon cluster?

		It depends on your architecture and where the different stolon clusters are located. In general, if two clusters live on complitely different hardware, to to handle all possible courner cases (like netslits) you need a separate Consul / etcd cluster for each stolon cluster.

FAQ section of README.md updated #172

FAQ section of README.md updated #172

Conversation

afiskon commented Oct 21, 2016

sgotti left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sgotti commented Dec 6, 2016

sgotti commented Dec 12, 2016