Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Start timeout and wal_keep_segments must be configurable #372

Open
tarabanton opened this issue Oct 20, 2017 · 7 comments
Open

Start timeout and wal_keep_segments must be configurable #372

tarabanton opened this issue Oct 20, 2017 · 7 comments

Comments

@tarabanton
Copy link

Good day.
We have stolon cluster with high WAL generation rate, so postgres can't start in hardcoded 60 seconds

args = append([]string{"start", "-w", "--timeout", "60", "-D", p.dataDir, "-o", "-c unix_socket_directories=" + common.PgUnixSocketDirectories}, args...)

Also, 8 WAL segments (wal_keep_segments) is to low for us, so we are using little hack to override it to higher value.
Maybe wal_keep_segment should be allowed to be increased above current limit?

@tarabanton
Copy link
Author

I'm not a Go guy, but after some reading of source code, I've found this cluster parameter

SyncTimeout *Duration `json:"syncTimeout,omitempty"`

Maybe we should use it when starting replica with --timeout parameter?

@sgotti
Copy link
Member

sgotti commented Oct 26, 2017

@tarabanton We can make the start timeout configurable. I have to check if it can be based on the sync timeout since it's also used to start the primary instance.
The wal_keep_segments is hardcoded but not needed by stolon itself since it uses replication slots so it can be made configurable without impacting the replication. What's the reason why you need an higher value?

@tarabanton
Copy link
Author

tarabanton commented Oct 26, 2017

@sgotti The cause is high WAL generation rate. We have quite amount of data is transferred to the other host (several hundreds of gigabytes), old wal logs already deleted by master, when slave starts, thus leads to slave unable to start.
Data is transferred at 2Gbits/s between hosts.

Our cluster spec:

{
	"synchronousReplication": true,
	"additionalWalSenders": null,
	"usePgrewind": true,
	"initMode": "new",
	"pgParameters": {
		"archive_mode": "off",
		"checkpoint_completion_target": "0.9",
		"datestyle": "iso, mdy",
		"default_text_search_config": "pg_catalog.english",
		"dynamic_shared_memory_type": "posix",
		"effective_cache_size": "4GB",
		"effective_io_concurrency": "128",
		"lc_messages": "en_US.UTF-8",
		"lc_monetary": "en_US.UTF-8",
		"lc_numeric": "en_US.UTF-8",
		"lc_time": "en_US.UTF-8",
		"log_destination": "stderr",
		"log_directory": "pg_log",
		"log_filename": "postgresql-%a.log",
		"log_line_prefix": "[%t] %a - %u@%d ",
		"log_rotation_age": "1d",
		"log_timezone": "UTC",
		"log_truncate_on_rotation": "on",
		"logging_collector": "on",
		"maintenance_work_mem": "512MB",
		"max_connections": "100",
		"shared_buffers": "2GB",
		"shared_preload_libraries": "pg_stat_statements",
		"timezone": "UTC",
		"track_activity_query_size": "20480",
		"work_mem": "16MB"
	},
	"pgHBA": null
}

@sgotti
Copy link
Member

sgotti commented Jan 9, 2018

@tarabanton I'm not sure how to proceed here, wal_keep_segments isn't required since we use replication slots that will avoid removing wals not yet sent to standbys, but you said that wals are being removed so I'm not sure how this could happen.

@sgotti
Copy link
Member

sgotti commented Feb 27, 2018

@tarabanton ping

@tarabanton
Copy link
Author

tarabanton commented Feb 27, 2018

It seems, that I also forgot to clarify something.
The reason for my request was due to the two facts

  • that stolon initialyzed replication slot AFTER initial standby sync and therefore it doesn't use all pros of replication slots, e.g. WAL perstistency during sync.
  • stolon-keeper didn't include wal streaming parameters in pg_basepackup.

In recent releases you've added option to stream wal logs during initial standby sync, it seems, that will be sufficient.

@tarabanton
Copy link
Author

@sgotti thanks for your help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants