Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What would cause block_blob.download_range_to_stream to generate storage exception with http code = 206 and what to do about it? #203

Open
yxiang92128 opened this issue Aug 15, 2018 · 5 comments

Comments

@yxiang92128
Copy link

Hi,

I have a stresstest which continuously downloads blob objects of the same size (20M) but it very often throws an exception: const azure::storage::storage_exception& with e.result().http_status_code() equals to 206 PARTIAL_CONTENT.

See the code block below:

block_blob.download_range_to_stream(output_stream, offset, download_size, azure::storage::access_condition(), reqOptions, azure::storage::operation_context());
}
catch (const azure::storage::storage_exception& e)
{
//std::cout << U("Error:") << e.what() << std::endl << U("The object can not be downloaded.") << std::endl;

// azure does not return REST CPI http response code so we are setting this to be a generic BAD_REQUEST
info.SetRetry(e.retryable());
int http_code = e.result().http_status_code();
int libcode = 0;


LOGIT(OBJSEV_ERROR,"Storage exception The object %s can not be downloaded <%s> failed with http code=<%d> ", obj_name.c_str(), e.what(), http_code);
LOGIT(OBJSEV_ERROR,"partial buffer size=%d request length=%d ",buffer.collection().size(), download_size);
return -1;

}

The question is whether download_range_to_stream is a synchronous call to make sure the buffer is filled to the requested length and what I should do knowing that the buffer was only partially filled with return code = 206.

What's the best practice here to guarantee my request length is filed?

Thanks,

Yang

@katmsft
Copy link
Member

katmsft commented Aug 17, 2018

According to the REST API documentation, the returned response will return 200, when the full blob is downloaded, and 206 when a part of the blob is downloaded. With this said, if the download range operation returned 206 and is considered an exception, it surely look like a bug. Can you please provide the version of azure-storage-cpp you are using and the OS/compiler?

@yxiang92128
Copy link
Author

Hi Kan,
The azure-storage-cpp version I am using is 5.0 and cppresetsdk is 2.9.1 and boost 1.54. And the OS is SLES12SP3 and compiler is g++ (SUSE Linux) 4.8.5 (c++11).
It occurs one in 100 downloads of a 20MB objects from the Azure backend.

@hallca
Copy link

hallca commented Aug 29, 2018

We are seeing similar exceptions when reading block blobs when our network is overloaded. azure-storage-cpp logs the below messages when these exceptions are thrown.

We're using:

  • azure-storage-cpp version 3.0.0
  • cpprestsdk version 2.9.1
  • Windows Server 2019 Datacenter
  • VS2017 version 15.7.5
  • CL version 19.14.26433
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
	<System>
		<Provider Guid="{ee5d17c5-1b3e-4792-b0f9-f8c5fc6ac22a}" />
		<EventID>0</EventID>
		<Version>0</Version>
		<Level>2</Level>
		<Task>0</Task>
		<Opcode>0</Opcode>
		<Keywords>0x0</Keywords>
		<TimeCreated SystemTime="2018-08-29T11:28:20.328469100-07:00" />
		<Correlation ActivityID="{00000000-0000-0000-0000-000000000000}" />
		<Execution ProcessID="6812" ThreadID="10100" ProcessorID="4" KernelTime="2325" UserTime="10395" />
		<Channel />
		<Computer />
	</System>
		<Data>29be0bf0-d8e9-4966-9a48-f308f2071c19 : Retry policy did not allow for a retry, so throwing exception: Incorrect number of bytes received.</Data>
</Event>
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
	<System>
		<Provider Guid="{ee5d17c5-1b3e-4792-b0f9-f8c5fc6ac22a}" />
		<EventID>0</EventID>
		<Version>0</Version>
		<Level>2</Level>
		<Task>0</Task>
		<Opcode>0</Opcode>
		<Keywords>0x0</Keywords>
		<TimeCreated SystemTime="2018-08-29T11:29:00.320359500-07:00" />
		<Correlation ActivityID="{00000000-0000-0000-0000-000000000000}" />
		<Execution ProcessID="6812" ThreadID="8824" ProcessorID="5" KernelTime="1860" UserTime="7305" />
		<Channel />
		<Computer />
	</System>
		<Data>29be0bf0-d8e9-4966-9a48-f308f2071c19 : Retry policy did not allow for a retry, so throwing exception: Incorrect number of bytes received.</Data>
</Event>

@EmmaZhu
Copy link
Member

EmmaZhu commented Feb 22, 2019

Hi @hallca,

From our code, the exception of "Incorrect number of bytes received." is a retriable exception, which means if the retry policy allows, the SDK should retry with the request instead of throwing the exception out directly.

If it's OK, could you share your code segment for our further investigation?
Meanwhile, I'll try to repro this issue. I may need some time to construct a fake partial response.

Thanks
Emma

@EmmaZhu
Copy link
Member

EmmaZhu commented Feb 28, 2019

@hallca, I have made a reproduce for this issue. With the default retry policy, it will retry on the error 3 times at most. If it still throws out such exception, might because it cannot complete the request with 3 retries, or the retry policy is changed.

If the happens in high frequency in your environment, I think maybe changing the retry policy to have more retries could mitigate this kind of issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants