TPS, result.poller and 429 errors when using python begin_classify_document from Document Intelligence SDK #39643
Labels
Client
This issue points to a problem in the data-plane of the library.
customer-reported
Issues that are reported by GitHub users external to the Azure organization.
Document Intelligence
needs-author-feedback
Workflow: More information is needed from author to address the issue.
question
The issue doesn't require a change to the product in order to be resolved. Most issues start as that
Service Attention
Workflow: This issue is responsible by Azure service team.
Hi,
We have build a solution where we utilize the client_async.begin_classify_document from the Document Intelligence Python SDK.
When we experienced 429 errors we implemented a load balance logic by using semaphore and async logic. This way we keep the initial paralel calls to 14 (1 lower than max tps) and make sure we can only have 14 simultaneous transactions per second.
However, we still seem to receive the 429 http errors.
In the documentation it isn't clear how to approach the TPS by definition. Do we need to assume that result.poller also contributes to the TPS? There isn't a way for us to control the amount of latency in poller. In our usecase we have pdf's of different sizes, and thus we stream it to the endpoint. Therefor we do not know how long a classification takes. And we do not know how many times the poller will try to fetch the end result.
Obviously the best practices state, implement retry logic. But that feels like a bandage solution. We would like to actually have a better grasp of the expected output and prevent any 429 as much as possible. This will also benefit the backend so it doesn't have to send 429 all the time.
What is the life cycle of a singular TPS? for begin classify to poller.result.
I saw similar mention here: #35952 Is there any progress on this?
The text was updated successfully, but these errors were encountered: