Currently the python client can't handle Read only mode because our retries are too short. We should handle read only mode as a special case of 503 where we throw a warning but continue to wait for a much longer period (possibly idenfinetely). Currently I believe we crap out after about a 1 minute. While the actual time of read only can be 20 minutes.
This might require us to rethink our exponential back-off to to handle retires differently depending on the error code or wrap the retry in another retry that handles the read only 503 exclusively.
I saw the retry but I ended up getting a 401:
Looking for the msg in the backend, found it in ./services/repository/src/main/java/org/sagebionetworks/auth/AuthenticationFilter.java:
I think that if we just reuse the same header, we will error with a 401 after 15 minutes.
To keep things simple, proposal is to change the timeout in backend to match (>=) the timeout we want in the clients (30 mins).
Opened to change backend.
Moving this to py1.7 so that we can verify once PLF-4086 is resolved.
Looks good with PLFM4086 fix.