Hello,
On Thursday last week (18th of April), there was an issue with the Access API. Clients started to experience a very high response time for most API requests for both, gRPC and REST interfaces. The response time went up by more than 4x for many of the requests.
The issue was subsequently identified and resolved. It did not affect the core protocol and was isolated only to the public access nodes.
The issue was caused by a recent feature that was rolled out to the public access nodes. The feature optimized the GetTransactionResult API call by enabling the access node to serve requests by reading transaction results from its local disk instead of going forward to the execution node as done in the past. However, under load, this seems to have caused resource contention and API calls started getting backed up. Further exacerbating the issue, this resulted in the access nodes falling behind syncing collections and caused other API calls to fail or be delayed. The issue has been documented in detail here: https://github.com/onflow/flow-go/issues/5747.
The feature was disabled and the response time for all API calls almost immediately improved.
We have identified several action items to ensure such an incident can be prevented in the future.