API latency refers to the response time between when a query is entered into your infrastructure and when a response is delivered to the user. Overall, the shorter the response time, the better the user experience.
Response rate includes the time that a server takes to fulfill the request, in addition to the API latency, or the time it takes for information to move from the server to the requesting party. Response rate will always be longer than the latency since the latency is included as part of the response time measurement.
API latency rate refers to the amount of time it takes for requested information to move from the API server to the party making the request. Response rate includes the latency but also accounts for the calculation time for the request to be fulfilled.
API latency can raise to times that impact user satisfaction when the server does not have enough power or capacity to fulfill the number of requests being entered at any given time. API latency rates can also rise when there is a bottleneck of requests, the server is otherwise overloaded or requests are managed inefficiently.
API latency can be monitored in many ways. A ping test will give the most straightforward measurement, but will not give an accurate assessment of how the user experience is impacted. Webservice HTTP/HTTPS monitors can measure API latency, response times, loadings times, and more.
Reducing latency can be achieved in many ways and a multi-pronged strategy will be more effective than any single initiative. API latency can be speeded by investing in server speed and capacity appropriate to the need of your requests, caching responses for common requests, and ensuring that requests are routed to the nearest server available.