Throttling Your Queries
You might have just received an email error alert from Pipl with the subject “Whoa Nellie! Too many API calls per second…” What now?
We can help. Pipl does use rate limiting, and here are tips for implementing a throttling mechanism for your Pipl API integration.
Why the rate limit?
The Pipl API has rate limits per account, across all keys. Pipl’s technology operates like a search engine rather than a traditional database. This means that records are indexed and clustered on every request made. We limit the number of queries per second (QPS) sent to the API at the same time to ensure that all customers have fair access.
Implementing a throttling mechanism is important. First, it helps ensure that you receive the data you need. When you hit the rate limit, you will receive an HTTP 403 error status response. Second, rate limits help prevent abuse of our services. Consistently exceeding rate limits can result in suspension of your account.
What are the limitations?
If you use Pipl with peaks of query volumes, please make sure that your application is adhering to the allotted rate limits, otherwise we consider this abuse.
Below is a chart indicating the allowed number of queries to be performed in a certain period.
API key type | Live feeds | Number of queries you can make | |||||
Second |
Minute |
Hour |
Day |
Week |
Month |
||
Contact, Social, or Business |
true (default) |
10 |
600 |
36,000 |
864,000 |
6,048,000 |
25,920,000 |
false |
20 |
1200 |
72,000 |
1,728,000 |
12,096,000 |
51,840,000 |
As the table above shows, the default is 10 QPS. You can increase your rate to 20 QPS, but you will need to use live_feeds=false in your queries.
Error management and HTTP headers
Each API response will return HTTP headers to help you adhere to rate limits. QPS rate limits depend on whether or not you use live_feeds with your queries. Depending on the query, different QPS rates apply and different HTTP headers need to be checked by your application code.
Once you hit the rate limit and receive an HTTP 403 status message, every succeeding query received counts towards your rate-limiting quota. See the table below for the error message and the HTTP headers to watch for, depending on whether you are using live feeds.
API Key Type | Configuration Parameter: live_feeds | QPS | Response HTTP Headers and Status Code |
Contact, Social, or Business |
true |
10 |
|
false |
20 |
|
How to best implement a throttling mechanism?
To implement throttling, you can implement a pause (thread.sleep) between each request. If you need a multithreaded solution with low latency, there are many open source solutions available to help. Open source code libraries can be used in conjunction with your API for a rate limiting solution, but this solution depends on your use case. Some Pipl customers use open source libraries such as:
Programming language and platform independent
- Nginx proxy. Take advantage of the built-in nginx module, the HttpLimitReqModule.
- HAProxy. Better Rate Limiting For All with HAProxy
Java
- Google Guava RateLimiter class. See also how this can be implemented using Guava on StackOverflow.
- Java Rate Limit API. Contains the primitives and utilities used to rate-limit Java applications and a CircuitBreaker implementation.
PHP
- Stiphle. A simple PHP library for throttling or rate limiting.
Python
- Rate Limit. API rate limit decorator.
- Limit. Decorator that limits the calling rate of a function.
- RequestsThrottler. Python HTTP requests throttler.
- Throttle. A robust and versatile throttling implementation relying on the token bucket algorithm.
Ruby
- Sidekiq framework for background processing has a Concurrency and threshold throttling library.
- ThrottleQueue. A thread-safe rate-limited work queue with foreground and background operations.
- Slow Web. An HTTP request governor.
C#/ASP.net
What's under the hood of Pipl's rate limiting mechanism?
If you’re curious about what’s under the hood, Pipl’s rate limiting engine uses a sliding window mechanism and starts counting from the time of the first request sent. If you’d like more specifics, visit here and here (note the second comment regarding the "Rolling Window")
Consider queuing your requests to Pipl
If your platform supports multiple users or requires a backlog of requests, consider using a modern queueing technology in your Pipl API integration. This will give you control over rate-querying the Pipl API and allow you to specify priority queues.
Consider the following architecture and flow:
1. User Action
- A user action can trigger single- and multiple-batch enrichments
- You can segment users by subscription. For example, you can designate priority vs. normal users, with trial users receiving the lowest priority.
2. Queuing
- A priority-based queue is created per user, FIFO
- Define priority based on use case/flow
3. Threading
- Multiple threads collect queries from each of the queues based on a priority algorithm
- Serve multiple users fairly based on their subscription levels
- Queues can be depleted based on priority
- Run optimally and adhere to Pipl API rate limits. The thread turns requests into API requests and sends them according to the specified rate limit, based on the key.
- Event-based, push to user feedback
4. User Feedback
- UI Dialog to show progress
- Show a progress bar. Estimate time remaining based on queue size and remainder.
- Update the number of enriched data points as the queue gets smaller.
- Upsell other features to customers.
- User can optionally close dialog and be notified by email.
- Use creative UI techniques instead of a dialog. Data gets updated on UI in near-real time as enriched.
Rate limit not enough for your needs?
Speak to your account manager. Pipl will assess your volume and requirements, and we can come to an agreement. You will need to use live_feeds=false, however.
Updated over 2 years ago