Throttling Your Queries

You might have just received an email error alert from Pipl with the subject “Whoa Nellie! Too many API calls per second…” What now?

We can help. Pipl does use rate limiting, and here are tips for implementing a throttling mechanism for your Pipl API integration.

Why the rate limit?

The Pipl API has rate limits per account, across all keys. Pipl’s technology operates like a search engine rather than a traditional database. This means that records are indexed and clustered on every request made. We limit the number of queries per second (QPS) sent to the API at the same time to ensure that all customers have fair access.

Implementing a throttling mechanism is important. First, it helps ensure that you receive the data you need. When you hit the rate limit, you will receive an HTTP 403 error status response. Second, rate limits help prevent abuse of our services. Consistently exceeding rate limits can result in suspension of your account.

What are the limitations?

If you use Pipl with peaks of query volumes, please make sure that your application is adhering to the allotted rate limits, otherwise we consider this abuse.

Below is a chart indicating the allowed number of queries to be performed in a certain period.

API key type	Live feeds	Number of queries you can make
API key type	Live feeds	Second	Minute	Hour	Day	Week	Month
Contact, Social, or Business	true (default)	10	600	36,000	864,000	6,048,000	25,920,000
Contact, Social, or Business	false	20	1200	72,000	1,728,000	12,096,000	51,840,000

As the table above shows, the default is 10 QPS. You can increase your rate to 20 QPS, but you will need to use live_feeds=false in your queries.

Error management and HTTP headers

Each API response will return HTTP headers to help you adhere to rate limits. QPS rate limits depend on whether or not you use live_feeds with your queries. Depending on the query, different QPS rates apply and different HTTP headers need to be checked by your application code.

Once you hit the rate limit and receive an HTTP 403 status message, every succeeding query received counts towards your rate-limiting quota. See the table below for the error message and the HTTP headers to watch for, depending on whether you are using live feeds.

API Key Type	Configuration Parameter: live_feeds	QPS	Response HTTP Headers and Status Code
Contact, Social, or Business	true	10	• X-QPS-Live-Current > X-QPS-Live-Allotted • HTTP Status Code 403 • Error Message "Per second limit for live calls reached"
Contact, Social, or Business	false	20	• X-QPS-Current > X-QPS-Allotted • HTTP Status Code 403 • Error Message "Per second limit for total calls reached"

How to best implement a throttling mechanism?

To implement throttling, you can implement a pause (thread.sleep) between each request. If you need a multithreaded solution with low latency, there are many open source solutions available to help. Open source code libraries can be used in conjunction with your API for a rate limiting solution, but this solution depends on your use case. Some Pipl customers use open source libraries such as:

Programming language and platform independent

Nginx proxy. Take advantage of the built-in nginx module, the HttpLimitReqModule.
HAProxy. Better Rate Limiting For All with HAProxy

Java

Google Guava RateLimiter class. See also how this can be implemented using Guava on StackOverflow.
Java Rate Limit API. Contains the primitives and utilities used to rate-limit Java applications and a CircuitBreaker implementation.

PHP

Stiphle. A simple PHP library for throttling or rate limiting.

Python

Rate Limit. API rate limit decorator.
Limit. Decorator that limits the calling rate of a function.
RequestsThrottler. Python HTTP requests throttler.
Throttle. A robust and versatile throttling implementation relying on the token bucket algorithm.

Ruby

Sidekiq framework for background processing has a Concurrency and threshold throttling library.
ThrottleQueue. A thread-safe rate-limited work queue with foreground and background operations.
Slow Web. An HTTP request governor.

C#/ASP.net

How to implement rate limiting in ASP.NET Core

🚧
What's under the hood of Pipl's rate limiting mechanism?
If you’re curious about what’s under the hood, Pipl’s rate limiting engine uses a sliding window mechanism and starts counting from the time of the first request sent. If you’d like more specifics, visit here and here (note the second comment regarding the "Rolling Window")

Consider queuing your requests to Pipl

If your platform supports multiple users or requires a backlog of requests, consider using a modern queueing technology in your Pipl API integration. This will give you control over rate-querying the Pipl API and allow you to specify priority queues.

Consider the following architecture and flow:

1. User Action

A user action can trigger single- and multiple-batch enrichments
You can segment users by subscription. For example, you can designate priority vs. normal users, with trial users receiving the lowest priority.

2. Queuing

A priority-based queue is created per user, FIFO
Define priority based on use case/flow

3. Threading

Multiple threads collect queries from each of the queues based on a priority algorithm
Serve multiple users fairly based on their subscription levels
Queues can be depleted based on priority
Run optimally and adhere to Pipl API rate limits. The thread turns requests into API requests and sends them according to the specified rate limit, based on the key.
Event-based, push to user feedback

4. User Feedback

UI Dialog to show progress
Show a progress bar. Estimate time remaining based on queue size and remainder.
Update the number of enriched data points as the queue gets smaller.
Upsell other features to customers.
User can optionally close dialog and be notified by email.
Use creative UI techniques instead of a dialog. Data gets updated on UI in near-real time as enriched.

Rate limit not enough for your needs?

Speak to your account manager. Pipl will assess your volume and requirements, and we can come to an agreement. You will need to use live_feeds=false, however.