How REST API Crawlers Were Silently Eating 60% of a Client's Server Resources
· 11 min read
The client's WooCommerce site had been drifting slower over three weeks. Not dramatically — response times had crept from 1.2 seconds to 3–4 seconds, and occasionally spiked to 8+. The hosting was a 4GB VPS running nginx, PHP-FPM 8.2, and MariaDB 10.11. Nothing had changed: no new plugins, no theme updates, no traffic surge in Simple Analytics. The kind of degradation that makes you suspect the database first.
It wasn't the database.
The Symptom
I SSH'd into the server and checked PHP-FPM status:
curl -s http://127.0.0.1/status?full | grep -E "^(active|idle|total)"
active processes: 11
idle processes: 1
total processes: 12
Twelve workers configured, eleven busy. That's 92% utilisation during what should have been a quiet Tuesday afternoon. The site had maybe 20 concurrent visitors. This server had been comfortably handling that load with 4–5 active workers a month ago.
Finding the Culprit
My first check was the nginx access log, looking for the heaviest hitters by request count:
awk '{print $1}' /var/log/nginx/access.log | sort | uniq -c | sort -rn | head -10
Two IP addresses stood out — each with over 3,000 requests in the past 24 hours. But the interesting part was what they were requesting:
grep "203.0.113.47" /var/log/nginx/access.log | awk '{print $7}' | sort | uniq -c | sort -rn | head -10
1847 /wp-json/wp/v2/posts?per_page=100&page=1
892 /wp-json/wp/v2/posts?per_page=100&page=2
341 /wp-json/wp/v2/pages
287 /wp-json/wp/v2/categories
198 /wp-json/wp/v2/tags
94 /wp-json/wp/v2/users
73 /wp-json/wp/v2/comments?per_page=100
Content scraper bots. They were methodically crawling every public REST API endpoint — posts, pages, categories, tags, users, comments — in a loop. Every single request hit PHP-FPM because the REST API bypasses page caching entirely. There was no FastCGI cache hit for /wp-json/ requests because I'd deliberately excluded the API from caching (you don't want to cache dynamic API responses meant for authenticated users).
The second IP showed the same pattern. Between them, they were making roughly 300 requests per hour to REST API endpoints. Each request spawned a PHP-FPM worker, loaded the full WordPress stack, ran the database queries, serialised the JSON response, and returned it. On this site with 2,400 products and 180 blog posts, each paginated /wp-json/wp/v2/posts?per_page=100 request was running heavy queries against wp_posts and wp_postmeta.
grep "wp-json" /var/log/nginx/access.log | wc -l
Over 7,200 REST API requests in the past 24 hours. For a site with 800 legitimate visits per day, the REST API was handling more traffic than the actual website.
Why This Wasn't Caught Sooner
Three reasons.
First, the requests returned 200 status codes. There were no errors in the WordPress debug log, no failed requests in monitoring. From WordPress's perspective, it was serving valid responses to valid requests.
Second, the bots used realistic user agents. One was masquerading as Googlebot (a quick reverse DNS check confirmed it wasn't). The other used a Chrome user agent string. Neither triggered any WAF rules.
Third, the degradation was gradual. The bots started with light crawling — maybe 50 requests per day — and ramped up over three weeks. By the time the site felt slow enough to investigate, they were fully entrenched.
The Immediate Fix
Block the known IPs and rate limit the REST API at the nginx level.
First, I blocked the two scrapers:
# /etc/nginx/conf.d/blocklist.conf
deny 203.0.113.47;
deny 198.51.100.23;
Then I added rate limiting specifically for /wp-json/ endpoints. This is distinct from the rate limiting I already had on wp-login.php and xmlrpc.php:
# /etc/nginx/nginx.conf — inside the http block
limit_req_zone $binary_remote_addr zone=restapi:10m rate=2r/s;
# Inside the server block
location /wp-json/ {
limit_req zone=restapi burst=10 nodelay;
limit_req_status 429;
try_files $uri $uri/ /index.php$is_args$args;
include fastcgi_params;
fastcgi_pass unix:/run/php/php8.2-fpm.sock;
fastcgi_param SCRIPT_FILENAME $document_root/index.php;
}
# Block REST API access via legacy query parameter route.
# WordPress also serves REST responses at /?rest_route=/wp/v2/posts,
# which bypasses the /wp-json/ location block entirely.
# Pretty permalinks have been default for years — no legitimate
# client should need this fallback on a properly configured site.
if ($arg_rest_route != "") {
return 403;
}
This allows 2 sustained requests per second per IP, with a burst buffer of 10. Legitimate users hitting the block editor or WooCommerce checkout — which make short bursts of API calls — won't notice. Aggressive bots that send rapid-fire requests get 429'd immediately. For slower crawlers like the ones in this incident (300 requests per hour spread evenly), the rate limit alone won't catch them — that's where the IP blocking and the monitoring script below close the gap.
I tested and reloaded:
sudo nginx -t && sudo systemctl reload nginx
Within minutes, PHP-FPM active workers dropped from 11 to 3. The site's response time returned to its baseline 1.2 seconds.
The WordPress-Level Hardening
Rate limiting at nginx stops the resource consumption, but the API endpoints were still publicly accessible. I added a small must-use plugin to restrict unauthenticated access to endpoints that have no business being public:
<?php
// wp-content/mu-plugins/restrict-rest-api.php
add_filter('rest_endpoints', function ($endpoints) {
if (is_user_logged_in()) {
return $endpoints;
}
$restricted = [
'/wp/v2/users',
'/wp/v2/users/(?P<id>[\d]+)',
];
foreach ($restricted as $route) {
if (isset($endpoints[$route])) {
unset($endpoints[$route]);
}
}
return $endpoints;
});
I didn't block all unauthenticated REST API access — that's a common recommendation I disagree with. The REST API is a core part of WordPress. The block editor uses it. Contact Form 7 uses it for AJAX submissions. WooCommerce uses the Store API for cart and checkout. Blanket-blocking unauthenticated access breaks things in ways that surface days later when a customer can't complete checkout or a content editor can't save a post.
Instead, I block the specific endpoints that serve as reconnaissance vectors (like /wp/v2/users) and rate limit everything else at the server level.
Why the REST API Is a Blind Spot
Most WordPress security guidance focuses on wp-login.php and xmlrpc.php. Rate limiting for those endpoints is well-documented and widely deployed. The REST API gets overlooked for three reasons:
- It's required for core functionality. You can't just block it, so people leave it wide open.
- Security plugins don't flag it. Wordfence and Solid Security track login attempts and file changes. They don't raise an alarm when a scraper bot makes 5,000 REST API requests in a day.
- It bypasses page caching. A site can have nginx FastCGI caching, Redis object cache, and a CDN in front — and still have its REST API hitting PHP on every request. This makes REST API traffic disproportionately expensive compared to regular page views.
On the sites I manage, I now treat REST API rate limiting as standard, alongside the wp-login.php and xmlrpc.php protections.
Monitoring for REST API Abuse
I added a simple cron job to flag spikes early:
#!/bin/bash
# /usr/local/bin/check-restapi-traffic.sh
THRESHOLD=2000
LOG="/var/log/nginx/access.log"
COUNT=$(grep -cE "(wp-json|rest_route)" "$LOG")
if [ "$COUNT" -gt "$THRESHOLD" ]; then
echo "REST API request count: $COUNT (threshold: $THRESHOLD)" | \
mail -s "High REST API traffic on $(hostname)" [email protected]
fi
# /etc/cron.d/restapi-monitor
0 */6 * * * root /usr/local/bin/check-restapi-traffic.sh
On a site with 800 daily visitors, anything above 2,000 REST API requests per day is suspicious. For more granular analysis, I parse the log by IP:
grep "wp-json" /var/log/nginx/access.log | \
awk '{print $1}' | sort | uniq -c | sort -rn | \
awk '$1 > 200 {print $1, $2}'
Any single IP making more than 200 REST API requests per day on a small-to-medium site warrants investigation.
The Numbers After the Fix
| Metric | Before | After |
|---|---|---|
| REST API requests/day | 7,200+ | ~400 |
| Active PHP-FPM workers (idle) | 11 of 12 | 3 of 12 |
| Average response time | 3.4s | 1.2s |
| 429 responses/day (rate limited) | 0 | ~180 |
The 400 remaining daily REST API requests are legitimate — Gutenberg saves, WooCommerce cart operations, and the occasional search engine bot that respects rate limits.
What I Now Check on Every Site
This incident changed my audit checklist. When I take on a new server management client, I run this as part of the initial review:
grep -c "wp-json" /var/log/nginx/access.log
grep "wp-json" /var/log/nginx/access.log | \
awk '{print $1}' | sort | uniq -c | sort -rn | head -5
If the REST API request count exceeds 5× the site's legitimate daily page views, there's a problem. I've found scraper traffic on roughly a third of the sites I've audited — most site owners had no idea it was happening.
The REST API is powerful, and it's not going anywhere. But leaving it completely unprotected on a production site is like having rate limiting on your front door while leaving the back door propped open. Rate limit it at the server level, restrict sensitive endpoints at the application level, and monitor the access logs. It takes 15 minutes to set up and prevents a category of performance degradation that's genuinely difficult to diagnose without looking at the right logs.
Stop Firefighting. Start Maintaining.
I manage 70+ WordPress sites for agencies and businesses. Whether you need ongoing maintenance, emergency support, or a one-off performance fix — I can help.
