PHP-FPM tuning advice on the internet follows a predictable pattern: divide total server memory by average worker memory, subtract some margin, and set pm.max_children to the result. That formula is not wrong, but it skips the most important step: measuring what your application actually does under load before deciding anything.
This guide covers the measurement-first approach to PHP-FPM tuning. You will learn how to determine your actual worker memory consumption, identify when PHP-FPM is saturated, interpret the process manager’s status output, and set values that match your workload rather than a generic formula.
Understanding the Process Manager Modes
PHP-FPM offers three process manager modes. The choice affects how workers are allocated and recycled:
Static Mode
1 | pm = static |
All workers are spawned at startup and kept alive. No worker creation or destruction overhead during traffic spikes. Memory usage is constant.
Best for: Dedicated servers or containers where PHP is the primary workload and memory can be pre-allocated.
Dynamic Mode
1 | pm = dynamic |
Workers are created and destroyed based on demand. FPM maintains a pool between min_spare_servers and max_spare_servers, scaling up to max_children under load.
Best for: Shared servers where PHP shares memory with other services and workload varies significantly.
On-Demand Mode
1 | pm = ondemand |
No workers exist when there is no traffic. Workers are spawned per request and killed after the idle timeout.
Best for: Low-traffic applications where memory conservation matters more than response latency.
For most legacy applications running at any meaningful scale, static or dynamic mode is the right choice. On-demand mode adds latency to every request after an idle period because a worker must be spawned.
Measuring Worker Memory Consumption
Real Memory Per Worker
The formula total_memory / worker_memory = max_children requires knowing actual worker memory. Do not guess.
1 | # Show RSS (Resident Set Size) for all PHP-FPM workers |
Better yet, sample over time during normal traffic:
1 | # Log worker memory every 60 seconds for an hour |
Typical Memory Ranges
- Minimal PHP application: 15 to 30 MB per worker
- Framework application (Laminas, Symfony, Laravel): 30 to 80 MB per worker
- Application with large ORM hydration or report generation: 80 to 200 MB per worker
- Pathological cases (image processing, CSV import): 200 MB+ per worker
If your worker memory varies wildly between requests, you have endpoints with very different memory profiles. This matters for tuning because max_children must be set based on the peak memory usage, not the average.
Use pm.max_requests to Control Memory Growth
PHP applications can leak memory slowly over thousands of requests. Set pm.max_requests to recycle workers periodically:
1 | pm.max_requests = 500 |
After 500 requests, the worker is killed and a new one spawned. This prevents gradual memory bloat from consuming your headroom.
To determine the right value, graph worker memory over time. If memory grows linearly, set max_requests to a value that recycles workers before they grow beyond your acceptable peak.
Calculating pm.max_children
The Basic Formula
1 | max_children = (total_available_memory - system_overhead) / peak_worker_memory |
- Total available memory: Physical RAM on the server or container memory limit
- System overhead: Memory used by the OS, MySQL if co-located, Redis, Nginx, etc. Typically 500 MB to 2 GB on a dedicated server.
- Peak worker memory: The highest RSS you observed during normal traffic, not average
Example: 8 GB server, 1 GB system overhead, 60 MB peak worker memory:
1 | (8192 - 1024) / 60 = 119 workers |
Round down and leave some margin:
1 | pm.max_children = 100 |
Why the Formula is Not Enough
The formula gives you a ceiling. It does not tell you whether your application needs that many workers. If your average request takes 50ms and you handle 100 requests per second, you only need 5 concurrent workers on average. Setting max_children to 100 allocates memory for a burst capacity you may never use.
This is where measurement matters more than formulas.
Detecting Saturation
PHP-FPM saturation means all workers are busy and new requests are waiting in a queue. This is the condition you are tuning to avoid (or at least to detect early).
Enable the Status Page
In your pool configuration:
1 | pm.status_path = /fpm-status |
In your Nginx configuration:
1 | location /fpm-status { |
Reading the Status Output
1 | curl -s http://127.0.0.1/fpm-status |
Key fields:
- active processes: Currently handling requests
- idle processes: Waiting for requests
- listen queue: Requests waiting for an available worker
- max listen queue: Highest queue depth since last FPM restart
- max active processes: Highest concurrent worker count since restart
The Critical Signal: Listen Queue
If listen queue is consistently above 0, PHP-FPM is saturated. Requests are waiting. Response times are degrading.
If max listen queue has hit double digits or higher, you have had saturation events. Check your logs for the timestamps and correlate with traffic patterns.
Monitoring Over Time
Sample the status page every 10 seconds and log the key metrics:
1 | while true; do |
Better yet, feed these metrics into Prometheus, Datadog, or whatever monitoring system you use. Alert when listen_queue > 0 persists for more than 30 seconds.
Tuning the Dynamic Mode Settings
If you use pm = dynamic, these settings control the scaling behaviour:
pm.start_servers
How many workers exist at startup. Set this to your expected normal concurrency:
1 | pm.start_servers = 15 |
If your normal traffic requires 10 to 20 concurrent workers, starting with 15 avoids cold-start spawning on the first burst.
pm.min_spare_servers
The minimum idle workers FPM maintains. If idle workers drop below this, FPM spawns more:
1 | pm.min_spare_servers = 5 |
Set this high enough that normal traffic bursts do not trigger spawning. Spawning is not instant, and the latency during spawning manifests as slightly slower requests.
pm.max_spare_servers
The maximum idle workers before FPM starts killing them:
1 | pm.max_spare_servers = 25 |
Set this high enough that workers are not constantly killed and respawned during traffic oscillations. Worker churn wastes CPU and can cause brief latency spikes.
The Relationship
1 | min_spare_servers <= start_servers <= max_spare_servers <= max_children |
Slow-Log Configuration
PHP-FPM can log requests that exceed a time threshold:
1 | slowlog = /var/log/php-fpm-slow.log |
This captures a stack trace of any request that takes longer than 5 seconds. The stack trace tells you exactly which function was executing when the timeout triggered.
Slow logs are essential for understanding why your workers are occupied for long periods. Common causes:
- Blocking database queries
- External API calls without timeouts
- File operations on slow storage
- Memory-intensive processing
Fix the slow requests, and you free workers faster, which means you need fewer workers for the same traffic level.
Production Tuning Workflow
- Measure worker memory under normal traffic for at least 24 hours.
- Calculate max_children using the formula above with a 20% safety margin.
- Enable the status page and monitor for listen queue growth.
- If queue depth stays zero: Your current settings are sufficient. The formula gave you headroom you may not need.
- If queue depth grows under peak traffic: Increase max_children if memory allows, or investigate slow requests using the slow log.
- If memory usage approaches the server limit: Reduce max_children, investigate memory-heavy endpoints, and consider optimisations covered in the PHP Performance Playbook.
FAQ
Should I use static or dynamic mode?
If your server is dedicated to PHP (or the container runs only PHP-FPM), use static. Pre-allocating workers eliminates spawn latency. If PHP shares the server with other services, use dynamic to reclaim memory during low-traffic periods.
What about pm.max_requests = 0 (unlimited)?
Only safe if you are certain your application does not leak memory. For legacy applications, set a finite value and monitor. The Performance Optimisation chapter covers application-level patterns that affect worker memory lifecycle.
How do I know if I have too many workers?
If most workers are idle most of the time and your server is memory-constrained, you are over-provisioned. Reduce max_children and reclaim that memory for OPcache or database buffers.
Does PHP 8.5 change anything about FPM tuning?
PHP 8.5 may reduce per-worker memory slightly due to internal allocator improvements. Re-measure after upgrading. The tuning methodology stays the same.
Next Steps
Enable the FPM status page if you have not already. Measure worker memory for a full day. Calculate your max_children based on real numbers. Monitor listen queue depth as your ongoing saturation signal.
For the OPcache settings that complement FPM tuning, the OPcache Preloading for Large Legacy Apps guide covers how preloading reduces per-worker memory and improves startup time. The PHP Performance Playbook covers the full performance tuning stack.