Я провожу несколько тестов производительности Apache и MySQL, затем обновляю один компонент и снова запускаю тесты, чтобы увидеть узкие места. У меня есть местный склад бывших в употреблении серверов (Tams в Линдоне, штат Юта), который завтра предоставляет мне около 20 процессоров для проведения этих тестов, чтобы сравнить большее количество ядер с более высокими тактовыми частотами. Затем я буду тестировать NVMe RAID против SSD RAID, против HDD RAID против одного HDD.
Я написал сценарий, который просматривает и объединяет статистику из различных других программ, таких как iostat, vmstat, mpstat. Когда я смотрю на результаты, мне трудно понять, почему такое огромное замедление на 512 потоках. Если посмотреть на нагрузку на ЦП, жесткий диск, память с помощью этих инструментов, похоже, нет большой разницы между этим и 256 потоками.
Для этого я использую Apache Benchmark с помощью команды: ab -kc 500 -n 1000 -l -s 60 http://localhost/stress.php
Все измерения проводились через 5 секунд после начала теста, так что он был в самом разгаре.
Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz, Cores: 6, Sockets 2, DDR4, (free-mh): 62G 12G 18G 4.5M 32G 49G, NVMe RAID, TestType: Apache PHP, Threads: 16, Server Load: 0.94, 0.28, 0.15, Threads: 16, Time per request: 125.523 [ms] (mean), Concurrency Level: 16, Requests per second 127.47 [#/sec] (mean)Time per request: 125.523 [ms] (mean), Connect: 0 0 0.1 0 1, Processing: 59 124 177.8 121 2924, CPU Core Load: %usr,1.89,1.94,1.98,1.95,1.93,1.92,1.93,1.94,1.93,1.94,1.93,1.95,1.92,1.87,1.88,1.85,1.83,1.85,1.82,1.84,1.82,1.85,1.84,1.85,1.82,, CPU Idle%: %idle,97.66,97.59,97.60,97.60,97.64,97.63,97.64,97.60,97.64,97.59,97.64,97.60,97.65,97.66,97.70,97.69,97.74,97.69,97.75,97.70,97.75,97.70,97.73,97.62,97.75,,Waiting on Disk: %iowait,0.04,0.05,0.04,0.04,0.04,0.04,0.04,0.04,0.03,0.04,0.03,0.04,0.03,0.05,0.04,0.04,0.04,0.04,0.04,0.04,0.03,0.04,0.04,0.11,0.04,, vmstatProcs: 17, vmstatSwapIn: 0, vmstatSwapOut: 0, vmstatIoIn: 7, vmstatIoOut: 42, iostatR-S: 1.72, iostatW-S: 13.37, iostatR-MBS: 0.25, iostatW-MBS: 0.52, iostatHDD-Util: 0.00
Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz, Cores: 6, Sockets 2, DDR4, (free-mh): 62G 12G 17G 4.5M 32G 49G, NVMe RAID, TestType: Apache PHP, Threads: 32, Server Load: 3.98, 0.97, 0.37, Threads: 32, Time per request: 194.680 [ms] (mean), Concurrency Level: 32, Requests per second 164.37 [#/sec] (mean)Time per request: 194.680 [ms] (mean), Connect: 0 0 0.5 0 4, Processing: 65 193 380.9 143 4300, CPU Core Load: %usr,1.89,1.94,1.99,1.95,1.93,1.92,1.94,1.95,1.94,1.94,1.93,1.95,1.93,1.88,1.88,1.85,1.84,1.85,1.82,1.85,1.82,1.85,1.84,1.86,1.82,, CPU Idle%: %idle,97.66,97.58,97.60,97.60,97.64,97.63,97.63,97.60,97.64,97.59,97.64,97.60,97.65,97.66,97.70,97.69,97.74,97.68,97.75,97.70,97.75,97.70,97.72,97.62,97.74,,Waiting on Disk: %iowait,0.04,0.05,0.04,0.04,0.04,0.04,0.04,0.04,0.03,0.04,0.03,0.04,0.03,0.05,0.04,0.04,0.04,0.04,0.04,0.04,0.03,0.04,0.04,0.11,0.04,, vmstatProcs: 32, vmstatSwapIn: 0, vmstatSwapOut: 0, vmstatIoIn: 7, vmstatIoOut: 42, iostatR-S: 1.72, iostatW-S: 13.37, iostatR-MBS: 0.25, iostatW-MBS: 0.52, iostatHDD-Util: 0.00
Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz, Cores: 6, Sockets 2, DDR4, (free-mh): 62G 12G 17G 4.5M 32G 49G, NVMe RAID, TestType: Apache PHP, Threads: 64, Server Load: 7.13, 1.72, 0.62, Threads: 64, Time per request: 384.736 [ms] (mean), Concurrency Level: 64, Requests per second 166.35 [#/sec] (mean)Time per request: 384.736 [ms] (mean), Connect: 0 1 3.0 0 14, Processing: 130 378 547.1 311 5501, CPU Core Load: %usr,1.90,1.95,1.99,1.95,1.93,1.93,1.94,1.95,1.94,1.95,1.94,1.95,1.93,1.88,1.89,1.86,1.84,1.86,1.83,1.85,1.83,1.86,1.84,1.86,1.83,, CPU Idle%: %idle,97.66,97.58,97.59,97.59,97.63,97.62,97.63,97.60,97.63,97.58,97.64,97.59,97.64,97.65,97.69,97.69,97.73,97.68,97.75,97.69,97.75,97.69,97.72,97.61,97.74,,Waiting on Disk: %iowait,0.04,0.05,0.04,0.04,0.04,0.04,0.04,0.04,0.03,0.04,0.03,0.04,0.03,0.05,0.04,0.04,0.04,0.04,0.04,0.04,0.03,0.04,0.04,0.11,0.04,, vmstatProcs: 59, vmstatSwapIn: 0, vmstatSwapOut: 0, vmstatIoIn: 7, vmstatIoOut: 42, iostatR-S: 1.72, iostatW-S: 13.37, iostatR-MBS: 0.25, iostatW-MBS: 0.52, iostatHDD-Util: 0.00
Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz, Cores: 6, Sockets 2, DDR4, (free-mh): 62G 12G 17G 4.5M 32G 49G, NVMe RAID, TestType: Apache PHP, Threads: 128, Server Load: 11.64, 2.82, 0.99, Threads: 128, Time per request: 786.459 [ms] (mean), Concurrency Level: 128, Requests per second 162.75 [#/sec] (mean)Time per request: 786.459 [ms] (mean), Connect: 0 1 4.0 0 14, Processing: 149 742 1175.8 468 6124, CPU Core Load: %usr,1.90,1.95,1.99,1.96,1.94,1.93,1.94,1.95,1.94,1.95,1.94,1.96,1.93,1.88,1.89,1.86,1.84,1.86,1.83,1.85,1.83,1.86,1.85,1.86,1.83,, CPU Idle%: %idle,97.65,97.58,97.59,97.59,97.63,97.62,97.63,97.59,97.63,97.58,97.63,97.59,97.64,97.65,97.69,97.68,97.73,97.68,97.74,97.69,97.74,97.69,97.72,97.61,97.74,,Waiting on Disk: %iowait,0.04,0.05,0.04,0.04,0.04,0.04,0.04,0.04,0.03,0.04,0.03,0.04,0.03,0.05,0.04,0.04,0.04,0.04,0.04,0.04,0.03,0.04,0.04,0.11,0.04,, vmstatProcs: 101, vmstatSwapIn: 0, vmstatSwapOut: 0, vmstatIoIn: 7, vmstatIoOut: 42, iostatR-S: 1.72, iostatW-S: 13.37, iostatR-MBS: 0.25, iostatW-MBS: 0.52, iostatHDD-Util: 0.00
Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz, Cores: 6, Sockets 2, DDR4, (free-mh): 62G 13G 17G 4.5M 32G 48G, NVMe RAID, TestType: Apache PHP, Threads: 256, Server Load: 30.81, 7.33, 2.49, Threads: 256, Time per request: 2309.182 [ms] (mean), Concurrency Level: 256, Requests per second 110.86 [#/sec] (mean)Time per request: 2309.182 [ms] (mean), Connect: 0 3 5.2 0 14, Processing: 148 1546 1985.2 830 9003, CPU Core Load: %usr,1.91,1.95,2.00,1.96,1.94,1.93,1.95,1.96,1.95,1.96,1.94,1.96,1.94,1.89,1.90,1.86,1.85,1.86,1.83,1.86,1.83,1.86,1.85,1.87,1.84,, CPU Idle%: %idle,97.65,97.57,97.59,97.59,97.63,97.62,97.62,97.59,97.62,97.58,97.63,97.58,97.64,97.65,97.69,97.68,97.73,97.67,97.74,97.69,97.74,97.69,97.71,97.61,97.73,,Waiting on Disk: %iowait,0.04,0.05,0.04,0.04,0.04,0.04,0.04,0.04,0.03,0.04,0.03,0.04,0.03,0.05,0.04,0.04,0.04,0.04,0.04,0.04,0.03,0.04,0.04,0.11,0.04,, vmstatProcs: 77, vmstatSwapIn: 0, vmstatSwapOut: 0, vmstatIoIn: 7, vmstatIoOut: 42, iostatR-S: 1.72, iostatW-S: 13.37, iostatR-MBS: 0.25, iostatW-MBS: 0.52, iostatHDD-Util: 0.00
Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz, Cores: 6, Sockets 2, DDR4, (free-mh): 62G 13G 16G 4.5M 32G 48G, NVMe RAID, TestType: Apache PHP, Threads: 512, Server Load: 46.33, 11.29, 3.82, Threads: 512, Time per request: 28000.550 [ms] (mean), Concurrency Level: 512, Requests per second 18.29 [#/sec] (mean)Time per request: 28000.550 [ms] (mean), Connect: 0 11 15.6 0 38, Processing: 179 14189 21674.4 1677 54626, CPU Core Load: %usr,1.91,1.96,2.00,1.96,1.94,1.94,1.95,1.96,1.95,1.96,1.95,1.96,1.94,1.89,1.90,1.87,1.85,1.87,1.84,1.86,1.84,1.87,1.85,1.87,1.84,, CPU Idle%: %idle,97.65,97.57,97.58,97.58,97.62,97.61,97.62,97.59,97.62,97.57,97.63,97.58,97.63,97.64,97.68,97.68,97.72,97.67,97.74,97.68,97.74,97.68,97.71,97.60,97.73,,Waiting on Disk: %iowait,0.04,0.05,0.04,0.04,0.04,0.04,0.04,0.04,0.03,0.04,0.03,0.04,0.03,0.05,0.04,0.04,0.04,0.04,0.04,0.04,0.03,0.04,0.04,0.11,0.04,, vmstatProcs: 0, vmstatSwapIn: 0, vmstatSwapOut: 0, vmstatIoIn: 7, vmstatIoOut: 42, iostatR-S: 1.72, iostatW-S: 13.37, iostatR-MBS: 0.25, iostatW-MBS: 0.52, iostatHDD-Util: 0.00
Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz, Cores: 6, Sockets 2, DDR4, (free-mh): 62G 14G 16G 4.5M 32G 48G, NVMe RAID, TestType: Apache PHP, Threads: 750, Server Load: 37.39, 13.46, 4.95, Threads: 750, Time per request: 40777.912 [ms] (mean), Concurrency Level: 750, Requests per second 18.39 [#/sec] (mean)Time per request: 40777.912 [ms] (mean), Connect: 0 13 17.8 0 43, Processing: 150 17962 23352.7 1807 54307, CPU Core Load: %usr,1.91,1.96,2.00,1.97,1.95,1.94,1.95,1.96,1.95,1.96,1.95,1.97,1.94,1.89,1.90,1.87,1.85,1.87,1.84,1.86,1.84,1.87,1.86,1.87,1.84,, CPU Idle%: %idle,97.64,97.57,97.58,97.58,97.62,97.61,97.62,97.58,97.62,97.57,97.62,97.58,97.63,97.64,97.68,97.67,97.72,97.67,97.73,97.68,97.73,97.68,97.71,97.60,97.73,,Waiting on Disk: %iowait,0.04,0.05,0.04,0.04,0.04,0.04,0.04,0.04,0.03,0.04,0.03,0.04,0.03,0.05,0.04,0.04,0.04,0.04,0.04,0.04,0.03,0.04,0.04,0.11,0.04,, vmstatProcs: 0, vmstatSwapIn: 0, vmstatSwapOut: 0, vmstatIoIn: 7, vmstatIoOut: 42, iostatR-S: 1.72, iostatW-S: 13.36, iostatR-MBS: 0.25, iostatW-MBS: 0.52, iostatHDD-Util: 0.00
Скрипт PHP выполняет json_decode (), несколько раз сортируя тысячи записей, используя preg_replace () для некоторых элементов массива, а затем генерируя случайные числа, используя:
for($z = 0; $z < 10; $z++) {
$r = array(0,0,0,0,0,0,0,0,0,0,0);
for ($i=0;$i<100000;$i++) {
$n = mt_rand(0,10000);
if ($n<=10) {
$r[$n]++;
}
}
print_r($r);
print_r("<BR>");
}
Мне интересно, есть ли другие команды, для которых я должен смотреть на вывод, чтобы определить это. Текущий набор команд:
vmstat -w -S m
iostat -dxmh /dev/md0
mpstat -P ALL
free -h
lscpu
top -b -n
#Between each test clear cache
sync; echo 3 > /proc/sys/vm/drop_caches
swapoff -a && swapon -a
Хорошо, что вы измеряете время отклика даже для этого микро-теста. Метрики использования на уровне хоста не раскрывают всей истории, определенно не здесь.
Производительность упала задолго до 256 потоков. Запросы в секунду фактически снизились после 32 потоков, поэтому третий запуск стал ухудшаться. И без соответствующего увеличения загрузки ЦП или iowait. Так что, вероятно, есть разногласия по поводу какого-то другого ресурса.
Будьте открыты и проверьте все для использования и насыщения. Очень неполный список:
/dev/random
? Поскольку это Linux, где инструменты для повышения производительности хороши, подумайте о выборке всего на ЦП и посмотрите на графики вызовов. Ссылка Заметки Грегга о производительности Linux о том, как делать графики пламени на CPU. Используйте eBPF, когда можете, и perf record
на более ранних ядрах. Прочтите графики, чтобы увидеть, на что тратится время, между вашим кодом, временем выполнения и базой данных и операционной системой. Например, вы можете увидеть функции, связанные с файловой системой, если многие из выбранных стеков связаны с файловым вводом-выводом.
Также по теме просмотра всех вещей подумайте о том, чтобы netdata собирается. Этот конкретный инструмент собирает большое количество точек данных с минимальной конфигурацией. На мой взгляд, просто встроенный набор предупреждений стоит усилий по его установке. netdata также можно указать на Apache httpd mod_status так что вы можете отображать соединения и рабочих вместе с показателями хоста.