Settings

Theme

E3-1240 v5 3.50GHz single core perf worse than E5-2650 v2 2.60GHz PHP 5.X

24 points by erichileman 9 years ago · 21 comments · 2 min read

Reader

E3-1240 v5 @ 3.50GHz performance is worse than E5-2650 v2 @ 2.60GHz for PHP 5.X. For PHP 7 (and everything else) the E3 is better.

The test setup is Xenserver 6.5 w/Centos 6.8 kernel 2.6.32-642.6.2.el6.x86_64 HVM guests. Each VM has 2 cores assigned. The test is using siege. PHP 5.4, 5.5, 5.6; are all nearly 50% slower for E3. PHP 7 is 200% faster for E3. Varnish is nearly 300% faster for E3. Sysbench tests are 150% - 300% faster for E3. Only PHP 5.X is faster for E5.

I've torn down and rebuilt the VM's several times and confirmed they are the same. I've even live migrated them across to the other host/proc and confirmed the same results.

I've tried strace, but it isn't going to work because it adds overhead to every call and the E3 executes that overhead faster. In a browser the E5 TTFB is 167ms; the E3 is 318ms. Stracing the call on the E5 is 548ms; E3 557ms. The E3 executes the overhead of strace faster and the execution times equalize.

What is different about PHP 5.X that it would run so much better on the older generation, slower clocked, E5? Is it the larger l1/l2 cache making the difference? Or something else, instruction set related maybe? What another tool could I use, that adds a little overhead, to see the php execution performance?

techjuice 9 years ago

You are comparing low end Xeon processors with high end Xeon processors ($250-$280 vs $1166-$1180 per processor). You would need to use the same series E3-1240 v2 vs E3-1240 v5 to have a more accurate test. http://ark.intel.com/products/88176/Intel-Xeon-Processor-E3-... http://ark.intel.com/products/65730/Intel-Xeon-Processor-E3-...

qb45 9 years ago

Maybe

  perf stat -d php ./benchmark.php
would show some difference? It measures some kernel and CPU events like context switches, page faults, L1 and L3 cache misses.
  • erichilemanOP 9 years ago

    Unfortunately <not supported>

    Performance counter stats for 'php56 index.php':

            394.588620      task-clock (msec)         #    0.983 CPUs utilized
                   226      context-switches          #    0.573 K/sec
                     2      cpu-migrations            #    0.005 K/sec
                17,447      page-faults               #    0.044 M/sec
       <not supported>      cycles
       <not supported>      stalled-cycles-frontend
       <not supported>      stalled-cycles-backend
       <not supported>      instructions
       <not supported>      branches
       <not supported>      branch-misses
       <not supported>      L1-dcache-loads
       <not supported>      L1-dcache-load-misses
       <not supported>      LLC-loads
       <not supported>      LLC-load-misses
    
           0.401580145 seconds time elapsed
    
    Working on finding the event descriptors...
lossolo 9 years ago

PHP 5 is allocating memory differently from PHP 7. That's why you see difference there and this is the biggest difference between E5 and E3 here (memory bandwidth,cache size). PHP 7 is making optimizations making less memory allocations because it allocates in chunks, PHP5 is allocating/reallocating all the time.

peller 9 years ago

I'm just speculating here, but if I remember correctly, PHP5 uses significantly more memory than PHP7, and the E5 has a 2.5x larger L3 cache and almost twice the memory bandwidth of the E3. Perhaps that has something to do with it?

  • erichilemanOP 9 years ago

    We thought that as well. The E5 has 4 memory channels max bandwidth of 51.2 GB/s. The E3 has 2 memory channels max bandwidth of 34.1 GB/s.

    But we see a dramatic difference in single core tests. Our virtual machines have 2 cores assigned and there's also a dramatic difference. I wouldn't think that 1-2 cores would saturate 2 memory channels nor 34.1 GB/s bandwidth. If we were testing all 8 cores on the E3 vs E5 8 core virtual machine, yeah maybe, but 1-2 cores?

    The L3 cache is much larger on the E5 at 20MB Smartcache vs the E3 at 8MB Smartcache. That seems to be the more likely suspect but I don't know enough about how the cpu cache is used in relation to php to say for sure. Hopefully, someone else does :)

    Ref: http://ark.intel.com/products/88176/Intel-Xeon-Processor-E3-... http://ark.intel.com/products/64590/Intel-Xeon-Processor-E5-...

    • rektide 9 years ago

      You have talked about everything but what the parent was mentioning- actual cache. As in, L1 and L2. Those vary sizably among the different price tiers, somewhat understably, for reasons related to this.

      On recent IBM Power chips, there's a so called PowerCore option that turns off half the cores, and lets the remaining cores double their L2. On some workloads that's a net win. I also tend to think it's there for those people paying a pricey per-core or per-socket fee, where a modest 15% performance gain/core could be very rewarding in a way that scale-out/more-cores can't replicate, but that's in a different realm than anyone I know.

      • erichilemanOP 9 years ago

        See the other comment above re: perf stat. Working on the event descriptors to see and confirm the l1/l2 cache hits/misses.

sliken 9 years ago

Look at the cache miss counters, I suspect that's the explanation. Your other workloads are more cache friendly.

mschuster91 9 years ago

> Varnish is nearly 300% faster for E3

That is the most worrying thing IMO. If the cache is hot (i.e. all loads are from RAM), then the E5 should be vastly more powerful, not vice versa...

Could you try the benchmarks with Gentoo, with optimised builds for each CPU?

nanis 9 years ago

Just to make sure: You built all binaries and linked libraries yourself, from scratch, with the same optimization settings, right?

  • erichilemanOP 9 years ago

    The binaries are from remi repo. We have a template we provision from. I used the same template on each virtual machine.

    PHP 5.4.45 (cli) (built: Sep 19 2016 15:31:07) PHP 5.5.38 (cli) (built: Nov 9 2016 17:32:11) PHP 5.6.28 (cli) (built: Nov 9 2016 07:04:38)

    The binaries are the same on each virtual machine. Are there build optimizations for E3/V5 vs E5/V2 that could make such a difference?

the8472 9 years ago

Have tried comparing on bare metal?

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection