Here are some Birmingham Particle Physics results for the HEP-SPEC06 (HS06) 32-bit benchmark (graphs bordered green), and its 64-bit equivalent (graphs bordered orange), which are customised versions of the SPEC CPU2006 benchmark.
The result of this benchmark for a particular setup is often quoted as a single figure: HEP-SPEC06 per core: the HEP-SPEC06 result divided by the number of benchmark streams, when the number of benchmark streams equals the number of cores. But the range of results for different numbers of benchmark streams is interesting too, and helps (along with other factors like network bandwidth) decide how many cpu streams should be configured for a particular setup.
Above: Dell Poweredge 1950 machine, with 16GB memory, and one quad-core E5450 processor, system Fedora 12 64-bit, benchmark 32-bit.
Above: One node of a Supermicro twin machine, with 16GB memory per node, and two quad-core E5450 processors per node, system SL4.7 32-bit.
Above: Dell Poweredge R410 machine, with 12GB memory, and two quad-core E5520 processors (with hyperthreading), system SL5.4 64-bit, benchmark 64-bit. This processor released 2009Q1 belongs to codename Gainestown.
Above: One node of a Supermicro twin machine, with 24GB memory per node, and two quad-core X5550 processors per node, system SL5.4 64-bit, benchmark 64-bit. If we assume that the hyperthreading doesn't contribute to performance for 8 benchmark streams or less, and also that the memory per stream here is not a contributing factor, then we can say that hyperthreading contributes 25% extra performance for the 16 stream case, (159.8/127.6), compared with a theoretical ideal 100% extra if hyperthreads were effectively full cores. This processor released 2009Q1 belongs to codename Gainestown.
Above: Dell R410 server with 16GB memory and two E5620 quad-core (with hyperthreading) processors, system SL5.7 64-bit, benchmark 64-bit. This processor released 2010Q1 belongs to codename Westmere.
Above: Supermicro server with 12GB memory and two Intel Xeon E5645 6-core processors, hyperthreading disabled, system SL5.6 64-bit, benchmark 64-bit. So for above 12 streams, the system is still using 12 cores. The 12-core 64-bit result is 159.96 as shown; my 12-core 32-bit result (not shown) is 135.41. This processor released 2010Q1 belongs to codename Westmere. Thanks to Viglen Ltd for the test facility.
Above: Supermicro server with 16GB memory and two AMD Opteron 6174 12-core processors, system SL5.6 64-bit, benchmark 64-bit. The Opteron 6100 processors released 2010Q1 are codenamed Magny-Cours. Thanks to Viglen Ltd for the test facility.
Above: Supermicro server with 64GB memory and two AMD Opteron 6274 16-core processors, system SL5.7 64-bit, gcc 4.1.2, benchmark 64-bit. The Opteron 6200 processors released 2011Q4 are codenamed Interlagos. Thanks to Boston Labs for the test facility.
Above: one motherboard of a Dell C6145 server with 96GB memory and four AMD Opteron 6234 12-core processors, turbo disabled, system SL5.8 64-bit, gcc 4.1.2, benchmark 32-bit. The Opteron 6200 processors released 2011Q4 are codenamed Interlagos.
Above: one motherboard of a Dell C6145 server with 96GB memory and four AMD Opteron 6234 12-core processors, turbo disabled, system SL5.8 64-bit, gcc 4.1.2, benchmark 64-bit. The Opteron 6200 processors released 2011Q4 are codenamed Interlagos.
Above: IBM dx360M4 server with 32GB memory and two Intel Xeon E5-2660 2.2GHz 8-core processors, hyper-threading on, system SL5.8 64-bit, gcc 4.1.2, benchmark 32-bit. Ignore the right-hand half of the diagram if the case of number of cores < number of streams <= number of hyperthreads is of no interest. Thanks to Uni of Birmingham BlueBEAR2 cluster.
Above: IBM dx360M4 server with 32GB memory and two Intel Xeon E5-2660 2.2GHz 8-core processors, hyper-threading on, system SL5.8 64-bit, gcc 4.1.2, benchmark 64-bit. Ignore the right-hand half of the diagram if the case of number of cores < number of streams <= number of hyperthreads is of no interest. Thanks to Uni of Birmingham BlueBEAR2 cluster.
Above: Supermicro server with 64GB memory and two Intel Xeon E5-2670 2.6GHz 8-core processors, system SL5.7 64-bit, gcc 4.1.2, benchmark 64-bit. Ignore the right-hand half of the diagram if the case of number of cores < number of streams <= number of hyperthreads is of no interest. The Xeon E5-26xx processors released 2012Q1 are a subset of codename Sandy Bridge. Thanks to Boston Labs for the test facility.
Above: Dell R620 server with 64GB memory and two Intel Xeon E5-2690 2.9GHz 8-core processors, system SL5.4 64-bit, gcc 4.1.2, benchmark 32-bit. Ignore the right-hand half of the diagram if the case of number of cores < number of streams <= number of hyperthreads is of no interest. The Xeon E5-26xx processors released 2012Q1 are a subset of codename Sandy Bridge.
Above: Dell R620 server with 64GB memory and two Intel Xeon E5-2690 2.9GHz 8-core processors, system SL5.4 64-bit, gcc 4.1.2, benchmark 64-bit. Ignore the right-hand half of the diagram if the case of number of cores < number of streams <= number of hyperthreads is of no interest. The Xeon E5-26xx processors released 2012Q1 are a subset of codename Sandy Bridge.
Above: Dell T3400 workstation with 8GB memory and Q9550 quad-core (no hyperthreading) processor, system SL4.7 32-bit. This processor released 2008Q1 is from codename Yorkfield (45nm).
Above: Dell T1600 workstation with 8GB memory and Intel Xeon E3-1225 quad-core (no hyperthreading) processor, system SL5.4 64-bit, benchmark 64-bit. This processor released 2011Q1 is from Sandy-Bridge codename set.