|
 |
|
 |
| ARIES performance guidelines |
 |
 |
Excluding following, for all applications ARIES delivers good performance compared to HP 9000 servers.
- Floating point intensive applications
- Java based short running applications
- Multi-threaded applications that create large no. of (>50) threads and spend significant amount of time in thread synchronization operations
- Applications compiled with
+Ovolatile compiler option
- OpenGL based applications (may work with good performance if they can use display lists and can communicate with OpenGL daemon process using GLX protocol).
- Transaction processing application comprising several hundred processes
- Typical linear code e.g. parsers, shells, interpreters (PERL)
- ADVICE: Use Integrity HP-UX native ports of these applications
ARIES mode performance of your application will vary depending on execution profile and could be comparable to or better than HP 9000 server (PA8700 750 MHz or older) performance.
On HP Integrity servers HP 9000 applications will benefit from faster CPU, server architecture and optimized ARIES. |
|
 |
| » Return to top |
 |
| ARIES performance : server and benchmark configurations |
 |
 |
Differences in server memory is of lesser significance for the performance comparison as none of the selected benchmarks consume more than 1 GB memory.
| Server |
HP 9000 servers |
HP Integrity server |
| Model |
rp5450 |
rp5470 |
rp4440 |
rx2660 |
| HP-UX OS |
11i v2 (11.23) |
11i v1 (11.11) |
11i v3 (11.31) |
11i v3 (11.31) |
| CPU |
PA8500 rev 2.4 |
PA8700 rev 3.1 |
PA8900 rev 3.2 |
Montvale 9100 |
| CPU Clock |
440 MHz |
750 MHz |
1000 MHz |
1.67 GHz |
| CPU / Core |
4 |
4 |
2 P : 4 C |
2 P : 4 C |
| L0 ICache |
512 KB |
768 KB |
768 KB |
16 KB |
| L0 DCache |
1024 KB |
1536 KB |
768 KB |
16 KB |
| L1 Cache |
- NA - |
- NA - |
64 MB |
1024 KB I : 256 KB D |
| L2 Cache |
- NA - |
- NA - |
- NA - |
9 MB / Core |
| Memory |
2 GB |
4 GB |
16 GB |
16 GB |
Benchmarks selected for ARIES performance comparison:
- SPEC CPU INT2000 (Integer operations intensive CPU workload)
- SPEC CPU FP2000 (Floating point operations intensive CPU workload)
- SPEC JVM98 (set of small Java application suites)
- SPEC JBB2000 (Multi-threaded scalable transaction processing Java benchmark)
- Sysbench v0.4.8 (System operations intensive test scenarios)
Benchmark parameters and configurations:
SPEC CPU INT2000 / FP2000:
- Same config file used for building the SPEC CPU benchmarks on HP 9000 servers as used in previous SPEC submissions.
- ARIES options
: -sched_trace
SPEC JVM98:
- JVM version
: 1.5.0.08 PA-RISC 2.0 32-bit (HotSpot, mixed-mode)
- Command line options
: -pa20 -Xms512m -Xmx512m -Xmn512m -a -d3000 -m4 -M4 -n -s100 -t
- ARIES options
: none
SPEC JBB2000:
- JVM version
: 1.5.0.08 PA-RISC 2.0 32-bit (HotSpot, mixed-mode)
- Command line options
: -pa20 -ms512m -mx512m
- ARIES options
: -sched_trace
Sysbench:
- Benchmark version:
0.4.8
- Benchmark parameters
num_threads : 1 (ST - single-threaded)
num_threads : 32 (MT - multi-threaded)
memory_block_size : 4 KB
memory_total_size : 500 MB
memory_scope : global
memory_operation : read/write
file_test_mode : rndrw
file_num : 10
file_total_size : 500 MB
cpu_max_prime : 10000
thread_stack_size : 128 KB
thread_yields : 1000
thread_locks : 16
mutex_num : 4096
mutex_locks : 50000
mutex_loop : 10000
ARIES version:
HP-UX 11i v3 (11.31) - ARIES patch PHSS_36520
ARIES performance comparison graph conventions:
- ARIES performance on HP Integrity server is considered as reference = 1.
- Different HP 9000 server performance is compared with ARIES reference performance.
- Y axis represents HP 9000 server performance relative to ARIES performance.
- For example, 0.39 for PA8500 for SPEC CPU INT2000 indicates 61% ARIES mode performance benefit compared to native execution on HP 9000 server.
- In the graphs, all points below reference green line indicate ARIES mode performance gain.
- In the graphs, all points above reference green line indicate the benchmark runs faster on HP 9000 server compared to ARIES mode on Integrity server.
|
|
 |
| » Return to top |
 |
| ARIES performance : at a glance (summary) |
 |
|
On Montvale 9100 18 MB L3 cache 1.67 GHz based rx2660 HP Integrity server ARIES performance is (on an average)
- For CPU intensive workloads
- ~2x FASTER vs PA8500 440 MHz based rp5450 HP 9000 server
- ~25% FASTER vs PA8700 750 MHz based rp5470 HP 9000 server
- ~25% SLOWER vs PA8900 1000 MHz based rp4440 HP 9000 server
- For non-CPU intensive workloads ARIES performance varies largely depending on test type and server configuration and could range between 0.5x - 4x of HP 9000 server performance
ARIES mode performance of your application will vary depending on execution profile and could be comparable to or better than HP 9000 server (PA8700 750 MHz or older) performance.
ARIES mode performance gain depends on configuration of HP 9000 and target HP Integrity servers.
HP ARIES technology on current Integrity servers provides performance better than PA8500, equivalent to PA8700, and nearly 75% of the latest PA8900 based HP 9000 servers. As a result, HP Integrity servers can be an ideal platform for hosting legacy HP 9000 application workloads as a part of your transformation to HP Integrity servers.
Refer to ARIES performance : server and benchmark configurations for details on server configurations and benchmark parameters. |
 |
|
 |
| » Return to top |
 |
| ARIES performance : SPEC CPU INT2000 comparison |
 |
 |
SPEC CPU INT2000 and ARIES configuration:
- Same config file used for building the SPEC CPU INT2000 benchmarks on HP 9000 servers as used in previous SPEC submissions.
- ARIES version
: HP-UX 11i v3 (11.31) patch PHSS_36520
- ARIES options
: -sched_trace
About the graphs:
- ARIES performance on HP Integrity server is considered as reference = 1.
- Different HP 9000 server performance is compared with ARIES reference performance.
- In the graphs, all points below reference green line indicate ARIES mode performance gain.
- In the graphs, all points above reference green line indicate the benchmark runs faster on HP 9000 server compared to ARIES mode on Integrity server.
- Y axis represents HP 9000 server performance relative to ARIES performance.
- Dotted lines represent weighted-average of corresponding data points.
- Conclusion: ARIES mode performance is better or comparable for all but the fastest (PA8900 based) HP 9000 servers.
|
 |
|
 |
| » Return to top |
 |
| ARIES performance : SPEC CPU FP2000 comparison |
 |
|
SPEC CPU FP2000 and ARIES configuration:
- Same config file used for building the SPEC CPU FP2000 benchmarks on HP 9000 servers as used in previous SPEC submissions.
- ARIES version
: HP-UX 11i v3 (11.31) patch PHSS_36520
- ARIES options
: -sched_trace
About the graphs:
- ARIES performance on HP Integrity server is considered as reference = 1.
- Different HP 9000 server performance is compared with ARIES reference performance.
- In the graphs, all points below reference green line indicate ARIES mode performance gain.
- In the graphs, all points above reference green line indicate the benchmark runs faster on HP 9000 server compared to ARIES mode on Integrity server.
- Y axis represents HP 9000 server performance relative to ARIES performance.
- Dotted lines represent weighted-average of corresponding data points.
- Conclusion: ARIES mode performance is better or comparable for all but the fastest (PA8900 based) HP 9000 servers.
|
 |
|
 |
| » Return to top |
 |
| ARIES performance : Sysbench-0.4.8 comparison |
 |
 |
Sysbench and ARIES configuration:
- Benchmark version:
0.4.8
- Benchmark parameters
num_threads : 1 (ST - single-threaded)
num_threads : 32 (MT - multi-threaded)
memory_block_size : 4 KB
memory_total_size : 500 MB
memory_scope : global
memory_operation : read/write
file_test_mode : rndrw
file_num : 10
file_total_size : 500 MB
cpu_max_prime : 10000
thread_stack_size : 128 KB
thread_yields : 1000
thread_locks : 16
mutex_num : 4096
mutex_locks : 50000
mutex_loop : 10000
- Data model
: 32-bit
- ARIES version
: HP-UX 11i v3 (11.31) patch PHSS_36520
- ARIES options
: none
About the graphs:
- ARIES performance on HP Integrity server is considered as reference = 1.
- Different HP 9000 server performance is compared with ARIES reference performance.
- In the graphs, all points below reference green line indicate ARIES mode performance gain.
- In the graphs, all points above reference green line indicate the benchmark runs faster on HP 9000 server compared to ARIES mode on Integrity server.
- Y axis represents HP 9000 server performance relative to ARIES performance.
- Dotted lines represent average of corresponding data points.
|
|
|
 |
| » Return to top |
 |
| ARIES performance : customer applications |
 |
 |
The table below shows ARIES mode performance for various customer applications. The performance comparison re-enforces the fact that ARIES performance largely depends on the application execution profile.
|
| When |
Application |
HP 9000 server |
HP Integrity server |
ARIES patch |
ARIES mode performance |
| 2004 |
Applications from a telecom major in APJ |
PA8700 750 MHz |
Madison 6M 1500 MHz |
11.23 PHSS_30779 |
150 % of PA8700 |
| 2005 |
An e-Business / BI application |
PA8800 900 MHz |
Madison 9M 1600 MHz |
11.23 PHSS_34201 |
65 % of PA8800 |
| 2005 |
Applications from a telecom major in EMEA |
PA8800 900 MHz |
Madison 9M 1600 MHz |
11.23 PHSS_34201 |
110 % of PA8800 |
| 2006 |
HP Radia application components |
PA8800 1000 MHz |
Madison 3M 1600 MHz |
11.23 PHSS_34201 |
57 % of PA8800 |
| 2007 |
Data cleansing application |
PA8900 1000 MHz |
Madison 9M 1600 MHz |
11.23 PHSS_35045 |
120 % of PA8900 (-sched_trace) |
| 2007 |
A BI application |
PA8800 800 MHz |
Madison 3M 1300 MHz |
11.23 PHSS_36519 |
68 % of PA8800 |
| 2007 |
An ERP application |
PA8900 1000 MHz |
Madison 9M 1600 MHz |
11.31 PHSS_36520 |
85 % of PA8900
55 % of Integrity native |
| 2007 |
Report generation and forms conversion app. |
PA8800 800 MHz |
Madison 3M 1300 MHz |
11.23 PHSS_36519 |
50 % of PA8800 |
|
|
 |
| » Return to top |
 |
| ARIES performance tuning |
 |
 |
Most HP 9000 applications experience good performance with default ARIES configuration. However, you may find following tips useful for tuning the performance of your HP 9000 applications under ARIES:
- For applications which are very much loop intensive, reduce the ARIES translation threshold value (ARIES option
-ts).
- For applications that run for a long duration and have good locality of execution (few functions account for significant portion of total execution time), you may consider enabling trace scheduling (ARIES option flag
-sched_trace).
- If the trace scheduling is enabled, you may consider fine tuning the trace scheduling threshold (ARIES option
-ts_trace).
- For long running Java applications that have large methods, consider increasing the size of ARIES AMAP for dynamically generated code (ARIES option
-amapsz_smc).
- For some rare applications that access and modify the Floating Point Access Register (
FPSR : fr0L) register in a loop, consider enabling translations for such blocks (ARIES option flag -fpgr_trans).
- Some applications experience good performance if ARIES maps emulated FP register context on general registers. While others would perform well if this optimization is disabled. Consider changing the ARIES option flag
-[no]opt_fpgr.
- Some applications may show slight performance gain if re-ordering of state changing instructions and memory loads is enabled (ARIES option flag
-[no]opt_reorder).
WARNING: Some multi-threaded applications like JVM might fail with core dump if ARIES re-orders state changing PA-RISC instructions.
- Some applications that have large text segment size or callmany shared libraries may require large code cache region size (ARIES options
-descsz, -amapsz, -ccsz). For best results make sure that sizes of these parameters are in the ratio 1:2:2.
- Some applications that require ARIES to spend significant time in dynamic translations may work faster if translations are cached to disk (ARIES option
-load, -save). The save and load of translated code works best for statically linked applications.
WARNING: Some HP 9000 applications may not work correctly with the loading of ARIES dynamic translations cached on disk during previous executions of the same application.
- Review the usage of
volatile variables in your application and make sure that the application is not compiled using +Ovolatile compiler option. You should use volatile variables only if necessary. ARIES performance suffers while translating ordered memory operations of HP 9000 applications.
HP recommends that you install latest ARIES patches on your system to ensure that your HP 9000 application benefits from most recent ARIES performance improvements and optimizations. Refer to the section Advanced ARIES Options in latest aries(5) man page for details about ARIES options mentioned in above performance tuning help. The performance tuning steps as described in section above are taken from aries(5) man page.
If above steps do not help with ARIES performance improvement for your application, follow the steps below to collect ARIES profile information and send it to the HP support organization at their web site, the IT Resource Center (ITRC).
- Download and install latest HP Caliper if not already installed on your server.
- Replace your HP 9000 application invocation commands with following
export PA_BOOT32_DEBUG=3
export CALIPER_HOME=/opt/caliper
mkdir /tmp/ARIES_profdb
exec $CALIPER_HOME/bin/caliper fprof \
--database=/tmp/ARIES_profdb/db,unique --scope=process \
--system-usage=all -r a --process=all\
--output-file=/tmp/ARIES_profdb/fprof_aries.txt,per-process,unique \
--module-default=none; \
--module-include=/usr/lib/hpux32/aries32.so:aries32_dyncode \
<PA_executable_name> <PA_executable_arguments>
unset PA_BOOT32_DEBUG
- Send all files under the directory
/tmp/ARIES_profdb along with the output of following commands:
/opt/caliper/bin/caliper -v
/usr/bin/what /usr/lib/hpux32/aries32.so
|
|
 |
| » Return to top |