 
 
The performance of SSS-CORE has been evaluated from various angles. Here you can see summarized results of the evaluation such as:
The details of the experiments and the discussions are given in our papers.
In the following, the word `SPARCstation 20' stands for Sun Microsystems SPARCstation 20 and its compatible machines. We have mainly used Axil 320 model 8.1.1, which is compatible with Sun Microsystems SPARCstation 20.
                       Conditions
---------------------------------------------------------
| workstation | SPARCstation 20 (85 MHz SuperSPARC × 1) |
|-------------|-----------------------------------------|
|             | SSS-CORE Ver. 1.1                       |
|     OS      |-----------------------------------------|
|             | SunOS 4.1.4                             |
---------------------------------------------------------
      Cost of getting a task ID
-------------------------------------
| SSS-CORE get_taskid() | 1.12 µsec |
|-----------------------|-----------|
|    SunOS getpid()     | 4.39 µsec |
-------------------------------------
            Costs of allocating/freeing memory (in µsec)
----------------------------------------------------------------
|    size (byte)    |  4 K     16 K    64 K    256 K     1 M   |
|-------------------+------------------------------------------|
| SSS-CORE allocate |  23.91   28.91   48.77   123.2    431.2  |
|   SSS-CORE free   |  19.49   20.36   23.91    36.23    99.06 |
|   SunOS sbrk()    | 133.2   375.8   894.3   1828     2020    |
----------------------------------------------------------------
                                  Conditions
------------------------------------------------------------------------------
|     workstation      |Sun Microsystems Ultra 60 (450 MHz UltraSPARC-II × 1)|
|----------------------|-----------------------------------------------------|
|         NIC          |Sun Microsystems GigabitEthernet/P 2.0 Adapter       |
|----------------------|-----------------------------------------------------|
|       network        |(directly connected)                                 |
|----------------------|-----------------------------------------------------|
|OS &                  |SSS-CORE Ver. 2.3               & MBCF               |
|Communication Protocol|-----------------------------------------------------|
|                      |Solaris 2.6                     & TCP/IP             |
------------------------------------------------------------------------------
     One-way latencies of MBCF/1000BASE-SX (in µsec)
---------------------------------------------------------
| data size (byte) |   4     16     64     256    1024  |
|------------------+------------------------------------|
|       MBCF       |  9.6   11.0   11.5   16.2    35.9  |
|      TCP/IP      | 95.08  95.22  95.39  99.45  114.15 |
---------------------------------------------------------
     Peak bandwidths of MBCF/1000BASE-SX (in Mbyte/sec)
-------------------------------------------------------------
| data size (byte) |  4     16    64     256   1024   1408  |
|------------------+----------------------------------------|
|       MBCF       | 2.29  5.67  22.30  55.41  78.22  80.92 |
|      TCP/IP      | 0.09  0.43   1.67   5.56  12.79  20.21 |
-------------------------------------------------------------
Although the software overhead of MBCF is small enough, the peak bandwidth does not come up to the hardware limit of 125 Mbyte/sec. There should be some bottleneck around the Ultra 60's hardware.
                                Conditions
--------------------------------------------------------------------------
| workstation | SPARCstation 20 (85 MHz SuperSPARC × 1)                  |
|-------------|----------------------------------------------------------|
|     NIC     | Sun Microsystems Fast Ethernet SBus Adapter 2.0          |
|-------------|----------------------------------------------------------|
|             | SMC TigerStack 100 5324TX (non-switching 100BASE-TX HUB) |
|   network   |----------------------------------------------------------|
|             | Bay Networks BayStack 350T (switching 100BASE-TX HUB)    |
|-------------|----------------------------------------------------------|
|     OS      | SSS-CORE Ver. 1.1                                        |
--------------------------------------------------------------------------
One-way latencies of MBCF/100BASE-TX (in µsec) ---------------------------------------------------- | data size (byte) | 4 16 64 256 1024 | |------------------+-------------------------------| | MBCF_WRITE | 24.5 27.5 34 60.5 172 | | MBCF_FIFO | 32 32 40.5 73 210.5 | | MBCF_SIGNAL | 49 52.5 60.5 93 227.5 | ----------------------------------------------------
        Peak bandwidths of MBCF/100BASE-TX (in Mbyte/sec)
------------------------------------------------------------------
|    data size (byte)     |  4     16    64   256   1024   1408  |
|-------------------------+--------------------------------------|
| MBCF_WRITE, half duplex | 0.31  1.15  4.31  8.56  11.13  11.48 |
| MBCF_WRITE, full duplex | 0.34  1.27  4.82  9.63  11.64  11.93 |
------------------------------------------------------------------
                                  Conditions
-------------------------------------------------------------------------------
|   workstation    | SPARCstation 20 (85 MHz SuperSPARC × 1)                  |
|------------------|----------------------------------------------------------|
|       NIC        | Sun Microsystems Fast Ethernet SBus Adapter 2.0          |
|------------------|----------------------------------------------------------|
|                  | SMC TigerStack 100 5324TX (non-switching 100BASE-TX HUB) |
|     network      |----------------------------------------------------------|
|                  | Bay Networks BayStack 350T (switching 100BASE-TX HUB)    |
|------------------|----------------------------------------------------------|
|OS &              | SSS-CORE Ver. 1.1          & MPI/MBCF                    |
|MPI implementation|----------------------------------------------------------|
|                  | SunOS 4.1.4                & MPICH Ver. 1.1 (using TCP)  |
-------------------------------------------------------------------------------
       Round-trip times of MPI with 100BASE-TX (in µsec)
----------------------------------------------------------------
| message size (byte)  |  0    4   16    64   256   1024  4096 |
|----------------------+---------------------------------------|
| MPI/MBCF on SSS-CORE |  71   85   85   106   168   438  1026 |
|  MPICH/TCP on SunOS  | 968  962  980  1020  1080  1255  2195 |
----------------------------------------------------------------
            Peak bandwidths of MPI with 100BASE-TX (in Mbyte/sec)
------------------------------------------------------------------------------
|  message size (byte)  |  4     16    64   256   1024   4096   16384  65536 |
|-----------------------+----------------------------------------------------|
| MPI/MBCF on SSS-CORE, | 0.14  0.53  1.82  4.72   8.08   9.72  10.15   9.78 |
|           half duplex |                                                    |
| MPI/MBCF on SSS-CORE, | 0.14  0.57  1.90  5.33  10.22  11.68  11.77  11.85 |
|           full duplex |                                                    |
| MPICH/TCP on SunOS,   | 0.02  0.09  0.35  1.27   3.54   6.04   5.59   7.00 |
|           half duplex |                                                    |
------------------------------------------------------------------------------
                                  Conditions
-------------------------------------------------------------------------------
|   workstation    | SPARCstation 20 (85 MHz SuperSPARC × 1)                  |
|------------------|----------------------------------------------------------|
|       NIC        | Sun Microsystems Fast Ethernet SBus Adapter 2.0          |
|------------------|----------------------------------------------------------|
|     network      | SMC TigerStack 100 5324TX (non-switching 100BASE-TX HUB) |
|------------------|----------------------------------------------------------|
|OS &              | SSS-CORE Ver. 1.1          & MPI/MBCF                    |
|MPI implementation|----------------------------------------------------------|
|                  | SunOS 4.1.4                & MPICH Ver. 1.1 (using TCP)  |
-------------------------------------------------------------------------------
               Execution results of the NAS Parallel Benchmarks
-------------------------------------------------------------------------------
|  program [# of nodes]   | EP[8]  MG[8]  CG[8]  IS[8]   LU[8]   SP[9]  BT[9] |
|-----------------------------------------------------------------------------|
|                            MPI/MBCF on SSS-CORE                             |
|-----------------------------------------------------------------------------|
|  execution time (sec)   | 15.14  7.48   11.02  3.02   160.36  154.91  67.30 |
| speedup ratio to 1 node | 7.99   5.24   6.27   3.33   6.26    8.11    9.16  |
|communication frequency  | 0.00   9.68   12.69  13.58  1.89    7.83    5.32  |
|              (Mbyte/sec)|                                                   |
|communication frequency  | 4      4670   2138   466    1199    421     488   |
|      (# of messages/sec)|                                                   |
|average message size     | 0.00   2.07   5.94   29.14  1.58    18.60   10.90 |
|                  (Kbyte)|                                                   |
|MBCF_WRITE               | 51.10  0.01   53.33  99.22  13.37   49.01   47.24 |
|    availability rate (%)|                                                   |
|use of                   | yes    no     no     yes    no      no      no    |
| collective communication|                                                   |
|-----------------------------------------------------------------------------|
|                             MPICH/TCP on SunOS                              |
|-----------------------------------------------------------------------------|
|  execution time (sec)   | 16.25  13.72  14.59  4.81   185.04  231.66  96.02 |
| speedup ratio to 1 node | 7.73   2.83   4.71   2.13   5.84    6.01    6.53  |
|-----------------------------------------------------------------------------|
|               MPI/MBCF on SSS-CORE versus MPICH/TCP on SunOS                |
|-----------------------------------------------------------------------------|
|performance improvement  | 1.07   1.83   1.32   1.59   1.15    1.50    1.43  |
|                    ratio|                                                   |
-------------------------------------------------------------------------------
                                  Conditions
-------------------------------------------------------------------------------
|   workstation    | SPARCstation 20 (85 MHz SuperSPARC × 1)                  |
|------------------|----------------------------------------------------------|
|       NIC        | Sun Microsystems Fast Ethernet SBus Adapter 2.0          |
|------------------|----------------------------------------------------------|
|     network      | SMC TigerStack 100 5324TX (non-switching 100BASE-TX HUB) |
|------------------|----------------------------------------------------------|
|OS &              | SSS-CORE Ver. 1.1          & modified SUNRPC 4.0         |
|RPC implementation|----------------------------------------------------------|
|                  | SunOS 4.1.4                & SUNRPC 4.0                  |
-------------------------------------------------------------------------------
Round-trip latencies of RPC with 100BASE-TX (in µsec) ----------------------------------------------- | data size (byte) | 4 256 512 1024 | |-----------------------+---------------------| | SSS-CORE, MBCF_SIGNAL | 127 173 221 315 | | SSS-CORE, MBCF_FIFO | 148 194 251 372 | | SunOS TCP | 863 903 918 1033 | -----------------------------------------------
                                Conditions
--------------------------------------------------------------------------
|  workstation   | SPARCstation 20 (85 MHz SuperSPARC × 1)               |
|----------------|-------------------------------------------------------|
|      NIC       | Sun Microsystems Fast Ethernet SBus Adapter 2.0       |
|----------------|-------------------------------------------------------|
|    network     | Bay Networks BayStack 350T (switching 100BASE-TX HUB) |
|----------------|-------------------------------------------------------|
|       OS       | SSS-CORE Ver. 1.1                                     |
|----------------|-------------------------------------------------------|
| runtime system | ADSM                                                  |
--------------------------------------------------------------------------
        Effects of optimization methods on LU-Contig (n = 512, b = 16)
------------------------------------------------------------------------------
|  optimization methods   |execution |# of consistency|# of    |amount of    |
|                         |time (sec)|      management| packets|communication|
|                         |          |           codes|        |      (Mbyte)|
|-------------------------+--------------------------------------------------|
|          None           |    28.20 |         5592 K | 5207 K |       47.73 |
|runtime packet combining |    14.35 |         5592 K | 83.5 K |      113.00 |
|static interprocedural   |     2.17 |         1.43 K | 7.73 K |        9.42 |
|   redundancy elimination|          |                |        |             |
|runtime packet combining |     2.16 |         1.43 K | 7.60 K |        9.27 |
| & static interprocedural|          |                |        |             |
|   redundancy elimination|          |                |        |             |
------------------------------------------------------------------------------
            Effects of optimization methods on Radix (#key = 1 M)
------------------------------------------------------------------------------
|  optimization methods   |execution |# of consistency|# of    |amount of    |
|                         |time (sec)|      management| packets|communication|
|                         |          |           codes|        |      (Mbyte)|
|-------------------------+--------------------------------------------------|
|          None           |    21.90 |          793 K | 3220 K |       76.72 |
|runtime packet combining |    12.13 |          793 K | 75.8 K |      101.08 |
|static interprocedural   |     1.57 |         2.08 K | 19.5 K |       13.47 |
|   redundancy elimination|          |                |        |             |
|runtime packet combining |     1.24 |         2.08 K | 10.1 K |       13.63 |
| & static interprocedural|          |                |        |             |
|   redundancy elimination|          |                |        |             |
------------------------------------------------------------------------------
![[graph (17KB)]](ADSM.gif)
                                  Conditions
-------------------------------------------------------------------------------
|          |  workstation   | SPARCstation 20 (85 MHz SuperSPARC × 1)         |
|          |----------------|-------------------------------------------------|
|          |      NIC       | Sun Microsystems Fast Ethernet SBus Adapter 2.0 |
|          |----------------|-------------------------------------------------|
| SSS-CORE |    network     | Bay Networks BayStack 350T                      |
|   system |                |                      (switching 100BASE-TX HUB) |
|          |----------------|-------------------------------------------------|
|          |       OS       | SSS-CORE Ver. 1.1                               |
|          |----------------|-------------------------------------------------|
|          | runtime system | UDSM                                            |
|-----------------------------------------------------------------------------|
|          |      MPP       | Fujitsu AP1000+ (50 MHz SuperSPARC × 256)       |
| AP1000+  |----------------|-------------------------------------------------|
|   system |       OS       | Cell-OS                                         |
|          |----------------|-------------------------------------------------|
|          | runtime system | UDSM                                            |
-------------------------------------------------------------------------------
           Breakdown of execution time
--------------------------------------------------
| Sync | synchronization                         |
|------|-----------------------------------------|
|  WC  | write commitment                        |
|------|-----------------------------------------|
|  PF  | page fault handler                      |
|------|-----------------------------------------|
| Msg  | remote message handlers                 |
|------|-----------------------------------------|
| Task | execution of original application codes |
--------------------------------------------------
![[graph (6KB)]](lu.gif)
![[graph (6KB)]](radix.gif)
![[graph (17KB)]](UDSM.gif)