Next: Bibliography Up: Sendmail X Previous: Sendmail X: Implementation Contents

Subsections

Sendmail X: Performance Tests and Results

SMTP Server Daemon

Remark (placed here so it doesn't get lost): there is a restricted number ( 60000) of possible open connections to one port. Could that limit the throughput we are trying to achieve or is such a high number of connections unfeasible?

SMTP Sink

For simple performance comparisons several SMTP sinks have been implemented or tested.

Test programs are:

smtp-sink from postfix. This is an entirely event driven program.
thrperconn: one thread per connection.
thrpool: uses a worker model with concurrency limiting, see Section 3.20.4.1.
smtps: state-threads, see Section 3.20.3.1.

Test machines are:

v-sun: Sun SPARCserver E450, 4 processors
v-bsd: FreeBSD 3.4, 2 PIII processors, 2 GB RAM
v-aix: AIX, 4 processors
schmidt: Linux 2.4, uses 15 threads per client only, otherwise the machine just ``dies''.

Entries in the tables down below denote execution time in seconds unless otherwise noted, hence smaller values are better.

Tests have been performed with myslam (a multi-threaded SMTP client), using 7 to 8 client machines, 50 threads per client, and 5000 messages per client.

v-sun (8 clients):

parameters smtp-sink smtps thrperconn thrpool

1KB/msg (40MB) 45s 70s 92s 43s

4KB/msg (160MB) 49s 56s 259s 78s

32KB/msg (1280MB) 203s 208s 999s 110s

-w 1 141s 109s 156s 230s

Note: v-sun is a four processor machine, hence the multi-threaded programs (thrpool, thrperconn) can use multiple processors. I didn't select (via an option) multiple processors for smtps though.
Just as one example, the achieved throughput in MB/s is listed in the next table. As it can be seen, it is an order of magnitude lower than the sustainable throughput that can be achieved over a single connection (about 85-90MB/s measured with ttcp; this is a 100Mbit/s ethernet).

parameters smtp-sink smtps thrperconn thrpool

1KB/msg (40MB) 0.9 0.6 0.4 0.9

4KB/msg (160MB) 3.3 2.9 0.6 2.1

32KB/msg (1280MB) 6.5 6.3 - 11.9
v-bsd:

parameters smtp-sink smtps thrperconn thrpool

1KB msg size 97 87 380 140

4KB msg size 108 130 1150 156

32KB msg size 208 197 fails 330

-w 1 165 138 484 223
v-aix:

parameters smtp-sink smtps thrperconn thrpool

1KB msg size 38 28 - 31

4KB msg size 34 33 - 31

32KB msg size 125 125 - 125

-w 1 125 125 - 155

125 for 250/3
schmidt:

parameters smtp-sink smtps thrperconn thrpool

1KB msg size 45 44 165 74

4KB msg size 54 45 418 75

32KB msg size 217 167 fails 256

-w 1 370 360 - 337

SMTP Sink with CDB

2004-03-02

statethreads/examples/smtps3

wiz

See Section 5.2.1.1, machine 1

wiz$ time ./smtpc2 -fa@b.c -Rx@y.z -t 100 -s 1000 -r localhost

sink program FS times (s)

smtps3 - 5

smtpss UFS 17, 18

smtps3 -C UFS 16, 17, 19

perf-lab

source: s-6.perf-lab

sink: v-bsd.perf-lab

with -C

s-6.perf-lab$ time ./smtpc2 -t 100 -s 1000 -r v-bsd.perf-lab
   19.17s real     1.08s user     0.64s system

without -C

s-6.perf-lab$ time ./smtpc2 -t 100 -s 1000 -r v-bsd.perf-lab
    3.04s real     0.81s user     0.59s system

source: s-6.perf-lab

sink: mon.perf-lab (FreeBSD 4.9)

with -C

   12.05s real     1.04s user     0.67s system

without -C

    3.03s real     0.92s user     0.54s system

2004-03-04 source: s-6.perf-lab; sink: v-sun.perf-lab

with -C: 20s - 24s (UFS) Note: It takes 20s(!) to remove all CDB files:

time rm ?/S*     0m20.11s

with -C: 1s (TMPFS); 16s (UFS, /), rm: 14s; logging turned on: 16s, rm: 0.8s.

without -C: 1s

2004-03-08 source: s-6.perf-lab; sink: v-bsd;

./smtpc -t 100 -s 1000

sink program time (s)

smtpss 30

smtps3 -C 30

smtps3 3

2004-03-08 source: s-6.perf-lab; sink: v-sun;

./smtpc -t 100 -s 1000

sink program FS times (s)

smtps3 - 1

smtpss UFS 25, 30

smtps3 -C UFS 23

smtpss swap 2, 3

smtps3 -C swap 1, 2

Note: the variance for smtpss on UFS is fairly large. The lower numbers are achieved by running smtps3 -C first and then smtpss, the larger numbers are measured when the CDB files have just been removed. However, this effect was not reproduceable. Note: removing those files takes about as long as a test run.

SMTP Relaying Using a Sendmail X Prototype

Test setup with a sendmail X prototype of 2002-09-04: v-aix.perf-lab running QMGR, SMTPS, and SMTPC. Relaying from localhost to v-bsd.perf-lab. Source program running on v-aix:

time ./smtp-source -s 50 -m 100 -c localhost:8000

Using the full version: 2.45s; turning fsync() off: 1.44s.

This clearly shows the need for a better CDB implementation, at least on AIX.

Same test with reversed roles (smX on v-bsd, sink on v-aix): using the full version: 7.44s; turning fsync() off: 6.20s. For comparison: using sendmail 8.12: 14.71s.

The SCSI disks on v-bsd seem to be fairly slow. Moreover, there seems to be something wrong with the OS version (it's very old: FreeBSD 3.4).

On FreeBSD 4.6 (machine 14, see Section 5.2.1.1) (source, sink, sm-9 of 2002-10-01 on the same machine):

time ./smtp-source -s 100 -m 200 -c localhost:8000

softupdates: 4.35s; without softupdates: 5.66s

time ./smtp-source -s 50 -m 100 -c localhost:8000

softupdates: 2.01s/1.93s, -U: 1.79s; without softupdates: 2.60s/2.46s, -U: 2.17s

(-U turns off fsync()).

Using sendmail 8.12.6:

time ./smtp-source -s 50 -m 100 localhost:1234

softupdates: 5.01s. This looks quite good for sendmail 8, but the result for:

time ./smtp-source -c -s 100 -m 200 localhost:1234

is: 143.12s, which certainly is not anywhere near good. This is related to the high load generated by this: up to 200 concurrent sendmail processes just kill the machine. sendmail X has only up to 4 processes running.

Various Linux FS

Test date: 2003-05-25, version: smX.0.0.6, machine: PC, AMD Duron 700MHz, 512MB RAM, SuSE 8.1

Test program:

time ./smtp-source -s 50 -m 500 -fa@b.c -tx@y.z localhost:1234

FS Times msg/s (best)

JFS 4.02s, 4.23s 124

ReiserFS 4.8s 104

XFS 6.7s, 7.2s, 7.48s, 7.64s 74

EXT3 14.39s, 13.44s 34

2004-03-17 checks/t-readwrite on destiny (Linux, IDE, ext2):

parameters writes time

-s -f 1000 -p 1 - 9

-s -f 100 -p 10 - 6

The FS is mounted async (default!).

2004-03-17 checks/t-readwrite on ia64-2 (Linux, SCSI, reiserfs):

parameters writes time

-s -f 1000 -p 1 - 5.2

-s -f 100 -p 10 - 2.6

2004-03-23 source: basil.ps-lab MTA: cilantro.ps-lab (Linux 2.4.18-64GB-SMP) sink: v-sun.perf-lab

FS: ReiserFS version 3.6.25

smtpc -t 100 -s 1000

program source time sink time

smtps3 -C -

smX.0.0.12 6 5

sm8.12.11 74 74

sm8.12.11 See 1 50

postfix 2.0.18

gatling -m 100 -c 5000 -z 1 -Z 1

program writes source time source msgs/s sink time

smtps3 2 2295 -

smtps3 -C 5 962 -

smX.0.0.12 22 225 22

sm8.12.11 358 14 358

sm8.12.11 See 1 246 20 -

postfix 2.0.18

Notes:

Default for Linux is to have REQUIRES_DIR_FSYNC set, in this test it has been turned off. Some people claim it is safe to do that with recent Linux FSs. For some reasons (timeouts?) the tests with smtpc fail in this configuration, i.e., less than 1000 messages are sent.
According to tests by Thom sendmail 8.12 was able to relay 40 msgs/s on the same machine.

2004-03-25:

Filesystems:

ext3 (rw,sync,data=journal)
ext3 (rw,data=journal) [this means async?]
reiserfs (rw,noatime,data=journal,notail)
jfs (rw)
ext2 (rw,sync)

smtpc -t 100 -s 1000

program FS source time sink time

smX.0.0.12 1 63 61

1 63 63

2 19 18

3 5 4

3 5 5

5 81 80

sm8.12.11 3 45 several read errors

5 91 92

smtps3 -C

2004-03-25: gatling -m 100 -c 5000 -z 1 -Z 1 (1KB message size)

program FS source time sink time msgs/s

smX.0.0.12 1

2 90 90 55

3 24 24 208

4 100 99 100

sm8.12.11 3 216 errors 23

gatling -m 100 -c 5000 -z 4 -Z 4 (4KB message size)

program FS source time sink time msgs/s

smX.0.0.12 1

2 92 92 54

3 141 140 35

4 168 168 29

sm8.12.11 3 226 errors 22

gatling -m 100 -c 5000 -z 16 -Z 16 (16KB message size)

program FS source time sink time msgs/s

smX.0.0.12 1

2

3 169 29

4

sm8.12.11 3 226 errors 22

Notes:

ReiserFS seems to have some optimizations for small files, hence the results for 1KB are really good, but for 4KB they are in the normal range.
Testing with sm8 usually caused several read errors on the sink side and several errors displayed by gatling.

Various FreeBSD Results

2003-11-19 sm-9.0.0.9 running on v-bsd.perf-lab (2 processors, FreeBSD 3.4)

Source on bsd.dev-lab

time ./smtp-source -d -s 100 -m 500

directly to sink: 2.16 - 2.74s (231msgs/s)

using MFS: 14.37 - 14.43s (34msgs/s) (sm8.12.10: 32s)

using FS with softupdates: 22.78 - 23.83s (21msgs/s) (sm8.12.10: 49s)

using FS without softupdates: 35.27 - 35.56s (14msgs/s)

2004-03-02 source: s-6.perf-lab; relay: mon; sink: v-bsd

time ./smtpc2 -O 10 -fa@s-6.perf-lab -Rnobody@v-bsd.perf-lab -t 100 -s 1000 -r mon.perf-lab:1234

38.26s real 1.01s user 0.88s system

2004-03-04 source: s-6.perf-lab; relay: v-bsd; sink: v-sun

options: -t 100 -s 1000

MTA source time(s) sink time

postfix 2.0.18 53 94

smX.0.0.12 69 68

without smtpc 56 -

sm8.12.11 67 67

-odq 79, 82

-odq / 100 qd 101

-odq / 10 qd 100

Note: this is FreeBSD 3.4 without softupdates and directory hashes.

getrusage(2) data:

sm8.12.11 -odq

ru_utime=        15.0158488
ru_stime=        71.0104605
ru_maxrss=     1524
ru_ixrss=   5030592
ru_idrss=   4098456
ru_isrss=   1412096
ru_minflt=   127503
ru_majflt=        0
ru_nswap=         0
ru_inblock=       0
ru_oublock=   11851
ru_msgsnd=    13000
ru_msgrcv=    10000
ru_nsignals=      0
ru_nvcsw=    617469
ru_nivcsw=    18793

sm8.12.11

ru_utime=        15.0236311
ru_stime=        62.0117941
ru_maxrss=     1520
ru_ixrss=   4573224
ru_idrss=   3676784
ru_isrss=   1283712
ru_minflt=   174619
ru_majflt=        0
ru_nswap=         0
ru_inblock=       0
ru_oublock=    4001
ru_msgsnd=    12000
ru_msgrcv=    10000
ru_nsignals=   1000
ru_nvcsw=    128074
ru_nivcsw=    14771

This looks like a problem in queue only mode: there's way too much data written: almost 3 times the amount of background delivery mode. Why does sm8 send 1000 more message in queue only mode?

2004-03-05 source, relay, sink: wiz (FreeBSD 4.8)

options: -t 100 -s 1000

source: 34s, sink: 32s

turn off smtpc: source: 31s, 34s

2004-03-26 source: v-6.perf-lab running smtpc -t 100 -s 5000; relay: v-bsd.perf-lab; sink: v-sun.perf-lab

sink runs smtps2 -R n with varying values for n

n source time requests served

0 108 5000

8000000 115 5060

58000000 140 5450

88000000 151 5620

put defedb on a RAM disk:

n source time requests served

0 108 5000

8000000

58000000 111 5453

88000000 114 5693

Obviously the additional disk I/O traffic created by having to use DEFEDB is slowing down the system.

FreeBSD 4.9, Softupdates, and fsync()

2004-06-23 Upgraded v-bsd.perf-lab to FreeBSD 4.9 (2 processors), using softupdates.

source on v-sun, sink on s-6:

time ./smtpc2 -O 10 -t 100 -s 1000 -r v-bsd.perf-lab:1234

43s

turn off fsync(): (smtps -U, must be compiled with -DTESTING)

32s

Disk I/O On FreeBSD

A modified iostat(8) program is used to show the number of bytes written and read, and the number of read, write, and other disk I/O operations.

The following tests were performed: sink (smtps3) on v-bsd.perf-lab, source (smtpc) on s-6.perf-lab sending 1000 mails. All numbers for write operations are rounded; if there are numbers in parentheses then those denote the value of ru_oublock (getrusage(2)) for smtps/qmgr or sm8. If two times are given (separated by /) then the second time denotes the output (elapsed time) for the sink.

program softupdates? writes reads time

smtps3 -C yes 2200 - 14

smtps3 -C no 2900 - 30

smX.0.0.12, no sched (see 1) yes 5200 - 34

smX.0.0.12, no sched yes -

smX.0.0.12, no sched no -

smX.0.0.12 (see 2) yes 3500 (2000/1300) 4 33

yes 3370 (2020/1270) 4 30/29

-O i=1000000 yes 2660 (1850/660) 0 25/24

smX.0.0.12 no 6300 (3000/3200) 0 52

smX.0.0.12 (see 4) yes 3500 (2200/1200) 4 25

sm8.12.11 -odq SS=m yes 1800 - 41

sm8.12.11 -odq SS=m no 12200 - 72

sm8.12.11 SS=m (see 3) yes 236 (164) 0 61

yes 370 (218) 0 60

sm8.12.11 no 8100 (4100) 1 63

sm8.12.11 SS=t yes 7400 0 70

postfix 2.0.18 yes 2900 16 21/26

Notes:

Question: why does the smX.0.0.12 use so many write operations? 5200 is way too much. Answer: qmgr committed IBDB more than 1000 times^5.1, increasing the maximum time to acknowledge an SMTPS transaction from 100 $\mu$ s to 10000 $\mu$ s reduces the number of commits to 165.
Question: why does qmgr still (after increasing the time between commits) perform so many write operations?
Question: why does sm8 use so few writes? Can softupdates eliminate or cluster most writes? Why doesn't this work for smX? Solution: SuperSafe was set to m, not to true.
If IBDB and CDB are on different partitions, the performance increases significantly (about 25 per cent faster).

2004-03-23 source: basil.ps-lab MTA: wasabi.ps-lab (FreeBSD 4.9, machine 16 in Section 5.2.1.1) sink: v-sun.perf-lab

smtpc -t 100 -s 1000

program writes reads source time sink time

smtps3 -C 2400 - 11 -

smX.0.0.12 2600 5 15 13

sm8.12.11 6000 1 35

postfix 2.0.18 2800 15 14 20

Note: the sink time for postfix is shorter than the time for smX because smX emptied the queue during the run while postfix has more than 700 entries in the mail queue after the source finished sending all mails. This can be seen by looking at the sink time which is noticeable larger for postfix compared to sendmail X.

Using gatling:

Max random envelope rcpts:  1
Connections:                100
Max msgs/conn:              Unlimited
Messages:                   Fixed size 1 Kbytes
Desired Message Rate:       Unlimited
Total messages:             5000

Total test elapsed time: 73.571 seconds (1:13.570)
Overall message rate: 67.962 msg/sec
Peak rate: 100.000 msg/sec

gatling -m 100 -c 5000 -z 1 -Z 1

program writes source time source msgs/s sink time

smtps3 0 5 980 -

smtps3 -C 11750 53 93

smX.0.0.12 73 67 71

smX.0.0.12 11157 (8000/2700) 70 71 69

sm8.12.11 136 36

postfix 2.0.18 60 83 78

postfix 2.0.18 12635 58 85 75

2004-03-16 results for wiz: source: time ./smtpc -s 1000 -t 100 -r localhost:1234; sink: smtps3, file system: UFS, softupdates

parameters oublock writes source time sink time

-C -i 1920 ? 17 16

-C -p 1 1860 ? 17 17

-C -p 1 1940 2700 16 15

-C -p 1 1970 2770 16 15

-C -p 2 ? 15 ?

-C -p 2 877+966 2600 15 ?

-C -p 4 455+476+432+472 2640 15 ?

New option: -f for flat, i.e., instead of using 16 subdirectories for CDB files, a single directory is used. Even though this does not cause a noticeable difference in run time, the number of I/O operations is reduced.

parameters oublock writes source time

-C -p 2 915+920 2600 14

-C -p 2 -f 600+610 2200 14

2004-03-16 source: s-6.perf-lab, time ./smtpc -s 1000 -t 100 -r localhost:1234; sink: -v-bsd.perf-lab, smtps3, file system: UFS, softupdates

parameters oublock writes source time sink time

-C -i 1430 2165 12 11

1550 2300 14 13

-C -p 1 1500 2500 14 12

-C -p 2 1100+620 2500 13 -

800+770 2320 13 -

-C -p 4 530+350+540+470 2600 13 -

Note: some of the write operations might be from softupdates due to the previous rm command (removing the CDB files).

2004-03-17 checks/t-readwrite on v-bsd (FreeBSD 4.9, SCSI):

parameters softupdates oublock writes time

-s -f 1000 -p 1 yes 4000 4000 22

-s -f 100 -p 10 yes 2575 2579 14

-s -f 1000 -p 1 no 4050 4050 28

-s -f 100 -p 10 no 4050 4050 27

-p specifies the number of processes to start, -f specifies the number of files to write per process. The test cases above write 1000 files with either 1 or 10 processes. As it can be seen, it is significantly more efficient to use 10 processes if softupdates are turned on.

2004-03-17 checks/t-readwrite on wiz (FreeBSD 4.8, IDE):

parameters softupdates oublock writes time

-s -f 1000 -p 1 yes 3000 3800 13

-s -f 100 -p 10 yes 2860 3600 13

In this case no difference can be seen, which is most likely a result of using an IDE drive with write-caching turned on (default).

Various SunOS 5 Results

2003-11-21 sm-9.0.0.9 running on v-sun.perf-lab

Source on bsd.dev-lab

time ./smtp-source -d -s 100 -m 5000 -c

using FS: 301.90 - 305.02s (16msgs/s)

using swap: 77.98 - 78.55s (64msgs/s)

Those tests ran only 32 SMTPS threads (the machine has 4 CPUs, hence the specified limit 128 was divided by 4). Using 128 SMTPS threads (by forcing only one process which was used anyway because SMTPS is run with the interactive option which does not start backgroup processes):

time ./smtp-source -d -s 100 -m 50000 -c

using swap: 727.73s (68msgs/s)

2004-03-09 sm-9.0.0.12 running on v-sun.perf-lab

time ./smtpc -O 20 -fa@s-6.perf-lab.sendmail.com -Rnobody@v-bsd.perf-lab.sendmail.com -t 100 -s 1000 -r v-sun.perf-lab.sendmail.com:1234

MTA options FS source time(s) sink time(s)

full MTS SWAPFS 16 14

without sched SWAPFS 10 -

smtpss SWAPFS 3 -

full MTS UFS 64, 65, 64 75, 70, 69

8.12.11 SWAPFS 16 19

8.12.11 UFS 141 138

Note: smX using UFS runs into connection limitations: QMGR believes there are 100 open connections even though the sink shows at most 18. This seems to be a communication latency between SMTPC and QMGR (and needs to be investigated further).

2004-03-17 checks/t-readwrite on v-sun (SunOS 5.8, SCSI):

parameters writes time

-s -f 1000 -p 1 - 39

-s -f 100 -p 10 - 37

The filesystem on SunOS 5.8 does not cause any difference whether 1 or 10 processes are used.

Various OpenBSD Results

2004-03-05 source, relay, and sink on zardoc (OpenBSD 3.2)

test with logging via smioout

zardoc$ time ./smtpc2 -O 10 -s 1000 -t 100 -r localhost:1234
   24.17s real     0.94s user     2.57s system

smtps3 stats:

elapsed                    26
Thread limits (min/max)    8/256
Waiting threads            8
Max busy threads           3
Requests served            1000

Note that there have been only 3 active threads. That means the client is not busy at all. Another test shows elapsed=23s, max busy threads=21, so the result isn't deterministic (the machine is running as normal SMTP server etc during tests).

test with logging via smioerr: smtpc2: 24.53s; no difference.

Various AIX Results

2004-03-17 checks/t-readwrite on aix-3 (AIX 4.3, SCSI, jfs):

parameters writes time

-s -f 1000 -p 1 - 30

-s -f 100 -p 10 - 29

No (noticeable) difference.

Implementation of Queues and Caches

Filesystem Performance

Here are some results of a simple test program which creates and deletes a number of files and optionally renames them twice while doing so.

Notice: unless mentioned otherwise, all measurements are at most accurate to one second resolution. Repeated test will most likely show (slightly) different results. These tests are only listed to give an idea of the magnitude of available performance.

Test Systems

The involved systems are:

PC, Pentium III, 500MHz, 256MB RAM, FreeBSD 3.2,

wdc0: unit 0 (wd0): <FUJITSU MPD3064AT>
wd0: 6187MB (12672450 sectors), 13410 cyls, 15 heads, 63 S/T, 512 B/S

PC, AMD K6-2, 450MHz, 220MB RAM, OpenBSD 2.8,

wd0 at pciide0 channel 0 drive 0: <IBM-DJNA-351010>
wd0: can use 32-bit, PIO mode 4, DMA mode 2, Ultra-DMA mode 4
wd0: 16-sector PIO, LBA, 9671MB, 16383 cyl, 16 head, 63 sec, 19807200 sectors

wd1 at pciide0 channel 0 drive 1: <Maxtor 98196H8>,
wd1: can use 32-bit, PIO mode 4, DMA mode 2, Ultra-DMA mode 4,
wd1: 16-sector PIO, LBA, 78167MB, 16383 cyl, 16 head, 63 sec, 160086528 sectors

PC, Pentium III, 500MHz, 256MB RAM, FreeBSD 4.4-STABLE,

ad0: 6187MB <FUJITSU MPC3064AT> [13410/15/63] at ata0-master UDMA33

PC, AMD-K7, 500MHz, FreeBSD 4.4-STABLE, 332MB RAM,

ahc0: <Adaptec 2940 Ultra2 SCSI adapter (OEM)>
da0: <IBM DNES-309170W SA30> Fixed Direct Access SCSI-3 device
da0: 40.000MB/s transfers (20.000MHz, offset 31, 16bit), Tagged Queueing Enabled
da0: 8748MB (17916240 512 byte sectors: 255H 63S/T 1115C)

SCSI with softupdates
SCSI without softupdates

ad0: 8063MB <FUJITSU MPD3084AT> [16383/16/63] at ata0-master UDMA66

softupdates

PC, Linux 2.2.12,

hda: IBM-DJNA-370910, 8693MB w/1966kB Cache, CHS=1108/255/63

ext 2 FS

PC, Linux 2.4.7,

hda: 39102336 sectors (20020 MB) w/2048KiB Cache, CHS=2434/255/63, UDMA(66)
reiserfs: using 3.5.x disk format
ReiserFS version 3.6.25

Dec Digitial Alpha, OSF/1, SCSI disk?
Sun SPARC, SunOS 5.6, SCSI disk?
Sun SPARC, SunOS 5.7, SCSI disk?
1. mount options: no logging, atime
2. mount options: logging, atime
3. mount options: logging, noatime
Sun SPARC E450, 4 CPUs,
1. Baydel RAID
2. SCSI disk
AIX 4.3.3, using JFS (default).

PC, AMD K7, 1000MHz, 512MB RAM, SuSE 7.3, kernel 2.4.10

WD1200BB
hdg: 234441648 sectors (120034 MB) w/2048KiB Cache, CHS=232581/16/63, UDMA(100)

/home jfs
/opt reiserfs
/work ext3

HP-UX 11.00

PC, Pentium II, 360MHz, 512MB RAM, FreeBSD 4.6,

ad0: 8693MB <IBM-DJNA-370910> [17662/16/63] at ata0-master UDMA33
acd0: CDROM <CD-ROM 40X> at ata1-master PIO4

Intel IA64, 4 CPUs, 1GB RAM

scsi0 : ioc0: LSI53C1030, FwRev=01000000h, Ports=1, MaxQ=255, IRQ=52
  Vendor: MAXTOR    Model: ATLASU320_18_SCA  Rev: B120
  Type:   Direct-Access                      ANSI SCSI revision: 03
Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
SCSI device sda: 35916548 512-byte hdwr sectors (18389 MB)
reiserfs: found format "3.6" with standard journal
reiserfs: using ordered data mode
Using r5 hash to sort names

Intel Pentium III, 650 MHz, 256MB RAM

da0 at ahc0 bus 0 target 0 lun 0
da0: <SEAGATE ST39175LW 0001> Fixed Direct Access SCSI-2 device 
da0: 80.000MB/s transfers (40.000MHz, offset 15, 16bit), Tagged Queueing Enabled
da0: 8683MB (17783240 512 byte sectors: 255H 63S/T 1106C)

PC, Pentium III, 450MHz, 256MB RAM, FreeBSD 4.8, softupdates

ad0: 6187MB <FUJITSU MPD3064AT> [13410/15/63] at ata0-master UDMA33

PC, FreeBSD 4.10, softupdates

da3 at ahc0 bus 0 target 4 lun 0
da3: <IBM DNES-309170Y SA30> Fixed Direct Access SCSI-3 device 
da3: 40.000MB/s transfers (20.000MHz, offset 31, 16bit), Tagged Queueing Enabled
da3: 8748MB (17916240 512 byte sectors: 255H 63S/T 1115C)

PC, VIA C3, 667MHz, 256MB RAM, OpenBSD 3.2,

wd0 at pciide0 channel 0 drive 0: <IBM-DJNA-371350>
wd0: 16-sector PIO, LBA, 12949MB, 16383 cyl, 16 head, 63 sec, 26520480 sectors

wd1 at pciide0 channel 0 drive 1: <WDC WD1200BB-53CAA0>
wd1: 16-sector PIO, LBA, 114473MB, 16383 cyl, 16 head, 63 sec, 234441648 sectors

wd2 at pciide1 channel 0 drive 0: <Maxtor 6Y160P0>
wd2: 16-sector PIO, LBA48, 156334MB, 16383 cyl, 16 head, 63 sec, 320173056 sectors
wd2(pciide1:0:0): using PIO mode 4, Ultra-DMA mode 6

Meta Data Operations

In this section, some simple test programs are used that create some files, perform (sequential) read/write operations on them and remove them afterwards.

Entries in the following table are elapsed time in seconds (except for the first column which obviously refers to the machine description above). The program that has been used to produce these results is fsperf1.c.

machine 5000 100 -c 5000 100 -c -r 5000 100

1 50 49 48

1 42 48 51

2a 3 7 10

about 2200 tps about 1500 tps

2b
11 21

3 10 34 34

about 500 tps

4(a)i 126 125

4(a)ii 208 454

4b 43 48

7 7 13 16

5 9 8 9

8 133 201 603

9a 52 665

10a 9 9 12

11
89 139 233

Comments:

2a is probably so fast due to a disk cache, i.e., the data is probably not really written to disk (even though fsync() is used). The same might hold for 5. In case of 10a the RAID system has a (battery backed) RAM disk cache, which gives similar results without the risk of losing data in case of a power loss.
8 is unacceptably slow...

(2004-07-14) With and without fsync(2) (-S)

common parameters machine -c -c -r -S -c -S -c -r

(5000 100) 17 42 42 2 3

10b 165 496 165 495

18 83 83 5 8

19a 8 7 1 3

19b 8 9 1 3

19c 7 9 1 2

(-s 32 5000 100) 17 109 109 8 9

10b 250 537 207 498

18 114 113 14 16

19b 87 81 3 5

19c 26 26 4 5

Comments:

Sun's FS is unbelievable slow. Moreover, it doesn't make a difference whether fsync(2) is used or not. That means the FS does not have any optimization like softupdates or journalling FSs have.
FreeBSD softupdate can optimize renaming in this test.
IDE is faster than SCSI for small sizes because of the cache; it most likely ``cheats'' (write cache is not disabled, hence the date may not really be on disk even if fsync(2) is used).

Next version: allow for hashing (00 - 99, up to two levels). Use enough files to defeat the (2MB) cache of IDE disks.

machine -h 1 -c 1000 1000 -h 1 -c -r 1000 1000

1 18 18

2a 24 24

2b 7 9

3 14 14

4(a)i 23 23

4(a)ii 33 77

4b 25 49

5 3 2

7 3 4

8 58 163

9a 51 139

11 28 48

Comments:

The fact that there is no difference for the two tests for 2a might be again be attributed to the disk cache. The same effect can be observed for 1, 3, and 5.
The result for 4(a)i (softupdates) indicates that softupdates eliminates the two consecutive rename(2)s since no fsync(2) is issued inbetween. The results without softupdates 4(a)ii reflect this.
8, 9a, and 11 show the normal filesystem (UFS) behaviour.
7 is extremely fast.

Meta Data Operations: Existing Files

Next version fsperf1.c: allow for hashing (00 - 99, up to two levels). Use enough files to defeat the (2MB) cache of IDE disks. The parameters for the following table are 1000 operations and 1000 files, hence each file is used once. Additional parameters are listed in the heading. c: create, h 1: one level hashing, r: rename file twice, p: populate directories before test, then just reuse the files.

machine -h 1 -c -h 1 -c -r -p -h 1 -c -p -h 1 -c -r

1 32 31 18 17

2a 18 18 9 10

2b 10 10 8 10

5 2 1 2 1

6 2 2 4 4

7 2 4 2 3

8 58 165 78 178

9a 27 127 33 131

9c 13 51 37 55

11 28 48 28 48

Comments:

5 must be cheating (ext2, async).
Why are populated directories slower on 8?
Using logging (and noatime) on 9 makes rename() more than two times faster.

Writing a Logfile

Another test program (fsseq1.c) writes lines to a file and uses fsync(2) after a specified number (-C parameter).

20000 entries (10000 entries each for received/delivered, total 490000 bytes).

machine - -C 100 -C 50 -C 10 -C 5 -C 2 -f

1 1 4 6 17 32 78 150

2a 0 2 2 5 5 9 18

2b 1 0 1 3 4 10 20

3 1 2 3 9 16 37 68

5 1 1 2 6 12 27 56

7 0 4 8 39 79 198 410

8 1 7 13 60 120 299 598

9a 1 8 13 15 62 90 140

11 0 6 12 53 106 262 518

This clearly demonstrates the need for group commits. However, the program requires a lot of CPU since each line is generated by snprintf(). Hence the full I/O speed may not be reached. To confirm this, another program (fsseq2.c) is used that just writes a buffer with a fixed content to a file.

The following table lists the results for group commits (C) together with various buffer sizes (256, 1024, 4096, 8192, and 16384). As usual the entries are execution time in seconds. The program writes 2000 records in total, e.g., for size 16384 that is 31MB data.

machine C 256 1024 4096 8192 16384

5 1 4 5 10 20 34

2 2 4 6 12 22

5 1 2 5 7 15

10 1 1 3 6 12

50 1 0 3 5 10

100 0 1 3 5 10

7 1 1 5 20 40 44

2 1 5 11 23 29

5 1 5 9 12 13

10 1 2 3 6 7

50 0 1 1 2 3

100 0 1 1 1 3

8 1 3 10 45 95 109

2 2 11 23 52 59

5 3 11 19 24 32

10 2 5 6 15 21

50 1 2 3 8 13

100 0 1 3 6 13

9a 1 3 12 34 35 58

2 3 12 18 53 53

5 3 6 21 23 24

10 3 5 6 13 14

50 1 2 2 5 7

100 1 1 2 3 6

11 1 21 35 77 83 92

2 13 26 38 45 50

5 8 13 17 20 24

10 5 6 10 11 15

50 1 2 2 4 7

100 1 1 2 3 6

Comments:

7 is able to sustain about 10MB/s write rate, 9a reaches about 5MB/s, just like 11.
5 achieves about 3MB/s.

Yet another program (fsseq3.c) uses write() instead of fwrite(). This time the tests write 40000KB each, which makes it simpler to determine the throughput.

Note: as usual, these times are not very accurate (1s resolution), and hence the rate is inaccurate too. Machines:

1

C s records time KB/s

1 512 80000 1365 29

1 1024 40000 734 54

1 2048 20000 451 88

1 4096 10000 352 113

1 8192 5000 250 160

2 512 80000 736 54

2 1024 40000 453 88

2 2048 20000 354 112

2 4096 10000 382 104

2 8192 5000 225 177

5 512 80000 638 62

5 1024 40000 585 68

5 2048 20000 312 128

5 4096 10000 187 213

5 8192 5000 101 396

10 512 80000 561 71

10 1024 40000 296 135

10 2048 20000 161 248

10 4096 10000 88 454

10 8192 5000 60 666

50 512 80000 128 312

50 1024 40000 70 571

50 2048 20000 41 975

50 4096 10000 34 1176

50 8192 5000 29 1379

100 512 80000 73 547

100 1024 40000 43 930

100 2048 20000 33 1212

100 4096 10000 28 1428

100 8192 5000 27 1481

2b

C s records time KB/s

1 512 80000 165 242

1 1024 40000 90 444

1 2048 20000 54 740

1 4096 10000 28 1428

1 8192 5000 16 2500

2 512 80000 94 425

2 1024 40000 52 769

2 2048 20000 30 1333

2 4096 10000 17 2352

2 8192 5000 11 3636

5 512 80000 54 740

5 1024 40000 33 1212

5 2048 20000 19 2105

5 4096 10000 11 3636

5 8192 5000 8 5000

10 512 80000 31 1290

10 1024 40000 18 2222

10 2048 20000 11 3636

10 4096 10000 8 5000

10 8192 5000 6 6666

50 512 80000 11 3636

50 1024 40000 8 5000

50 2048 20000 6 6666

50 4096 10000 5 8000

50 8192 5000 4 10000

100 512 80000 10 4000

100 1024 40000 8 5000

100 2048 20000 5 8000

100 4096 10000 4 10000

100 8192 5000 5 8000

5

C s records time KB/s

1 512 80000 13440 2

1 1024 40000 6790 5

1 2048 20000 3451 11

1 4096 10000 1779 22

1 8192 5000 1007 39

2 512 80000 6790 5

2 1024 40000 3439 11

2 2048 20000 1763 22

2 4096 10000 909 44

2 8192 5000 471 84

5 512 80000 2763 14

5 1024 40000 1414 28

5 2048 20000 739 54

5 4096 10000 383 104

5 8192 5000 208 192

10 512 80000 1414 28

10 1024 40000 731 54

10 2048 20000 384 104

10 4096 10000 208 192

10 8192 5000 120 333

50 512 80000 312 128

50 1024 40000 174 229

50 2048 20000 101 396

50 4096 10000 64 625

50 8192 5000 46 869

100 512 80000 171 233

100 1024 40000 100 400

100 2048 20000 64 625

100 4096 10000 46 869

100 8192 5000 37 1081

6

C s records time KB/s

1 512 80000 130 307

1 1024 40000 93 430

1 2048 20000 78 512

1 4096 10000 23 1739

1 8192 5000 12 3333

2 512 80000 62 645

2 1024 40000 46 869

2 2048 20000 24 1666

2 4096 10000 13 3076

2 8192 5000 15 2666

5 512 80000 66 606

5 1024 40000 31 1290

5 2048 20000 18 2222

5 4096 10000 15 2666

5 8192 5000 10 4000

10 512 80000 28 1428

10 1024 40000 19 2105

10 2048 20000 13 3076

10 4096 10000 10 4000

10 8192 5000 10 4000

50 512 80000 14 2857

50 1024 40000 10 4000

50 2048 20000 10 4000

50 4096 10000 9 4444

50 8192 5000 7 5714

100 512 80000 11 3636

100 1024 40000 10 4000

100 2048 20000 8 5000

100 4096 10000 8 5000

100 8192 5000 8 5000

7

C s records time KB/s

1 512 80000 3347 11

1 1024 40000 1689 23

1 2048 20000 845 47

1 4096 10000 418 95

1 8192 5000 192 208

2 512 80000 1243 32

2 1024 40000 796 50

2 2048 20000 431 92

2 4096 10000 222 180

2 8192 5000 122 327

5 512 80000 655 61

5 1024 40000 268 149

5 2048 20000 161 248

5 4096 10000 108 370

5 8192 5000 58 689

10 512 80000 355 112

10 1024 40000 185 216

10 2048 20000 85 470

10 4096 10000 42 952

10 8192 5000 38 1052

50 512 80000 88 454

50 1024 40000 49 816

50 2048 20000 31 1290

50 4096 10000 18 2222

50 8192 5000 10 4000

100 512 80000 45 888

100 1024 40000 33 1212

100 2048 20000 19 2105

100 4096 10000 14 2857

100 8192 5000 14 2857

8

C s records time KB/s

1 512 80000 6302 6

1 1024 40000 3220 12

1 2048 20000 1695 23

1 4096 10000 949 42

1 8192 5000 552 72

2 512 80000 3183 12

2 1024 40000 1708 23

2 2048 20000 950 42

2 4096 10000 484 82

2 8192 5000 299 133

5 512 80000 1402 28

5 1024 40000 805 49

5 2048 20000 440 90

5 4096 10000 252 158

5 8192 5000 137 291

10 512 80000 783 51

10 1024 40000 395 101

10 2048 20000 211 189

10 4096 10000 122 327

10 8192 5000 87 459

50 512 80000 181 220

50 1024 40000 107 373

50 2048 20000 68 588

50 4096 10000 49 816

50 8192 5000 42 952

100 512 80000 111 360

100 1024 40000 70 571

100 2048 20000 50 800

100 4096 10000 40 1000

100 8192 5000 36 1111

9a

C s records time KB/s

1 512 80000 2638 15

1 1024 40000 1419 28

1 2048 20000 753 53

1 4096 10000 442 90

1 8192 5000 221 180

2 512 80000 1379 29

2 1024 40000 774 51

2 2048 20000 409 97

2 4096 10000 220 181

2 8192 5000 124 322

5 512 80000 644 62

5 1024 40000 382 104

5 2048 20000 198 202

5 4096 10000 105 380

5 8192 5000 58 689

10 512 80000 355 112

10 1024 40000 196 204

10 2048 20000 104 384

10 4096 10000 59 677

10 8192 5000 32 1250

50 512 80000 90 444

50 1024 40000 51 784

50 2048 20000 28 1428

50 4096 10000 19 2105

50 8192 5000 15 2666

100 512 80000 54 740

100 1024 40000 28 1428

100 2048 20000 20 2000

100 4096 10000 15 2666

100 8192 5000 14 2857

9b

C s records time KB/s

1 512 80000 2642 15

1 1024 40000 1312 30

1 2048 20000 723 55

1 4096 10000 376 106

1 8192 5000 185 216

2 512 80000 1363 29

2 1024 40000 699 57

2 2048 20000 359 111

2 4096 10000 185 216

2 8192 5000 104 384

5 512 80000 563 71

5 1024 40000 302 132

5 2048 20000 162 246

5 4096 10000 88 454

5 8192 5000 46 869

10 512 80000 299 133

10 1024 40000 161 248

10 2048 20000 87 459

10 4096 10000 46 869

10 8192 5000 24 1666

50 512 80000 81 493

50 1024 40000 44 909

50 2048 20000 35 1142

50 4096 10000 19 2105

50 8192 5000 13 3076

100 512 80000 51 784

100 1024 40000 35 1142

100 2048 20000 26 1538

100 4096 10000 15 2666

100 8192 5000 13 3076

9c

C s records time KB/s

1 512 80000 2576 15

1 1024 40000 1326 30

1 2048 20000 707 56

1 4096 10000 377 106

1 8192 5000 192 208

2 512 80000 1324 30

2 1024 40000 685 58

2 2048 20000 349 114

2 4096 10000 187 213

2 8192 5000 107 373

5 512 80000 578 69

5 1024 40000 313 127

5 2048 20000 163 245

5 4096 10000 89 449

5 8192 5000 46 869

10 512 80000 306 130

10 1024 40000 162 246

10 2048 20000 86 465

10 4096 10000 46 869

10 8192 5000 25 1600

50 512 80000 82 487

50 1024 40000 44 909

50 2048 20000 33 1212

50 4096 10000 19 2105

50 8192 5000 13 3076

100 512 80000 52 769

100 1024 40000 36 1111

100 2048 20000 25 1600

100 4096 10000 16 2500

100 8192 5000 13 3076

12a

C s records time KB/s

1 512 80000 65 615

1 1024 40000 61 655

1 2048 20000 59 677

1 4096 10000 5 8000

1 8192 5000 4 10000

2 512 80000 13 3076

2 1024 40000 8 5000

2 2048 20000 4 10000

2 4096 10000 4 10000

2 8192 5000 3 13333

5 512 80000 44 909

5 1024 40000 21 1904

5 2048 20000 13 3076

5 4096 10000 3 13333

5 8192 5000 3 13333

10 512 80000 12 3333

10 1024 40000 3 13333

10 2048 20000 3 13333

10 4096 10000 3 13333

10 8192 5000 5 8000

50 512 80000 11 3636

50 1024 40000 3 13333

50 2048 20000 5 8000

50 4096 10000 5 8000

50 8192 5000 4 10000

100 512 80000 5 8000

100 1024 40000 5 8000

100 2048 20000 5 8000

100 4096 10000 4 10000

100 8192 5000 3 13333

12b

C s records time KB/s

1 512 80000 124 322

1 1024 40000 87 459

1 2048 20000 72 555

1 4096 10000 20 2000

1 8192 5000 10 4000

2 512 80000 47 851

2 1024 40000 32 1250

2 2048 20000 16 2500

2 4096 10000 8 5000

2 8192 5000 5 8000

5 512 80000 56 714

5 1024 40000 27 1481

5 2048 20000 20 2000

5 4096 10000 5 8000

5 8192 5000 5 8000

10 512 80000 23 1739

10 1024 40000 17 2352

10 2048 20000 6 6666

10 4096 10000 3 13333

10 8192 5000 6 6666

50 512 80000 7 5714

50 1024 40000 4 10000

50 2048 20000 6 6666

50 4096 10000 6 6666

50 8192 5000 4 10000

100 512 80000 7 5714

100 1024 40000 6 6666

100 2048 20000 5 8000

100 4096 10000 4 10000

100 8192 5000 3 13333

12c

C s records time KB/s

1 512 80000 205 195

1 1024 40000 144 277

1 2048 20000 122 327

1 4096 10000 14 2857

1 8192 5000 7 5714

2 512 80000 34 1176

2 1024 40000 22 1818

2 2048 20000 13 3076

2 4096 10000 7 5714

2 8192 5000 5 8000

5 512 80000 96 416

5 1024 40000 48 833

5 2048 20000 20 2000

5 4096 10000 4 10000

5 8192 5000 4 10000

10 512 80000 36 1111

10 1024 40000 7 5714

10 2048 20000 5 8000

10 4096 10000 4 10000

10 8192 5000 3 13333

50 512 80000 12 3333

50 1024 40000 4 10000

50 2048 20000 4 10000

50 4096 10000 3 13333

50 8192 5000 3 13333

100 512 80000 7 5714

100 1024 40000 6 6666

100 2048 20000 3 13333

100 4096 10000 3 13333

100 8192 5000 3 13333

Raw Throughput

Very simple measurement of transfer rate:

time dd ibs=8192 if=/dev/zero obs=8192 count=5120 of=incq

machine s MB/s

1 11.6 3.6

2a 4.8 8.4

2b 1.9 20.9

5 10.83 3.9

6 0.65 61

7 1.0 40.0

8 14.8 2.8

9 6.3 6.6

11 6.98 6.0

12a 0.247 161

12b 0.401 99

12c 0.357 112

Comments:

The dd throughput is about twice as high as the maximum achieved via fsseq3 for 1, 2b, and 8.
5 seems to have a very bad fsync() implementation, and even in the best case the throughput is off by a factor of 3 from the dd value.
For 7 the difference is about a factor of 13, which either means the dd time is flawed (pretty likely, since it is very short), or the fsync() calls are significant on that system.
The dd throughput for 40MB is absurdely high on the Linux 2.4.x systems. That might be due to the default async mount option. See below for a test with a larger value

dd ibs=8192 if=/dev/zero obs=8192 count=124000 of=incq

machine s MB/s

12a 24.762 39

12b 22.608 42

The data in this table is more likely, even though 40MB/s is still very fast.

Writing a Logfile; 2nd Version

For comparison with the Berkeley DB performance data, more tests have been run with fsseq4 with different parameters. Number of records is 100000 unless otherwise noted, t/s is transactions (records written) per second. Notice: fsseq3 writes twice as much records as fsseq4 (one add and one delete entry each), and it calls fsync() twice as often (after the add and after the delete entry).

1

C s time KB/s t/s

100000 20 1 1953 100000

10000 20 2 976 50000

1000 20 7 279 14285

100 20 20 97 5000

100000 100 3 3255 33333

10000 100 4 2441 25000

1000 100 8 1220 12500

100 100 57 171 1754

100000 512 15 3333 6666

10000 512 16 3125 6250

1000 512 17 2941 5882

100 512 67 746 1492

100000 1024 29 3448 3448

10000 1024 30 3333 3333

1000 1024 33 3030 3030

100 1024 77 1298 1298

100000 2048 60 3333 1666

10000 2048 60 3333 1666

1000 2048 64 3125 1562

100 2048 101 1980 990

2b

C s time KB/s t/s

100000 20 1 1953 100000

10000 20 1 1953 100000

1000 20 2 976 50000

100 20 2 976 50000

100000 100 2 4882 50000

10000 100 1 9765 100000

1000 100 2 4882 50000

100 100 7 1395 14285

100000 512 3 16666 33333

10000 512 3 16666 33333

1000 512 4 12500 25000

100 512 6 8333 16666

100000 1024 6 16666 16666

10000 1024 5 20000 20000

1000 1024 6 16666 16666

100 1024 8 12500 12500

100000 2048 12 16666 8333

10000 2048 12 16666 8333

1000 2048 15 13333 6666

100 2048 15 13333 6666

5

C s time KB/s t/s

100000 20 1 1953 100000

10000 20 1 1953 100000

1000 20 2 976 50000

100 20 9 217 11111

100000 100 3 3255 33333

10000 100 4 2441 25000

1000 100 5 1953 20000

100 100 15 651 6666

100000 512 16 3125 6250

10000 512 18 2777 5555

1000 512 22 2272 4545

100 512 75 666 1333

100000 1024 34 2941 2941

10000 1024 35 2857 2857

1000 1024 46 2173 2173

100 1024 139 719 719

100000 2048 67 2985 1492

10000 2048 79 2531 1265

1000 2048 95 2105 1052

100 2048 246 813 406

7

C s time KB/s t/s

100000 20 1 1953 100000

10000 20 1 1953 100000

1000 20 4 488 25000

100 20 31 63 3225

100000 100 2 4882 50000

10000 100 2 4882 50000

1000 100 6 1627 16666

100 100 33 295 3030

100000 512 8 6250 12500

10000 512 11 4545 9090

1000 512 15 3333 6666

100 512 50 1000 2000

100000 1024 11 9090 9090

10000 1024 10 10000 10000

1000 1024 14 7142 7142

100 1024 42 2380 2380

100000 2048 25 8000 4000

10000 2048 26 7692 3846

1000 2048 21 9523 4761

100 2048 42 4761 2380

8

C s time KB/s t/s

100000 20 3 651 33333

10000 20 3 651 33333

1000 20 3 651 33333

100 20 5 390 20000

100000 100 3 3255 33333

10000 100 4 2441 25000

1000 100 4 2441 25000

100 100 9 1085 11111

100000 512 5 10000 20000

10000 512 5 10000 20000

1000 512 7 7142 14285

100 512 20 2500 5000

100000 1024 8 12500 12500

10000 1024 8 12500 12500

1000 1024 9 11111 11111

100 1024 26 3846 3846

100000 2048 15 13333 6666

10000 2048 16 12500 6250

1000 2048 21 9523 4761

100 2048 36 5555 2777

11

C s time KB/s t/s

100000 20 1 1953 100000

10000 20 1 1953 100000

1000 20 4 488 25000

100 20 29 67 3448

100000 100 1 9765 100000

10000 100 2 4882 50000

1000 100 5 1953 20000

100 100 36 271 2777

100000 512 4 12500 25000

10000 512 5 10000 20000

1000 512 9 5555 11111

100 512 44 1136 2272

100000 1024 8 12500 12500

10000 1024 9 11111 11111

1000 1024 13 7692 7692

100 1024 54 1851 1851

100000 2048 15 13333 6666

10000 2048 17 11764 5882

1000 2048 22 9090 4545

100 2048 67 2985 1492

10a

C s time KB/s t/s

100000 20 2 976 50000

10000 20 1 1953 100000

1000 20 2 976 50000

100 20 3 651 33333

100000 100 2 4882 50000

10000 100 2 4882 50000

1000 100 2 4882 50000

100 100 6 1627 16666

100000 512 3 16666 33333

10000 512 3 16666 33333

1000 512 4 12500 25000

100 512 21 2380 4761

100000 1024 3 33333 33333

10000 1024 4 25000 25000

1000 1024 7 14285 14285

100 1024 41 2439 2439

100000 2048 4 50000 25000

10000 2048 5 40000 20000

1000 2048 12 16666 8333

100 2048 80 2500 1250

10b

C s time KB/s t/s

100000 20 1 1953 100000

10000 20 1 1953 100000

1000 20 4 488 25000

100 20 23 84 4347

100000 100 2 4882 50000

10000 100 2 4882 50000

1000 100 5 1953 20000

100 100 32 305 3125

100000 512 5 10000 20000

10000 512 5 10000 20000

1000 512 9 5555 11111

100 512 42 1190 2380

100000 1024 10 10000 10000

10000 1024 11 9090 9090

1000 1024 14 7142 7142

100 1024 59 1694 1694

100000 2048 21 9523 4761

10000 2048 21 9523 4761

1000 2048 25 8000 4000

100 2048 78 2564 1282

Comments:

Using record sizes that are not a divider of the block size of the underlying filesystem reduces throughput since fsync() can't write whole blocks in that case.
10a behaves very bad for C=100. The reason for this is unknown. The times are especially bad compared to the Berkeley DB tests run on the same configuration (see Section 5.2.3).

Harddisk Performance

Some performance data gathered from the WWW.

SR Office DriveMark 2002 in IO/Sec taken from [Ra01]:

Manufacturer Model I/O operations/second

Seagate Cheetah X15-36LP (36.7 GB Ultra160/m SCSI) 485

Maxtor Atlas 10k III (73 GB Ultra160/m SCSI) 455

Fujitsu MAM3367 (36 GB Ultra160/m SCSI) 446

IBM Ultrastar 36Z15 (36.7 GB Ultra160/m SCSI) 402

Western Digital Caviar WD1000BB-SE (100 GB ATA-100) 397

Seagate Cheetah 36ES (36 GB Ultra160/m SCSI) 373

Fujitsu MAN3735 (73 GB Ultra160/m SCSI) 369

Seagate Cheetah 73LP (73.4 GB Ultra160/m SCSI) 364

Western Digital Caviar WD1200BB (120 GB ATA-100) 337

Seagate Cheetah 36XL (36.7 GB Ultra 160/m SCSI) 328

IBM Deskstar 60GXP (60.0 GB ATA-100) 303

Maxtor DiamondMax Plus D740X (80 GB ATA-133) 301

Seagate Barracuda ATA IV (80 GB ATA-100) 296

Quantum Fireball Plus AS (60.0 GB ATA-100) 295

Quantum Atlas V (36.7 GB Ultra160/m SCSI) 269

Seagate Barracuda 180 (180 GB Ultra160/m SCSI) 249

Maxtor DiamondMax 536DX (100 GB ATA-100) 248

Seagate Barracuda 36ES (36 GB Ultra160/m SCSI) 222

Seagate U6 (80 GB ATA-100) 210

Samsung SpinPoint P20 (40.0 GB ATA-100) 192

ZD Business Disk WinMark 99 in MB/Sec

Manufacturer Model MB/second

Seagate Cheetah X15-36LP (36.7 GB Ultra160/m SCSI) 13.1

Maxtor Atlas 10k III (73 GB Ultra160/m SCSI) 12.0

IBM Ultrastar 36Z15 (36.7 GB Ultra160/m SCSI) 11.3

Fujitsu MAM3367 (36 GB Ultra160/m SCSI) 11.1

Seagate Cheetah 36ES (36 GB Ultra160/m SCSI) 10.5

Seagate Cheetah 73LP (73.4 GB Ultra160/m SCSI) 10.2

Seagate Cheetah 36XL (36.7 GB Ultra 160/m SCSI) 9.9

Western Digital Caviar WD1000BB-SE (100 GB ATA-100) 9.8

Fujitsu MAN3735 (73 GB Ultra160/m SCSI) 9.1

Western Digital Caviar WD1200BB (120 GB ATA-100) 8.9

IBM Deskstar 60GXP (60.0 GB ATA-100) 8.8

Seagate Barracuda ATA IV (80 GB ATA-100) 8.5

Maxtor DiamondMax Plus D740X (80 GB ATA-133) 8.0

Quantum Atlas V (36.7 GB Ultra160/m SCSI) 7.9

Quantum Fireball Plus AS (60.0 GB ATA-100) 7.7

Seagate Barracuda 36ES (36 GB Ultra160/m SCSI) 7.4

Seagate Barracuda 180 (180 GB Ultra160/m SCSI) 7.1

Maxtor DiamondMax 536DX (100 GB ATA-100) 6.9

Samsung SpinPoint P20 (40.0 GB ATA-100) 6.5

Seagate U6 (80 GB ATA-100) 6.3

The file and web server benchmarks (also available at [Ra01]) are not useful since they include 80 and 100 per cent read accesses, which is not really typical of MTA servers.

Performance of Berkeley DB

Some preliminary, very simple performance tests with Berkeley DB 4.0.14 have been made. Two benchmark programs have been used: bench_001 and bench_002 which use Btree and Queue as access methods. They are based on examples_c/bench_001.c that comes with Berkeley DB. Notice: the access method Queue requires fixed size records and the access methods is record numbers (simply increasing). This method may be used for the backup of the incoming EDB. Notice: the tests have not (yet) been run multiple times, at least not systematically. Testing showed that the runtimes may vary noticable. However, the data can be used to show some trends.

Possible parameters are:

-n N number of records to write

-T N use transactions, synchronize after N transactions

-l N length of data part

-C N do a checkpoint every N actions and possibly remove logfile

Unless otherwise noted, the following tests have been performed on system 1, see Section 5.2.1. Number of records is 100000 unless otherwise noted, t/s is transactions (records written) per second.

Vary synchronization (-T):

Prg -T -l real user sys KB/s t/s

1 100000 20 14.73 5.99 1.00 132 6788

1 10000 20 14.64 5.85 1.29 133 6830

1 1000 20 18.14 6.02 1.10 107 5512

1 100 20 70.57 6.03 1.76 27 1417

2 100000 20 11.58 2.91 0.74 168 8635

2 10000 20 10.14 2.86 0.85 192 9861

2 1000 20 11.20 2.85 0.95 174 8928

2 100 20 68.71 2.73 1.61 28 1455

Vary data length, first program only:

Prg -T -l real user sys KB/s t/s

1 100000 20 14.39 5.93 1.16 135 6949

1 10000 20 16.77 5.91 1.16 116 5963

1 1000 20 16.58 5.91 1.13 117 6031

1 100 20 68.10 5.95 1.85 28 1468

1 100000 100 23.30 5.57 1.90 419 4291

1 10000 100 30.56 5.56 1.90 319 3272

1 1000 100 33.39 5.51 1.99 292 2994

1 100 100 82.58 5.47 2.62 118 1210

1 100000 512 96.03 7.69 4.78 520 1041

1 10000 512 94.12 7.39 5.03 531 1062

1 1000 512 97.67 7.20 5.15 511 1023

1 100 512 164.13 7.51 5.67 304 609

1 100000 1024 304.88 10.88 10.62 327 327

1 10000 1024 270.00 10.69 10.66 370 370

1 1000 1024 275.27 10.91 11.06 363 363

1 100 1024 346.10 11.01 12.09 288 288

1 100000 2048 788.88 22.18 27.59 253 126

The test has been aborted at this point. Maybe run it again later on.

Vary data length, second program only:

Prg -T -l real user sys KB/s t/s

2 100000 20 9.46 2.81 0.80 206 10570

2 10000 20 11.53 2.88 0.81 169 8673

2 1000 20 12.47 2.83 0.96 156 8019

2 100 20 67.91 2.80 1.59 28 1472

2 100000 100 13.57 2.92 1.20 719 7369

2 10000 100 18.62 3.07 1.17 524 5370

2 1000 100 19.04 2.92 1.20 512 5252

2 100 100 72.73 2.80 2.16 134 1374

2 100000 512 46.10 3.90 2.61 1084 2169

2 10000 512 53.55 3.84 2.79 933 1867

2 1000 512 66.71 3.65 3.05 749 1499

2 100 512 105.25 3.36 3.76 475 950

2 100000 1024 103.72 4.92 4.68 964 964

2 10000 1024 105.53 4.87 4.82 947 947

2 1000 1024 105.60 4.73 4.85 946 946

2 100 1024 145.14 4.73 5.84 688 688

2 100000 2048 194.70 7.44 8.09 1027 513

2 10000 2048 197.09 7.22 8.15 1014 507

2 1000 2048 200.09 7.10 8.70 999 499

2 100 2048 234.85 6.86 9.53 851 425

Put the directory for logfiles on a different disk (/extra/home/ca/tmp/db), using Btree.

Prg -T -l real user sys KB/s t/s

1 100000 20 14.90 6.05 0.96 131 6711

1 10000 20 14.46 5.95 1.12 135 6915

1 1000 20 17.70 5.83 1.08 110 5649

1 100 20 63.91 5.92 1.74 30 1564

1 100000 100 27.00 5.53 1.90 361 3703

1 10000 100 33.39 5.63 1.92 292 2994

1 1000 100 29.16 5.63 1.75 334 3429

1 100 100 72.18 5.44 2.42 135 1385

1 100000 512 96.94 7.49 5.09 515 1031

1 10000 512 107.99 7.34 5.17 463 926

1 1000 512 97.05 7.21 5.54 515 1030

1 100 512 145.15 7.85 5.36 344 688

1 100000 1024 268.88 10.67 11.54 371 371

1 10000 1024 279.65 11.02 11.05 357 357

1 1000 1024 304.07 10.58 11.69 328 328

1 100 1024 319.74 10.88 12.10 312 312

1 100000 2048 738.38 23.07 27.13 270 135

1 10000 2048 651.86 22.70 26.92 306 153

1 1000 2048 693.13 21.79 28.63 288 144

1 100 2048 724.68 22.51 29.04 275 137

Put the directory for logfiles on a different disk (/extra/home/ca/tmp/db), using Queue.

Prg -T -l real user sys KB/s t/s

2 100000 20 10.92 2.90 0.65 178 9157

2 10000 20 9.94 2.87 0.77 196 10060

2 1000 20 31.66 2.85 0.88 61 3158

2 100 20 60.74 2.93 1.36 32 1646

2 100000 100 13.62 3.09 0.95 717 7342

2 10000 100 19.30 3.02 1.17 505 5181

2 1000 100 15.55 3.16 1.08 628 6430

2 100 100 71.88 2.97 1.72 135 1391

2 100000 512 52.08 3.93 2.50 960 1920

2 10000 512 52.42 3.68 3.03 953 1907

2 1000 512 56.58 3.91 2.90 883 1767

2 100 512 95.38 3.74 3.64 524 1048

2 100000 1024 107.20 4.69 4.87 932 932

2 10000 1024 100.15 4.88 4.57 998 998

2 1000 1024 100.95 4.78 5.06 990 990

2 100 1024 139.38 4.71 5.61 717 717

2 100000 2048 187.78 7.68 8.41 1065 532

2 10000 2048 189.76 7.09 8.62 1053 526

2 1000 2048 201.95 7.37 8.65 990 495

2 100 2048 217.66 7.21 9.53 918 459

Machine 2b: Vary data length, first program:

Prg -T -l real user sys KB/s t/s

1 100000 20 21.56 9.04 1.88 90 4638

1 10000 20 13.02 9.58 1.92 150 7680

1 1000 20 12.64 9.40 1.81 154 7911

1 100 20 16.35 9.68 1.73 119 6116

1 100000 100 32.79 9.16 4.60 297 3049

1 10000 100 25.05 9.54 4.11 389 3992

1 1000 100 23.69 9.80 4.39 412 4221

1 100 100 28.51 10.25 3.89 342 3507

1 100000 512 47.67 13.82 13.65 1048 2097

1 10000 512 48.04 13.22 13.64 1040 2081

1 1000 512 46.35 13.16 14.54 1078 2157

1 100 512 52.10 13.78 11.93 959 1919

1 100000 1024 109.32 21.59 25.00 914 914

1 10000 1024 107.94 19.97 26.49 926 926

1 1000 1024 108.74 20.13 26.06 919 919

1 100 1024 113.14 20.01 26.45 883 883

1 100000 2048 240.16 44.55 55.72 832 416

1 10000 2048 262.05 43.58 54.94 763 381

1 1000 2048 245.93 41.17 57.54 813 406

1 100 2048 254.97 41.39 59.63 784 392

Vary data length, second program:

Prg -T -l real user sys KB/s t/s

2 100000 20 9.85 5.92 1.30 198 10152

2 10000 20 7.82 5.90 1.28 249 12787

2 1000 20 7.21 5.13 1.34 270 13869

2 100 20 10.36 5.79 1.23 188 9652

2 100000 100 10.22 5.84 2.73 955 9784

2 10000 100 10.54 6.11 2.72 926 9487

2 1000 100 10.68 6.12 2.40 914 9363

2 100 100 13.57 6.06 2.37 719 7369

2 100000 512 23.73 7.32 8.89 2107 4214

2 10000 512 25.36 7.42 8.44 1971 3943

2 1000 512 26.12 7.19 8.56 1914 3828

2 100 512 33.79 7.24 8.78 1479 2959

2 100000 1024 47.93 9.05 12.29 2086 2086

2 10000 1024 52.26 9.63 14.91 1913 1913

2 1000 1024 52.07 9.37 14.50 1920 1920

2 100 1024 58.91 9.49 14.52 1697 1697

2 100000 2048 74.59 15.42 20.55 2681 1340

2 10000 2048 72.47 14.99 21.50 2759 1379

2 1000 2048 78.38 14.54 21.93 2551 1275

2 100 2048 76.63 14.01 22.12 2609 1304

Machine 7: Vary data length, second program only; the times for a second test run are added on the right, these clearly show how wildly the results can vary.

Prg -T -l real user sys KB/s t/s

2 100000 20 5.20 2.00 0.30 375 19230

2 10000 20 6.20 2.00 0.30 315 16129

2 1000 20 7.10 2.00 0.30 275 14084

2 100 20 25.80 2.00 0.50 75 3875

2 100000 100 6.30 2.10 0.60 1550 15873

2 10000 100 6.50 2.10 0.60 1502 15384

2 1000 100 10.60 2.20 0.60 921 9433

2 100 100 36.50 2.10 0.80 267 2739

2 100000 512 33.40 2.70 2.80 1497 2994

2 10000 512 29.80 2.70 2.90 1677 3355

2 1000 512 29.30 2.60 2.50 1706 3412

2 100 512 65.90 2.60 2.90 758 1517

2 100000 1024 50.50 3.30 4.90 1980 1980

2 10000 1024 60.80 3.40 5.40 1644 1644

2 1000 1024 51.70 3.30 4.60 1934 1934

2 100 1024 89.70 3.20 5.60 1114 1114

2 100000 2048 90.20 4.40 8.90 2217 1108

2 10000 2048 92.80 4.30 9.10 2155 1077

2 1000 2048 93.50 4.60 7.80 2139 1069

2 100 2048 134.00 4.40 7.50 1492 746

real user sys

5.0 2.0 0.3

4.8 1.9 0.3

7.8 2.0 0.3

28.5 2.0 0.5

6.0 2.0 0.6

7.6 2.1 0.6

11.5 2.0 0.6

31.4 2.1 0.8

18.5 2.6 2.0

24.6 2.5 2.3

32.9 2.5 2.8

61.0 2.5 2.8

58.4 3.2 5.5

47.2 3.2 4.6

45.1 3.2 4.2

82.0 3.2 4.9

86.9 4.3 9.1

67.1 4.3 7.3

66.0 4.2 7.0

107.7 4.2 6.5

Vary data length, first program only:

Prg -T -l real user sys KB/s t/s

1 100000 20 6.90 3.10 0.40 283 14492

1 10000 20 7.20 3.30 0.50 271 13888

1 1000 20 9.90 3.30 0.50 197 10101

1 100 20 28.90 3.20 0.60 67 3460

1 100000 100 11.30 3.40 1.00 864 8849

1 10000 100 12.20 3.30 1.00 800 8196

1 1000 100 14.00 3.30 1.10 697 7142

1 100 100 35.80 3.30 1.30 272 2793

1 100000 512 37.10 4.50 4.20 1347 2695

1 10000 512 50.00 4.60 4.50 1000 2000

1 1000 512 62.50 4.50 4.60 800 1600

1 100 512 68.60 4.50 4.60 728 1457

1 100000 1024 86.20 6.20 8.70 1160 1160

1 10000 1024 117.10 6.00 8.40 853 853

1 1000 1024 78.90 6.10 7.80 1267 1267

1 100 1024 109.60 6.10 7.40 912 912

1 100000 2048 225.80 10.90 15.90 885 442

1 10000 2048 259.40 10.80 16.30 771 385

1 1000 2048 382.60 10.90 17.40 522 261

1 100 2048 394.30 10.90 17.20 507 253

Machine 10a:

Prg -T -l real user sys KB/s t/s

1 100000 20 5.00 4.40 0.50 390 20000

1 10000 20 5.00 4.30 0.60 390 20000

1 1000 20 5.50 4.40 0.80 355 18181

1 100 20 9.00 4.50 3.90 217 11111

1 100000 100 6.10 4.70 1.20 1600 16393

1 10000 100 6.20 4.80 1.20 1575 16129

1 1000 100 6.70 4.60 1.80 1457 14925

1 100 100 10.90 5.00 4.30 895 9174

1 100000 512 13.30 6.50 5.10 3759 7518

1 10000 512 12.90 6.90 4.80 3875 7751

1 1000 512 14.00 7.00 5.00 3571 7142

1 100 512 19.00 7.10 8.40 2631 5263

1 100000 1024 19.70 8.80 8.40 5076 5076

1 10000 1024 19.30 9.20 8.20 5181 5181

1 1000 1024 19.90 9.20 8.70 5025 5025

1 100 1024 26.70 9.20 12.30 3745 3745

1 100000 2048 32.90 13.80 11.70 6079 3039

1 10000 2048 31.10 13.80 12.10 6430 3215

1 1000 2048 34.90 14.40 12.30 5730 2865

1 100 2048 41.30 14.10 16.10 4842 2421

Prg -T -l real user sys KB/s t/s

2 100000 20 4.70 4.20 0.30 415 21276

2 10000 20 4.70 4.00 0.50 415 21276

2 1000 20 5.20 4.20 0.70 375 19230

2 100 20 8.80 4.10 3.90 221 11363

2 100000 100 5.50 4.30 0.80 1775 18181

2 10000 100 5.70 4.30 0.80 1713 17543

2 1000 100 6.20 4.50 1.00 1575 16129

2 100 100 9.70 4.50 4.20 1006 10309

2 100000 512 12.50 5.50 2.30 4000 8000

2 10000 512 13.60 5.40 2.60 3676 7352

2 1000 512 11.70 5.10 3.30 4273 8547

2 100 512 14.50 5.70 6.40 3448 6896

2 100000 1024 17.90 6.80 3.90 5586 5586

2 10000 1024 17.30 6.70 4.60 5780 5780

2 1000 1024 18.40 6.60 4.60 5434 5434

2 100 1024 19.00 7.00 8.10 5263 5263

2 100000 2048 24.80 8.80 6.90 8064 4032

2 10000 2048 21.20 9.00 6.80 9433 4716

2 1000 2048 20.90 9.10 7.20 9569 4784

2 100 2048 24.00 8.90 11.30 8333 4166

General notice: the benchmark programs have been run while the machines were ``in use'', so some unusual results can be explained by the activity of other processes.

Comments:

Compared with fsseq3 (see Section 5.2.1), Berkeley DB has to write at least two files, the logfile itself and the database. Hence the throughput should be lower by at least a factor of 2. Additional disk head movements cause another slowdown.
For 1 it doesn't help much to put the directory for logfiles on a different disk, which is a bit surprising.
The throughput for Queue is about 30 to 200 per cent higher than for Btree. Since the former has only fixed record size and (almost) no search overhead, this is expected. However, it is not clear whether the Queue access method is flexible enough for the deferred EDB. Currently it doesn't seems so.

Miscellaneous about Performance

2004-03-26 Effect of logging using logging to files via sm I/O.

On FreeBSD 4.9, UFS, softupdates, SCSI, smX.0.0.12, relay 5000 messages, 100 threads:

logging time

same disk, smioerr 137-141

same disk, smioout 104

RAM, smioerr 104

This means there is a performance hit of about 35 per cent if smioerr is used instead of smioout. The former uses line buffering, hence there are more writes involved.

Performance of Various Programs

TCP/IP Performance

The program checks/t-net-0.c can be used for very simple performance test of local TCP/IP (AF_INET or AF_LOCAL) communication. This function uses the SM I/O layer on top of sockets. Some of the options which are available are listed below:

-b n set buffer size to n, default 8192

-c n act as client, write n bytes

-s n act as server, read n bytes

-R n read and write n times

-u use a Unix domain socket

The numbers reference the machines listed in Section 5.2.1.1.

1:

-R -c time INET time LOCAL

100000 32 15 10

100000 64 19 12

100000 128 24 17

100000 256 31 25

100000 512 49 43

100000 1024 84 81

7:

-R -c time INET time LOCAL

100000 32 10 7

100000 64 11 7

100000 128 13 9

100000 256 17 14

100000 512 23 20

100000 1024 37 35

8:

-R -c time INET time LOCAL

100000 32 51 41

100000 64 57 46

100000 128 66 60

100000 256 97 86

100000 512 148 138

100000 1024 250 243

9:

-R -c time INET time LOCAL

100000 32 67 52

100000 64 74 59

100000 128 85 71

100000 256 114 97

100000 512 159 148

100000 1024 263 246

11:

-R -c time INET time LOCAL

100000 32 99 89

100000 64 108 94

100000 128 138 115

100000 256 141 151

100000 512 199 221

100000 1024 346 373

Notice: these times vary wildly since the machine is used by several people.

13:

-R -c time INET time LOCAL

100000 32 46 38

100000 64 59 48

100000 128 78 70

100000 256 120 110

100000 512 203 192

100000 1024 376 358

DB Lookup Performance

Just a preliminary number: On system 1 the example program examples_c/bench_001.c achieves about 1 to 1.5 millions lookups per second (this is for a data length of 20 bytes and a cache size of 64MB, which more or less means direct memory access, no disk I/O). Taking the results from 5.4.1 into account means that this is factor of 100 faster than performing lookups over a generic TCP/IP connection. This certainly must be taken into account for the decision how and where to incorporate DB lookups.

Using larger data sizes (256 to 512 bytes) and smaller caches (10000 bytes) cause a significant drop in performance: a sequential lookup of all data varies from 60000 to 10000 lookups per second (on system 1).

Random access goes down as much as 1000 to 2000 lookups per second.

`snprintf` Performance

On AIX, sm_snprintf() is about 2 times slower than snprintf(). On SunOS 5.8 it's about 1.3, on FreeBSD there is no difference (which isn't surprising since it's almost the same code). It might make sense to use the native snprintf() version on some platforms, however, this isn't possible anymore due to the extensions in sm_snprintf() (which supports more format specifiers, e.g., for constant strings).

Next: Bibliography Up: Sendmail X Previous: Sendmail X: Implementation Contents

Claus Assmann

parameters	smtp-sink	smtps	thrperconn	thrpool
1KB/msg (40MB)	45s	70s	92s	43s
4KB/msg (160MB)	49s	56s	259s	78s
32KB/msg (1280MB)	203s	208s	999s	110s
-w 1	141s	109s	156s	230s

sink program	FS	times (s)
smtps3	-	5
smtpss	UFS	17, 18
smtps3 -C	UFS	16, 17, 19

FS	Times	msg/s (best)
JFS	4.02s, 4.23s	124
ReiserFS	4.8s	104
XFS	6.7s, 7.2s, 7.48s, 7.64s	74
EXT3	14.39s, 13.44s	34

program	source time	sink time
smtps3 -C		-
smX.0.0.12	6	5
sm8.12.11	74	74
sm8.12.11 See 1		50
postfix 2.0.18

program	writes	source time	source msgs/s	sink time
smtps3		2	2295	-
smtps3 -C		5	962	-
smX.0.0.12		22	225	22
sm8.12.11		358	14	358
sm8.12.11 See 1		246	20	-
postfix 2.0.18

MTA	source time(s)	sink time
postfix 2.0.18	53	94
smX.0.0.12	69	68
without smtpc	56	-
sm8.12.11	67	67
-odq	79, 82
-odq / 100 qd	101
-odq / 10 qd	100

n	source time	requests served
0	108	5000
8000000	115	5060
58000000	140	5450
88000000	151	5620

program	softupdates?	writes	reads	time
smtps3 -C	yes	2200	-	14
smtps3 -C	no	2900	-	30
smX.0.0.12, no sched (see 1)	yes	5200	-	34
smX.0.0.12, no sched	yes		-
smX.0.0.12, no sched	no		-
smX.0.0.12 (see 2)	yes	3500 (2000/1300)	4	33
	yes	3370 (2020/1270)	4	30/29
-O i=1000000	yes	2660 (1850/660)	0	25/24
smX.0.0.12	no	6300 (3000/3200)	0	52
smX.0.0.12 (see 4)	yes	3500 (2200/1200)	4	25
sm8.12.11 -odq SS=m	yes	1800	-	41
sm8.12.11 -odq SS=m	no	12200	-	72
sm8.12.11 SS=m (see 3)	yes	236 (164)	0	61
	yes	370 (218)	0	60
sm8.12.11	no	8100 (4100)	1	63
sm8.12.11 SS=t	yes	7400	0	70
postfix 2.0.18	yes	2900	16	21/26

parameters	oublock	writes	source time	sink time
-C -i	1920	?	17	16
-C -p 1	1860	?	17	17
-C -p 1	1940	2700	16	15
-C -p 1	1970	2770	16	15
-C -p 2		?	15	?
-C -p 2	877+966	2600	15	?
-C -p 4	455+476+432+472	2640	15	?

MTA options	FS	source time(s)	sink time(s)
full MTS	SWAPFS	16	14
without sched	SWAPFS	10	-
smtpss	SWAPFS	3	-
full MTS	UFS	64, 65, 64	75, 70, 69
8.12.11	SWAPFS	16	19
8.12.11	UFS	141	138

machine	5000 100	-c 5000 100	-c -r 5000 100
1	50	49	48
1	42	48	51
2a	3	7	10
		about 2200 tps	about 1500 tps
2b		11	21
3	10	34	34
	about 500 tps
4(a)i		126	125
4(a)ii		208	454
4b		43	48
7	7	13	16
5	9	8	9
8	133	201	603
9a		52	665
10a	9	9	12
11	89	139	233

common parameters	machine	-c	-c -r	-S -c	-S -c -r
(5000 100)	17	42	42	2	3
	10b	165	496	165	495
	18	83	83	5	8
	19a	8	7	1	3
	19b	8	9	1	3
	19c	7	9	1	2
(-s 32 5000 100)	17	109	109	8	9
	10b	250	537	207	498
	18	114	113	14	16
	19b	87	81	3	5
	19c	26	26	4	5

machine	-h 1 -c 1000 1000	-h 1 -c -r 1000 1000
1	18	18
2a	24	24
2b	7	9
3	14	14
4(a)i	23	23
4(a)ii	33	77
4b	25	49
5	3	2
7	3	4
8	58	163
9a	51	139
11	28	48

machine	-h 1 -c	-h 1 -c -r	-p -h 1 -c	-p -h 1 -c -r
1	32	31	18	17
2a	18	18	9	10
2b	10	10	8	10
5	2	1	2	1
6	2	2	4	4
7	2	4	2	3
8	58	165	78	178
9a	27	127	33	131
9c	13	51	37	55
11	28	48	28	48

machine	-	-C 100	-C 50	-C 10	-C 5	-C 2	-f
1	1	4	6	17	32	78	150
2a	0	2	2	5	5	9	18
2b	1	0	1	3	4	10	20
3	1	2	3	9	16	37	68
5	1	1	2	6	12	27	56
7	0	4	8	39	79	198	410
8	1	7	13	60	120	299	598
9a	1	8	13	15	62	90	140
11	0	6	12	53	106	262	518

machine	C	256	1024	4096	8192	16384
5	1	4	5	10	20	34
	2	2	4	6	12	22
	5	1	2	5	7	15
	10	1	1	3	6	12
	50	1	0	3	5	10
	100	0	1	3	5	10
7	1	1	5	20	40	44
	2	1	5	11	23	29
	5	1	5	9	12	13
	10	1	2	3	6	7
	50	0	1	1	2	3
	100	0	1	1	1	3
8	1	3	10	45	95	109
	2	2	11	23	52	59
	5	3	11	19	24	32
	10	2	5	6	15	21
	50	1	2	3	8	13
	100	0	1	3	6	13
9a	1	3	12	34	35	58
	2	3	12	18	53	53
	5	3	6	21	23	24
	10	3	5	6	13	14
	50	1	2	2	5	7
	100	1	1	2	3	6
11	1	21	35	77	83	92
	2	13	26	38	45	50
	5	8	13	17	20	24
	10	5	6	10	11	15
	50	1	2	2	4	7
	100	1	1	2	3	6

C	s	records	time	KB/s
1	512	80000	1365	29
1	1024	40000	734	54
1	2048	20000	451	88
1	4096	10000	352	113
1	8192	5000	250	160
2	512	80000	736	54
2	1024	40000	453	88
2	2048	20000	354	112
2	4096	10000	382	104
2	8192	5000	225	177
5	512	80000	638	62
5	1024	40000	585	68
5	2048	20000	312	128
5	4096	10000	187	213
5	8192	5000	101	396
10	512	80000	561	71
10	1024	40000	296	135
10	2048	20000	161	248
10	4096	10000	88	454
10	8192	5000	60	666
50	512	80000	128	312
50	1024	40000	70	571
50	2048	20000	41	975
50	4096	10000	34	1176
50	8192	5000	29	1379
100	512	80000	73	547
100	1024	40000	43	930
100	2048	20000	33	1212
100	4096	10000	28	1428
100	8192	5000	27	1481

machine	s	MB/s
1	11.6	3.6
2a	4.8	8.4
2b	1.9	20.9
5	10.83	3.9
6	0.65	61
7	1.0	40.0
8	14.8	2.8
9	6.3	6.6
11	6.98	6.0
12a	0.247	161
12b	0.401	99
12c	0.357	112

C	s	time	KB/s	t/s
100000	20	1	1953	100000
10000	20	2	976	50000
1000	20	7	279	14285
100	20	20	97	5000
100000	100	3	3255	33333
10000	100	4	2441	25000
1000	100	8	1220	12500
100	100	57	171	1754
100000	512	15	3333	6666
10000	512	16	3125	6250
1000	512	17	2941	5882
100	512	67	746	1492
100000	1024	29	3448	3448
10000	1024	30	3333	3333
1000	1024	33	3030	3030
100	1024	77	1298	1298
100000	2048	60	3333	1666
10000	2048	60	3333	1666
1000	2048	64	3125	1562
100	2048	101	1980	990

Manufacturer	Model	I/O operations/second
Seagate	Cheetah X15-36LP (36.7 GB Ultra160/m SCSI)	485
Maxtor	Atlas 10k III (73 GB Ultra160/m SCSI)	455
Fujitsu	MAM3367 (36 GB Ultra160/m SCSI)	446
IBM	Ultrastar 36Z15 (36.7 GB Ultra160/m SCSI)	402
Western Digital	Caviar WD1000BB-SE (100 GB ATA-100)	397
Seagate	Cheetah 36ES (36 GB Ultra160/m SCSI)	373
Fujitsu	MAN3735 (73 GB Ultra160/m SCSI)	369
Seagate	Cheetah 73LP (73.4 GB Ultra160/m SCSI)	364
Western Digital	Caviar WD1200BB (120 GB ATA-100)	337
Seagate	Cheetah 36XL (36.7 GB Ultra 160/m SCSI)	328
IBM	Deskstar 60GXP (60.0 GB ATA-100)	303
Maxtor	DiamondMax Plus D740X (80 GB ATA-133)	301
Seagate	Barracuda ATA IV (80 GB ATA-100)	296
Quantum	Fireball Plus AS (60.0 GB ATA-100)	295
Quantum	Atlas V (36.7 GB Ultra160/m SCSI)	269
Seagate	Barracuda 180 (180 GB Ultra160/m SCSI)	249
Maxtor	DiamondMax 536DX (100 GB ATA-100)	248
Seagate	Barracuda 36ES (36 GB Ultra160/m SCSI)	222
Seagate	U6 (80 GB ATA-100)	210
Samsung	SpinPoint P20 (40.0 GB ATA-100)	192

Sendmail X: Performance Tests and Results

SMTP Server Daemon

SMTP Sink

SMTP Sink with CDB

wiz

perf-lab

SMTP Relaying Using a Sendmail X Prototype

Various Linux FS

Various FreeBSD Results

FreeBSD 4.9, Softupdates, and fsync()

Disk I/O On FreeBSD

Various SunOS 5 Results

Various OpenBSD Results

Various AIX Results

Implementation of Queues and Caches

Filesystem Performance

Test Systems

Meta Data Operations

Meta Data Operations: Existing Files

Writing a Logfile

Raw Throughput

Writing a Logfile; 2nd Version

Harddisk Performance

Performance of Berkeley DB

Miscellaneous about Performance

Performance of Various Programs

TCP/IP Performance

DB Lookup Performance

snprintf Performance

`snprintf` Performance

Manufacturer	Model	MB/second
Seagate	Cheetah X15-36LP (36.7 GB Ultra160/m SCSI)	13.1
Maxtor	Atlas 10k III (73 GB Ultra160/m SCSI)	12.0
IBM	Ultrastar 36Z15 (36.7 GB Ultra160/m SCSI)	11.3
Fujitsu	MAM3367 (36 GB Ultra160/m SCSI)	11.1
Seagate	Cheetah 36ES (36 GB Ultra160/m SCSI)	10.5
Seagate	Cheetah 73LP (73.4 GB Ultra160/m SCSI)	10.2
Seagate	Cheetah 36XL (36.7 GB Ultra 160/m SCSI)	9.9
Western Digital	Caviar WD1000BB-SE (100 GB ATA-100)	9.8
Fujitsu	MAN3735 (73 GB Ultra160/m SCSI)	9.1
Western Digital	Caviar WD1200BB (120 GB ATA-100)	8.9
IBM	Deskstar 60GXP (60.0 GB ATA-100)	8.8
Seagate	Barracuda ATA IV (80 GB ATA-100)	8.5
Maxtor	DiamondMax Plus D740X (80 GB ATA-133)	8.0
Quantum	Atlas V (36.7 GB Ultra160/m SCSI)	7.9
Quantum	Fireball Plus AS (60.0 GB ATA-100)	7.7
Seagate	Barracuda 36ES (36 GB Ultra160/m SCSI)	7.4
Seagate	Barracuda 180 (180 GB Ultra160/m SCSI)	7.1
Maxtor	DiamondMax 536DX (100 GB ATA-100)	6.9
Samsung	SpinPoint P20 (40.0 GB ATA-100)	6.5
Seagate	U6 (80 GB ATA-100)	6.3

-n N	number of records to write
-T N	use transactions, synchronize after N transactions
-l N	length of data part
-C N	do a checkpoint every N actions and possibly remove logfile

Prg	-T	-l	real	user	sys	KB/s	t/s
1	100000	20	14.73	5.99	1.00	132	6788
1	10000	20	14.64	5.85	1.29	133	6830
1	1000	20	18.14	6.02	1.10	107	5512
1	100	20	70.57	6.03	1.76	27	1417
2	100000	20	11.58	2.91	0.74	168	8635
2	10000	20	10.14	2.86	0.85	192	9861
2	1000	20	11.20	2.85	0.95	174	8928
2	100	20	68.71	2.73	1.61	28	1455

real	user	sys
5.0	2.0	0.3
4.8	1.9	0.3
7.8	2.0	0.3
28.5	2.0	0.5
6.0	2.0	0.6
7.6	2.1	0.6
11.5	2.0	0.6
31.4	2.1	0.8
18.5	2.6	2.0
24.6	2.5	2.3
32.9	2.5	2.8
61.0	2.5	2.8
58.4	3.2	5.5
47.2	3.2	4.6
45.1	3.2	4.2
82.0	3.2	4.9
86.9	4.3	9.1
67.1	4.3	7.3
66.0	4.2	7.0
107.7	4.2	6.5

logging	time
same disk, smioerr	137-141
same disk, smioout	104
RAM, smioerr	104

-b n	set buffer size to n, default 8192
-c n	act as client, write n bytes
-s n	act as server, read n bytes
-R n	read and write n times
-u	use a Unix domain socket

-R	-c	time INET	time LOCAL
100000	32	15	10
100000	64	19	12
100000	128	24	17
100000	256	31	25
100000	512	49	43
100000	1024	84	81