To add to the pool of braindead benchmarks, but perhaps with a little more reason, I'm adding this, and take it for what it is. If anything, this shows that performance is generally not the primary argument for choosing an intermediary. This is what I've been preaching - Yeah, performance is important, but most servers available today will handle a ridicious amount of HTTP traffic.
This is a test against an AMD Phenom(tm) II X4 940 Processor (very cheap), running across a GigE network. There are two linksys switches between the load geneating host and the server, but no routing or packet filtering. The payload is 100 bytes + a fairly small header, and the test is running with keep-alive.
In all tests, all logging is disabled.
Varnish
The configs are mostly the defaults, the main thing was I had to jack up the minimum threads, 200 seems to be a reasonable number for this test. During the test, the load goes to 300. The version of varnish is v2.1.5, from the Fedora repository.
5719306 fetches on 450 conns, 450 max parallel, 5.776500E+08 bytes, in 60 seconds
101 mean bytes/fetch
95320.7 fetches/sec, 9.627390E+06 bytes/sec
msecs/connect: 3.955 mean, 6.453 max, 0.162 min
msecs/first-response: 3.546 mean, 1005.235 max, 0.076 min
Nginx
This is running the older v0.8.53 version, since it's what was made available on the Fedora repo. The configs had to be tuned some, increasing the number of worker processes, setting the open_file_cache high, and also increasing the keepalive_requests setting (high).
5848823 fetches on 450 conns, 450 max parallel, 5.848820E+08 bytes, in 60 seconds
100 mean bytes/fetch
97480.4 fetches/sec, 9.748040E+06 bytes/sec
msecs/connect: 1.340 mean, 3.558 max, 0.469 min
msecs/first-response: 3.522 mean, 280.463 max, 0.067 min
Apache Traffic Server
This is the winner, of course, otherwise I wouldn't have published these results ;). This is running ATS v2.1.8, with mostly stock config. The primary configuration changes is to set the number of worker threads to 5 and turning off some verbose Via and server strings.
6944993 fetches on 450 conns, 450 max parallel, 6.945000E+08 bytes, in 60 seconds
100 mean bytes/fetch
115748.6 fetches/sec, 1.157486E+07 bytes/sec
msecs/connect: 1.805 mean, 2.995 max, 0.519 min
msecs/first-response: 1.736 mean, 218.573 max, 0.081 min
Update: I updated with the latest results from ATS v2.1.9, they are marginally different.