Traffic Server

warning: filemtime(): stat failed for taxonomy/term/13/0/feed in /home/server/html/drupal6/sites/all/modules/cdn/cdn.basic.farfuture.inc on line 232.

Filtering Drupal comment spam

I get a fair amount of comment spam on my blog, and even after I changed all comments to be moderated, the spammers still persist. I decided to do something about this, and working under the assumption that most spammers are from a few countries, I decided to implement a Geo-location filter for Apache Traffic Server. The code is currently available at http://svn.apache.org/repos/asf/trafficserver/plugins/geoip_acl/, and only works with MaxMind's APIs (but I'd be more than happy to add support for other Geo-location APIs). This plugin also requires PCRE, but that's already a requirement for building ATS, so shouldn't be a problem.

Once compiled and installed (see the README), setting this up is fairly straight forward. In my remap.config, I now have the following rule

map http://www.ogre.com http://localhost:69 @plugin=geoip_acl.so @pparam=country \
       @pparam=regex::/home/server/etc/deny_spam.conf


This says to apply a country based Geo-location filter on this rule, using the additional configurations from deny_spam.conf. This file contains one single line:

^comment/       deny    CN RU IN

This might look draconian, but for now I'm disabling all comment posts from China, Russia and India. For more details on the plugin configurations and features, again see the README from the source above.

Enjoy!

Performance tuning

This section contains various configurations that can improve the performance, some might not be appropriate for your environment, but all these options are known to have an impact on performance.

Threads

This is probably the most important configuration for your system, particularly for benchmarking. The default configurations are good for most systems, but can obviously be improved. The relevant settings are:


CONFIG proxy.config.exec_thread.autoconfig INT 0
CONFIG proxy.config.exec_thread.autoconfig.scale FLOAT 2.0
CONFIG proxy.config.exec_thread.limit INT 5

CONFIG proxy.config.accept_threads INT 1

CONFIG proxy.config.cache.threads_per_disk INT 8

CONFIG proxy.config.task_threads INT 2

The first three control the number of network threads, which are the threads primarily responsible for handling and processing all requests.

The next configuration controls the accept thread configuration, 1 is almost always the best, but some times 0 (moves accept back to the net-threads), or 2-3 might be optimal. It's really application specific.

Cache threads are setup per disk, 8 is usually good, but on very active system, with lots of disk activity, 16 or more threads might be necessary. These threads are blocking on disk I/O, so having more of these generally won't make the system a lot slower (but don't go overboard). Going with too few threads obviously won't let you use the full capacity of the disk's I/O capacity.

Finally, there's a set of background task threads, threads meant to off load the normal net-threads with mundane tasks. This is currently only mildly used by the ATS core, but more and more tasks are moved over to these threads as we make progress. This allows the net-thread to both serve more requets, and do so at less latency in the responses. Additionally, plugins are encouraged to use this pool of threads for similar tasks. This means that the number of these threads depends on not only which version of ATS you are using (v3.0 only lightly uses these threads), but also what plugins you are using. You will simply have to watch these threads, and see if they are becoming bottlenecks, and then increase this as necessary.

 

Network related options

 

HTTP related options

 

May 2011 Apache Traffic Server performance

I just ran a small tests against Apache Traffic Server, to see how performance has improved since last time. My test box is my desktop, an Intel Core i7-920 at 2.67Ghz (no overclocking), and the client runs on a separate box, over a cheap GigE network (two switches in between). Here are the latest results:

2,306,882 fetches on 450 conns, 450 max parallel, 2.306882E+08 bytes, in 15 seconds
100 mean bytes/fetch
153,792.0 fetches/sec, 1.537920E+07 bytes/sec
msecs/connect: 5.326 mean, 13.078 max, 0.149 min
msecs/first-response: 2.094 mean, 579.752 max, 0.099 min

 

This is of course for very small objects (100 bytes) served out of RAM cache, with HTTP keep-alive. Still respectable, close to 154k QPS out of a vey low end, commodity box.

Apache Traffic Server recipes

Apache Traffc Server is a high performance, customizable HTTP proxy server. This is a collection of small "recipes", showing how to accomplish various tasks. This cookbook is work in progress, I haven't quite decided yet how I want to handle this "documentation". Perhaps it belongs in Apache official docs, but for now, I find it easier to use the tools I'm used to here on ogre.com to maintain this.

July 2010 Apache Traffic Server benchmark

I reran my benchmarks with the latest "trunk" of Apache Traffic Server, to make sure we're not regressing. I also tweaked the number of worker threads a little, a gut feeling tells me that with Hyper Threading, our auto-scaling algorithm isn't optimal (and, it really isn't). Here are the latest numbers, running over a GigE network (two Linksys el-cheapo switches between clients and server)

3,160,237 fetches on 3,666 conns, 1,800 max parallell, 1.58012e+09 bytes in 30 seconds
500 mean bytes/fetch
105,341.10 fetches/sec, 5.26704e+07 bytes/sec
msecs/connect: 1.46781 mean, 6.674 max, 0.093666 min
msecs/first-response: 16.3333 mean, 615.34033 max, 0.121333 min

 

That is, 105k QPS (with keep-alive) for small objects, over the network. It's pushing 52MB of payload at this speed, but remember the average size is very small (500 bytes). My box is an Intel i7 920, Quad core.

Why Traffic Server defaults to not allow forward proxying

We have discussed numerous times on the Apache mailing lists about the reasons why Apache Traffic Server ships with a default configuration that is almost entirely locked down. Our argument has been that we want to assure that someone testing TS is not accidentally setup in the wild as an open proxy.

I recently moved all of www.ogre.com to be served via a TS forward proxy setup. Within minutes from setting it up, and while watching the log files, I found entries like these in the logs:

1273927263.211 0 125.45.109.166 ERR_CONNECT_FAIL/404 485 GET http://proxyjudge1.proxyfire.net/fastenv - NONE/- tex -
1274057848.081 0 125.45.109.166 ERR_CONNECT_FAIL/404 485 GET http://proxyjudge1.proxyfire.net/fastenv - NONE/- tex -
1274236765.403 4 125.45.109.166 ERR_CONNECT_FAIL/404 485 GET http://proxyjudge1.proxyfire.net/fastenv - NONE/- tex -

Of course, my server doesn't allow this, since it's setup to only accelerate www.ogre.com traffic. But case in point is, I think we've done the right thing of shipping Apache Traffic Server with a very restrictive configuration.

Ogre is now running on Apache Traffic Server

I've just switched over to serve all of www.ogre.com out of Apache Traffic Server. The site is still managed and created using Apache HTTPD, PHP and Drupal, but that is running as an "origin" server to ATS. This gives me a few benefits over serving straight out of Apache HTTPD:

  • Static content is automatically "cached" on the ATS server, and it can serve such content very fast with low latency.
  • I can jack up keep-alive much higher than I dared doing with HTTPD. Fwiw, I still use the pre-fork MPM, so I have limited number of processes and can't afford to tie those up with idle KA connections.
  • In a pinch, I could turn the HTML generated from Drupal to be cacheable, and serve straight out of ATS. I'm contemplating making this setting automatic, so when the load on the box hits a certain level, all HTML will also be cached by ATS. That would increase my capacity by at least a magnitude I think.

This change required no changes on my Drupal site, but I did change the port on my Apache HTTPD virtual host:

NameVirtualHost 209.126.158.218:8080

<VirtualHost 209.126.158.218:8080>
    ServerName www.ogre.com
...

I then installed Apache Traffic Server to listen on port 80, and I also told it to only bind a specific IP on my server (I have three IPs for different things). I also increased the RAM cache size and Keep-Alive timeouts, so I now have these changes in etc/trafficserver/records.config:

CONFIG proxy.config.proxy_name STRING kramer3.ogre.com
CONFIG proxy.config.http.server_port INT 80
CONFIG proxy.config.http.keep_alive_no_activity_timeout_in INT 60
CONFIG proxy.config.http.keep_alive_no_activity_timeout_out INT 1
CONFIG proxy.config.http.transaction_no_activity_timeout_in INT 15
CONFIG proxy.config.http.transaction_no_activity_timeout_out INT 30
CONFIG proxy.config.cache.ram_cache.size LLONG 33554432

LOCAL proxy.local.incoming_ip_to_bind STRING 209.126.158.218

Next, I added a disk cache to use for ATS, etc/trafficserver/storage.config:

/disk/tmp 134217728

This creates a 128MB cache in /disk/tmp. I know, very small, but this is still experimental. Finally, I added a remapping rule to etc/trafficserver/remap.config:

map http://www.ogre.com/ http://www.ogre.com:8080/

After starting everything up, the entire site is now reverse proxied (or accelerated) through Apache Traffic Server! As you can see, the changes necessary to ATS are fairly small, and pretty straight forward, most of the default settings 'just work'. It's a miracle.

First "alpha" release of Apache Traffic Server

We've finally produced our first "alpha" release of Apache Traffic Server. It can be fetched from your local Apache Download Mirror. This first version, v2.0.0-alpha, should be reasonably stable as it does not contain a ton of improvements over the old Yahoo code. The 2.0.0 releases also only supports Linux, but a number of 32-bit and 64-bit distros have been tested.

We're hoping to get some testing done on this release in the next week or two, so we can make the "final" 2.0.0 release. After that, the plan is to aggressively start making "developer" releases of trunk, which has some impressive improvements (like, up to 2x the performance in some cases). Apache Traffic Server is adopting the same versioning scheme as Apache HTTPD, so, v2.0.x is a "stable" release, while v2.1.x is a developer release. This implies, of course, that the next stable release will be v2.2.0, and we're hoping to get that out the door sometime this summer!

People interested in Apache Traffic Server are highly encouraged to join our mailing lists (see the incubator page), or come join us on #traffic-server on freenode.net.

Traffic Server performance

I just got a new shiny desktop at home, a Quad Core i7 920, running 64-bit Linux 2.6.32 (stock Fedora Core 12). Nothing fancy, it's a $250 CPU, but I did some artificial tests of Traffic Server on it just to see where we are on a modern machine and kernel. The tests are done between two Linux boxes, both on GigE network, and with two Linksys switches in between the two boxes. So, this is definitely not a production quality network in any way. I ran with keep-alive enabled, doing 100 requests per connection, each request fetches a 500 byte body out of cache (i.e. 100% cache hit ratio). I know, this is a completely unrealistic test, but it is still interesting to see what we can do. Here's the best run out of three:

2270442 fetches on 23464 conns, 2000 max parallell, 1.13522e+09 bytes in 30 seconds
500 mean bytes/fetch
75681.40 fetches/sec, 3.78406e+07 bytes/sec
msecs/connect: 0.3878585 mean, 11.1295 max, 0.089 min
msecs/first-response: 16.3695 mean, 459.379 max, 0.1295 min

So, over 75,000 requests per second, at 16ms latency, and I think some of that latency can be attributed to the two switches (wish I had a better setup). An interesting side note, the Traffic Server process can actually use about 470% CPU on this Quad Core box, so an extra 70% of a CPU is "gained" here from Hyper Threading. How does this compare to my old desktop? Well, the old system was a Core2, and in the same test, it was able to pull off around 35k QPS.

Anyways, these results aren't too shabby, and we've just started!

Caveat: This is using the "trunk" version of Traffic Server, the first release that is soon coming out won't be quite this fast.

Update: I updated the numbers with the results after forcing my CPU to run at highest CPU frequency at all time. No overclocking though, this is a standard i7 920 setup.

 

Traffic Server week 1

Traffic Server has been out for a full week now. And it's been great, the interest is huge (almost overwhelming), and surprsingly, lots of people want to participate and contribute. So far, we've already achieved:

  • 64-bit port on Linux!
  • Port to Solaris (and OpenSolaris I believe)
  • Port to Ubuntu (it required a lot of changes due to glibc changes)
  • MacOSX port is partially done.

Not bad for week. If you are interested, check out our Wiki: http://cwiki.apache.org/confluence/display/TS/Traffic+Server.