Most clients supports what we call Forward Proxying: You explicitly tell it which server (and port) to use as a proxy. This has traditionally been done over HTTP, with the addition of support for the CONNECT method for HTTPS request. We are now starting to see some clients supporting Forward Proxy over HTTPS, and you might wonder why? Well, a few reasons could include
I saw this tweet from Daniel Stenberg, looking for volunteers to implement support for this in curl. I don't know if he's got any takers yet :). Firefox and Chrome both are working on this feature, Chrome already having the basics available. Since I work on a proxy server (Apache Traffic Server), I took the opportunity to test it with the latest Chrome. Lo and behold, it simply worked right out of the box! I started chrome with this (OSX) command:
% Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --proxy-server=https://localhost:443
So, I was going to check my Email to my private IMAPS (TLS, port 993) mail server. And I get this warning about the certificate from my mail client (Apple Mail). Curious, I checked the certificate, and found this:
issuer=/C= /ST=Some-State/O=Blue Coat SG900 Series/OU=4312240020/CN=172.16.0.50
Note: This is the TLS handshake with the MITM proxy server:
issuer=/C= /ST=Some-State/O=Blue Coat SG900 Series/OU=4312240020/CN=172.16.0.50
No client certificate CA names sent
SSL handshake has read 1754 bytes and written 456 bytes
New, TLSv1/SSLv3, Cipher is DHE-RSA-AES256-SHA
Server public key is 2048 bit
Secure Renegotiation IS supported
Protocol : TLSv1
Cipher : DHE-RSA-AES256-SHA
Key-Arg : None
Start Time: 1397924326
Timeout : 300 (sec)
Verify return code: 21 (unable to verify the first certificate)
I run Drupal behind an Apache Traffic Server caching proxy. In my setup, the proxy listens on port 80, and the real Apache HTTPD server listens on port 82 (which is firewall off). In my Traffic Server remap.config, I have a rule like
map http://www.boot.org http://www.boot.org:82
Granted, in retrospect, this is not the best of setups, but it does however causes serious problems with Drupal 7, whereas it does not cause problems with Drupal 6. In D7, the favicon.ico and all JS and CSS URLs in the head are created to use absolute URLs. I don't set an explicit $base_url in my Drupal settings.php, more on that later, and this causes the URLs to get the wrong base! These URLs are all getting a form like
Yikes! This obviously fails, since port 82 is not accesible from the outside. Browsing the forums, the "solution" seems to be to set the $base_url in the Drupal settings.php configuration file, e.g.
$base_url = 'http://www.boot.org'; // NO trailing slash!
This does indeed solve the problem, however, it now breaks when I want to use e.g. https://www.boot.org for admin access. Besides, why these URLs should be absolute, is a mystery to me, they certainly were not in D6.
The solution I'm ending up with is of course to change Apache Traffic Server to use what we call "pristine host headers", so that the Origin server (Apache HTTPD and Drupal) sees the original client Host: header. I could not get any help from the Drupal IRC, or forums, but if anyone has any insight on why D7 is doing this crazy stuff with absolute URLs, please post. In an ideal world, they really should change these to be relative, e.g. /misc/favicon.ico.
I just ran a small tests against Apache Traffic Server, to see how performance has improved since last time. My test box is my desktop, an Intel Core i7-920 at 2.67Ghz (no overclocking), and the client runs on a separate box, over a cheap GigE network (two switches in between). Here are the latest results:
2,306,882 fetches on 450 conns, 450 max parallel, 2.306882E+08 bytes, in 15 seconds
100 mean bytes/fetch
153,792.0 fetches/sec, 1.537920E+07 bytes/sec
msecs/connect: 5.326 mean, 13.078 max, 0.149 min
msecs/first-response: 2.094 mean, 579.752 max, 0.099 min
This is of course for very small objects (100 bytes) served out of RAM cache, with HTTP keep-alive. Still respectable, close to 154k QPS out of a vey low end, commodity box.
We have discussed numerous times on the Apache mailing lists about the reasons why Apache Traffic Server ships with a default configuration that is almost entirely locked down. Our argument has been that we want to assure that someone testing TS is not accidentally setup in the wild as an open proxy.
I recently moved all of www.ogre.com to be served via a TS reverse proxy setup. Within minutes from setting it up, and while watching the log files, I found entries like these in the logs:
1273927263.211 0 188.8.131.52 ERR_CONNECT_FAIL/404 485 GET http://proxyjudge1.proxyfire.net/fastenv - NONE/- tex -
1274057848.081 0 184.108.40.206 ERR_CONNECT_FAIL/404 485 GET http://proxyjudge1.proxyfire.net/fastenv - NONE/- tex -
1274236765.403 4 220.127.116.11 ERR_CONNECT_FAIL/404 485 GET http://proxyjudge1.proxyfire.net/fastenv - NONE/- tex -
Of course, my server doesn't allow this, since it's setup to only accelerate www.ogre.com traffic. But case in point is, I think we've done the right thing of shipping Apache Traffic Server with a very restrictive configuration.
I've just switched over to serve all of www.ogre.com out of Apache Traffic Server. The site is still managed and created using Apache HTTPD, PHP and Drupal, but that is running as an "origin" server to ATS. This gives me a few benefits over serving straight out of Apache HTTPD:
This change required no changes on my Drupal site, but I did change the port on my Apache HTTPD virtual host:
I then installed Apache Traffic Server to listen on port 80, and I also told it to only bind a specific IP on my server (I have three IPs for different things). I also increased the RAM cache size and Keep-Alive timeouts, so I now have these changes in etc/trafficserver/records.config:
CONFIG proxy.config.proxy_name STRING kramer3.ogre.com
CONFIG proxy.config.http.server_port INT 80
CONFIG proxy.config.http.keep_alive_no_activity_timeout_in INT 60
CONFIG proxy.config.http.keep_alive_no_activity_timeout_out INT 1
CONFIG proxy.config.http.transaction_no_activity_timeout_in INT 15
CONFIG proxy.config.http.transaction_no_activity_timeout_out INT 30
CONFIG proxy.config.cache.ram_cache.size LLONG 33554432
LOCAL proxy.local.incoming_ip_to_bind STRING 18.104.22.168
Next, I added a disk cache to use for ATS, etc/trafficserver/storage.config:
This creates a 128MB cache in /disk/tmp. I know, very small, but this is still experimental. Finally, I added a remapping rule to etc/trafficserver/remap.config:
map http://www.ogre.com/ http://www.ogre.com:8080/
After starting everything up, the entire site is now reverse proxied (or accelerated) through Apache Traffic Server! As you can see, the changes necessary to ATS are fairly small, and pretty straight forward, most of the default settings 'just work'. It's a miracle.
We've finally produced our first "alpha" release of Apache Traffic Server. It can be fetched from your local Apache Download Mirror. This first version, v2.0.0-alpha, should be reasonably stable as it does not contain a ton of improvements over the old Yahoo code. The 2.0.0 releases also only supports Linux, but a number of 32-bit and 64-bit distros have been tested.
We're hoping to get some testing done on this release in the next week or two, so we can make the "final" 2.0.0 release. After that, the plan is to aggressively start making "developer" releases of trunk, which has some impressive improvements (like, up to 2x the performance in some cases). Apache Traffic Server is adopting the same versioning scheme as Apache HTTPD, so, v2.0.x is a "stable" release, while v2.1.x is a developer release. This implies, of course, that the next stable release will be v2.2.0, and we're hoping to get that out the door sometime this summer!
People interested in Apache Traffic Server are highly encouraged to join our mailing lists (see the incubator page), or come join us on #traffic-server on freenode.net.
I just got a new shiny desktop at home, a Quad Core i7 920, running 64-bit Linux 2.6.32 (stock Fedora Core 12). Nothing fancy, it's a $250 CPU, but I did some artificial tests of Traffic Server on it just to see where we are on a modern machine and kernel. The tests are done between two Linux boxes, both on GigE network, and with two Linksys switches in between the two boxes. So, this is definitely not a production quality network in any way. I ran with keep-alive enabled, doing 100 requests per connection, each request fetches a 500 byte body out of cache (i.e. 100% cache hit ratio). I know, this is a completely unrealistic test, but it is still interesting to see what we can do. Here's the best run out of three:
2270442 fetches on 23464 conns, 2000 max parallell, 1.13522e+09 bytes in 30 seconds
500 mean bytes/fetch
75681.40 fetches/sec, 3.78406e+07 bytes/sec
msecs/connect: 0.3878585 mean, 11.1295 max, 0.089 min
msecs/first-response: 16.3695 mean, 459.379 max, 0.1295 min
So, over 75,000 requests per second, at 16ms latency, and I think some of that latency can be attributed to the two switches (wish I had a better setup). An interesting side note, the Traffic Server process can actually use about 470% CPU on this Quad Core box, so an extra 70% of a CPU is "gained" here from Hyper Threading. How does this compare to my old desktop? Well, the old system was a Core2, and in the same test, it was able to pull off around 35k QPS.
Anyways, these results aren't too shabby, and we've just started!
Caveat: This is using the "trunk" version of Traffic Server, the first release that is soon coming out won't be quite this fast.
Update: I updated the numbers with the results after forcing my CPU to run at highest CPU frequency at all time. No overclocking though, this is a standard i7 920 setup.
Traffic Server has been out for a full week now. And it's been great, the interest is huge (almost overwhelming), and surprsingly, lots of people want to participate and contribute. So far, we've already achieved:
Not bad for week. If you are interested, check out our Wiki: http://cwiki.apache.org/confluence/display/TS/Traffic+Server.