Caching yum repo data in Fedora

Since Comcast are nuts and meters my connection, and I do a lot of yum update and yum install on my day-to-day work, I decided to setup an HTTP cache. I decided against doing a mirror, because I don't need all of what Fedora provides, only a very small subset. Now, the modern Fedora DNF configurations uses metalink and HTTPS URLs, which are not (easily) cacheable. The first thing I ended up doing was to edit the yum.repos.d configuration files, e.g. fedora.repo, for example:

[fedora]
name=Fedora $releasever - $basearch
failovermethod=priority
baseurl=http://some_cache.ogre.com/pub/fedora/linux/releases/$releasever/Everything/$basearch/os/
enabled=1
...

In particular, notice that I removed (commented out) the metalink configuration. Alternatively, I'm fairly certain you can keep the baseurl pointing to the normal download.fedoraproject.org server, and instead add a proxy= configuration option to /etc/dnf/dnf.conf. However, since I had to edit the .repo files anwyays, I figured I might as well just do it all there.

Next, it's time to setup a caching proxy, in my case, the obvious choise is Apache Traffic Server. What gets a little tricky here is that the Fedora download servers sends a lot of redirects to the mirrors, which mostly defeats the purpose of caching. In my configuration (remap.config), I work around this by making sure ATS itself follows such redirects:

map http://some_cache.ogre.com http://download.fedoraproject.org \
    @plugin=conf_remap.so \
    @pparam=proxy.config.http.number_of_redirections=2 \
    @pparam=proxy.config.http.redirect_use_orig_cache_key=1

 

Update: I attached a Lua plugin script, that's a little more clever, and does more aggressive caching of the various RPM file extentions. This combines all the logic above, and this additional cache tweaking all into one nice script.

Fwiw, I tried  using https://download.fedoraproject.org, but that did not seem to work :-/. I did a few minor other tweaks in ATS itself, but primarily this is running a stock ATS configuration out of the box (change the ports definition though). One thing to bear in mind is that the content from Fedora (and its mirrors) do not include a Cache-Control or Expires header. So, I ended up changing the ATS configurations to allow heuristics based on Last-Modified:

CONFIG proxy.config.http.cache.required_headers INT 1
CONFIG proxy.config.http.cache.heuristic_min_lifetime INT 7200
CONFIG proxy.config.http.cache.heuristic_max_lifetime INT 172800

Obviously set this min/max to some numbers that are reasonable for your environment and needs. Enjoy!