Sylvain Mareschal, Ph.D.
Bioinformatics engineer
May 31, 2013 at 19:08
Yum manual update without network
If like me you have to deal with a facetious IT department, or simply with a 56k era internet access, you have found that keeping software up-to-date or installing something without managing the dependencies yourself is quite a challenge. So here is the solution i developed to update my Fedora 18 install on a computer without network access, using of course an alternative connection point to download required data.

As if it wasn't boring enough, a few more constraints for me : the only connection point available is a Windows XP machine, behind a firewall that blocks anything but HTTP traffic. As the only machine running Linux in my lab is the one to update (hence its isolation from the main network, thanks a lot for all this confidence in free software ...), forget the package caching, the cache needs to be built locally to get the proper package and dependencies list. The packages can then be downloaded from the connection point, and installed on the computer to update.

The solution i developed is to be stored on an USB pen drive (at least 1 Gb if you don't update your system frequently), as some scripts needs to be executed on the machine to update and others on the one connected to the internet. Here are the files, .bat scripts are for my Windows connection point and .bash for my Fedora to update :

/getPackages
/getPackages/GnuWin32, GnuWin wget installation directory
/getPackages/packages, empty directory that will hold the downloaded packages
/getPackages/yum, empty directory that will hold the local yum cache and config
/getPackages/1_getRepodata.bat, to download the yum cache file
/getPackages/2_buildCache.bash, to build the yum cache from the downloaded files
/getPackages/3_getURL.bash, to list packages and dependencies
/getPackages/4_getPackages.bat, to download listed packages
/getPackages/5_installPackages.bat, to install downloaded packages
/getPackages/URL.txt, empty text file listing the URL of the packages to download


A few considerations on the yum system

In Linux systems, software is split into packages that highly relies the ones on the others, hence the problem of dependency : as a required package may or may not already be installed (as it is needed by an other program), the list of packages needed by the program to install must be compared to the list of packages already installed, to know which packages are to be downloaded. And each dependency package may have its own dependencies, and so on ... As a consequence, the downloading and installation of the same program on two distinct computers may vary dramatically.

With the yum system, a few files frequently updated are needed to build a package cache, that will be interrogated when commands like yum update or yum install ... are processed. This cache lists all the packages available, and the various dependencies between them, to help selecting the ones to download. In the usual case, this cache is invisibly refreshed from a remote mirror just before the interrogation if it is older than the limit defined in the yum configuration file (a few hours by default). The whole cache is not systematically refreshed as it is quite heavy, data like package file lists is refreshed only if the query received have to deal with it (like yum provides ...).

Packages are standardized, i.e. names and versions should not be ambiguous, and gathered into repositories, websites offering everyone the opportunity to get the package it needs, in the version it needs. These repositories are mirrored, so with a "shopping list" of packages it is quite simple to get the software you need, even if the Fedora repository is out of service.


Select mirrors

First we need to choose a remote mirror to get the package data (and later the packages themselves) from. The yum system usually pick the faster mirror to update itself, but as most of them are FTP or RSYNC based (protocols that are generally faster than HTTP), i was required to manually pick an HTTP one. To pick a mirror corresponding to your Fedora version and architecture, you can use the Fedora project mirror list. Let's work with the OVH one (fedora.mirrors.ovh.net), as it provides HTTP traffic with a satisfying bandwidth (2000 Mbits/sec) from my own country (France).

Don't forget to pick a mirror for each repository you are using (at least “fedora" and “updates").


1. Download the cache data

If you try to browse a mirror, you will find that a lot of files are available. All are not required, as many are various packaging of the same data (with or without compression, as flat file or database ...) that are provided for compatibility. Here is my own selection from a Fedora 18 mirror :
  • other.sqlite.bz2
  • filelists.sqlite.bz2
  • primary.sqlite.bz2
  • prestodelta.xml.gz
  • comps-f18.xml.gz
  • repomd.xml
  • pkgtags.sqlite.gz
  • updateinfo.xml.gz

To get all of these files, you can write a short bash script that will make use of the wget command on your internet connected machine. Wget is a powerful tool to download files, and it has also been ported to Windows as a GnuWin package. You can find at the beginning of this article the batch script i use on Windows. On Linux the commands are the same, just add a shebang and execute it from your favorite shell.


2. Build the cache

To make use of these files, we are required to mimic a local repository, telling yum to get the files its needs into the directory of our choice rather than through the network. Default repositories are defined in /etc/yum.repos.d by *.repo files, usually with a “mirrorlist” tag / value pair. This tag tells yum where to find a list of mirrors in which it can pick one to download package data from. As we have chosen to make the mirror selection by ourselves, we rather use the “baseurl” tag to define directly where the files are to be downloaded from, using the file:// protocol to refer to local files.

Repositories may be edited directly, or defined elsewhere to keep a fresh internet-based version. I made the choose to write an alternative yum.conf file containing the repository list rather than relying on the yum.repos.d directory, to keep the whole process on my USB pen. A few over options may be modified for a more comfortable use of local repositories, like “max_connections=1” to avoid simultaneous file copy (which is interesting when bandwidth is limiting, but penalizing with the file:// protocol), use man yum.conf for more details.

To force a cache build, yum provides the yum makecache command. If like me you opted for an alternative configuration file, the --config argument accepts a path to the configuration file to use in replacement of the usual one, and you may consider the alias command to define a version of yum that uses your configuration file. To use modified repositories with an intact yum.repos.d content, we can give modified repos names with a distinct prefix, like “cache.fedora” for “fedora”, and use --disablerepo=* --enablerepo=cache.* to filter the repositories to use. As these arguments will be needed for each yum command, you should definitively consider to make an alias.


3. Get package list

Now we have an up-to-date cache, we can query yum. As yum can't connect to internet to download the packages, we will limit ourselves to list the packages URLs. First we need to list the packages we need, using the yum list command. If we are looking for a single package to install this can be skipped, but to update the whole system yum list update is especially useful.

Then we come to the critical point : resolving the dependencies. The yumdownloader command offers this service, when the --resolve and --urls flags are provided.

Here again, for yum list and yumdownloader, the --config, --disablerepo and --enablerepo are required for a local behavior.


4. Get packages

Let's go back to the connection point. With an URL list, the following is quite obvious : a little wget and all the packages needed will be on an USB drive, ready to be installed on the other machine. Wget even offers a -i option that takes an URL list like the one we produced as input !


5. Install packages

On the machine to update, just list the packages paths and use yum install, and its done. Don't forget to specify the alternative configuration file and to disable standard repos.


And here you are ! As in the usual yum behavior, you can skip the cache building (steps 1 and 2) if it was done close enough, so with two "seat switching" you can have your software installed on a machine without network access, respecting the dependencies.