First a disclaimer, the core of Ubuntu package management is a lot of hard work by Debian developers. I'm going to refer to all this mostly as Ubuntu package management because I'm using Ubuntu and frankly the distribution is just more relevant than Debian. I used Debian for about 2-3 years, great distribution, Ubuntu wouldn't be what it is without the hard work of Debian.

I was curious about how my Ubuntu box handles dependencies. I'm not a Linux gear head. Sure, I've used it for over 10 years but for me it's just a means to an end. It's a platform for me to write software, listen to music, browse the web and generally do what I want to do.

Given my shallow (but long) Linux background I doubt I'll present anything amazing except perhaps a n00b perspective to how Ubuntu (really Debian) handles packages and their dependencies.

Table of Contents

How APT Handles Dependency

The Stone Ages

The thing that makes apt so great is that it manages dependencies for you. Lots of folks jumped to Red Hat back in the day because RPM made package management so easy. No more untar, make, make install dance to get new software, just a simple rpm -iv [package]. But what about when you're missing something that your new software depends on, off to rpmfind or something of that sort. You're still in dependency hell, albeit a little bit less painful to navigate that hell.

Along Comes APT

APT leap-frogged RPM by automatically handling dependencies for you. No more hunting dependencies down, APT takes care of it. Dependency hell hasn't gone away, it's just been automated and abstracted. So how does APT handle this?

The unit of work in apt package management is the .deb file. There are lots of tools used for creating and working with .deb files. dpkg, dselect and more recently apt-get and synaptic come to mind. .deb files are [ar files http://en.wikipedia.org/wiki/Ar_(Unix)]. This is an archive format normally used for libraries. Inside the .deb file are tar files that contain the executables, metadata and configuration.

A .deb file autopsy

Reading references for what makes up a .deb file is great but it's more fun to tear apart a living .deb file to see what makes it tick. I'm gonna play with a package on my box and see what's inside. I'm a Java coder so let's take a look at ant.

  mkdir ~/tmp
  sudo cp /var/cache/apt/archives/ant_1.6.5-3ubuntu1_all.deb ~/tmp
  cd ~/tmp
  ar -x ant_1.6.5-3ubuntu1_all.deb 

After that, this is what we're left with:

Now I'm going to untar those archives, create directories with the same name (as the archive) and put the contents in there. I'm just doing this to get a better understanding of what is where inside the entire structure of the .deb file.

Here is what we're left with when I run tree on that directory.

  |-- control.tar.gz
  |   |-- control
  |   `-- md5sums
  |-- data.tar.gz
  |   `-- usr
  |       |-- bin
  |       |   `-- ant -> ../share/ant/bin/ant
  |       `-- share
  |           |-- ant
  |           |   |-- bin
  |           |   |   |-- ant
  |           |   |   |-- antRun
  |           |   |   |-- antRun.pl
  |           |   |   |-- complete-ant-cmd.pl
  |           |   |   `-- runant.pl
  |           |   `-- lib
  |           |       |-- ant-bootstrap.jar
  |           |       |-- ant-launcher.jar
  |           |       `-- ant.jar
  |           |-- doc
  |           |   `-- ant
  |           |       |-- README
  |           |       |-- README.Debian
  |           |       |-- TODO
  |           |       |-- changelog.Debian.gz
  |           |       `-- copyright
  |           |-- java
  |           |   |-- ant-1.6.jar -> ../ant/lib/ant.jar
  |           |   |-- ant-bootstrap.jar -> ../ant/lib/ant-bootstrap.jar
  |           |   |-- ant-launcher.jar -> ../ant/lib/ant-launcher.jar
  |           |   `-- ant.jar -> ../ant/lib/ant.jar
  |           `-- man
  |               |-- man1
  |               |   `-- ant.1.gz
  |               `-- man5
  |                   `-- build.xml.5.gz
  `-- debian-binary

Now, let's work up through the contents of the .deb file.

debian-binary

The debian-binary file just contains the text 2.0. It refers to the deb format version number. Move along, nothing to see here.

data.tar.gz

This contains the real payload of the package. This is the executables, libraries and documentation. Everything is layed out in the directory structure that will be used such that the files and directories can just be copied to root and everything goes in the right place.

control.tar.gz

This is meta-data about the package.

md5sums

md5sums obviously a collection of md5sums for files within the archive. It's not all the files, I'm not sure what their criteria is for inclusion. Maybe any file that, when executed, could cause badness. It seems to be primarily executables and libraries.

Here's a few lines from the postgres package:

  7c15ff5c1ee6e5d938f06e600b9a863a  usr/lib/postgresql/8.1/bin/pg_ctl
  bbfb08498980a65f0d72a8fdbe63cb6d  usr/lib/postgresql/8.1/bin/pg_resetxlog
  7cbc18eb384a6bd34f5ebea22d60e822  usr/lib/postgresql/8.1/bin/postgres

Scripts

One thing we're not looking at in this example is the possibility of scripts being sent along inside of the package. In a more complex package, perhaps one that requires configuration or needs to run as a service there are a series of scripts that will be run at various points of the install. The scripts live in at the root of the control.tar.gz file. Here are the names of the scripts, when they're ran and their general purpose.

For example, here's the postinst file for postgres package on my system. You can see, among other things, it plays with configuring postgres to run as a service.

  #!/bin/sh -e
  
  VERSION=8.1
  
  
  if [ "$1" = configure ]; then
      . /usr/share/postgresql-common/maintscripts-functions
  
      configure_version $VERSION "$2"
  fi
  
  # Automatically added by dh_installinit
  if [ -x "/etc/init.d/postgresql-8.1" ]; then
          update-rc.d postgresql-8.1 defaults 19 >/dev/null
          if [ -x "`which invoke-rc.d 2>/dev/null`" ]; then
                  invoke-rc.d postgresql-8.1 start || exit $?
          else
                  /etc/init.d/postgresql-8.1 start || exit $?
          fi
  fi
  # End automatically added section

control (the star of the show)

Okay, this is why I went in reverse order throught the tree. The control file is where all the magic happens. This humble little text file is what makes dependency management work in Debian and it's derivative distributions.

Here's the contents of a control file in a .deb archive

  Package: ant
  Version: 1.6.5-3ubuntu1
  Section: devel
  Priority: optional
  Architecture: all
  Depends: java-gcj-compat | java-virtual-machine, java-gcj-compat | java1-runtime | java2-runtime, libxerces2-java
  Recommends: ant-optional, ecj-bootstrap | ecj | java-compiler
  Suggests: ant-doc
  Conflicts: libant1.6-java, ant-doc (<= 1.6.5-1)
  Replaces: libant1.6-java, ant-doc (<= 1.6.5-1)
  Installed-Size: 1200
  Maintainer: Debian Java Maintainers <pkg-java-maintainers@lists.alioth.debian.org>
  Description: Java based build tool like make
   A system independent (i.e. not shell based) build tool that uses XML
   files as "Makefiles". This package contains the scripts and the core
   tasks libraries.
   .
   For more information see http://ant.apache.org/index.html.

How Packages Relate

As someone not very familiar with the internals of package management, I think it's interesting to note that there are different notions of relationships for Ubuntu packages. Here are all of the possible relationships that we can have in a control file.

In this table I'll refer to the package we're installing as this and the package described in the relationship as that.

relation explaination
depends won't install this until that is installed
recommends this will work without that but having that around would be a good thing
suggests this will work without that but that works nicely with this
pre-depends a stronger depends, won't install this until that is installed and configured
conflicts won't install this if that is on the system
provides used for virtual packages, that's another discussion (that I know nothing of)
replaces before installing this we'll remove that

On the conflicts and replaces lines we see the <= expression used. These expressions can be used to describe versions on any of the packages.

relation explaination
<< strictly earlier
<= earlier or equal
= exactly equal
>= later or equal
>> strictly later

So this line in the control file will instruct apt to remove libant1.6-java and ant-doc with a version number earlier or equal to 1.6.5-1.

  Replaces: libant1.6-java, ant-doc (<= 1.6.5-1)

How to Visualize a Package Dependency

I wrote this script to get a directed graph (as a GIF file) of every single package installed on my Ubuntu machine and all the package dependencies. Feel free to hack at the script to get it to do one package at a time.

On my box it produced about 1700 gif files and took maybe 15 minutes to finish. As an aside, Unix always amazes me how you can slam together simple utilities that do a custom task that the author never intended. Unix rocks.

  #!/bin/bash
  
  # This creates a gif file for every package installed package
  # that dpkg is aware of.
  
  # You may need to install these packages
  # sudo apt-get install apt-rdepends
  # sudo apt-get install graphviz
  
  for ii in $(  dpkg -l | awk '{print $2}' ); do
                  apt-rdepends -d $ii > $ii.dotty
                  dot -Gratio=auto -Tgif -o $ii-dependency.gif $ii.dotty
                  rm $ii.dotty
  done

Here are some examples of the dependency graphs we get after running the above script.

Besides being just neato pictures. These graphs illustrate how, with some simple primitives, incredibly complex package dependencies can be resolved and managed.


References