First a disclaimer, the core of Ubuntu package management is a lot of hard work by Debian developers. I'm going to refer to all this mostly as Ubuntu package management because I'm using Ubuntu and frankly the distribution is just more relevant than Debian. I used Debian for about 2-3 years, great distribution, Ubuntu wouldn't be what it is without the hard work of Debian.
I was curious about how my Ubuntu box handles dependencies. I'm not a Linux gear head. Sure, I've used it for over 10 years but for me it's just a means to an end. It's a platform for me to write software, listen to music, browse the web and generally do what I want to do.
Given my shallow (but long) Linux background I doubt I'll present anything amazing except perhaps a n00b perspective to how Ubuntu (really Debian) handles packages and their dependencies.
Table of Contents
How APT Handles Dependency
The Stone Ages
The thing that makes apt so great is that it manages dependencies for you. Lots of folks jumped to Red Hat back in the day because RPM made package management so easy. No more untar, make, make install dance to get new software, just a simple rpm -iv [package]. But what about when you're missing something that your new software depends on, off to rpmfind or something of that sort. You're still in dependency hell, albeit a little bit less painful to navigate that hell.
Along Comes APT
APT leap-frogged RPM by automatically handling dependencies for you. No more hunting dependencies down, APT takes care of it. Dependency hell hasn't gone away, it's just been automated and abstracted. So how does APT handle this?
The unit of work in apt package management is the .deb file. There are lots of tools used for creating and working with .deb files. dpkg, dselect and more recently apt-get and synaptic come to mind. .deb files are [ar files http://en.wikipedia.org/wiki/Ar_(Unix)]. This is an archive format normally used for libraries. Inside the .deb file are tar files that contain the executables, metadata and configuration.
A .deb file autopsy
Reading references for what makes up a .deb file is great but it's more fun to tear apart a living .deb file to see what makes it tick. I'm gonna play with a package on my box and see what's inside. I'm a Java coder so let's take a look at ant.
mkdir ~/tmp sudo cp /var/cache/apt/archives/ant_1.6.5-3ubuntu1_all.deb ~/tmp cd ~/tmp ar -x ant_1.6.5-3ubuntu1_all.deb
After that, this is what we're left with:
- control.tar.gz
- data.tar.gz
- debian-binary
Now I'm going to untar those archives, create directories with the same name (as the archive) and put the contents in there. I'm just doing this to get a better understanding of what is where inside the entire structure of the .deb file.
Here is what we're left with when I run tree on that directory.
|-- control.tar.gz | |-- control | `-- md5sums |-- data.tar.gz | `-- usr | |-- bin | | `-- ant -> ../share/ant/bin/ant | `-- share | |-- ant | | |-- bin | | | |-- ant | | | |-- antRun | | | |-- antRun.pl | | | |-- complete-ant-cmd.pl | | | `-- runant.pl | | `-- lib | | |-- ant-bootstrap.jar | | |-- ant-launcher.jar | | `-- ant.jar | |-- doc | | `-- ant | | |-- README | | |-- README.Debian | | |-- TODO | | |-- changelog.Debian.gz | | `-- copyright | |-- java | | |-- ant-1.6.jar -> ../ant/lib/ant.jar | | |-- ant-bootstrap.jar -> ../ant/lib/ant-bootstrap.jar | | |-- ant-launcher.jar -> ../ant/lib/ant-launcher.jar | | `-- ant.jar -> ../ant/lib/ant.jar | `-- man | |-- man1 | | `-- ant.1.gz | `-- man5 | `-- build.xml.5.gz `-- debian-binary
Now, let's work up through the contents of the .deb file.
debian-binary
The debian-binary file just contains the text 2.0. It refers to the deb format version number. Move along, nothing to see here.
data.tar.gz
This contains the real payload of the package. This is the executables, libraries and documentation. Everything is layed out in the directory structure that will be used such that the files and directories can just be copied to root and everything goes in the right place.
control.tar.gz
This is meta-data about the package.
md5sums
md5sums obviously a collection of md5sums for files within the archive. It's not all the files, I'm not sure what their criteria is for inclusion. Maybe any file that, when executed, could cause badness. It seems to be primarily executables and libraries.
Here's a few lines from the postgres package:
7c15ff5c1ee6e5d938f06e600b9a863a usr/lib/postgresql/8.1/bin/pg_ctl bbfb08498980a65f0d72a8fdbe63cb6d usr/lib/postgresql/8.1/bin/pg_resetxlog 7cbc18eb384a6bd34f5ebea22d60e822 usr/lib/postgresql/8.1/bin/postgres
Scripts
One thing we're not looking at in this example is the possibility of scripts being sent along inside of the package. In a more complex package, perhaps one that requires configuration or needs to run as a service there are a series of scripts that will be run at various points of the install. The scripts live in at the root of the control.tar.gz file. Here are the names of the scripts, when they're ran and their general purpose.
- preinst Do this before the install happens. For example, if we're going to upgrade Postgres it'd be good to check if it's currently running and then shut it down.
- postinst Do this after the install happens. Ask the user for config information, like what port do you want your server running on. Usually this would start up a service if need be.
- prerm Stop daemons.
- post Do cleanup and modify links.
For example, here's the postinst file for postgres package on my system. You can see, among other things, it plays with configuring postgres to run as a service.
#!/bin/sh -e
VERSION=8.1
if [ "$1" = configure ]; then
. /usr/share/postgresql-common/maintscripts-functions
configure_version $VERSION "$2"
fi
# Automatically added by dh_installinit
if [ -x "/etc/init.d/postgresql-8.1" ]; then
update-rc.d postgresql-8.1 defaults 19 >/dev/null
if [ -x "`which invoke-rc.d 2>/dev/null`" ]; then
invoke-rc.d postgresql-8.1 start || exit $?
else
/etc/init.d/postgresql-8.1 start || exit $?
fi
fi
# End automatically added section
control (the star of the show)
Okay, this is why I went in reverse order throught the tree. The control file is where all the magic happens. This humble little text file is what makes dependency management work in Debian and it's derivative distributions.
Here's the contents of a control file in a .deb archive
Package: ant Version: 1.6.5-3ubuntu1 Section: devel Priority: optional Architecture: all Depends: java-gcj-compat | java-virtual-machine, java-gcj-compat | java1-runtime | java2-runtime, libxerces2-java Recommends: ant-optional, ecj-bootstrap | ecj | java-compiler Suggests: ant-doc Conflicts: libant1.6-java, ant-doc (<= 1.6.5-1) Replaces: libant1.6-java, ant-doc (<= 1.6.5-1) Installed-Size: 1200 Maintainer: Debian Java Maintainers <pkg-java-maintainers@lists.alioth.debian.org> Description: Java based build tool like make A system independent (i.e. not shell based) build tool that uses XML files as "Makefiles". This package contains the scripts and the core tasks libraries. . For more information see http://ant.apache.org/index.html.
How Packages Relate
As someone not very familiar with the internals of package management, I think it's interesting to note that there are different notions of relationships for Ubuntu packages. Here are all of the possible relationships that we can have in a control file.
In this table I'll refer to the package we're installing as this and the package described in the relationship as that.
| relation | explaination |
|---|---|
| depends | won't install this until that is installed |
| recommends | this will work without that but having that around would be a good thing |
| suggests | this will work without that but that works nicely with this |
| pre-depends | a stronger depends, won't install this until that is installed and configured |
| conflicts | won't install this if that is on the system |
| provides | used for virtual packages, that's another discussion (that I know nothing of) |
| replaces | before installing this we'll remove that |
On the conflicts and replaces lines we see the <= expression used. These expressions can be used to describe versions on any of the packages.
| relation | explaination |
|---|---|
| << | strictly earlier |
| <= | earlier or equal |
| = | exactly equal |
| >= | later or equal |
| >> | strictly later |
So this line in the control file will instruct apt to remove libant1.6-java and ant-doc with a version number earlier or equal to 1.6.5-1.
Replaces: libant1.6-java, ant-doc (<= 1.6.5-1)
How to Visualize a Package Dependency
I wrote this script to get a directed graph (as a GIF file) of every single package installed on my Ubuntu machine and all the package dependencies. Feel free to hack at the script to get it to do one package at a time.
On my box it produced about 1700 gif files and took maybe 15 minutes to finish. As an aside, Unix always amazes me how you can slam together simple utilities that do a custom task that the author never intended. Unix rocks.
#!/bin/bash
# This creates a gif file for every package installed package
# that dpkg is aware of.
# You may need to install these packages
# sudo apt-get install apt-rdepends
# sudo apt-get install graphviz
for ii in $( dpkg -l | awk '{print $2}' ); do
apt-rdepends -d $ii > $ii.dotty
dot -Gratio=auto -Tgif -o $ii-dependency.gif $ii.dotty
rm $ii.dotty
done
Here are some examples of the dependency graphs we get after running the above script.
Besides being just neato pictures. These graphs illustrate how, with some simple primitives, incredibly complex package dependencies can be resolved and managed.