Scratchpad

To content | To menu | To search

Monday, December 15 2008

Automatically creating a disk image with partitions and bootloader.

I'm often playing with tools to manipulate full system images for virtual machines and so I often need to create disk images.

The script that you can find here allows to create a disk image of an arbitrary size with partitions and a working grub bootloader. This kind of script can be a bit dangerous, so I put it there just as an example, be careful. It is using qemu-img to create the disk image (which can be easily replaced by dd), sfdisk to create the partitions, and grub to install a boot loader.

There is behind that a couple of not often documented issues.

Disk size

The PC partition table still uses the wonderful Cylinder/Head/Sector (a.k.a. CHS) scheme to address the disk. Of course, those values do not correspond anymore to any physical reality, but they are still here to annoy you. The idea with PC partition table is that partitions can start and stop only at CHS boundaries. Typically, you count 255 heads and 63 sectors, and you have a number of cylinders depending on the disk size. Usually one sector is 512 octets, so the math is easily done.

It is not a problem to have a disk size not strictly aligned with a CHS boundary. Worst thing is that you lose a few kilobytes, which is not an issue. However when it comes to creating a disk image from an host, you can have the following problem:

  • You create the image:
qemu-img create -f raw "${IMGFILE}" ${IMGSIZE}
  • Then you create the partition table. I'm using here option -D for sfdisk. It will move the first partition a bit forward (one head further). In practice, I do that to match the usual partition scheme of tools such as fdisk. The following command creates only one linux partition, taking the whole disk and marks it as bootable.
sfdisk -D $IMGFILE <<EOF
,,L,*
;
;
;
EOF
  • And then, you want to format the partition. The partition is not starting at the beginning of the file, so you first need to do some trickery using linux loopback devices. The offset 32256 come from 63 * 512, with 63 being the number of sectors and 512 the size of one sector.
losetup -o 32256 /dev/loop0 "${IMGFILE}"
  • Now, you have a block device, /dev/loop0, which corresponds to the partition, so you can format it. Be careful, there's no confirmation here :
mkfs.ext3 /dev/loop0

And now you have a problem if you look carefully. The mkfs called will determine (by default) the filesystem size based on the block device size. But if your disk size was not aligned on CHS boundaries, it means that you have more place on the disk image, hence on /dev/loop0 than you really have on the partition. Another way to say that is that the end of the filesystem will be after the end of the partition. And of course, fsck won't be too thrilled about that.

There's a couple of solutions:

  • Explicitely set number of blocks on the mkfs command. E.g, mkfs.ext3 -b 4096 /dev/loop0 <number of block>. You can compute number of blocks from the disk geometry.
  • Set the size of the disk to perfectly fit with the geometry, so you don't have to teach mkfs about size. That the approach I'm doing in the script, because sfdisk is a pain to parse to obtain the geometry.

Configuring grub

Grub can be easily put a disk image. In its default setup, it needs to have a few files on the partition to be able to boot and show up a menu. At the beginning, I was simply counting on the files provided by the linux distribution I was putting on the disk. However, it can be sometime incompatible with the grub version you're using from the host machine. So you need to put grub files from the host machine first. I'm doing something along those lines after having mounted the partition on $INSTDIR:

cp /boot/grub/{stage1,stage2,e2fs_stage1_5} $INSTDIR/local/grub
ln -s /boot/grub/menu.lst $INSTDIR/local/grub/menu.lst

I'm not putting the host file in the usual /boot/grub directory. The idea is that I'm going to put a full distribution image here, and I don't want to override the real grub files. Moreover, this way, if I reinstall grub from the virtual machine afterwards, it will probably do the right thing by using files from /boot/grub.

Now, I just need to setup grub using the following commands:

grub --batch <<EOF
device (hd0) ${IMGFILE}
root (hd0,0)
setup --prefix=/local/grub (hd0)
quit
EOF

There are two things here. I need to specify that hd0 is in fact my disk image using the device stanza, to avoid writing on the real disk. Then, I use --prefix in the setup command to make sure that grub will be using the files I copied from the host.

Sunday, December 7 2008

Configuration of my network through an ALIX

My Media Center is located in the living room. Until now, to have networking on it, there was an old fashionned network cable between the office and the living room, which prevented the door to be closed. So, I've configured on of my Alix to serve as a kind of wifi bridge for the media center.

The PC Engines / ALIX is a small box, with a x86 geode processor in it, a wifi card and a network plug (some model have several network plugs).

There's no hard disk on it, but a flash card. I've installed Voyage Linux on it. While I usually don't like to install niche linux distributions, having a flash card as disk means modifying a lot of things to avoid disk writes. Voyage linux has the advantage of being based on a regular Debian. so there's still access to all the classical packages and configuration system. So far, I'm happy with it. Everything is read-only by default and you can easily remount the disk as read/write when needed.

My first aim was to have the ALIX act as a pure level-2 bridge, so the media center would have been able to talk directly with the dhcp server and so on. However, my wifi router is most probably crappy, and it was not possible; packets were discarded at its level. I suspect that it didn't like seeing several mac address on a WPA authenticated connection.

To circumvent this problem, I've choosed (well, not much choice :) to have the alix act as a router. But to make the access of the media center possible and transparent from the main network, the ALIX box do a 1:1 NAT between the IP of the media center on the media center network, and a "visible" IP on the main network. In practice the ALIX has:

  • One public IP, 192.168.1.3, on the main network, just to be able to access it.
  • One IP, 192.168.2.1, on the private network, to be able to act as a router.
  • And one extra IP, 192.168.1.42, which will correspond to the public IP of the media center.

So, here is the /etc/network/interfaces on the alix:

# Because we always need a loop back :)
auto lo
iface lo inet loopback
 
# The network interface on the private part, which act 
# as a router. Nothing fancy here, static IP.
auto eth0
iface eth0 inet static
        address 192.168.2.1
        netmask 255.255.255.0
        broadcast 192.168.2.255
 
# The wifi interface, which is an atheros card, hence
# the name.
auto ath0
# It absolutely needs to be in manual mode to have wpa-* 
# stanzas working. It is possible to still have dhcp on top 
# of that with a default interface, but I don't need it
# here.
iface ath0 inet manual
        # Atheros network interfaces need to be 
        # instanciated from the generic wifi0 card. We want 
        # to be in managed mode (aka client of an access 
        # point), so wlanmode is 'sta'.
        pre-up wlanconfig ath0 create wlandev wifi0 \
                        wlanmode sta
        # We now configure a regular wpa_supplicant with 
        # the following two stanzas. That's an atheros 
        # card, so the driver is madwifi. All wpa 
        # configuration (essid, passphrase and so on) is 
        # in the wpa_supplicant.conf file.
        wpa-driver madwifi
        wpa-roam /etc/wpa_supplicant.conf
        # On a 'up' event (see 'man interfaces'), assign a 
        # static address. That's the 'public' address of the
        # ALIX, the one I use to connect by ssh on it.
        up ifconfig ath0 192.168.1.3
        # Since I'm in manual mode, I add the default 
        # gateway, which is my wifi router.
        up route add -net default gw 192.168.1.1 ath0
        # And when trying to ifdown this interfaces, clean
        # everything.
        down route del -net default gw 192.168.1.1 ath0
        post-down wlanconfig ath0 destroy
 
 
# And now, create a virtual interface on the wifi side, 
# which will be the visible IP of the media center.
auto ath0:42
iface ath0:42 inet static
        # Regular and boring static configuration of this 
        # IP.
        address 192.168.1.42
        netmask 255.255.255.0
        broadcast 192.168.1.255
        # And now, the interesting part. Those 4 commands 
        # tell iptables to forward everything that is 
        # coming for .1.42 to .2.42 and viceversa. So, at 
        # the IP level, access to the media center is 
        # completely transparent, as if it was on the main 
        # network. Since it's 1:1 NAT, we're using iptables 
        # in stateless mode, as we don't care about 
        # connection tracking.
        # For that to work, you need to have 
        # /proc/sys/ipv4/ip_forwarding set to 1; it's the 
        # default on voyage linux, but ymmv.
        post-up iptables -t nat -A PREROUTING \ 
                          -d 192.168.2.42 -j DNAT \
                          --to-destination 192.168.1.42
        post-up iptables -t nat -A PREROUTING \ 
                           -d 192.168.1.42 -j DNAT \
                           --to-destination 192.168.2.42
        post-up iptables -t nat -A POSTROUTING \ 
                           -s 192.168.1.42 -j SNAT \
                           --to-source 192.168.2.42
        post-up iptables -t nat -A POSTROUTING \
                           -s 192.168.2.42 -j SNAT \
                           --to-source 192.168.1.42

And that's it. I've just configured the media center to use 192.168.2.42 with 192.168.2.1 as gateway and everything went well.

Monday, December 1 2008

Better integration of qemu and screen

I've updated the script to launch qemu in screen that I described in this post. You can get the new version here.

It nows take care of creating a new screen session if needed, and if launched from an existing screen, it will simply had new tabs for the serial port and the monitor. Its usage changed too, you now needs to specify the qemu binary to launch, since the wrapper now just add the necessary options to the qemu line. And it now doesn't close the screen, so you can look at what qemu wrote on its output.

Basic usage : qemu-wrapper qemu -hda vm.img -nographic

Sunday, November 30 2008

How to boot a debian netinst over serial in a qemu without display

The debian netinst is able to work over serial but if a graphic card is detected, the bootloader won't be sent over serial, which make appending options to display over serial a bit tricky. And of course, when booting a qemu with -nographic, it doesn't remove the graphic card.

The trick here is to use the monitor to send keys to the bootloader. You just have to send the keys to tell the kernel and the install to boot and start using ttyS0. Which means that you need to write the following in the monitor :

sendkey i
sendkey n
sendkey s
sendkey t
sendkey a
sendkey l
sendkey l
sendkey spc
sendkey c
sendkey o
sendkey n
sendkey s
sendkey o
sendkey l
sendkey e
sendkey equal
sendkey t
sendkey t
sendkey y
sendkey shift-s
sendkey 0
sendkey ret

That will start the image install with argument console=ttyS0.

Sunday, November 9 2008

Automatic detection of mails moved from/to the spam folder

I'm currently using dspam to filter my mails. However, as I'm using IMap, spam filtering is done server side. So, to identify false negative (FN) and false positive (FP), I cannot use some built-in feature of my mail clients (I have severals), I need to communicate with the server. Until recently, I was using the classic approach: when I got a FN or FP, I redirect the mail (with full headers) to a special address, which send it to dspam, telling it that it was a misclassification.

The problem with this approach in practice is that to mark a FP/FN I need to retransmit the mail, and move it to the correct folder, which is redundant. Of course, most mail clients can help doing that with some configuration, but still, that's several operations where it is not really needed. Moreover, in the case of FN, it means sending through SMTP a spam, which can sometimes be a problem.

So, I've made a script which watches the content of the spam folder and detects mails which are added and removed. This way, to mark a FN as spam, I just need to move it to the Spam folder: the script will detect that a mail has been added, and will re-train dspam with the signature of the email. For FP, it's the same thing: I just need to move the mail out of the spam folder, the script will detect that and call dspam with the signature of the moved email.

The script is a single-file python script : dspam_auto.py

It works with Maildir style mailboxes, dspam and a mysql database. However, the principle is simple and can easily be adapted. The implementation is currently really dumb and could be enhanced (especially resource-wise, for the regular scan) but it's working.

The principle of the script is to scan the directory regularly to look for missing and added mails. The script must be plugged to the delivery system too (procmail in my case) to avoid trying to re-learn a spam already classified as spam.

How to use it :

  • First, your dspam must be configured to put the signature as an header, not in the body of emails.
  • Download the script on the server, let's say in ~/bin/dspam_auto.py and don't forget to mark it as executable.
  • Edit the beginning of the script to adapt the settings to your configuration
    • DB_USER, DB_PASS, DB_NAME : Access to the mysql database. You can reuse the DB you're using for dspam.
    • DB_TABLE is the name of the table which will be used to store the script information. It shall be a non-existing table, the default value is probably usually ok.
    • DSPAM_USER is the name of the dspam user you're using; usually your login name.
    • DSPAM_UID is the uid of the user for the script. It's probably good practice to use the same as in dspam, but it's in practice independant. You can check for user/uid in the table 'dspam_virtual_uids' of your dspam database.
    • LOG_FILE : Where to log all of script runs. It's really useful for debugging or just checking that the script isn't going rogue.
    • The dspam command to re-classifies FN/FP is in the classify function, feel free to adapt it to your installation. E.g., the script is currently using the option --client which you might not need.
  • Check that DRY_RUN is True ; that's needed to initialize correctly the script database without polluting the dspam database.
  • Initialize the database: ~/bin/dspam_auto.py init
  • Add a regular scan, in cron (using crontab -e as example, all on one line):
*/10 * * * * $HOME/bin/dspam_auto.py update $HOME/Maildir/.Spam

This line make the scan run every 10 minutes which is probably largely enough (especially that the current version of the script is not really nice to database :). Note that the first scan will detect all existing spam as FN, so double check that DRY_RUN is True before screwing your dpsam.


  • Modify the procmailrc to tell the script when each spam is detected. I have something like that in my .procmailrc :
# Spam filtering:
:0fw
| /usr/bin/dspam --stdout --deliver=spam,innocent --user pierre

# Tell the script for each detected spam
:0 ic
* ^X-DSPAM-Result: spam
| /home/pierre/dspam/dspam_auto.py push

# And deliver spam in the spam folder
:0:
* ^X-DSPAM-Result: spam
.Spam/

Note that the script is slightly racy, as calling the script and delivering the script is not atomic. However, as long as you don't run the scan every 10 seconds it shall not matter much, and recover itself from previous mistakes anyway. The way to implement that with no race condition would be to do the delivery ourselves, but I prefer not to for reliability reason: if my script is screwed up, it won't trash mails.

  • Configuration is done. As long as you're in dry run mode, you can watch the effect of the script by moving mail in and out from the Spam folder. Typically, moving a spam out then in (don't forget to wait for the cron scan between operations) will produce those kind of log lines :
INFO 2008-11-09 11:40:09,710 [dryrun] Classify command: /usr/bin/dspam --signature=4916ba3b179033708835974 --class=innocent --source=error --client --user pierre
INFO 2008-11-09 11:50:09,338 [dryrun] Classify command: /usr/bin/dspam --signature=4916ba3b179033708835974 --class=spam --source=error --client --user pierre
  • Once you think that your all set (a.k.a, you've configured the above and at least one scan was fully done), you can set DRY_RUN to False and enjoy a simple way to mark FP and FN in imap :)

Sunday, November 2 2008

Simple integration of Gallery2 in Dotclear2

This patch enable an integration of Gallery2 images on a Dotclear 2 blog.

The patch is really basic and crude; a plugin might be better but that's an adaptation from a patch for dotclear 1 :) Nevertheless it's working well. You can apply it with a simple patch -p1 < dc2-gallery2.patch from the directory of your dotclear2 installation.

It works only for wiki mode and add a new kind of tag:

##folder/image.jpg##

That will put the corresponding image from the gallery in the blog post. You can easily get the path and image name from the Gallery URL of the image you want to insert, without the base path and the html extension.

The full tag syntax is similar to regular images:

##folder/image.jpg|position|size##

where:

  • position is L or G for a left position, R or D for a right position, or C for centered (default).
  • size determine the size of the image, in pixels.

Both are optionnal.

The patch adds a few parameters available in the gallery section of the about:config of your blog instance:

  • gallery_enable : Activate gallery2 integration.
  • gallery_embed : Where to find the embed.php on the local server from the Gallery2 embedding system. E.g, /home/user/gallery-site/embed.php.
  • gallery_uri : Path relative to your domain of the gallery instance. E.g., /gallery/main.php.
  • gallery_size : Default size of images. E.g, 400.

Unfortunately, this patch doesn't support yet integration of a gallery on another domain name.

Sunday, October 26 2008

US keyboard with non intrusive easy accents

I'm now always using a US keyboard. However, I still often need to write french text with those keyboards, hence needing accents. Of course, I don't like when what's written on the keyboard doesn't match what it does :-)

On classical linux, there is an international version of the US layout. However, it modifies "standard" behaviour; i.e., if you want to do a double quote, you need to press it twice because in this intl layout, it is, by default, a dead key. Same applies for a couple of other symbols, such as the simple quote or tilde. So, I've made a custom version of the US layout, providing easy access to accent through alt-gr (both with dead keys and common accents in french), but without modifying standard behaviour of keys.

It's partly based on us-intl, with deadkeys removed from default bindings, and the following bindings added (mainly):

  • altgr-` : dead grave accent
  • altgr-' : dead acute accent
  • altgr-^ : dead circonflex accent
  • altgr-" : dead diaresis
  • altgr-a : à
  • altgr-e : é
  • altgr-c : ç
  • altgr-u : ù

You can find the xkbmap file here : us-custom

It still needs tuning, but it works already quite well for my purposes. You just need to the file in /usr/share/X11/xkb/symbols and do a setxkbmap us-custom (if you named the file us-custom) to load it.

Wednesday, October 22 2008

Using Qemu in screen

The following python script allows you to start a Qemu in a screen with separated screen windows for qemu monitor and for serial port.

#!/usr/bin/env python
import os
import subprocess
import sys

# Qemu command. Parameters to this script will be added to the command.
CMD=['qemu-system-i386',
     '-serial', 'pty', '-monitor', 'pty']
CMD += sys.argv[1:]

SCREENNAME = os.environ['STY']

# Qemu is not really verbose, so we need to know in which order pty names
# will appear.
TITLES=['Monitor', 'Serial']
PREFIX = 'char device redirected to '

# And then start qemu
p = subprocess.Popen(CMD, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, bufsize=0)

devcount = 0
# For obscure buffering reasons, iterating over stdout doesn't work.
while p.poll() is None:
    line =  p.stdout.readline()
    sys.stdout.write(line)
    if line.startswith(PREFIX):
        devname = line[len(PREFIX):].strip()
        try:
            title = TITLES[devcount]
        except KeyError:
            title = devname
        # This command add a window to the current screen, using given pty.
        subprocess.call(['screen', '-x', SCREENNAME, '-X', 'screen', '-t', title, devname])
        devcount += 1

To use it, put this script in something like qemu-wrapper, make it executable and do a screen qemu-wrapper <qemuargs> to start qemu inside the screen with the nice dedicated windows.

UPDATE: A new version is available here