Asterisk, and other worldly endeavours.

A blog by Leif Madsen

CentOS 5.8 On AWS EC2 With Xen Kernel (PVGRUB)


At CoreDial we’ve been using a lot of AWS EC2 lately for building sandbox infrastructure for testing. Part of the infrastructure is a voice platform utilizing Asterisk 1.4 and 1.8, and those voice platforms are using Zaptel and DAHDI respectively for use with MeetMe(). This hasn’t been an issue previously as our testing has either been on bare metal, or in other virtual machine systems where installation of a base image and standard kernel are not an issue.

However, with the introduction of a lot of EC2 instances in our testing process, we ran into issues with building our own DAHDI RPMs since there aren’t any EC2 kernel development packages outside of OpenSuSE (which we don’t use). After spending a day of trying to hack around it, Kevin found a PDF from Amazon that states AWS now supports the ability to load your own kernels via PVGRUB. Great! If I can do that, then I can just continue using the same RPMs I’d be building anyways (albeit the xen based kernel, but that’s easy to do in the spec file).

Unfortunately this was not nearly as trivial and simple as it appeared at first. The first problem was that I had to figure out the correct magic kernel AKI that needed to be loaded, and the PDF wasn’t incredibly clear about which one to use. (There is two different styles of the AKI, one called “hd0” and another called “hd00” which I’ll get into shortly.) After searching Google and looking through several forum posts and other blogs (linked at the end), I finally found a combination that seems to work for our imported CentOS 5.8 base image. Below is a list of the steps I executed after loading up an image from our base AMI:

  • yum install grub kernel-xen kernel-xen-devel
  • grub-install /dev/sda
  • cd /boot/
  • mkinitrd -f -v –allow-missing –builtin uhci-hcd –builtin ohci-hcd –builtin ehci-hcd –preload xennet –preload xenblk –preload dm-mod –preload linear –force-lvm-probe /boot/initrd-2.6.18-308.13.1.el5xen.img 2.6.18-308.13.1.el5xen
  • touch /boot/grub/menu.lst
  • cat /boot/grub/menu.lst
default 0
timeout 1

title EC2
     root (hd0)
     kernel /boot/vmlinuz-2.6.18-308.11.1.el5xen root=/dev/sda1
     initrd /boot/initrd-2.6.18-308.11.1.el5xen.img

Once the changes were made to the image, I took a snapshot of the running instances volume. I then created an image from the snapshot. When creating the image, I selected a new kernel ID. The kernel ID’s for the various zones and architectures are listed in the PDF. As our base image was CentOS 5.8 i386 in the us-east-1 zone, I had to select between either aki‐4c7d9525 or aki‐407d9529. The paragraph above seems to indicate there is a difference based on what type of machine you’re using, and references S3 or EBS based images. We are using EBS based images, so I tried the first one, which in the end failed miserably. After reading through the IonCannon blog post it became clear that the hd0 and hd00 AKIs are really differences in whether you have a single partition, or multiple partitions with a separate /boot/ partition.

With that bit of knowledge, and knowing that we only had a single partition that contained our /boot/ directory, I knew to use aki-407d9529 (hd0). Another forum post also pointed out that I needed to enable some modules for the xen kernel or the system wouldn’t boot (and I verified that by stepping through each of the steps listed above to make sure it was required). With those two major items checked off, I am now able to build an AMI that will load with a stock CentOS Xen kernel image, making it trivial to build RPMs against now.

Note: If you do happen to use separate partitions, make sure you use the hd00 AKI. In the menu.lst you need to make sure to use root (hd0,0) instead of just (hd0). Additionally, your menu.lst file needs to live at /boot/boot/grub/menu.lst since AWS is going to look in the /boot/grub/menu.lst location on the /boot/ partition. On a single partition the file can just live at /boot/grub/menu.lst.

References

Written by Leif Madsen

2012/08/22 at 9:10 am

4 Responses

Subscribe to comments with RSS.

  1. It sounds like the difference between those AKIs is whether the root filesystem is on the *whole disk*, or using a partition table and located in the first partition. In GRUB syntax, ‘(hd0)’ is the entire first disk, where ‘(hd0,0)’ is the first partition of the first disk.

    Kevin P. Fleming

    2012/08/22 at 9:15 am

    • Correct 🙂

      Leif Madsen

      2012/08/22 at 9:16 am

      • Hi,

        1. I have installed slackware14.0 (64bit) in my local machine.

        2. I have created the 10 gb image space in slackware machine using below command.
        (dd if=/dev/zero of=slack14.img bs=1M count=10075). and mount the image in slack14.img.
        mount -o loop slack14.img /mnt/slack1464.

        3. I have format the image (slack14.img)

        4. I have installed the custom package through ruby script. The custom package for
        installed without any error.

        After that While login the mounting image (/mnt/slack1464).

        root@slack1464bit:~# chroot /mnt/slack1464
        chroot: failed to run command ‘/bin/bash’: No such file or directory
        root@slack1464bit:~#

        earlier I have created the .img image in (slackware13.1 & 13.37) without any error.

        But I am getting the error in Slackware14.0 64bit only.

        Thanks for advance.

        BY
        DAVID

        David

        2012/10/31 at 12:54 am

        • Hi,

          I have installed local machine for slackware14.0 and I have created new
image for 10GB using below command and

          # dd if=/dev/zero of=Slack14.0 bs=1M count=10000
          
# mke2fs -F -j Slack14.0
          #mount Slack14.0 /mnt/slackware14.0

          and installed the custom package for slackware14.0

          I have referred the below links and follow-up the steps, for pdf document.
          http://aws.typepad.com/aws/2010/07/u…mazon-ec2.html
          
Download the kernel:-http://www.kernel.org/pub/linux/kernel/v2.6/linux-2.6.34.4.tar.gz

          I have untar the linux-2.6.34. kernel, I am getting 2 folders (/lib/modules & /boot)
and installed the path like (/lib/modules folder in /lib path and boot file is stored in /boot path.

          Also I have set the fstab entry for, and changed the above links steps also.
          Create /etc/fstab and add the following entries to it:

          /dev/xvda1 / ext3 defaults 1 1
          none /dev/pts devpts gid=5,mode=620 0 0
          none /dev/shm tmpfs defaults 0 0
          none /proc proc defaults 0 0
          none /sys sysfs defaults 0 0

          And I have created the /boot/grub/menu.lst file and store the file
          default 0
          timeout 3
          title kernel-2.6.34
          root (hd0)
          kernel /boot/vmlinuz root=/dev/xvda1 xencons=xvda1 console=xvda1 ro

          and bundle with image aki (aki-407d9529) and upload the amazon server
 and register it,
          While run the instance, I am unable to login the server, I am getting
below error.

          ec2-get-console-output i-ajht0c9

          [6535502.143450] nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
          [6535502.143828] ctnetlink v0.93: registering with nfnetlink.
          [6535502.145187] ip_tables: (C) 2000-2006 Netfilter Core Team
          [6535502.145235] TCP cubic registered
          [6535502.145244] NET: Registered protocol family 17
          [6535502.245110] XENBUS: Device with no driver: device/console/0
          [6535502.247428] EXT3-fs: barriers not enabled
          [6535502.257460] EXT3-fs (xvda1): mounted filesystem with writeback data mode
          [6535502.257484] VFS: Mounted root (ext3 filesystem) readonly on device 202:1.
          [6535502.257779] Freeing unused kernel memory: 484k freed
          [6535502.257953] kjournald starting. Commit interval 5 seconds
          [6535502.471724] mount used greatest stack depth: 4296 bytes left
          [6535512.662690] touch used greatest stack depth: 4120 bytes left
          [6535576.762574] xenbus_dev_shutdown: device/console/0: Initialising != Connected, skipping
          [6535577.114477] Restarting system.

          Please help me how to fix this issue.

          MIKE

          MIKE

          2012/11/23 at 3:52 am


Comments are closed.

%d bloggers like this: