New Use for Old Hardware: Network RAID Backup

Long unused, the old P166 PC seemed like it ought to have some use left in it. Then I remembered the 2nd hard drive wasting time in the other PC; the combination of the old PC and a 2nd drive seemed like a good candidate for a RAID, to be used as backup storage over the network. As it turns out, that was perfectly realistic and not too much work, at least if you know what you are doing. I learned a lot.

In the following I describe my actions in detail - but there is no reason for you to follow this too closely. E.g., the distro needn't be SuSE 8.0. Because the machine was so old and only had 32MB of memory, I wanted software as old as reasonably possible - but the kernel of 7.0 seemed to need a patch for RAID, so 8.0 it was. In retrospect something like Damn Small Linux might have been a better choice.

For the other machine, Knoppix is not mandatory either - but using it has the advantage of forgetting everything the next time you boot, thus giving you a clean slate for the next attempt. This is very important for me because I do a lot of experimenting. In particular, SSH settings don't need to be retained, so it doesn't bother me particularly to enter passwords.

In all probability the necessary software is already available due to a normal GNU/Linux installation -- other than the RAID tool, 'mdadm', perhaps. Certainly Knoppix includes everything that's needed. It is up to you to ensure that the required packages have been installed. E.g., if you don't want NFS, then don't install it.

There are 4 major parts to this article: Network Installation, RAID, SSH, and NFS. They can be read pretty much independently from each other. In particular, if you don't need to do a network installation, you can skip that part, which overlaps somewhat with NFS.

Although not a prerequesite for understanding, in conjunction with this article you should read "Encrypted Storage with LUKS, RAID and LVM2" by René Pfeiffer in Linux Gazette #140. The two articles complement each other, the common element being RAID. There, the emphasis is on encryption and file systems; here, I focus on network access.

Network Installation

I hadn't intended to do a network installation, thinking that it would be great to let a CD version of Knoppix live in the drive. But the drive refuses to open once Knoppix is booted, and besides, how could Knoppix survive in 32MB? There was no other feasible way to get an operating system on the machine. Certainly having the OS installed improves performance but in this case performance is not that important.

In spite of possible first impressions, a network installation is not significantly different from installing from a CD or DVD. The only catch is that the installation program must give you the opportunity to tell it to use the network rather than the usual device.

The most time-consuming part was to make the partition with the OS as small as possible and then during installation omit as much software as possible - not at all a trivial task. Note that both drives had been partitioned in advance.

The following steps made the CD-drive on the 2nd PC available over the LAN:

  • Boot Knoppix (I used v5.0.1)
  • Check the name of the CD-ROM mount-point in "/etc/fstab", in this case "/media/cdrom"
  • Add it to "/etc/exports" along with the IP-address to be used by the 1st PC:
    	/media/cdrom 192.168.0.101(ro)
    
  • Include the IP-address in /etc/hosts.allow:
    	portmap: 192.168.0.101 : ALLOW
    	mountd: 192.168.0.101 : ALLOW
    
  • Start the network file system:
    	/etc/init.d/portmap start
    	/etc/init.d/nfs-kernel-server start
    

    Note: if this produces a segmentation fault, just try it again.

  • Put the SuSE 8.0 CD 1 in the CD-Drive and let Knoppix mount it.
    If this produces an error similar to this one --
    	mount: can't find /dev/hdc in /etc/fstab or /etc/mtab
    

    then just do it manually:

    	mount /dev/cdrom
    
  • Bring up the network:
    	ifconfig eth0 192.168.0.102 up
    

Once the CD-drive is available on the 2nd PC, installation can begin on the 1st one.

  • Insert the SuSE 8.0 boot diskette and boot the PC
  • From the menu select the following items:
    • "Installation"
      If offered a manual choice of installation device, select it; if not, the procedure will be unable to find an installation device and will activate manual selection anyhow.
    • "Kernel modules (hardware drivers)"
    • "Load Network Card Modules"
      Insert the modules diskette and then select "rtl8139"
    • "Start Installation / System"
    • "Start Installation/Update"
    • "Network"
    • "NFS"
    • "Automatic configuration via DHCP?"
      Respond "no", then enter the following as requested
      	IP-address	192.168.0.101
      	netmask		255.255.255.0	(default)
      	gateway		192.168.0.101	(default)
      	name server	<ESC>
      	NFS server	192.168.0.102
      	directory	/media/cdrom
      

      The directory entry must match the mount-point in /etc/fstab on the 2nd PC and, of course, the entry in /etc/exports there.

  • When YaST complains about not enough memory and wants to activate swap --
    "Please choose the swap partition"
    choose "hdc1".
  • At this point YaST starts reading from the CD on the 2nd PC and installation proceeds as usual, except that as little software as possible should actually be selected for installation.

If YaST asks for CDs beyond the first one, on the 2nd PC

	/etc/init.d/nfs-kernel-server stop
	umount /dev/cdrom

Remove the CD and insert the next as usual, then

	mount /dev/cdrom
	/etc/init.d/nfs-kernel-server start

Post-Installation Adjustments

Once the installation as such is finished, we need to make a couple of changes to let the PC boot unattended and permit remote access to it.

  • In /etc/inittab change the default runlevel from
    	5  Full multiuser with network and xdm
    

    to

    	3  Full multiuser with network
    
  • In /etc/init.d/network after "start)" add
    	/sbin/ifconfig eth0 192.168.0.101 up
    
  • Add to /etc/hosts.allow
    	sshd: 192.168.0.0/255.255.255.0 : ALLOW
    
  • And then, finally, using YaST add a user for remote access (logon from another machine).

RAID Setup

Once an operating system is available, setting up the RAID storage on the 1st PC is actually fairly straight-forward. During partitioning the partition type 0xfd had already been set in both /dev/hda2 and /dev/hdc2.

	Disk /dev/hda: 32 heads, 63 sectors, 4238 cylinders
	Units = sectors of 1 * 512 bytes

	   Device Boot    Start       End    Blocks   Id  System
	/dev/hda1            63    465695    232816+  83  Linux
	/dev/hda2        465696   8543807   4039056   fd  Linux raid autodetect

	Disk /dev/hdc: 15 heads, 63 sectors, 8894 cylinders
	Units = sectors of 1 * 512 bytes

	   Device Boot    Start       End    Blocks   Id  System
	/dev/hdc1            63    262709    131323+  82  Linux swap
	/dev/hdc2        262710   8404829   4071060   fd  Linux raid autodetect

For details beyond the scope of this article see the RAID HOWTO.

Before you go any further, you can make sure that your kernel supports RAID (almost certainly the case). If toward the end of the output of "dmesg | less" you see something like the following, you should be in good shape:

	SCSI subsystem driver Revision: 1.00
	request_module[scsi_hostadapter]: Root fs not mounted
	request_module[scsi_hostadapter]: Root fs not mounted
	md: md driver 0.90.0 MAX_MD_DEVS=256, MD_SB_DISKS=27
	md: Autodetecting RAID arrays.
	 [events: 00000000]
	md: invalid raid superblock magic on hda2
	md: hda2 has invalid sb, not importing!
	md: could not import hda2!
	 [events: f6928712]
	md: invalid raid superblock magic on hdc2
	md: hdc2 has invalid sb, not importing!
	md: could not import hdc2!
	md: autorun ...
	md: ... autorun DONE.
	NET4: Linux TCP/IP 1.0 for NET4.0
	IP Protocols: ICMP, UDP, TCP, IGMP
  • If not already available, download the mdadm tool as referenced in the HOWTO and install it
    	rpm -hUv mdadm-2.6-1.src.rpm
    
  • Create /etc/raidtab
    	raiddev /dev/md0
    		raid-level	1
    		nr-raid-disks	2
    		nr-spare-disks	0
    		persistent-superblock 1
    		chunk-size	4
    		device		/dev/hda2
    		raid-disk	0
    		device		/dev/hdc2
    		raid-disk	1
    
  • Prepare the RAID
    	mkraid /dev/md0
    
  • While mkraid is running (this may take quite a while, depending on the size of the RAID partitions) create a mount-point
    	mkdir /DATA
    
  • Make it accessible by everyone
    	chmod a+w /DATA
    
  • And add it to /etc/fstab
    	/dev/md0 /DATA ext2 defaults 1 2
    
  • Once mkraid is done, create a file system
    	mkfs.ext2 /dev/md0
    

SSH Access

At this point the RAID storage can be used for simple backup. However, to enable network access to the network backup PC /etc/hosts.allow on it must always permit sshd (see the entry under NFS Access, below) or at least '/etc/hosts.deny' must not prevent sshd.

Assuming that the network is functional and sshd is up on the network backup PC, the following will copy a file to it from some other PC:

	scp /tmp/junk.exe web@192.168.0.101:/DATA/throw_this_away_now

The very first time you do this, you will be asked whether you trust this connection and if you approve, the host key will be permanently stored on the local PC (thus my preference for Knoppix when experimenting). Every time you use any ssh command you will have to enter the password of the remote user, unless you have gone to the trouble to set up appropriate keys on both machines. At Linuxmafia.com/ssh you can find lots of references to SSH including an FAQ.

It is just as easy to retrieve a file from the network backup PC

	scp web@192.168.0.101:/DATA/throw_this_away_now /tmp/junk.exe

You don't even really need a keyboard or monitor on that PC, since everything can be done through SSH; just log on as the remote user. After entering the password, as usual, you can switch to root:

	knoppix@4[knoppix]$ ssh -l web 192.168.0.101
	The authenticity of host '192.168.0.101 (192.168.0.101)' can't be established.
	RSA key fingerprint is 87:5f:41:fb:4d:32:9d:d3:f9:e4:d1:9d:6f:23:4a:fb.
	Are you sure you want to continue connecting (yes/no)? yes
	Warning: Permanently added '192.168.0.101' (RSA) to the list of known hosts.
	web@192.168.0.101's password:
	Last login: Sat Nov  3 13:48:43 2007 from 192.168.0.102
	Have a lot of fun...
	web@linux:~> su
	Password:
	linux:/home/web # cd
	linux:~ # ls
	.   .bash_history  .gnupg  .viminfo  bin      nfs_on
	..  .exrc          .mc     .xinitrc  nfs_off
	linux:~ # pwd
	/root
	linux:~ #

NFS Access

The following steps performed on the network backup PC will enable access to a directory on it from another machine on the LAN as if it were a directory local to that machine. For details beyond the scope of this article see the NFS HOWTO.

  • Initially only the portmapper is active
    	rpcinfo -p localhost
    	   program vers proto   port
    	    100000    2   tcp    111  portmapper
    	    100000    2   udp    111  portmapper
    
  • Add to /etc/hosts.deny
    	ALL : ALL
    
  • Add to /etc/hosts.allow
    	sshd: 192.168.0.0/255.255.255.0 : ALLOW
    	portmap: 192.168.0.0/255.255.255.0 : ALLOW
    	lockd: 192.168.0.0/255.255.255.0 : ALLOW
    	mountd: 192.168.0.0/255.255.255.0 : ALLOW
    	rquotad: 192.168.0.0/255.255.255.0 : ALLOW
    	statd: 192.168.0.0/255.255.255.0 : ALLOW
    
  • Add to /etc/exports
    	/DATA 192.168.0.0/255.255.255.0(rw,root_squash,sync,insecure)
    
  • Then start portmap and NFS
    	/etc/init.d/portmap restart
    	/etc/init.d/nfsserver restart
    

To access the network backup PC from another PC

  • If the network has not yet been brought up locally, do so
    	ifconfig eth0 192.168.0.102 up
    
  • Verify communication
    	knoppix@4[knoppix]$ rpcinfo -p 192.168.0.101
    	program vers proto   port
    	100000    2   tcp    111  portmapper
    	100000    2   udp    111  portmapper
    	100024    1   udp   1027  status
    	100024    1   tcp   1026  status
    	100003    2   udp   2049  nfs
    	100003    3   udp   2049  nfs
    	100021    1   udp   1028  nlockmgr
    	100021    3   udp   1028  nlockmgr
    	100021    4   udp   1028  nlockmgr
    	100005    1   udp   1029  mountd
    	100005    1   tcp   1027  mountd
    	100005    2   udp   1029  mountd
    	100005    2   tcp   1027  mountd
    	100005    3   udp   1029  mountd
    	100005    3   tcp   1027  mountd
    	knoppix@4[knoppix]$ 
    
  • And then mount the network drive locally
    	mount 192.168.0.101:/DATA /mnt
    

At this point, all data in the directory /DATA on the network backup PC can be accessed as if it were physically present on the local mount-point /mnt.

By the way, this is not only the case with the Knoppix environment we have been using here; if rpcinfo shows that NFS is functional, both Mandriva 2007 and Debian 4.0 XFce behave as described - and if the mount-point used is defined in a VirtualBox host as a "shared folder", it is available to a VM under that host.

Trust is Good, But ...

For the skeptics in the crowd and to satisfy my own idle curiosity, I dug out an ancient boot diskette, HAL 91 from 2001, with a kernel that doesn't know about RAID (2.0.39) and booted the machine. Here's what "fdisk" and "ls -l" had to say about the drives:

Disk /dev/hda: 32 heads, 63 sectors, 4238 cylinders
Units = sectors of 1 * 512 bytes

   Device Boot    Start      End   Blocks   Id  System
/dev/hda1            63   465695   232816+  83  Linux native
/dev/hda2        465696  8543807  4039056   fd  Unknown
total 259296

     4 drwxr-xr-x   2 1000     1000         4096 Nov  4 15:13 Mail/
     4 drwxr-xr-x  11 501      100          4096 Nov  4 14:56 Pictures/
     4 drwxr-xr-x   3 root     root         4096 Nov  4 15:39 TEMP/
     4 drwxrwxrwx   3 501      100          4096 Nov  4 20:47 article/
    16 drwx------   2 root     root        16384 Nov  2 11:51 lost+found/
   204 -rwxr-xr-x   1 501      100        201156 Nov  2 12:02 mdadm-2.6-1.src.rpm*
259056 -rw-r--r--   1 501      100      265004960 Nov  4 15:35 web.tgz
     4 -rw-r--r--   1 root     root          132 Nov  3 15:31 rpcinfo

Disk /dev/hdc: 15 heads, 63 sectors, 8894 cylinders
Units = sectors of 1 * 512 bytes

   Device Boot    Start      End   Blocks   Id  System
/dev/hdc1            63   262709   131323+  82  Linux swap
/dev/hdc2        262710  8404829  4071060   fd  Unknown
total 259296

     4 drwxr-xr-x   2 1000     1000         4096 Nov  4 15:13 Mail/
     4 drwxr-xr-x  11 501      100          4096 Nov  4 14:56 Pictures/
     4 drwxr-xr-x   3 root     root         4096 Nov  4 15:39 TEMP/
     4 drwxrwxrwx   3 501      100          4096 Nov  4 20:47 article/
    16 drwx------   2 root     root        16384 Nov  2 11:51 lost+found/
   204 -rwxr-xr-x   1 501      100        201156 Nov  2 12:02 mdadm-2.6-1.src.rpm*
259056 -rw-r--r--   1 501      100      265004960 Nov  4 15:35 web.tgz
     4 -rw-r--r--   1 root     root          132 Nov  3 15:31 rpcinfo

Indeed, unaware of what partition ID 0xfd means, this kernel just shows the contents. It does seem that the redundancy is working quite nicely!

Ahh, but what about behavior in failure-mode, an emergency? At least in this environment that would be a bit difficult to check out with any degree of certainty. I'm not about to open the PC and destroy one of the hard-drives just to make sure... However, for the curious, here is an excerpt from what "dmesg" reports immediately after a power-failure.

[ You can check for the state of the RAID valumes by looking into the /proc filesystem. /proc/mdstat shows all active RAID devices and their states. An "U" means "up", a "_" denotes a missing device. --- René ]

Conclusion

For my purposes this is an optimal solution to my backup problem. At no further expense, it is now possible to back files up either via ssh or NFS - and the backup is doubly safe due to the RAID.

[ Make sure you have the disks of your RAID device connected to different controllers if using PATA hardware. There are situations when the kernel might deactivate the whole bus system of a single controller. This means both master and slave device will be unavailable. A RAID1 will be gone and a RAID5 will be invalid and required manual recovery (which may not always work as intended). --- René ]

Certainly this is not something any large organization would want to consider; old hardware is not as energy-efficient as modern equipment. Also, the drives are much closer to that feared MTBF than something you'd want to rely on in a commercial environment. Besides, while fine for parking a tarball, if you want to use this as a file server, you'll have to give some thought to UID/GID conflicts, etc. - at least if the operating systems involved have differing algorithms, as is the case with SuSE 8.0 and Knoppix 5.0.1.

But in this limited environment, where all the equipment gets shut down at the end of the day, none of that matters. Indeed, the RAID backup server doesn't even have to be turned on unless there is a need to back something up securely.

[ One last note about secure storage for backups: bear in mind that your backup usually holds a copy of your important data - and is thus a juicy target for break-in attempts. Someone getting hold of a backup system with an unlocked, unencrypted disk partition doesn't need any cracking expertise to get to the interesting details. --- René ]


By Edgar Howell
dgar is a consultant in the Cologne/Bonn area in Germany. His day job involves helping a customer with payroll, maintaining ancient IBM Assembler programs, some occasional COBOL, and otherwise using QMF, PL/1 and DB/2 under MVS.