Archiving data

Paul Kerry, version: 2004-05-04


Disks available

Even though the hard disks are large within the ultracam computer systems, you must remember that they are not infinite voids of empty space.
You must regularly monitor disk usage (especially on ucam2 which can create 6GB/hr) by using the df -h command. If disk capacity is above 90%, then I/O is impaired. If for example, the capacity of /data on ucam2 hits 100% while data is being written, then the system will crash. You should either move, delete or compress files before this situation arises.

ultracam disks

(information taken from df -h)

Filesystem Size Mounted on
/dev/sda2 27G /
/dev/sda1 91M /boot
/dev/sda4 38G /home
/dev/sdb1 67G /ultracam_data
/dev/hdb1 187G /archive
/dev/hda1 56G /win

The /ultracam_data disk is for copying data from ucam2 so backups can be created.

The /archive disk is for long term storage of data files. You should read the further instructions on the use of /archive.

The /win disk is the Windows XP Professional disk. Files can be copied from this disk but not to it.

ucam2 disks

(information taken from df -h)

Filesystem Size Mounted on
/dev/hda2 5.1G /
/dev/hda1 99M /boot
/dev/hdb1 20G /ultracam
/dev/hdb2 94G /data
/dev/sda1 69G /data1
/dev/sdb1 69G /data2
/dev/sdc1 69G /data3
/dev/sdd1 69G /data4
/dev/sde1 138G /data5

The /ultracam disk contains the control software.

The /data disk is where the instrument writes data to (the ultracam software actually writes data to /ultracam/data - do not remove the data symbolic link in /ultracam).
Ultracam raw data is created in two files, a data file (.dat) and a xml header file (.xml). Both files are saved with the prefix "run", followed by a number from 001 to 999.

/data1, /data2, /data3, /data4 and /data5 make up the SCSI data array which is housed in the lowest crate in the rack unit. The SCSI data array is not a RAID system. These disks are for saving recent data from the /data disk. The filesystem in use on these disks is reiserfs (the other disks are ext3 filesystems).


Copying data from ucam2 to ultracam

At the end of the night you should

on ucam2

on ultracam

Copying the files may take several hours as the network speed is around 10MB/sec.
DO NOT USE scp -r AS THIS COULD CRASH THE COMPUTER BY OVERLOADING THE FILESYSTEM BUFFERS.
You can automate copying the files by using a script that is provided as a template.

on ultracam


General notes about writing data files

DDS tapes, DVD-R and CD-R disks are kept in one of the ULTRACAM packing crates.

It is likely that you will need to use multiple DVD-Rs or tapes to write an entire observing run.
If you are observing on multiple nights, then it is likely you will end up with files of the same name, ie run001.dat and run001.xml from 1st Jan and run001.dat and run001.xml from 2nd Jan. Be careful that you do not copy files/create links with the same name onto your media. It is a good idea to create separate subdirectories using the date of observation for the directoryname.


Tape Drive

The tape drive within ultracam is a Sony SDT-10000 DDS4 20GB (native) device. This also supports all lower DDS versions.
If the status light is on and then flashes off every 4 seconds, then the drive needs cleaning. There is a cleaning cartridge within the computing box file. Simply insert the cleaning tape and wait a few moments - the lights will flash while cleaning is in progress and then the tape will eject automatically. All the lights should now be off. Please write the date on the label so there is a record of how many times the tape has been used.

The tape drive names are

Tapes are written in the following style

When using multiple tar files on a tape, a marker is automatically inserted when you start the next write.

Useful tape drive commands (man mt for more information)

Writing files to tape

tar is an easy way of writing tapes.
At the time of writing this guide, the version in use is GNU tar 1.13.25-2 from the debian distribution.

other useful tar options can be found via man tar or info tar but selected ones are

GNU tar uses relative archives by default. This means that you can extract a tarfile into any location. Other tar programs usually use absolute archives which require root access on a machine to create the complete path from "/".
It is a good idea to write on the tape which method you used to write the tape in the first instance - in the future you may not remember how you did it and you will no doubt be pestering your system manager!

Verifying tapes

Extracting tapes


DVD-R Drive

The DVD writer is an IDE interfaced Sony DRU-500AX which supports CD-ROM, CD-R, CD-RW, DVD-ROM, DVD-R, DVD-RW, DVD+R and DVD+RW.
At the time of writing, Linux software writes DVD-R and DVD-RW formats which are based on the standards from the DVD Forum. The other formats are available from Windows if required.

Currently supported maximum write speeds are...

CD-R 24x NB: University of Sheffield Intenso disks are 16x.
DVD-R 4x NB: University of Sheffield Verbatim disks are 4x, Sony disks are 2x.
DVD-RW 2x

Take note of the maximum write speed of the media you are using. DVD-R disks must be specified as being 4x capable to achieve the maximum speed otherwise the drive defaults to a lower writing speed automatically.

Take note that the capacity of media may not be a true representation of the filesystem equivalent size. 4.7GB DVD-R disks for example, hold around 4700000000 bytes whereas the true value of 4.7GB is 5046586572 bytes. This can cause confusion if you are using commands like ls -h, df -h or du -h. When creating iso images, filesystem information is also written onto the disk which takes a small amount of capacity.

If you do use the -h options, make sure that the files are 4.4GB or under.

The Yamaha SCSI CD-R drive is now disconnected and is retained as a spare drive in case of equipment failure.

Creating a CD-R

Creating a DVD-R

All data runs obtained in a night should be written uncompressed to DVD-R. This is the master copy and is kept at the University of Sheffield.

Always label the DVD BEFORE WRITING in case you scratch the surface layer. You should have the following details on both the disk and jewel case...

It takes approximately 15 minutes to write a full DVD-R at 4x speed.

Observers can make their own DVDs containing just the data they require.

Make sure that you verify the DVD after burning.

Instead of copying large amounts of data around the system, symbolic links can be created. These are simply a pointer to a file in another directory.

A symbolic link is created by the following command...

If you use rm on a symbolic link, then the link is removed and not the actual file.

Examples of symbolic links can be found in the /home/star/dvd/2004-04-29/sym_links/disk1 directory on ultracam.

You can run full archive backups while the system is being used by observers if you use the nice command, examples of which are...

Do not use the other DVD drive or the /archive disk while writing a DVD as you may cause a buffer underrun (this is caused by simultaneously accessing IDE devices on the system and reducing data throughput).

Do not put nice before the cdrecord-prodvd command.

If you are unsure as to the speed capability of the disk, you may still specify speed=4 as the software automatically detects the speed by probing the disk when inserted in the drive.

Note that -r option is now preferred over -R. Please review the manual pages for further information.

By using the -v option, the current status of the input and output buffers can be viewed as well as the writing speed and MBytes written. Do not worry if it appears that cdrecord-prodvd has hung at the start - it takes at least sixty seconds before the status information is echoed to the terminal. If the input or output buffers go to zero percent, then the disk will be corrupted. You can usually increase the buffer values by stopping other processes running on the system.

A typical successful writing session's last few lines of output should look like this...

If the "Min drive buffer fill" hits 0%, then the DVD is corrupt and you will have been probably running too many processes on the system. Try again with a new disk, making sure the system is in a quiescent state.

If you get a large amount of SCSI errors instead of the output above, then the media itself is most likely to have a fault on the writing layer. Try again with a new disk.

Walk-through

You will more than likely require multiple disks so you need to work out which files will go onto which DVD.

In a different directory on ultracam, you are going to create a range of symbolic links to the files instead of copying large files around the system.

Verify the DVD

Do not verify a DVD while writing another DVD as you could cause a buffer underrun to the disk being written.

You should have written all of the information on the label before verification.

Eject the DVD from the writer and put it in the DVD-ROM drive which has a much faster access time.

Eject the DVD and you are finished!


The /archive disk

The 187GB /archive disk should be used when /ultracam_data (or /home) is getting full.
There is a performance decrease compared with the SCSI disks in use within the ultracam system as /archive is an IDE disk. It is possible to overload the filesystem buffers if you try to copy too much data at the same time. You should therefore copy files individually or use a small group of files. It is a good idea to run diff to check the files afterwards.


Valid XHTML 1.0!