G53OPS - Operating Systems

This course is run at the The University of Nottingham within the School of Computer Science & IT. The course is run by Graham Kendall (EMAIL : gxk@cs.nott.ac.uk)

File Structure

Files can be structured in various ways. One approach adopted by UNIX, MS-DOS and Windows is simply to have the file stored as a sequence of bytes. It is up to the program that accesses the file to interpret the byte sequence.

This structure is useful in that the programs can do whatever they like as the operating system does not impose any restrictions. In addition, the program can decide which extension to give the file so that the correct application is used if the file is opened using its associations.

One way of adding structure to a file is to split the file into fixed length records. This method is common on mainframes and can be traced back to the use of punched cards. When it became possible to store card (images) on disc it was sensible to read 80 bytes at a time as the programmer knew that each 80 bytes represented a card.

But, a record size of 80 was just the first step. By defining the file using a different value a different record size cold be set up, to meet the requirements of the programmer (e.g. 160 for prints produced for a line printer).

A development of the fixed length record was a file that contained variable length records. It was quickly recognised that a fixed length record could be wasteful of space. For example, if you defined a file to contain names and addresses, every record had to be the same size, even though some people have short names and some have long names. In fact, the field size was dictated by the longest possible record that could possibly stored.

Some operating systems recognised this limitation and did not waste so much space but the programmer still had to allow space in his/her program.

Variable length records allowed each field to have its size specified so that a surname field (for example) could have a different size to a line of an address.

However, this method also has its problems. Consider what they are. I have given one problem at the end of this handout (see question 1).

A further type of file structure is to have an index (possibly more than one) within the file. Each index entry points to a record within the file. If the file is sorted on the index (though the data itself does not have to be sorted) then random access is possible.

These types of files are often used on mainframe computers (and thus large data processing systems) and, although the structure of a database is a lot more complicated, they use the concept of an index to access the data as fast as possible.

Due to the way the file is structured they may need to be reorganised after a time else they will be become inefficient. The reasons for this were covered in the lecture

File Access

Above, whilst looking at file structures, we mentioned random access. We ought to say what we mean by this.

There are essentially two ways to access a file. We can start at the beginning and read every record until we find the one we require. When mainframe operating systems kept the majority of their data on magnetic tape only sequential access was required. This led to a model of batch updating which was explained in the lectures.

But, as disc space became cheaper, more and more files were stored on disc and this led to the ability to access data randomly. Or more exactly, it allowed the program to directly access the record it required without having to read all the data that preceded it.

Last Page

Back to Main Index

Last Updated : 23/01/2002