What is the filesystem in Unix

Basic properties of the Unix file systems

The easiest way to demonstrate the basic properties of Unix file systems is to compare them to a known file system. The DOS FAT system is ideal because it has many similarities to Unix and is well known.

In the DOS file system, every file has a few attributes that are stored in its directory entry on the hard disk. These attributes are:

  • File names (8 bytes)
  • Extension (3 bytes)
  • Size (4 bytes)
  • Date / time of the last write access (4 bytes)
  • Attribute byte (archive, directory, system, hidden, ReadOnly) (1 byte)
  • Number of the first cluster used

In contrast, Unix files have a much larger set of attributes, the most important of which are:

  • File names (265 bytes)
  • Size (4 bytes)
  • Date of the last write access (4 bytes)
  • Date of the last access (also read) (4 bytes)
  • Date of the last status change (4 bytes)
  • Owner UID (2 bytes)
  • GroupID (2 bytes)
  • Number of file names (hard links) (2 bytes)
  • Access mode and file type (2 bytes)
  • Inode number
  • Multiple references to the data blocks

Filenames

The first major differences can already be seen in the file name. The maximum length of a Unix file name is limited to 256 characters. A major difference to DOS / Windows filenames is the fact that Unix makes a strict distinction between upper and lower case. That means the file names

File1.txt
file1.txt
FILE1.TXT

denote three different files.
Theoretically, a Unix file name can contain any character, including spaces, etc. - but you make your life unnecessarily difficult with such file names. We will see later, when dealing with the shell, how such names can be used anyway. To start with, you are on the safe side if only letters, numbers, periods, underscores and hyphens are used.

The point has no special meaning in Unix file names, so any number of them can be included in the name, the division into name and extension, as under DOS, does not exist here. But that also means that Unix does not use this extension to find out what type of file it is.

The point has a special meaning when a file name begins with a point, so the normal ls command does not display the file, so it is hidden, so to speak.

Access date

In contrast to DOS, Unix knows not just one date per file, but three. According to the DOS date, the first designates the last write access, i.e. the date from which you can see when the file was last changed. This date is called mtime (modification time).

The second date denotes the last access at all, so read accesses are also noted here. This date is called atime (access time).

The third date finally registers the last status change, e.g. change of access rights, owner, group etc. This date is called ctime (change time) Of course, this date can only work if the data carrier is not mounted as ReadOnly. Logically, it cannot work on CD-ROMs, for example.

owner

Each file belongs to one user. As a rule, the owners are the users who have created a file. The system files usually belong to the system administrator or specific administrative users. It is not the username but the numeric UserID that is stored in the file system.

Group membership

Each file belongs to exactly one group. This group membership can be varied as required. It is not absolutely necessary that the owner of a file must also be a member of the group, although in practice this is usually the case. The group membership plays a major role in the question of access rights to a file.

Number of filenames

Under Unix, files can have multiple file names, so-called hard links. The number is given here. Hard links are explained in more detail later in the Unix file types.

Access mode and file type

Every file has a so-called access mode, which describes what kind of file it is and who is allowed to do what with it. To do this, the 16-bit number that represents these relationships is divided into five values ​​(once 4 bits for the file type, four times 3 bits for the rights).

The most important values ​​are represented by the ls -l command. The following abbreviations are used:

File typeRights of
Owner
Rights one
Group member
Rights of all
Others
- Normal file
d directory
l Symb. Link
b Block orient. device
c Character orient. device
s Socket
p Named pipe
r w x r w x r w x
4 2 1 4 2 1 4 2 1
0-7 0-7 0-7

The values ​​of the individual rights for owner, group member and the rest are added together so that octal digits (0-7) result. The meaning of the individual letters or numbers is simple:

r or 4Right to read
w or 2Write permission
x or 1Right of execution

Read right means that a user who has this right can read the content of the file. Write authorization means that the user can change the content of the file or delete the file. The right to execute only relates to programs; the user who has the right to execute a program is allowed to load and execute the program. The right to execute has a special meaning for directories, namely the right to change to the directory.

So let's assume that the file Testfile.txt has the following mode string, output by the ls -l command:

We would have to split the string as follows:

File typeUserGrouprest

This shows that it is a regular file (first hyphen). The owner of the file (hans) has rw- rights, so he can read and write. Members of the user group are only allowed to read (r--) and the rest of the world has no rights to this file (---).

In numerical terms, we could represent the rights as follows:

The numerical value of the entire access mode in octal digits is therefore 640.

For programs, there is a fourth specification, which is also obtained with the octal digits 1, 2 and 4 or a combination of these digits. This fourth entry is made BEFORE the three usual entries, so strictly speaking, the file from the last example would have the numerical representation 0640. This additional entry will be described in more detail later on in the user system.

 


[Course main page] [Linux courses] [Main page Linux practice]

© 1999, 2000, 2001 by F. Kalhammer