Process Scheduler in Linux Operating System

Process Scheduler in Linux Operating System:

1. Goal:

Process scheduling is the heart of Linux operating system. The process scheduler has the following responsibilities:

Allow processes to create new copies of themselves
Determine which process will have access to the CPU and effect the transfer between running processes
Receive interrupts and route them to the appropriate kernel subsystem
Send signals to user processes
Manage the timer hardware
Clean up process resources when a processes finishes executing

The process scheduler also provides support for dynamically loaded modules. These modules represent kernel functionality that can be loaded after kernel has started executing. The loadable module functionality is used by virtual file system and network interface.

2. External Interface:

The process scheduler provides two interfaces. First, it provides a limited system call interface that user processes may call. Secondly, it provides a rich interface to the rest of the kernel system.

Processes can only create other processes by copying the existing process. At boot time, Linux system has only one running process: init. This process then spawns others that can also spawn off copies of themselves through fork () system call. The fork () call generates a new child process that is a copy of its parent. Upon termination, a user process (implicitly or explicitly) calls the _exit () system call.

Several routines are provided to handle loadable modules. A create_module () system call will allocate enough memory to load a module. The call will initialize the module structure, described below, with the name, size, starting address, and initial status for the allocated module. The init_module () system call loads the module from disk and activates it. Finally, delete_module () unloads a running module.

Timer management can be done through the setitimer() and getitimer() routines. The former sets a timer while the latter gets a timer's value.
Among the most important signal functions is signal(). This routine allows a user process to associate a function handler with a particular signal.

3. Subsystem Description:

The process scheduler subsystem is primarily responsible for the loading, execution, and proper termination of user processes. The scheduling algorithm is called at two different
points during the execution of a user process, first, there are system calls that call the scheduler directly, such as sleep(). Second, after every system call, and after every slow system interrupt (described in a moment), the schedule algorithm is called.

Signals can be considered an IPC mechanism, thus are discussed in the inter-process communication section.

Interrupts allow hardware to communicate with the operating system. Linux distinguishes between slow and fast interrupts. A slow interrupt is a typical interrupt. Other interrupts are legal while they are being processed, and once processing has completed on a slow interrupt Linux conducts business as usual, such as calling the scheduling algorithm. A timer interrupt is exemplary of a slow interrupt. A fast interrupt is one that is used for much less complex tasks, such as processing keyboard input. Other interrupts are disabled as they are being processed, unless explicitly enabled by the fast interrupt handler.

The Linux OS uses a timer interrupt to fire off once every 10ms. Thus, according to our scheduler description given above, task rescheduling should occur at lease once every 10ms.

4. Data Structures:

The structure task_struct represents a Linux task. There is a field that represents the process state; this may have the following values:

Running
Returning from system call
Processing an interrupt routine
Processing a system call
Ready
Waiting

In addition, there is a field that indicates the processes priority, and field, which holds the number of clock ticks (10ms intervals), which the process can continue executing without, forced rescheduling. There is also a field that holds the error number of the last faulting system call.

In order to keep track of all executing processes, a doubly linked list is maintained, (through two fields that point to task_struct). Since every process is related to some other process, there are fields which describe a processes: original parent, parent, youngest child, younger sibling, and finally older sibling.

There is a nested structure, mm_struct, which contains a process's memory management information, (such as start and end address of the code segment).

Process ID information is also kept within the task_struct. The process and group id are stored. An array of group id's is provided so that a process can be associated with more than one group. File specific process data is located in a fs_struct substructure. This will hold a pointer to the inode corresponding to a processors root directory, and it's current working directory.

All files opened by a process will be kept track of through a files_struct substructure of the task_struct. Finally, there are fields that hold timing information; for example, the amount of time the process has spent in user mode.

All executing processes have an entry in the process table. The process table is implemented as an array of pointers to task structures. The first entry in the process table is the special init process, which is the first process executed by the Linux system.

Finally, a module structure is implemented to represent the loaded modules. This structure contains fields that are used to implement a list of module structure, a field which points to the modules symbol table, and another field that holds the name of the module. The module size (in pages), and a pointer to the starting memory for the module are also fields within the module structure.

5. Subsystem Structure:

The below figure shows the Process Scheduler subsystem. It is used to represent, collectively, process scheduling and management (i.e. loading and unloading), as well as timer management and module management functionality.

6. Subsystem Dependencies:

The process scheduler requires the memory manager to set up the memory mapping when a process is scheduled. Further, the process scheduler depends on the IPC subsystem for the semaphore queues that are used in bottom-half-handling Finally, the process scheduler depends on the file system to load loadable modules from the persistent device. All subsystems depend on the process scheduler, since they need to suspend user processes while hardware operations complete.

Taqi Shah Blogspot

Education Material, Softwares, Games and much more

Process Scheduler in Linux Operating System

Blog Archive

Taqi Shah Twitter Timeline

Taqi Shah Facebook Page

Followers