Sunday, July 19, 2009

System Startup and the Kernel

System Startup

* The Linux System Startup handout covers the sequence of startup steps in detail. The important parts to keep in mind are: the bootloader (LILO or GRUB), kernel messages (visible using dmesg), the init process and the rc scripts, and the roles of /etc/profile, .bash_profile, and .bashrc.

Bootloaders

* LILO (LInux LOader) is the most common bootloader and the one installed in the lab.

* The configuration file for LILO is /etc/lilo.conf.

* The bootloader's job is to select an operating system/kernel version to load.

* Typically the OS will be Linux but it is possible to use LILO to boot MS-DOS or Windows 9x on dual-boot machines.

* LILO can also boot different versions of kernels that are installed on the machine.

* LILO is responsible for the Red Hat logo screen that appears while the machine is starting up. It can be configured to allow the user to choose an operating system and/or kernel version. There is usually a default option as well if the user doesn't make a selection in a given period of time.

* LILO cannot be used to dual-boot OS/2, Windows NT/2000/XP. Those operating systems have to be installed in the MBR (Master Boot Record) and so conflict with LILO if it too is installed in the MBR.

* It is possible to use WinNT/2K/XP as the "master" bootloader which in turn can call LILO on a Linux partition, but this can be tricky to configure. See the articles: Linux + Windows 95 mini-HOWTO, and NT OS Loader + Linux mini-HOWTO in the Resources section of this site for more information.

* GRUB (GRand Unified Bootloader) is another bootloader. It's part of the GNU Project.

* GRUB is much more flexible than LILO, but in most respects does exactly the same thing as LILO.

The init Process

* The init process is the master process in Linux. It has Process ID 1 (PID). init is responsible for launching all other processes.

* init is configured through the /etc/inittab file.

* init's job during startup is the following: set the default runlevel, run the rc.sysinit script (usually in /etc/rc.d), run the rc script (usually in /etc/rc.d), install an interrupt handler to catch Ctrl+Alt+Del sequences, and launch the tty processes that handle each of the virtual terminals. Each of these tty processes in turn runs the login process that is responsible for accepting the user's login name and password, and then launching the user's shell and running the /etc/profile script for global settings.

Runlevels

* A runlevel is a "mode of operation that provides a particular set of services."

* Linux has 7 runlevels: 0 through 6.

* Runlevels 0, 1, and 6 are standardized across distributions, but the meanings of the other runlevels can vary considerably.

* Runlevel 0: halts the system

* Runlevel 1: single-user text mode. Usually used to diagnose and repair serious problems.

* Runlevel 2: multi-user text mode without networking (Red Hat). Rarely used.

* Runlevel 3: multi-user text mode with networking (Red Hat). The usual runlevel for servers that don't use the X Windows GUI.

* Runlevel 4: not used (Red Hat)

* Runlevel 5: multi-user X Windows mode with networking (Red Hat). The usual runlevel for graphical workstations (as in the lab).

* Runlevel 6: reboots the system

* The runlevel program (in /sbin) displays the current and previous runlevels.

The rc Script

* The rc script (usually in /etc/rc.d) is called by init to launch the services depending on the current runlevel (also set by init).

* Simply put, rc runs all of the scripts in one of the /etc/rc.d/rcN.d directories (N = runlevel) in alphabetical order.

* In each rcN.d directory, there will be a number of files beginning with "K" (Kill) and "S" (Start). Following the K or S, a two digit number is used to precisely order the files. These files are usually symbolic links to scripts in the /etc/rc.d/init.d directory. Links are used so that multiple copies of script files don't have to be copied amongst the various runlevel directories. Each of these scripts is responsible for starting or stopping (killing) a service: printer daemons, web servers, databases, email, firewalls, etc.

* rc first runs all of the Kill scripts in order. This can be thought of as a "clean-up" operation for runlevels other than 0 and 6. As expected, the scripts listed in the rc0.d and rc6.d directories are mostly Kill scripts since there is very little that needs to be started when the system is shutting down.

* rc then runs all of the Start scripts in order.

* For runlevels 2 through 5, the last Start script is usually /etc/rc.d/rc.local. The rc.local script is a handy place to put any of your own service initialization for which you don't have a full startup script like those in the /etc/rc.d/init.d directory.

* The green "OK" messages that appear during startup (or shutdown) are produced as the rc scripts iterates through the rcN.d directory.

The Shell rc Scripts

* The /etc/profile script contains global settings for all users (including root). This is the place to put any special settings that apply to everyone (e.g. "safe" versions of rm, mv, and cp).

* The .bash_profile script (in the user's home directory) contains any settings that only apply to that particular user. .bash_profile is only run once when the user first logs in. It usually calls the .bashrc script to do most of the work.

* The .bashrc script (in the user's home directory) also contains user-specific settings, but it is run for every new shell the user opens. This is usually the best place to put personal settings.

Kernel

/boot

* The kernel is typically located in the /boot directory.

* The compressed version of the kernel is called "vmlinuz" ("vm" = Virtual Machine, "z" = Compressed/Zipped). This is usually a symbolic link to the full name of the compressed kernel which may include the kernel version number.

* Many different kernel versions may be in /boot. The vmlinuz link will usually point to the most recent one, but LILO can still be used to select any particular kernel during startup.

Compiling the Kernel

* While not common, it is possible to compile the kernel itself from source.

* The main reason for doing so is to trim away any unnecessary code, thereby producing a lean 'n' mean kernel that consumes less memory. This may be necessary to run Linux on a machine with limited hardware (esp memory).

* The sorts of things that can be pruned out of the kernel include: device drivers, floating-point emulation, SMP (multi-CPU) support, loadable modules, networking, PCI support, parallel port support, Plug 'n' Play card support, ISDN, sound, SCSI, etc.

Kernel Modules

* While Linux is thought of as a "monolithic" kernel (meaning that it is one big program with all of the device drivers built-in), recent versions of the kernel support loadable modules.

* These modules, usually device drivers, can be manually or automatically loaded as needed, rather than having to re-compile the kernel each time a new piece of hardware is added to the machine.

* The modules themselves are stored in the /lib/modules/kernel_version directory.

* The /boot/module-info file contains short descriptions of the modules installed on the system.

* The kerneld (replaced by kmod in more recent distributions) daemon process can automatically load or remove modules as they are needed.

* The /etc/modules.conf file (this used to be called conf.modules) lists the modules that should be loaded by the system during startup.

* The lsmod command lists modules currently loaded in the kernel. Non-root users can run this command, but may have to run it as /sbin/lsmod since it won't be in the user's path.

Shell

Environent Variables

* Linux makes extensive use of environment variables for configuration or personalization settings.

* By convention, environment variable names are always fully capitalized.

* When referring to an environment variable, the variable is prefixed with "$".

The Path

* Linux has the concept of a "path" very much like in Windows/MS-DOS. The path is a set of directories that the shell searches though to locate a command you enter on the command line.

* The current path is stored in the $PATH environent variable. Directories are separated by colons (:).

* Unlike MS-DOS, Linux does not consider the current working directory part of the path unless the "." directory is explicitly included in $PATH.

* To run a program or script in the current directory, the program name must be prefixed with "./" to indicate to the shell that you mean to execute the program in this directory.

* The current directory (.) is not usually included in $PATH for security reasons.

Aliases

* Aliases are handy synonyms for commands, or specific versions of commands.

* The current aliases can be listed by running the alias command without any parameters.

* Aliases can be deleted using the unalias command.

Quotes

* Single quotes (') and double quotes (") have slightly different meanings to the shell.

* Use single quotes to specify literal strings (i.e. the shell won't evaluate anything within the string).

* Use double quotes if you want environment variables within the string to be evaluated.

Redirection

* Most Linux programs can be thought of as possessing one input and two outputs. The input is called Standard Input. The outputs are called Standard Output and Standard Error.

* By default, both Standard Output and Standard Error are sent to the terminal screen.

* By using the output redirection operators, the output can be diverted to files, or elsewhere.

* > redirects Standard Output

* 1> also redirects Standard Output

* 2> redirects Standard Error

* &> redirects both Standard Output and Standard Error

* The output of programs can be redirected to a file, or perhaps /dev/null which just throws out anything it's given.

* If redirecting to a file, the file will be overwritten using the above operators. Use >> to append to the file.

Tarballs

* Large groups of files are usually distributed in what is called a compressed tarball format.

* A "tarball" is a file created using the tar command (Tape ARchiver) which can group together many files into a single archive file.

* The tar command does not compress its contents.

* Tarballs are usually identified by the filename suffix ".tar".

* The gzip (and gunzip) programs are used to compress (and uncompress) individual files or tarballs.

* A gzip-ed tarball usually has the suffix ".tar.gz", or ".tgz" for short.

* Examine the commands in the handout, or the man pages, for more details.

 
Things You Should Know About Linux !!!