By DAN CALLOWAY
Published 28 January 2010 @ 02:41 UTC

WEAVERVILLE, NC – An Operating System (OS) is designed to run on either desktop or network platforms. For the sake of brevity in this article, I will limit my discussion, for the most part, to user desktop platforms.

A desktop OS is essentially designed to be the interface between the hardware (including the CPU) and the user, wherein it is primarily responsible for the management of the hardware and activities that run on the computer as well any applications that may be running within the OS. The OS also provides the graphical user interface (GUI) where it exists, in order to make the computer more user friendly for the user. As the host for running applications on the computer, the OS is also responsible for the hardware, scheduling of system resources to support the applications, and the access protection for the hardware. When services are requested on the desktop, the kernel of the OS creates a process by assigning memory and other resources, establishing a priority for the process (in multi-tasking systems), loading program code into memory, and executing the program. The program then interacts with the user and/or other devices and performs its intended function.

Regardless of OS installed on the desktop, OSes provide application services to both programs running on the computer or to the user through the use of Application Program Interfaces (APIs) or, in some instances, program system calls. When invoked by the user or by another program running on the computer, system calls or APIs request services from the OS, pass parameters, and receive the results of the operation. As mentioned, users can interact with the OS either through the GUI or by Command-line Interface (CLI) to request services from the OS. On desktop computers, these interfaces are usually considered part of the OS. However, on larger multi-user systems running UNIX, UNIX-like, or VMS OSes on mainframes or mini-mainframes, the user interface is typically a program that runs outside of the OS itself.

As parallelism increases on the desktop platform; that is, as more and more processors are added and processing takes place through multi-core and multi-threaded environments, the impact that such increases in parallelism has on the OS is related to what is referred to as application workload or process scheduling and is directly related to this increased complexity. Thus, increasing parallelism would have a detrimental impact on OS functionality unless the OS is redesigned to accommodate this increase. Frachtenburg and Etsion (n.d.) suggest that as the average desktop workload grows more parallel and more complex, current OSes are not adequate to support the growing parallelization to handle this increase in computer parallelism. Frachtenburg and Etsion contend that parallel process scheduling required to efficiently run desktop platforms and their applications in a supercomputing environment cannot be achieved unless the OS is redesigned to handle the increased workload. Through case studies in their paper, Frachtenburn and Etsion demonstrate that one possible solution to this inadequacy of existing OSes might be to redesign the OS process schedulers with an understanding of the requirements of all process classes and their mixes, as well the abilities of the underlying architecture.

Frachtenburg and Etsion (n.d.) state: “The predominant approach to multiprocessing in general purpose [OSes] is to treat each processing element as an independent entity—processes/threads are migrated between processing elements in an attempt to balance cache affinity needs with CPU load imbalance” (p. 2). As a result, the general-purpose scheduler within the OS is too focused on handling a small set of requirements and misses the big picture, and overlooks two requirements that are critical in maintaining performance and efficiency for parallel desktop workloads: separation of co-interfering processes and co-scheduling of collaborating processes. Thus, these are two specific redesign considerations within the OS that Frachtenburg and Etsion suggest are necessary as parallelism is increased on the workstation.

Giacomoni and Vachharajani (n.d.) concur with Frachtenburg and Etsion (n.d.) in their assumption that in order to realize the potential of pipeline-parallel software as parallelism increases on the desktop, requires a reexamination of some basic historical assumptions in OS design, including the purpose of time-sharing and the nature of applications. Multicore architectures make it possible to fully dedicate resources as needed without compromising existing OS services. Giacomoni and Vachharajani describe the minimal OS extensions necessary to support efficient pipeline parallel applications on multicore systems and support their claims with evidence produced from the domain of network frame processing.

Giacomoni and Vachharajani (n.d.) contend that “maintaining a smoothly flowing pipeline, that is a pipeline where a datum is never waiting for processor time, requires the system to provide a zero-stall guarantee” (p. 4). Furthermore, “Pipelines implemented in hardware are based on this guarantee and ensure it by having every stage operate in lockstep with a uniform stage length of 1 cycle” (p. 4.). Operating systems that run on single-processor desktops, in general, do not make this guarantee as they have been built on the principle of timesharing resources. Multicore systems are different and OSes that support them “must be able to provide abundant processing resources permitting a system to use selective timesharing and fully dedicate resources to an application for an extended period of time. With dedicated resources it is possible to achieve the zero-stall guarantee” (Giacomoni & Vachharajani, nd., p. 4.). Giacomoni and Vachharajani argue that realizing these improvements require the operating system to be redesigned in order to provide a zero-stall guarantee. Meeting this zero-stall guarantee for any pipeline requires that the system: (1) fully dedicates sufficient computational resources to the application and (2) provides a set of pipe-lineable services. Finally, supporting a pipeline that spans multiple execution contexts requires a new abstraction to label the pipeline as single entity for resource allocation and security.

References:

Frachtenburg, E., & Etsion, Y. (nd.). Hardware Parallelism: Are Operating Systems Ready? (Case Studies in Mis Scheduling) . Los Alamos National Laboratory, Modeling, Algorithms, and Informatics Group School of Computer Science and Engineering. Los Alamos, NM: Defense Advanced Research Projects Agency (DARPA).

Giacomoni, J., & Vachharajani, M. (n.d.). Operating System Support for Pipeline Parallelism on Multicore Architectures. University of Colorado at Boulder. Boulder: University of Colorado at Boulder.

Dan Calloway

by DAN CALLOWAY
Published 19 January 2010 @ 24:19 UTC

WEAVERVILLE, NC – Before discussing the effects that increasing chip densities might have on the traditional supercomputer and parallel computer architectures, it is important that we correctly define both concepts within the context of computing architecture.

The original or traditional computing concept was developed by the Hungarian mathematician, John von Neumann, who first developed the concept of the electronic computer in his papers written in 1945 (Barney, 2010). His concept of the electronic computer was one that consisted of four main components: memory, control unit, arithmetic/logic unit, and Input/Output, and has been the design of computers for many years. The traditional supercomputer is based on a serial computing concept, and is one that is built upon the Von Neumann design in which there is a single CPU or central processing unit or chip containing millions of silicon-based transistors, with the CPU containing the control unit and the arithmetic/logic unit, which is separated from the memory and Input/Output features.  Unlike the traditional supercomputer architecture, the parallel computing architecture is designed upon a multi-processing concept using multiple CPUs whose classification are distinguished from one another according to Flynn’s Classical Taxonomy, wherein the multi-processor architecture is classified along two independent dimensions: Instruction and Data, and where each of these dimensions can exist in one of two states: Single or Multiple.  The four possibile classifications of computing architectures, according to Flynn, are thus: (1) SISD—Single Instruction, Single Data (serial computing); (2) SIMD—Single Instruction, Multiple Data (parallel computing); (3) MISD—Multiple Instruction, Single Data (parallel computing); and (4) MIMD—Multiple Instruction, Multiple Data (parallel computing) (Barney).

Nair (2002) discusses the evolution of the early serial-computing uni-processor computers of the 1940s, which evolved into the traditional supercomputer architecture as transistor chip density was increased to promote greater and greater computing power, to the more advanced parallel-computing multiple-processor superscalar concept of the 1980s found in modern-day parallel computing architectures.  Increasing chip density in uni-processors found in the traditional supercomputer architecture is constrained in several ways, among them are: (1) transmission speed – the speed of a serial computer is directly proportional to the speed with which data can be transmitted through the hardware of the computer, and is limited to both the speed of light (30 cm/nanosecond) and the transmission limit of copper wire (9 cm/nanosecond); (2) limits to miniaturization – the number of transistors that must be added to a single chip to increase processing power by increasing the proximity of processing elements within the chip has a physical limitation of the molecular or atomic size of the silicon-based transistors being used; and (3) economic limitations – the cost of adding expensive silicon-based transistors to a single chip to make the processor faster has a limit of diminishing returns within the chip itself (Barney,2010).  Current parallel computer architectures, which employ multiple-execution units, pipelined or multi-threaded instructions, and multi-core construction on a single chip have similar limitations due to the fact that their CPUs are constructed using identical silicon-based technology. However, according to Nair (2002) even though increasing the number of transistors per chip (increasing chip density) has lead to increased computing power, the performance benefit gained from adding more transistors per chip in parallel computing architectures is a function of the number of transistors needed wherein the proportion of transistors essential to the processing of each instruction has become a smaller fraction of the total number of transistors deployed on each processor.  Thus, the efficiency of transistors, according to Nair, within each processor has been decreasing steadily as well.

Warren (2004) posits that sub-atomic alternatives to silicon-based transistor technology, used in both serial-computing supercomputer and parallel-computing architectures, may soon be available to overcome the physical and cost limitations of the atomic nature of the silicon substrate. Some of these alternatives include: (1) molecular computing—using molecule-sized switches between two electrodes, and an alternative protein called bacteriohodopsin, which changes its structure when exposed to light. Bacteriohodopsin is also promising in the development of three-dimensional memory that can be used in both serial- and parallel-computing architectures; (2) biological or DNA computing—a promising alternative to silicon-based transistor technology or used to complement silicon-based transistor technology that can be used in parallel-computing architectures; (3) inorganic computing devices—holographic memory storage capability using crystals of lithium niobate; and (4) quantum computing—utilizing electron spin-state found in quantum physics and probability theory to effect computing capability in parallel computing architectures.

This author concurs with Warren (2004) that the alternatives to silicon-based transistor technology, such as those discussed above, are essential since the miniaturization and speeding-up of conventional silicon technology is likely to come to an end within the decade. And, as predicted by Warren, silicon technology is still with us today due to advances in parallel computing, a refinement in silicon technology, and a slowdown of the rate of progress that has extended its life-expectancy perhaps until 2020.

In any event, the alternatives suggested by Warren (2004) will most definitely favor the parallel computing architecture over the serial-computing supercomputer architecture since the former has the greatest potential to eliminate the current silicon transistor technology that is quickly reaching its physical limitations in the quest to attain faster and faster computing power needed to solve increasingly complex problems that cannot be solved using traditional supercomputer architectures.

 References:

Barney, B. (2010). Introduction to Parallel Computing. Retrieved January 19, 2010, from Lawrence Livermore Laboratory: https://computing.llnl.gov/tutorials/parallel_comp/

Nair, R. (2002). Effect of increasing chip density on the evolution of computer architectures. IBM Journal of Research and Development , 46 (2/3), 223-234.

Warren, P. (2004). The future of computing – New architectures and new technologies. IEEE Proceedings – Nanobiotechnology , 151 (1), 1-9.

Get Adobe Flash playerPlugin by wpburn.com wordpress themes

SEO Powered by Platinum SEO from Techblissonline