Parallel computing is a computational technique, where the computation process is carried out by several independent (computer) resources, simultaneously. Parallel computation is usually required when processing large amounts of data (in the financial industry, bioinformatics, etc.) or in fulfilling enormous computing processes. Furthermore, parallel computing can also be found in the case of numerical calculations in solving mathematical equations in the fields of physics (computational physics), chemistry (computational chemistry), etc. In solving a problem, parallel computing requires a parallel machine infrastructure consisting of many computers connected to the network and able to work in parallel.
For that we need a variety of supporting software commonly referred to as middleware whose role is to regulate the distribution of work between nodes in a parallel machine. Furthermore, users must make parallel programming to realize computing. This does not mean that with a parallel machine all programs running on it will automatically be processed in parallel. Parallel programming is a computer programming technique that allows execution of commands / operations simultaneously (parallel computing), both in computers with one (single processor) or many (dual processors with parallel machines) CPU. When a computer that is used simultaneously is carried out by separate computers connected in a computer network more often the term used is a distributed system (distributed computing). The main purpose of parallel programming is to improve computing performance. The more things that can be done simultaneously (at the same time), the more work that can be completed.
The easiest analogy is, if you can boil water while chopping onions when you are cooking, the time you need will be less than if you do it sequentially. Or the time you need to cut the onions will be less if you do it together. Performance in parallel programming is measured by how much speed up is obtained in using parallel techniques. Informally, if you chop onions alone it takes 1 hour and with the help of a friend, both of you can do it in 1/2 hour then you get a speed increase of 2 times.
Ability to do all data processing together between a central computer with several smaller computers and are interconnected through communication channels. Each computer has a standalone processor so that it is able to process some data separately, then the results of the processing are combined into a total solution. If one processor has a failure or a problem then the other processor will take over the job.
Architectural Parallel Computer
Embarasingly Parallel is parallel programming used in problems that can be paralleled without requiring communication with each other. Actually, this programming can be considered as an ideal parallel programming, because without communication costs, more speed improvements can be achieved. Michael J. Flynn created one of the classification systems for computers and parallel programs, known as Flynn’s Taxonomy. Flynn groups computers and programs based on the number of instruction sets executed and the number of data sets used by those instructions. The taxonomy of the parallel processing model is based on the instruction flow and the data flow used:
1. SISD (Single Instruction stream, Single Data stream)
A single computer that has one control unit, one processor unit and one memory unit Instructions are carried out in sequence but may also overlap in the execution stage (overlap) An instruction path is decoded for a single data path.
2. SIMD (Single Instruction stream, Multiple Data streams)
A computer that has several processor units under one supervision of a common control unit. Each processor receives the same instructions from the control unit, but operates on different data.
3. MISD (Multiple Instruction stream, Single Data stream)
Until now this structure is still a theoretical structure and there is no computer with this model.
4. MIMD (Multiple Instruction streams, Multiple Data streams) Computer organizations that have the ability to process several programs at the same time. In general, multiprocessors and multicomputers fall into this category
Parallel Processing implementation on Ray Tracing Engine using MPI-based POV-Ray
Namely implementing parallel processing in the original version of the well-known ray tracing program, POV-Ray. Parallel processing in this algorithm involves several problems that commonly arise in parallel computing. The ray tracing process is very complex and requires high computation, for very complex images it takes hours and even days to render a POV-Ray code. Therefore the need to increase the speed of this process is realized in the implementation of parallel processing.
Following are examples of implementation cases:
POV-Ray (Persistence of Vision Raytracer – www.povray.org) is a 3-dimensional rendering engine. This program translates information from an external text file, simulates a light that interacts with an object in a scene to produce a real 3-dimensional object. Starting from a text file that contains descriptions of scenes (objects, lights, point o f view), the program can render the desired image. The algorithm works line by line. An interesting facility of this POV-Ray is antialiasing. Antialiasing is a technique that helps to get rid of sampling errors, which can produce better results.
Using antialiasing, POV-Ray starts tracing a ray for each pixel. If the color of the pixel is different from the color of the neighboring pixels (pixels on the left and above), then the amount is greater than the value of th reshold, then the pixel is supersampled tracing a fixed number of additional rays. This technique is called supersampling and can improve the final quality of an image but it also increases the rendering time for longer. After editing the input data, the POV-Ray outlines all the pixels of the image for the rendering process, through a horizontal scan of each line from left to right. After completing the decomposition of a line it will be written in a file or displayed on the screen and then counts the next line to the last.
- Message Passing Interface (MPI).
MPI is a programming standard that allows programmers to create applications that can be run in parallel. The process run by an application can be divided to be sent to each compute node and then each compute node processes and returns the results to the head node computer. To design a parallel application certainly requires many considerations – considerations include the latency of the network and the length of time a task is executed by the processor.
This MPI is a standard developed to create portable messaging applications. A parallel computation consists of a number of processes, each of which works on some local data. Each process has a local variable, and there is no mechanism for a process that can directly access the other memory. The sharing of data between processes is done by message passing, namely by sending and receiving messages between processes.
MPI provides functions for exchanging messages. Other uses of MPI are:
- write parallel code portable,
2. get high performance in parallel programming, and
3. face problems involving irregular or dynamic data relationships that are not very compatible with parallel data models.
- PVM (Parallel Virtual Machine)
Is a software package that supports sending messages for parallel computing between computers. PVM can run on various UNIX or Windows variations and is portable for many architectures such as PCs, workstations, multiprocessors and supercomputers.
The PVM system is divided into two. First is the daemon, pvmd, which runs on each machine’s virtual machine. A virtual machine will be created, when the user executes the PVM application. PVM can be executed via the UNIX prompt on all hosts. The second part is the routine library interface that has many functions for communication between tasks. This library contains routines that can be called for sending messages, creating new processes, coordinating tasks and configuring virtual machines.
One of the important rules in PVM is the mechanism of the master and slave / worker programs. The programmer must create a master code that is the coordinator of the process and a slave code that receives, runs, and returns the results of the process to the master computer. The master code is executed first and then gives birth to another process from the master code. Each program is written using C or Fortran and is compiled on each computer. If the computer architecture for parallel computing is all the same, (for example pentium 4 all), then the program is sufficient to compile on only one computer. Furthermore, the compilation results are distributed to other computers which will be parallel computing nodes. The master program is only on one node while the slave program is on all nodes.
Communication can take place if each computer has access rights to the filesystem of all computers. System file access is done via the rsh protocol which runs on unix or windows. The following are the setup steps for each computer:
Create a hostfile file that lists the computer nodes and user names that will be used for parallel computing. If the user names on all computers are the same, for example the research user names on computers C1, C2, C3 and C4, then this hostfile may not be available. This hostfile can be used if the user names on each computer are different.
Register each computer’s IP in the /etc/hosts/hosts.allow and /etc/hosts/hosts.equiv file.
Dynamically adding and deleting hosts can be done via the PVM console. If IP is not defined on hostfile¸ this method can be used.
The PVM program consists of master and slave, where the master program is executed first and then gives birth to another process. PVM invokes the pvm_spawn () routine to give birth to one or two more of the same processes. Functions for the C language version of PVM have a pvm prefix routine. The sending and receiving of tasks are identified by TID (Task Identifier). TID is unique and is generated by local PVMD. PVM contains several routines that return TID values so the user application can identify other tasks on the system.