Parallel & Distributed ComputingCS441 Lecture Notes - Ryan Turnquist
In keeping with the concept of making computing faster, designers have basically two choices: make a single processor computer faster, or split the computation among multiple processors.
The first idea is what we have been talking about so far in this class and it has been accomplished through
caching and a host of other means. This unfortunately
has an upper limit on performance
growth due to physical properties of materials and eventually we will reach that limit. The other option is
to use multiple processors and split the workload among each one. This is called
parallel computing and there are numerous ways
of arranging it.
ArchitectureAmong other criteria, parallel computers can be divided by their memory architectures - shared memory and distributed memory. Shared memory parallel computers are defined by the ability for all CPUs to access all memory as global address space. There are two main types of shared memory: uniform and non-uniform.
Uniform Memory Access (UMA) is characterized by the use of identical processors who have equal access times to memory. Symmetric Multiprocessor (SMP) machines, like a dual core processor, are the most common representation of UMA machines today.
Non-Uniform Memory Access (NUMA) is characterized by CPUs whose access times are not equal to all memory. These are often made from multiple SMPs, physically linking them and access is slower across the link. A multi-CPU computer (one with more than one actual piece of silicon) is a NUMA machine.
Cache coherency can be an issue and is usually solved at the
hardware level with snooping or snarfing. Advantages of shared memory systems include ease of programming (as
compared to distributed memory systems) and fast data sharing between CPUs. Scalability and cost are the biggest
disadvantages of shared memory parallel computers - as the number of processors is increased the traffic to
memory and cache increases geometrically.
Distributed memory systems are characterized by each processor having its own memory. Because of this, accessing
memory of another processor is up the programmer and results in NUMA times. These systems are much more scalable
and cost effective but harder to program, as it is up to the programmer to deal with communication among processors.
Cache coherency is however not an issue.
The organization of concurrent systems can be as tight as a multithreaded or multicore CPU to as loosely coupled as
grid computing. The two main organizations I want to talk
about are clusters and grid computing (distributed computing).
Cluster computers are very tightly coupled computers that can be viewed as a single computer in most respects. Each node of the cluster is commonly connected to the others over a high speed local are network. Clusters are highly economic and provide the benefits of performance or reliability more cost-efficiently than that of a comparable single computer. Scalability is also much easier in a cluster computer than other types of parallel computing. There are generally three types of cluster categories: High Availably, High Performance and Load-Balancing.
Grid Computing (Distributed Computing)
Grid or distributed computing is very similar to cluster computing and basically clusters are a special case of distributed computing. The biggest differences between the two are in grid computing: each node is not usually only workings on group tasks, nodes do not fully trust one another, nodes are much more heterogeneous and they are generally separated by geography. Distributed computing almost always falls into one of the following architectures: Client-Server, N-Tier, Tightly Coupled (Clustered) or P2P.
Some examples include:
While distributed computing allows users and computing resources all around the world to solve problems that would normally have an unreasonable timescale, there are many pitfalls that make it difficult to implement.
Poor planning can lead to system unreliability if critical nodes fail or there is inadequate or poor synchronization. Troubleshooting also becomes much more difficult when dealing with many different systems and platforms. Many problems are not suited for distributed computation especially if there is low parallelism or high amounts of communication, which leads to the problem of overhead. If the bandwidth is too low or communication needs are too high overall efficiency and performance gain may not be lower than computing on a different type of system. Proprietary data generated by volunteers could also pose a legal problem. Another major problem could be the amount of raw data that is generated (especially in biogenetic problems) which is meaningless until it is sifted through and presented in a useable way which could take longer than the actual problem solving.
Along with conceptual problems, there are many implementation problems that can be mostly summarized by the eight fallacies of distributed computing written by Peter Deutsch and James Gosling. Many of these seem to be networking issues but what is distributed computing without the network…
There are many interesting ideas about making distributed computing work and trying to solve the network issues listed above.
Remote Procedure Calls (RPC) - This is a way to call procedures on a networked machine without the hassle of explicitly coding the interaction between the computers. It gives the programmer the ability to call a procedure like they would with a local one.
Remote Method Invocation (RMI) - This is the OO version of RPC and was originally developed for Java. No surprise there, if you read about the formation of the fallacies list, you get a sense that the Sun people have been striving for this kind of thing.
While nearly any programming language that has enough access to the system can be used in parallel computing and specifically distributed computing, there are many, many languages tailored just for this. Many of them are adaptations of C/C++ or Java and some of them have gone the other way - for example, Alef by Rob Pike, became a thread library for C.