Excerpt from Performance of Shared Memory in a Parallel Computer
The particular application that motivated this study is the performance analysis of parallel computers, especially vector machines in which processors and memories are connected by a crossbar. This means there is a communication path between each processor and memory that does not con?ict with the path between any other processor and memory. However, if a memory module is addressed by more than one processor during an instruction cycle, the different accesses must be serviced sequentially, and the program cannot advance until all memory requests are satisfied. In such a case, the time to perform an instruction increases linearly with the length of the maximum request queue. Consequently, the hardware designer wishes the memory requests to be spread as uniformly as possible on average.