List of Moderators (Updated: 23 Aug

List of Moderators (Updated: 23 Aug

Post by Arch Robis » Sun, 02 Dec 1990 22:05:00



I'm looking for references (either pro or con) on the following claim:

        A p-processor parallel processor can never exhibit speedup
        of greater than p.

I remember there being a pro and con article in one of the ACM SIG newletters
a while back.  Thanks in advance for any pointers.

Arch D. Robison
University of Illinois at Urbana-Champaign


UUCP: {pur-ee,convex}!uiucdcs!robison

 
 
 

List of Moderators (Updated: 23 Aug

Post by David F. Ko » Sun, 02 Dec 1990 16:09:00



> I'm looking for references (either pro or con) on the following claim:

>    A p-processor parallel processor can never exhibit speedup
>    of greater than p.

Try these, all short:

D. Parkinson, "Parallel Efficiency can be greater than unity",
Parallel Computing 3 (1986) 261-262.

V. Faber, Olaf M. Lubeck and Andrew B. White, Jr, "Superlinear speedup
of an efficient sequential algorithm is not possible", Parallel
Computing 3 (1986) 259-260.

"Isomorphic Computers, Ltd", International Journal of Parallel
Programming, Vol 16, No 2, 1987, pp 179-182.

David Kotz
Department of Computer Science, Duke University, Durham, NC 27706


UUCP:   decvax!duke!dfk

 
 
 

List of Moderators (Updated: 23 Aug

Post by David F. Ko » Sun, 02 Dec 1990 14:21:00



> I'm looking for references (either pro or con) on the following claim:

>    A p-processor parallel processor can never exhibit speedup
>    of greater than p.

Try these, all short:

D. Parkinson, "Parallel Efficiency can be greater than unity",
Parallel Computing 3 (1986) 261-262.

V. Faber, Olaf M. Lubeck and Andrew B. White, Jr, "Superlinear speedup
of an efficient sequential algorithm is not possible", Parallel
Computing 3 (1986) 259-260.

"Isomorphic Computers, Ltd", International Journal of Parallel
Programming, Vol 16, No 2, 1987, pp 179-182.

David Kotz
Department of Computer Science, Duke University, Durham, NC 27706


UUCP:   decvax!duke!dfk

 
 
 

List of Moderators (Updated: 23 Aug

Post by Matthew Huntba » Sun, 02 Dec 1990 19:53:00



> I'm looking for references (either pro or con) on the following claim:

>    A p-processor parallel processor can never exhibit speedup
>    of greater than p.

Depends on the algorithm. In parallel heuristic search superlinear
speedup can happen easily. The reference I have to hand is:

G-J.Li and B.W.Wah. How to cope with anomalies in parallel approximate
branch-and-bound algorithms.
In National Conference on Artificial Intelligence (AAAI-84) pp.212-215.

though I think there was something in a recent CACM, possibly by the
same authors.

As a simple demonstration, suppose you are searching a tree with one branch
which your heuristic tells you is promising, but which after a lot of
computation (say 11 time units) does not yield a result, and another branch
which your heuristic tells you is less promising but which does in fact yield
a result after a small amount of computation (say 1 time unit).

On a single processor you could search the whole large branch first
before turning to the small branch and finding your solution: total time
12 units.

With two processors you could assign one branch to each processor. Let us
assume there is a time delay of one unit shipping the less promising
branch to a remote processor, and a time delay of another unit getting the
result back. You would still get you result in 3 units of time.

So 2 processors give you a speedup of 4   Q.E.D.

 
 
 

List of Moderators (Updated: 23 Aug

Post by Magnus M Halldorss » Sun, 02 Dec 1990 18:41:00



Quote:> Depends on the algorithm. In parallel heuristic search superlinear
> speedup can happen easily.
>...
> As a simple demonstration, suppose you are searching a tree with one branch
> which your heuristic tells you is promising, but which after a lot of
> computation (say 11 time units) does not yield a result, and another branch
> which your heuristic tells you is less promising but which does in fact yield
> a result after a small amount of computation (say 1 time unit).

> On a single processor you could search the whole large branch first
> before turning to the small branch and finding your solution: total time
> 12 units.

But you can always simulate the parallel processors by traversing the
tree breadth-first. That will require a stack and extra memory, but no
more than the memory used by all the parallel processors combined.
  For this algorithm, the parallel version would be an example of
AND-parallelism, while for the standard sequential algorithm, it would
be an example of OR-parallelism.
  The m*is, you can only get superlinear speedup if the
corresponding sequential algorithm you're comparing with is less optimal.

Magnus

 
 
 

List of Moderators (Updated: 23 Aug

Post by James Ve » Sun, 02 Dec 1990 18:41:00




 >> I'm looking for references (either pro or con) on the following claim:
 >>
 >>       A p-processor parallel processor can never exhibit speedup
 >>       of greater than p.
 >
 >Depends on the algorithm. In parallel heuristic search superlinear
 >speedup can happen easily. The reference I have to hand is:
 >[...]
 >As a simple demonstration, suppose you are searching a tree with one branch
 >which your heuristic tells you is promising, but which after a lot of
 >computation (say 11 time units) does not yield a result, and another branch
 >which your heuristic tells you is less promising but which does in fact yield
 >a result after a small amount of computation (say 1 time unit).
 >
 >On a single processor you could search the whole large branch first
 >before turning to the small branch and finding your solution: total time
 >12 units.
 >
 >With two processors you could assign one branch to each processor. Let us
 >assume there is a time delay of one unit shipping the less promising
 >branch to a remote processor, and a time delay of another unit getting the
 >result back. You would still get you result in 3 units of time.
 >
 >So 2 processors give you a speedup of 4   Q.E.D.

One has to be careful in comparing apples to oranges.  Here you are
comparing two different algorithms.  Just because you run the same
program on each of your many processors as you ran on your
uniprocessor does not mean that your global algorithm is consistent.

In the case described above, the uniprocessor is running an algorithm
that says "exhaustively search a branch until you find an answer or it
fails to yield a result", while the multiprocessor is running an
algorithm that says "search N (the number of processors) branches
simultaneously until your find an answer replacing branches that fail
to yield a result with new branches."

To realistically compare the performance increase of the multiprocessor
over the uniprocessor you should run the second algorithm on the
uniprocessor.  This could be done by specifying N and looping over N
branches one level at a time.

--
James S. Vera      |      Internet           |Standard Disclaimers


"When I was young it seemed that life was so wonderful..." - Super*

 
 
 

List of Moderators (Updated: 23 Aug

Post by Gary L Da » Sun, 02 Dec 1990 18:42:00



>As a simple demonstration,.... [had to edit out excessive included text]
>So 2 processors give you a speedup of 4   Q.E.

Wait!  You still have to search your promising branch thoroughly on the
first processor; that's 11 steps, not 1.  The fastest turnaround time
if you have each branch ALREADY residing on seperate processors is 11
if they are searched simultaneously; master-slave, it's 13.--
~~~~~~~~~~~~~~~~~~~~~~~~ je me souviens ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~



 
 
 

List of Moderators (Updated: 23 Aug

Post by Marco Valtor » Sun, 02 Dec 1990 20:16:00


If you search breadth-first, instead of depth-first,
you will return a solution in 2 units of time on a serial
machine.

Should I conclude that the speedup in parallel search is
always less than 1 ?  :-)

Marco Valtorta                  usenet: ...!ncrae!usceast!mgv

University of South Carolina    tel.: (1)(803)777-4641
Columbia, SC 29208              tlx: 805038 USC
U.S.A.                          fax: (1)(803)777-3065
usenet from Europe: ...!mcvax!uunet!ncrlnk!ncrae!usceast!mgv

 
 
 

List of Moderators (Updated: 23 Aug

Post by Matthew Huntba » Sun, 02 Dec 1990 15:40:00


What I was assuming was that you have an algorithm where the any solution
is acceptable and on the whole a depth-first search leads to solutions quickest.
It just happens that in this particular example, the heuristics lead you down
the wrong path. I assumed the two processors still searched depth-first, so a
completely breadth-first search would not give a quicker solution.

I agree that bringing in parallelism could be said to have altered the algorithm
by introducing some breadth-first search into what was a purely depth-first
search. But this is what is generally accepted by the term "superlinear
speedup"