San Diego State University

Let S be a list of n items in random order

Problem - Find the k'th smallest item in the list

Sequential_Select( S, k ), Q is a constant

Step 1

- if | S| < Q then sort S and return k'th element
- else
- subdivide S into |S |/Q sublists of Q elements each

- Sort each sublist and determine its median

Step 3

- Call Sequential_Select to find m, median of the |S |/Q medians found in step 2

Step 4

- Let L = elements of S that are less than m
- Let E = elements of S that are equal to m
- Let G = elements of S that are greater than m

Step 5

- if | L | >= k then return Sequential_Select( L, k )
- if | L |+| E | >= k then return m
- return Sequential_Select( G , k -| L |-| E |)

Define

**t(n)**= time required in worst case to find the k'th smallest item in a list of n items

Step 1

- if | S| < Q then sort S and return k'th element
- else
- subdivide S into |S |/Q sublists of Q elements each

- Sort each sublist and determine its median

Step 3

- Call Sequential_Select to find m, median of the |S |/Q medians found in step 2

This takes t( n / Q )

Step 4

- Let L = elements of S that are less than m
- Let E = elements of S that are equal to m
- Let G = elements of S that are greater than m

Step 5

- if | L | >= k then return Sequential_Select( L, k )
- if | L |+| E | >= k then return m
- return Sequential_Select( G , k -| L |-| E |)

Claim: |S|/4 items of S will be greater then or equal to m

proof:

- There are |S |/2Q medians larger then m
- Each median is the median of list of size Q
- Each median has Q/2 items larger or equal to it
- So there are |S |/2Q * Q/2 = |S|/4 items of S will be greater then or equal to m

We have

- | L | <= 3*|S|/4
- | G | <= 3*|S|/4

We have:

t( n ) = k* n + t( n / Q ) + t( 3n/4), k = c1 + c2 + c3

Need Q so that n / Q + 3n/4 < n

Any Q >= 5 will work, pick 5

t( n ) = k* n + t( n / 5 ) + t( 3n/4)

Assume that T ( n ) <= c*n

We get:

t( n ) = k* n + c* n / 5 + c* 3n/4

- = k* n + c* 19*n/20, let c = 20k
- = k* n + 19k*n
- = 20k*n = c*n

Let S be a list of n items in random order

We have N processors ( I will use P = N )

Determine x such that

Each processor will get n/N = elements

M is an array in shared memory

Problem - Find the k'th smallest item in the list

Parallel_Select( S, k )

Step 1

- if | S| < 5 then sort S and return k'th element
- else
- subdivide S into P sublists of |S|/P = elements each
- Pi gets sublist Si

- Each processor determines mi the median of its sublist using Sequential_Select( Si, |Si|/2 )
- Set M[i] = mi

Step 3 Find the median of M

- Parallel_Select( M, |M|/2 )

Step 4a

- Let Li = elements of Si that are less than m
- Let Ei = elements of Si that are equal to m
- Let Gi = elements of Si that are greater than m

Step 4b Construct L, E, G from Li, Ei, Gi, i = 1 to P

- Let L = elements of S that are less than m
- Let E = elements of S that are equal to m
- Let G = elements of S that are greater than m
- Perform a pre-scan on |L1|, |L2|, |L3|, ... |LP| to get
- 0, |L1|, |L1| + |L2|, |L1| + |L2| + |L3|, etc.
- Now processor Pi places it list Li starting in location
- |L1| + |L2| + |L3| + ... |Li-1| of array L (assuming L starts at location 0)
- Do the same for E and G

Step 5

- if | L | >= k then return Parallel_Select( L, k )
- if | L |+| E | >= k then return m
- return Parallel_Select( G , k -| L |-| E |)

Define

**t(n)**= time required in worst case to find the k'th smallest item in a list of n items using Parallel_Select

Step 1

- if | S| < 5 then sort S and return k'th element
- else
- subdivide S into P sublists of |S|/P = elements each
- Pi gets sublist Si

- Each processor determines mi the median of its sublist using Sequential_Select( Si, |Si|/2 )

Step 3

- Parallel_Select( M, |M|/2 )

Step 4a

- Let Li = elements of Si that are less than m
- Let Ei = elements of Si that are equal to m
- Let Gi = elements of Si that are greater than m

Step 4b Construct L, E, G from Li, Ei, Gi, i = 1 to P

Step 5

- if | L | >= k then return Parallel_Select( L, k )
- if | L |+| E | >= k then return m
- return Parallel_Select( G , k -| L |-| E |)

We have

- | L | <= 3*|S|/4
- | G | <= 3*|S|/4

We have:

t( n ) = c1*log( P ) + c2 * |S|/P + t( P ) + c3 * |S|/P +

- c4 * log(P) + t( 3n/4)

But:

So

t( n ) = c1*log( n ) + c2 * + t( ) + c3 * +

- c4 * log( n ) + t( 3n/4)

Which gives:

t( n ) = O( ) = O( |S|/P )