SDSU CS 660: Combinatorial Algorithms
Review of Mathematical Analysis of Algorithms

[To Lecture Notes Index]
San Diego State University -- This page last updated 8/29/95
----------

Contents of Intro Lecture

  1. References
  2. Mathematical Analysis of Algorithms
    1. Model of Computing
    2. Asymptotic Notation
  3. Timing Analysis
    1. Timing in C on Rohan
    2. Handling Measurement Errors1
    3. Estimating Complexity from Timing Results
    4. Mathematical Analysis and Timing Code

References


Introduction To Algorithms, Corman, Leiserson,Rivest, Chapters 1-4

Mathematical Analysis of Algorithms


Model of Computing


If analysis of algorithms is the answer, what is the question?





Given two or more algorithms for the same task, which is better?
Under which condition is bubble sort better than insertion sort?

What computing resources does an algorithm require?
How long will it take bubble sort to sort a list of N items?






Goal of mathematical analysis is a function of the resources required of an algorithm



On what computer?
What is a Computer?
Random-access machine (RAM)
Single processor
Instructions executed sequentially
Each operation requires the same amount of time

Single cost vs. Lg(N) cost
Time required for basic operation?
3 + 6
1234!
Insertion Sort
A[0] = - infinity
for K = 2 to N do
begin
J = K;
Key = A[J];
while Key < A[J-1] do
begin
A[J] = A[J-1];
J = J - 1;
end while;
A[J] = Key;
end for;
Complexity
Resources required by the algorithm as a function of the input size



Worst-case Analysis
Complexity of an algorithm based on worst input of each size

Average-case Analysis
Complexity of an algorithm averaged over all inputs of each size
Insertion Sort
ComparisonsElement moves
worst case(N+1)N/2 - 1(N-1)N/2
average case(N+1)N/4 - 1/2(N-1)N/4


Asymptotic Notation


Asymptotically tight bound



Asymptotic upper bounds
Common Myths and Errors
instead of:
or even that there is an n such that

Let f(n) = 2n + 10, and g(n) = n then
f(n) = O(g(n)) but f(n) > g(n)



Bubble vs. Insertion Sort
Worst caseAverage case
Bubble sort
Insertion Sort


Bubble Sort
ComparisonsElement moves
worst case(N-1)N/23(N-1)N/2
average case(N-1)N/23(N-1)N/4
best case(N-1)N/20

Insertion Sort
ComparisonsElement moves
worst case(N+1)N/2 - 1(N-1)N/2
average case(N+1)N/4 - 1/2(N-1)N/4
best caseN - 10
Bubble vs. Insertion SortTiming Results
Worst Case
NBubbleInsertion
10011
20053
4001911
8007942
1600317166



Average Case
NBubbleInsertion
10010
20031
400145
8005621
160022884




What is wrong with this Picture?

Timing Analysis


Timing in C on Rohan



main()
{
int k, iterations;
for (iterations = 0; iterations < 50; iterations++)
{
start();
/* start the timer */
for (k = 0; k < 2000000; k++)
/* do some work */
k = k;
stop();
/* stop the timer */
printf("Time taken: %ld\n", report());
};
}
Result on Rohan

Time Frequency Occurred
30 2
31 2
32 9
33 10
34 11
35 9
36 5
37 1
39 1
Source for Timing C Code on Rohan

#include <stdio.h>
#include <sys/times.h>
#include <limits.h>

static struct tms _start; /* Stores the starting time*/
static struct tms _stop; /* Stores the ending time*/


int start()
{
times(&_start);
}

int stop()
{
times(&_stop);
}

unsigned long report()
{
return _stop.tms_utime - _start.tms_utime;
}

main()
{
int k, iterations;
for (iterations = 0; iterations < 50; iterations++)
{
start();
/* start the timer */
for (k = 0; k < 2000000; k++)
/* do some work */
k = k;
stop();
/* stop the timer */
printf("Time taken: %ld\n", report());
};
}

Handling Measurement Errors[1]


Repeat a measurement n times
Let the measurements be labeled

Let and

The confidence interval for the true measurement is[2]:

The value of t determine the probability the measurement is in the interval

When n >= 50
Probability value of t
50%80%90%95%99%
0.671.281.641.962.58
In Example

, s = 3.15, selecting t = 1.96 we get

95% confidence interval is (32.83, 34.57)
Student t table - When n < 50
n90%95%99%
13.0786.31431.821
21.8862.9206.965
31.6382.3534.541
41.5332.1323.747
51.4762.0153.365
61.4401.9433.143
71.4151.8952.998
81.3971.8602.896
91.3831.8332.821
101.3721.8122.764
201.3251.7252.528
301.3101.6972.457
401.3031.6842.423

Estimating Complexity from Timing Results

Fun with Functions

Let f(n) = 3n*n + 4n + 5 and g(n) = 3n*n


Fact: g(n) is an approximation of f(n)


Notation: f(n) = g(n) +

nf(n)g(n)% error
112375.00%
1034530013.04%
20128512006.61%
30282527004.42%
40496548003.32%
50770575002.66%
6011045108002.22%
7014985147001.90%
8019525192001.66%
9024665243001.48%
10030405300001.33%
2001208051200000.67%
3002712052700000.44%
Eyeballing Complexity

Let then
Timing Results
NBubbleInsertion
10011
20053
4001911
8007942
1600317166
Plotting Complexity
Cubic or Quadratic[3]?
Plotting ComplexityEngineers Method (Modified)

Let then

Let b = 2 and then

Plotting ComplexityTransform the Axis

Let and (or ) then:

g(J) = f( ) = a( )k = aJ

So g(J) is linear!
Example
nf(n) =5n*n+n + 3J=n*n
191
10513100
202023400
304533900
4080431600
50125532500
60180633600
Which is Quadratic?




Mathematical Analysis and Timing Code



Bubble sort worst case is ( n*n )

Complexity is an*n + bn + c

Timing Results Worst Case
NBubble Sort
40020
50031
60045
70061
80079

Least Squares fit of data to an*n + bn + c

Bubble sort worst case is 0.0001143n*n + 0.01084n - 2.738

Predicted vs. Actual Time for Bubble Sort
NActualPredicted% Error
90010599.6015.14%
1000124122.4021.29%
1100149147.4891.01%
2000496476.1424.00%
2400713681.6464.40%