Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

CIS 256 (Java - data structures) Goal: This lab will introduce you to several so

ID: 3831698 • Letter: C

Question

CIS 256 (Java - data structures)

Goal: This lab will introduce you to several sorting algorithms and allow you

to compare their running times in practice.

Part I: Identifying Sort Algorithms (1 point)

----------------------------------------------

To help build your intuition of how these sorting algorithms work, take a look

at three algorithms in action. Run a web browser, make sure its has Java

enabled, and go to the Web page

   http://www.cs.berkeley.edu/~jrs/61b/lab/MysterySort

Each histogram shown illustrates an array of ints. The length of each bar is

proportional to the integer stored at that position in the array. When you

click on a histogram, an animated algorithm sorts the array.

The histograms are numbered 1, 2, 3, and 4, each number corresponding to a

different sorting algorithm you have seen in class. Identify each algorithm.

How can you tell which is which? Which is fastest on the random data? What

about the presorted data and the reverse sorted data?

One of the algorithms is quicksort. Which element does it choose for the pivot

in a partition? (The pivot is not randomly chosen.) How can you tell?

Part II: Timing Sort Algorithms (2 points)

-------------------------------------------

Part I gave you a rough idea of the relative speeds of the three sorting

algorithms. For greater accuracy, we will now time several algorithms.

SortPerf.java runs a sorting algorithm (chosen from a selection of several) on

a randomly generated array of specified size, and outputs the data to a file.

Compile and run SortPerf.

javac -g SortPerf.java

java SortPerf select 5 175 200 select.dat

The first argument is the sorting algorithm ("select", "merge", "insert", or

"quick"). The second is both the size of the smallest array to sort and the

increment between array sizes. The third is the size of the largest array to

sort. The fourth is the number of trials for each size of array. Because

timings are easily perturbed by other processes running on the same

workstation, you must run each test for a few hundred trials to get accurate

timings. The last argument is the name of the file in which SortPerf reports

the total running time, in milliseconds, of all trials for each array size.

The sample command above runs selection sort on arrays of size 5, 10, 15, ...,

175, each 200 times, placing the timing results in the file select.dat. If you

notice that the timings look a little bumpy, i.e., the running time does not

appear to be strictly increasing with input size, close any unnecessary

applications you have running, then try running SortPerf several times until

you get a fairly smooth set of numbers. When SortPerf is running, be careful

not to touch the mouse or keyboard, as any user input increases the running

time.

Now run SortPerf for the same sizes using mergesort, insertion sort, and

quicksort. Save these data to different files--say, merge.dat, insert.dat, and

quick.dat. To make this easier, we have also provided a file called runrace,

which you can execute to run all four sorting algorithms. Read runrace to see

how it works.

Use the program "gnuplot" to plot the timing results. At the prompt in an

xterm (not emacs), type

gnuplot

You will see a ">" prompt. Plot the files using a command like

gnuplot> plot "select.dat" with linesp 1, "merge.dat" with linesp 2,

            "insert.dat" with linesp 3, "quick.dat" with linesp 4

[Type all this on just one line, though.]

You should observe that the algorithms behave asymptotically as expected, but

it is unclear which algorithm is faster for small array sizes. At what values

of n do you observe cross-overs in execution time between the three sorting

algorithms? To help see these cross-overs better, edit the runrace file to use

the command

java SortPerf <sort> 1 50 1000 <sort>.dat

for each sorting algorithm. This gives a clearer picture of what happens for

array sizes less than 50. Again run the script and plot the results with

gnuplot.

PART III: Building a Better Sort (1 point)

-------------------------------------------

Based on your observations in Part II, write a sorting algorithm that works

well on the random data for all sizes of n. It won't be a completely new

sorting algorithm, but should combine the sorting algorithms described here in

a very simple way. Implement your solution in YourSort.java. Run your

algorithm from SortPerf by calling the sorting method "best". Modify runrace

to generate timings for your new algorithm, and plot the results.

Check-off

---------

1 point:   What are sorting algorithms 1, 2, and 3 in Part I? How do you know?

           How is the pivot element chosen? How do you know?

3 points: Show the plotted performance of the sorting algorithms. The third

           point is for including your algorithm in the plot, and demonstrating

           that it's roughly as good as any of the others for any array size.

-------------------------------------------------------------------------------------------------------------------------------------------------------------

/* Sort.java */

/**
* Contains several sorting routines implemented as static methods.
* Arrays are rearranged with smallest item first using compares.
* @author Kathy Yelick and Mark Allen Weiss
**/
public final class Sort {

/**
* Simple insertion sort.
* @param a an array of int items.
**/
public static void insertionSort(int[] a) {
for (int p = 1; p < a.length; p++) {
int tmp = a[p];
int j = p;

for(; (j > 0) && (tmp < a[j - 1]); j--) {
a[j] = a[j - 1];
}
a[j] = tmp;
}
}

/**
* Shellsort, using a sequence suggested by Gonnet.
* @param a an array of int items.
**/
public static void shellsort(int[] a) {
for (int gap = a.length / 2; gap > 0;
gap = (gap == 2) ? 1 : (int) (gap / 2.2)) {
for (int i = gap; i < a.length; i++) {
int tmp = a[i];
int j = i;

for (; (j >= gap) && (tmp < a[j - gap]); j -= gap) {
a[j] = a[j - gap];
}
a[j] = tmp;
}
}
}

/**
* Standard heapsort.
* @param a an array of int items.
**/
public static void heapsort(int[] a) {
for (int i = a.length / 2; i >= 0; i--) {
percDown(a, i, a.length);
}
for (int i = a.length - 1; i > 0; i--) {
swapReferences(a, 0, i);
percDown(a, 0, i);
}
}

/**
* Internal method for heapsort.
* @param i the index of an item in the heap.
* @return the index of the left child.
**/
private static int leftChild(int i) {
return 2 * i + 1;
}

/**
* Internal method for heapsort, used in deleteMax and buildHeap.
* @param a an array of int items.
* @index i the position from which to percolate down.
* @int n the logical size of the binary heap.
**/
private static void percDown(int[] a, int i, int n) {
int child;
int tmp;

for (tmp = a[i]; leftChild(i) < n; i = child) {
child = leftChild(i);
if ((child != n - 1) && (a[child] < a[child + 1])) {
child++;
}
if (tmp < a[child]) {
a[i] = a[child];
} else {
break;
}
}
a[i] = tmp;
}

/**
* Mergesort algorithm.
* @param a an array of int items.
**/
public static void mergeSort(int[] a) {
int[] tmpArray = new int[a.length];

mergeSort(a, tmpArray, 0, a.length - 1);
}

/**
* Internal method that makes recursive calls.
* @param a an array of int items.
* @param tmpArray an array to place the merged result.
* @param left the left-most index of the subarray.
* @param right the right-most index of the subarray.
**/
private static void mergeSort(int[] a, int[] tmpArray, int left, int right) {
if (left < right) {
int center = (left + right) / 2;
mergeSort(a, tmpArray, left, center);
mergeSort(a, tmpArray, center + 1, right);
merge(a, tmpArray, left, center + 1, right);
}
}

/**
* Internal method that merges two sorted halves of a subarray.
* @param a an array of int items.
* @param tmpArray an array to place the merged result.
* @param leftPos the left-most index of the subarray.
* @param rightPos the index of the start of the second half.
* @param rightEnd the right-most index of the subarray.
**/
private static void merge(int[] a, int[] tmpArray, int leftPos, int rightPos,
int rightEnd) {
int leftEnd = rightPos - 1;
int tmpPos = leftPos;
int numElements = rightEnd - leftPos + 1;

// Main loop
while (leftPos <= leftEnd && rightPos <= rightEnd) {
if (a[leftPos] < a[rightPos]) {
tmpArray[tmpPos++] = a[leftPos++];
} else {
tmpArray[tmpPos++] = a[rightPos++];
}
}
while (leftPos <= leftEnd) {
tmpArray[tmpPos++] = a[leftPos++];
}
while(rightPos <= rightEnd) {
tmpArray[tmpPos++] = a[rightPos++];
}
// Copy TmpArray back
for (int i = 0; i < numElements; i++, rightEnd--) {
a[rightEnd] = tmpArray[rightEnd];
}
}

/**
* Quicksort algorithm.
* @param a an array of int items.
**/
public static void quicksort(int[] a) {
quicksort(a, 0, a.length - 1);
}

/**
* Method to swap two ints in an array.
* @param a an array of ints.
* @param index1 the index of the first int to be swapped.
* @param index2 the index of the second int to be swapped.
**/
public static void swapReferences(int[] a, int index1, int index2) {
int tmp = a[index1];
a[index1] = a[index2];
a[index2] = tmp;
}

/**
* This is a generic version of C.A.R Hoare's Quick Sort algorithm. This
* will handle arrays that are already sorted, and arrays with duplicate
* keys.
*
* If you think of an array as going from the lowest index on the left to
* the highest index on the right then the parameters to this function are
* lowest index or left and highest index or right. The first time you call
* this function it will be with the parameters 0, a.length - 1.
*
* @param a an integer array
* @param lo0 left boundary of array partition
* @param hi0 right boundary of array partition
**/
private static void quicksort(int a[], int lo0, int hi0) {
int lo = lo0;
int hi = hi0;
int mid;

if (hi0 > lo0) {

// Arbitrarily establishing partition element as the midpoint of
// the array.
swapReferences(a, lo0, (lo0 + hi0)/2);
mid = a[(lo0 + hi0) / 2];

// loop through the array until indices cross.
while (lo <= hi) {
// find the first element that is greater than or equal to
// the partition element starting from the left Index.
while((lo < hi0) && (a[lo] < mid)) {
lo++;
}

// find an element that is smaller than or equal to
// the partition element starting from the right Index.
while((hi > lo0) && (a[hi] > mid)) {
hi--;
}
// if the indices have not crossed, swap them.
if (lo <= hi) {
swapReferences(a, lo, hi);
lo++;
hi--;
}
}

// If the right index has not reached the left side of array
// we must now sort the left partition.
if (lo0 < hi) {
quicksort(a, lo0, hi);
}

// If the left index has not reached the right side of array
// must now sort the right partition.
if (lo < hi0) {
quicksort(a, lo, hi0);
}
}
}

/**
* Internal insertion sort routine.
* @param a an array of int items.
* @param low the left-most index of the subarray.
* @param n the number of items to sort.
**/
private static void insertionSort(int[] a, int low, int high) {
for (int p = low + 1; p <= high; p++) {
int tmp = a[p];
int j;

for (j = p; (j > low) && (tmp < a[j - 1]); j--) {
a[j] = a[j - 1];
}
a[j] = tmp;
}
}

/**
* Implementation of the SelectionSort algorithm on integer arrays.
* @param a an array of int items.
**/
public static void selectionSort(int[] A) {
int i;
for (i = A.length-1; i > 0; i--) {
// outer loop invariant: A[i..n-1] is sorted.

// find maximum value in A[0..i]
int maxIndex = 0;
int j;
for (j = 1; j <= i; j++) {
   /* inner loop invariant: for all k < j, A[maxIndex] >= A[k] */

   if (A[maxIndex] < A[j]) {
   maxIndex = j;
}
}

// swap largest (A[maxIndex]) into A[i] to maintain invariant.
swapReferences(A, i, maxIndex);
}
}

}

----------------------------------------------------------------------------------------------------------------------------------------------------------------------

/* SortPerf.java */

import java.util.Random;
import java.io.*;

class SortPerf {

private static final long INIT_SEED = 1234567;

/**
* Times a sorting algorithm on data for arrays of size incN, 2 * incN, ...,
* maxN and writes the timing data to "outFileName".
*
* For an array of each size, sorts the random data numTrials times to
* produce an total running time.
**/
public static void main(String[] argv) throws IOException {

String outFileName = "";
int incN = 0;
int maxN = 0;
int numTrials = 0;
String sortAlg = "";
if (argv.length != 5) {
printUsage();
return;
} else try {
sortAlg = argv[0];
incN = Integer.parseInt(argv[1]);
maxN = Integer.parseInt(argv[2]);
numTrials = Integer.parseInt(argv[3]);
outFileName = argv[4];
} catch (Exception e) {
printUsage();
return;
}
PrintWriter timings = new PrintWriter(new FileOutputStream(outFileName));
timings.println("# Results for " + numTrials + " trials");
timings.println("# n time (msec)");
timings.println("# ---------------");

timeSort(timings, incN, maxN, numTrials, sortAlg);
timings.close();
System.out.println("done! results in `" + outFileName + "'");
}

/**
* Times a sorting algorithm on data for arrays of size incN, 2 * incN, ...,
* maxN. Performs numTrials trials and computes the total running time.
*/
private static void timeSort(PrintWriter timings, int incN, int maxN,
           int numTrials, String sortAlg) {
Timer stopWatch = new Timer();
for (int n = incN; n <= maxN; n += incN) {
System.out.println("timing n == " + n + " ... ");
stopWatch.reset();
int[][] data = new int[numTrials + 1][];
for (int t = 0; t < numTrials + 1; t++) {
   data[t] = randomData(n);
}
if (sortAlg.equals("insert")) {
   Sort.insertionSort(data[numTrials]); // sort once without counting
   stopWatch.start();
   for (int t = 0; t < numTrials; t++) {
   Sort.insertionSort(data[t]);
   }
   stopWatch.stop();
} else if (sortAlg.equals("select")) {
   Sort.selectionSort(data[numTrials]); // sort once without counting
   stopWatch.start();
   for(int t = 0; t < numTrials; t++) {
   Sort.selectionSort(data[t]);
   }
   stopWatch.stop();
} else if (sortAlg.equals("merge")) {
   Sort.mergeSort(data[numTrials]); // sort once without counting
   stopWatch.start();
   for (int t = 0; t < numTrials; t++) {
   Sort.mergeSort(data[t]);
   }
   stopWatch.stop();
} else if (sortAlg.equals("quick")) {
   Sort.quicksort(data[numTrials]); // sort once without counting
   stopWatch.start();
   for(int t = 0; t < numTrials; t++) {
   Sort.quicksort(data[t]);
   }
   stopWatch.stop();
} else if (sortAlg.equals("best")) {
   YourSort.sort(data[numTrials]); // sort once without counting
   stopWatch.start();
   for(int t = 0; t < numTrials; t++) {
   YourSort.sort(data[t]);
   }
   stopWatch.stop();
} else {
   printUsage();
   return;
}
long totalTime = stopWatch.elapsed();
timings.println(n + " " + totalTime);
}
}

// Prints the contents of A.
private static void printData(int[] A) {
for (int i = 0; i < A.length - 1; i++) {
System.out.print(A[i] + ", ");
}
if (A.length - 1 >= 0) {
System.out.println(A[A.length - 1]);
}
}

/**
* Assumes n > 0
* Returns an array of `n' randomly selected integers.
**/
private static int[] randomData(int n) {

// choose same sequence of random numbers so that
// we can fairly compare our sorting algorithms

Random randGen = new Random(INIT_SEED);

int[] newData = new int[n];
for (int i = 0; i < n; i++) {
newData[i] = randGen.nextInt();
}

return newData;
}

/** Print a message saying how the main method should be called. */
private static void printUsage() {
System.out.println("Usage:");
System.out.println(" java SortPerf <sort> <incr> <max> <runs> <outfile>");
System.out.println(" sort - one of insert, merge, quick, or best");
System.out.println(" incr - the initial array size and increment");
System.out.println(" max - the maximum array size");
System.out.println(" runs - the number of runs for each size");
System.out.println(" outfile - is the name of the timing output file");
}
}

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

/* YourSort.java */

public class YourSort {

public static void sort(int[] A) {
// Place your Part III code here.
}
}

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

/* Timer.java */

/**
* Implements a simple stopwatch/timer class based on wall-clock time.
**/

/**
* RUNNING() == true <==> start() called with no corresponding
* call to stop()
*
* All times are given in units of msec.
**/
public class Timer {

private boolean running;
private long tStart;
private long tFinish;
private long tAccum;

/**
* Initializes Timer to 0 msec
**/
public Timer() {
reset();
}

/**
* Starts the timer. Accumulates time across multiple calls to start.
**/
public void start() {
running = true;
tStart = System.currentTimeMillis();
tFinish = tStart;
}

/**
* Stops the timer. returns the time elapsed since the last matching call
* to start(), or zero if no such matching call was made.
**/
public long stop() {
tFinish = System.currentTimeMillis();
if (running) {
running = false;

long diff = tFinish - tStart;
tAccum += diff;
return diff;
}
return 0;
}

/**
* if RUNNING() ==> returns the time since last call to start()
* if !RUNNING() ==> returns total elapsed time
**/
public long elapsed() {
if (running) {
return System.currentTimeMillis() - tStart;
}

return tAccum;
}

/**
* Stops timing, if currently RUNNING(); resets
* accumulated time to 0.
*/
public void reset() {
running = false;
tStart = 0;
tFinish = 0;
tAccum = 0;
}

}

Thank you very much!

Explanation / Answer

import java.util.Random;
import java.io.*;

class SortPerf {

private static final long INIT_SEED = 1234567;

/**
* Times a sorting algorithm on data for arrays of size incN, 2 * incN, ...,
* maxN and writes the timing data to "outFileName".
*
* For an array of each size, sorts the random data numTrials times to
* produce an total running time.
**/
public static void main(String[] argv) throws IOException {

String outFileName = "";
int incN = 0;
int maxN = 0;
int numTrials = 0;
String sortAlg = "";
if (argv.length != 5) {
printUsage();
return;
} else try {
sortAlg = argv[0];
incN = Integer.parseInt(argv[1]);
maxN = Integer.parseInt(argv[2]);
numTrials = Integer.parseInt(argv[3]);
outFileName = argv[4];
} catch (Exception e) {
printUsage();
return;
}
PrintWriter timings = new PrintWriter(new FileOutputStream(outFileName));
timings.println("# Results for " + numTrials + " trials");
timings.println("# n time (msec)");
timings.println("# ---------------");

timeSort(timings, incN, maxN, numTrials, sortAlg);
timings.close();
System.out.println("done! results in `" + outFileName + "'");
}

/**
* Times a sorting algorithm on data for arrays of size incN, 2 * incN, ...,
* maxN. Performs numTrials trials and computes the total running time.
*/
private static void timeSort(PrintWriter timings, int incN, int maxN,
           int numTrials, String sortAlg) {
Timer stopWatch = new Timer();
for (int n = incN; n <= maxN; n += incN) {
System.out.println("timing n == " + n + " ... ");
stopWatch.reset();
int[][] data = new int[numTrials + 1][];
for (int t = 0; t < numTrials + 1; t++) {
   data[t] = randomData(n);
}
if (sortAlg.equals("insert")) {
   Sort.insertionSort(data[numTrials]); // sort once without counting
   stopWatch.start();
   for (int t = 0; t < numTrials; t++) {
   Sort.insertionSort(data[t]);
   }
   stopWatch.stop();
} else if (sortAlg.equals("select")) {
   Sort.selectionSort(data[numTrials]); // sort once without counting
   stopWatch.start();
   for(int t = 0; t < numTrials; t++) {
   Sort.selectionSort(data[t]);
   }
   stopWatch.stop();
} else if (sortAlg.equals("merge")) {
   Sort.mergeSort(data[numTrials]); // sort once without counting
   stopWatch.start();
   for (int t = 0; t < numTrials; t++) {
   Sort.mergeSort(data[t]);
   }
   stopWatch.stop();
} else if (sortAlg.equals("quick")) {
   Sort.quicksort(data[numTrials]); // sort once without counting
   stopWatch.start();
   for(int t = 0; t < numTrials; t++) {
   Sort.quicksort(data[t]);
   }
   stopWatch.stop();
} else if (sortAlg.equals("best")) {
   YourSort.sort(data[numTrials]); // sort once without counting
   stopWatch.start();
   for(int t = 0; t < numTrials; t++) {
   YourSort.sort(data[t]);
   }
   stopWatch.stop();
} else {
   printUsage();
   return;
}
long totalTime = stopWatch.elapsed();
timings.println(n + " " + totalTime);
}
}

// Prints the contents of A.
private static void printData(int[] A) {
for (int i = 0; i < A.length - 1; i++) {
System.out.print(A[i] + ", ");
}
if (A.length - 1 >= 0) {
System.out.println(A[A.length - 1]);
}
}

/**
* Assumes n > 0
* Returns an array of `n' randomly selected integers.
**/
private static int[] randomData(int n) {

// choose same sequence of random numbers so that
// we can fairly compare our sorting algorithms

Random randGen = new Random(INIT_SEED);

int[] newData = new int[n];
for (int i = 0; i < n; i++) {
newData[i] = randGen.nextInt();
}

return newData;
}

/** Print a message saying how the main method should be called. */
private static void printUsage() {
System.out.println("Usage:");
System.out.println(" java SortPerf <sort> <incr> <max> <runs> <outfile>");
System.out.println(" sort - one of insert, merge, quick, or best");
System.out.println(" incr - the initial array size and increment");
System.out.println(" max - the maximum array size");
System.out.println(" runs - the number of runs for each size");
System.out.println(" outfile - is the name of the timing output file");
}
}