CIS 256 (Java - data structures) Goal: This lab will introduce you to several so
ID: 3831698 • Letter: C
Question
CIS 256 (Java - data structures)
Goal: This lab will introduce you to several sorting algorithms and allow you
to compare their running times in practice.
Part I: Identifying Sort Algorithms (1 point)
----------------------------------------------
To help build your intuition of how these sorting algorithms work, take a look
at three algorithms in action. Run a web browser, make sure its has Java
enabled, and go to the Web page
http://www.cs.berkeley.edu/~jrs/61b/lab/MysterySort
Each histogram shown illustrates an array of ints. The length of each bar is
proportional to the integer stored at that position in the array. When you
click on a histogram, an animated algorithm sorts the array.
The histograms are numbered 1, 2, 3, and 4, each number corresponding to a
different sorting algorithm you have seen in class. Identify each algorithm.
How can you tell which is which? Which is fastest on the random data? What
about the presorted data and the reverse sorted data?
One of the algorithms is quicksort. Which element does it choose for the pivot
in a partition? (The pivot is not randomly chosen.) How can you tell?
Part II: Timing Sort Algorithms (2 points)
-------------------------------------------
Part I gave you a rough idea of the relative speeds of the three sorting
algorithms. For greater accuracy, we will now time several algorithms.
SortPerf.java runs a sorting algorithm (chosen from a selection of several) on
a randomly generated array of specified size, and outputs the data to a file.
Compile and run SortPerf.
javac -g SortPerf.java
java SortPerf select 5 175 200 select.dat
The first argument is the sorting algorithm ("select", "merge", "insert", or
"quick"). The second is both the size of the smallest array to sort and the
increment between array sizes. The third is the size of the largest array to
sort. The fourth is the number of trials for each size of array. Because
timings are easily perturbed by other processes running on the same
workstation, you must run each test for a few hundred trials to get accurate
timings. The last argument is the name of the file in which SortPerf reports
the total running time, in milliseconds, of all trials for each array size.
The sample command above runs selection sort on arrays of size 5, 10, 15, ...,
175, each 200 times, placing the timing results in the file select.dat. If you
notice that the timings look a little bumpy, i.e., the running time does not
appear to be strictly increasing with input size, close any unnecessary
applications you have running, then try running SortPerf several times until
you get a fairly smooth set of numbers. When SortPerf is running, be careful
not to touch the mouse or keyboard, as any user input increases the running
time.
Now run SortPerf for the same sizes using mergesort, insertion sort, and
quicksort. Save these data to different files--say, merge.dat, insert.dat, and
quick.dat. To make this easier, we have also provided a file called runrace,
which you can execute to run all four sorting algorithms. Read runrace to see
how it works.
Use the program "gnuplot" to plot the timing results. At the prompt in an
xterm (not emacs), type
gnuplot
You will see a ">" prompt. Plot the files using a command like
gnuplot> plot "select.dat" with linesp 1, "merge.dat" with linesp 2,
"insert.dat" with linesp 3, "quick.dat" with linesp 4
[Type all this on just one line, though.]
You should observe that the algorithms behave asymptotically as expected, but
it is unclear which algorithm is faster for small array sizes. At what values
of n do you observe cross-overs in execution time between the three sorting
algorithms? To help see these cross-overs better, edit the runrace file to use
the command
java SortPerf <sort> 1 50 1000 <sort>.dat
for each sorting algorithm. This gives a clearer picture of what happens for
array sizes less than 50. Again run the script and plot the results with
gnuplot.
PART III: Building a Better Sort (1 point)
-------------------------------------------
Based on your observations in Part II, write a sorting algorithm that works
well on the random data for all sizes of n. It won't be a completely new
sorting algorithm, but should combine the sorting algorithms described here in
a very simple way. Implement your solution in YourSort.java. Run your
algorithm from SortPerf by calling the sorting method "best". Modify runrace
to generate timings for your new algorithm, and plot the results.
Check-off
---------
1 point: What are sorting algorithms 1, 2, and 3 in Part I? How do you know?
How is the pivot element chosen? How do you know?
3 points: Show the plotted performance of the sorting algorithms. The third
point is for including your algorithm in the plot, and demonstrating
that it's roughly as good as any of the others for any array size.
-------------------------------------------------------------------------------------------------------------------------------------------------------------
/* Sort.java */
/**
* Contains several sorting routines implemented as static methods.
* Arrays are rearranged with smallest item first using compares.
* @author Kathy Yelick and Mark Allen Weiss
**/
public final class Sort {
/**
* Simple insertion sort.
* @param a an array of int items.
**/
public static void insertionSort(int[] a) {
for (int p = 1; p < a.length; p++) {
int tmp = a[p];
int j = p;
for(; (j > 0) && (tmp < a[j - 1]); j--) {
a[j] = a[j - 1];
}
a[j] = tmp;
}
}
/**
* Shellsort, using a sequence suggested by Gonnet.
* @param a an array of int items.
**/
public static void shellsort(int[] a) {
for (int gap = a.length / 2; gap > 0;
gap = (gap == 2) ? 1 : (int) (gap / 2.2)) {
for (int i = gap; i < a.length; i++) {
int tmp = a[i];
int j = i;
for (; (j >= gap) && (tmp < a[j - gap]); j -= gap) {
a[j] = a[j - gap];
}
a[j] = tmp;
}
}
}
/**
* Standard heapsort.
* @param a an array of int items.
**/
public static void heapsort(int[] a) {
for (int i = a.length / 2; i >= 0; i--) {
percDown(a, i, a.length);
}
for (int i = a.length - 1; i > 0; i--) {
swapReferences(a, 0, i);
percDown(a, 0, i);
}
}
/**
* Internal method for heapsort.
* @param i the index of an item in the heap.
* @return the index of the left child.
**/
private static int leftChild(int i) {
return 2 * i + 1;
}
/**
* Internal method for heapsort, used in deleteMax and buildHeap.
* @param a an array of int items.
* @index i the position from which to percolate down.
* @int n the logical size of the binary heap.
**/
private static void percDown(int[] a, int i, int n) {
int child;
int tmp;
for (tmp = a[i]; leftChild(i) < n; i = child) {
child = leftChild(i);
if ((child != n - 1) && (a[child] < a[child + 1])) {
child++;
}
if (tmp < a[child]) {
a[i] = a[child];
} else {
break;
}
}
a[i] = tmp;
}
/**
* Mergesort algorithm.
* @param a an array of int items.
**/
public static void mergeSort(int[] a) {
int[] tmpArray = new int[a.length];
mergeSort(a, tmpArray, 0, a.length - 1);
}
/**
* Internal method that makes recursive calls.
* @param a an array of int items.
* @param tmpArray an array to place the merged result.
* @param left the left-most index of the subarray.
* @param right the right-most index of the subarray.
**/
private static void mergeSort(int[] a, int[] tmpArray, int left, int right) {
if (left < right) {
int center = (left + right) / 2;
mergeSort(a, tmpArray, left, center);
mergeSort(a, tmpArray, center + 1, right);
merge(a, tmpArray, left, center + 1, right);
}
}
/**
* Internal method that merges two sorted halves of a subarray.
* @param a an array of int items.
* @param tmpArray an array to place the merged result.
* @param leftPos the left-most index of the subarray.
* @param rightPos the index of the start of the second half.
* @param rightEnd the right-most index of the subarray.
**/
private static void merge(int[] a, int[] tmpArray, int leftPos, int rightPos,
int rightEnd) {
int leftEnd = rightPos - 1;
int tmpPos = leftPos;
int numElements = rightEnd - leftPos + 1;
// Main loop
while (leftPos <= leftEnd && rightPos <= rightEnd) {
if (a[leftPos] < a[rightPos]) {
tmpArray[tmpPos++] = a[leftPos++];
} else {
tmpArray[tmpPos++] = a[rightPos++];
}
}
while (leftPos <= leftEnd) {
tmpArray[tmpPos++] = a[leftPos++];
}
while(rightPos <= rightEnd) {
tmpArray[tmpPos++] = a[rightPos++];
}
// Copy TmpArray back
for (int i = 0; i < numElements; i++, rightEnd--) {
a[rightEnd] = tmpArray[rightEnd];
}
}
/**
* Quicksort algorithm.
* @param a an array of int items.
**/
public static void quicksort(int[] a) {
quicksort(a, 0, a.length - 1);
}
/**
* Method to swap two ints in an array.
* @param a an array of ints.
* @param index1 the index of the first int to be swapped.
* @param index2 the index of the second int to be swapped.
**/
public static void swapReferences(int[] a, int index1, int index2) {
int tmp = a[index1];
a[index1] = a[index2];
a[index2] = tmp;
}
/**
* This is a generic version of C.A.R Hoare's Quick Sort algorithm. This
* will handle arrays that are already sorted, and arrays with duplicate
* keys.
*
* If you think of an array as going from the lowest index on the left to
* the highest index on the right then the parameters to this function are
* lowest index or left and highest index or right. The first time you call
* this function it will be with the parameters 0, a.length - 1.
*
* @param a an integer array
* @param lo0 left boundary of array partition
* @param hi0 right boundary of array partition
**/
private static void quicksort(int a[], int lo0, int hi0) {
int lo = lo0;
int hi = hi0;
int mid;
if (hi0 > lo0) {
// Arbitrarily establishing partition element as the midpoint of
// the array.
swapReferences(a, lo0, (lo0 + hi0)/2);
mid = a[(lo0 + hi0) / 2];
// loop through the array until indices cross.
while (lo <= hi) {
// find the first element that is greater than or equal to
// the partition element starting from the left Index.
while((lo < hi0) && (a[lo] < mid)) {
lo++;
}
// find an element that is smaller than or equal to
// the partition element starting from the right Index.
while((hi > lo0) && (a[hi] > mid)) {
hi--;
}
// if the indices have not crossed, swap them.
if (lo <= hi) {
swapReferences(a, lo, hi);
lo++;
hi--;
}
}
// If the right index has not reached the left side of array
// we must now sort the left partition.
if (lo0 < hi) {
quicksort(a, lo0, hi);
}
// If the left index has not reached the right side of array
// must now sort the right partition.
if (lo < hi0) {
quicksort(a, lo, hi0);
}
}
}
/**
* Internal insertion sort routine.
* @param a an array of int items.
* @param low the left-most index of the subarray.
* @param n the number of items to sort.
**/
private static void insertionSort(int[] a, int low, int high) {
for (int p = low + 1; p <= high; p++) {
int tmp = a[p];
int j;
for (j = p; (j > low) && (tmp < a[j - 1]); j--) {
a[j] = a[j - 1];
}
a[j] = tmp;
}
}
/**
* Implementation of the SelectionSort algorithm on integer arrays.
* @param a an array of int items.
**/
public static void selectionSort(int[] A) {
int i;
for (i = A.length-1; i > 0; i--) {
// outer loop invariant: A[i..n-1] is sorted.
// find maximum value in A[0..i]
int maxIndex = 0;
int j;
for (j = 1; j <= i; j++) {
/* inner loop invariant: for all k < j, A[maxIndex] >= A[k] */
if (A[maxIndex] < A[j]) {
maxIndex = j;
}
}
// swap largest (A[maxIndex]) into A[i] to maintain invariant.
swapReferences(A, i, maxIndex);
}
}
}
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
/* SortPerf.java */
import java.util.Random;
import java.io.*;
class SortPerf {
private static final long INIT_SEED = 1234567;
/**
* Times a sorting algorithm on data for arrays of size incN, 2 * incN, ...,
* maxN and writes the timing data to "outFileName".
*
* For an array of each size, sorts the random data numTrials times to
* produce an total running time.
**/
public static void main(String[] argv) throws IOException {
String outFileName = "";
int incN = 0;
int maxN = 0;
int numTrials = 0;
String sortAlg = "";
if (argv.length != 5) {
printUsage();
return;
} else try {
sortAlg = argv[0];
incN = Integer.parseInt(argv[1]);
maxN = Integer.parseInt(argv[2]);
numTrials = Integer.parseInt(argv[3]);
outFileName = argv[4];
} catch (Exception e) {
printUsage();
return;
}
PrintWriter timings = new PrintWriter(new FileOutputStream(outFileName));
timings.println("# Results for " + numTrials + " trials");
timings.println("# n time (msec)");
timings.println("# ---------------");
timeSort(timings, incN, maxN, numTrials, sortAlg);
timings.close();
System.out.println("done! results in `" + outFileName + "'");
}
/**
* Times a sorting algorithm on data for arrays of size incN, 2 * incN, ...,
* maxN. Performs numTrials trials and computes the total running time.
*/
private static void timeSort(PrintWriter timings, int incN, int maxN,
int numTrials, String sortAlg) {
Timer stopWatch = new Timer();
for (int n = incN; n <= maxN; n += incN) {
System.out.println("timing n == " + n + " ... ");
stopWatch.reset();
int[][] data = new int[numTrials + 1][];
for (int t = 0; t < numTrials + 1; t++) {
data[t] = randomData(n);
}
if (sortAlg.equals("insert")) {
Sort.insertionSort(data[numTrials]); // sort once without counting
stopWatch.start();
for (int t = 0; t < numTrials; t++) {
Sort.insertionSort(data[t]);
}
stopWatch.stop();
} else if (sortAlg.equals("select")) {
Sort.selectionSort(data[numTrials]); // sort once without counting
stopWatch.start();
for(int t = 0; t < numTrials; t++) {
Sort.selectionSort(data[t]);
}
stopWatch.stop();
} else if (sortAlg.equals("merge")) {
Sort.mergeSort(data[numTrials]); // sort once without counting
stopWatch.start();
for (int t = 0; t < numTrials; t++) {
Sort.mergeSort(data[t]);
}
stopWatch.stop();
} else if (sortAlg.equals("quick")) {
Sort.quicksort(data[numTrials]); // sort once without counting
stopWatch.start();
for(int t = 0; t < numTrials; t++) {
Sort.quicksort(data[t]);
}
stopWatch.stop();
} else if (sortAlg.equals("best")) {
YourSort.sort(data[numTrials]); // sort once without counting
stopWatch.start();
for(int t = 0; t < numTrials; t++) {
YourSort.sort(data[t]);
}
stopWatch.stop();
} else {
printUsage();
return;
}
long totalTime = stopWatch.elapsed();
timings.println(n + " " + totalTime);
}
}
// Prints the contents of A.
private static void printData(int[] A) {
for (int i = 0; i < A.length - 1; i++) {
System.out.print(A[i] + ", ");
}
if (A.length - 1 >= 0) {
System.out.println(A[A.length - 1]);
}
}
/**
* Assumes n > 0
* Returns an array of `n' randomly selected integers.
**/
private static int[] randomData(int n) {
// choose same sequence of random numbers so that
// we can fairly compare our sorting algorithms
Random randGen = new Random(INIT_SEED);
int[] newData = new int[n];
for (int i = 0; i < n; i++) {
newData[i] = randGen.nextInt();
}
return newData;
}
/** Print a message saying how the main method should be called. */
private static void printUsage() {
System.out.println("Usage:");
System.out.println(" java SortPerf <sort> <incr> <max> <runs> <outfile>");
System.out.println(" sort - one of insert, merge, quick, or best");
System.out.println(" incr - the initial array size and increment");
System.out.println(" max - the maximum array size");
System.out.println(" runs - the number of runs for each size");
System.out.println(" outfile - is the name of the timing output file");
}
}
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
/* YourSort.java */
public class YourSort {
public static void sort(int[] A) {
// Place your Part III code here.
}
}
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
/* Timer.java */
/**
* Implements a simple stopwatch/timer class based on wall-clock time.
**/
/**
* RUNNING() == true <==> start() called with no corresponding
* call to stop()
*
* All times are given in units of msec.
**/
public class Timer {
private boolean running;
private long tStart;
private long tFinish;
private long tAccum;
/**
* Initializes Timer to 0 msec
**/
public Timer() {
reset();
}
/**
* Starts the timer. Accumulates time across multiple calls to start.
**/
public void start() {
running = true;
tStart = System.currentTimeMillis();
tFinish = tStart;
}
/**
* Stops the timer. returns the time elapsed since the last matching call
* to start(), or zero if no such matching call was made.
**/
public long stop() {
tFinish = System.currentTimeMillis();
if (running) {
running = false;
long diff = tFinish - tStart;
tAccum += diff;
return diff;
}
return 0;
}
/**
* if RUNNING() ==> returns the time since last call to start()
* if !RUNNING() ==> returns total elapsed time
**/
public long elapsed() {
if (running) {
return System.currentTimeMillis() - tStart;
}
return tAccum;
}
/**
* Stops timing, if currently RUNNING(); resets
* accumulated time to 0.
*/
public void reset() {
running = false;
tStart = 0;
tFinish = 0;
tAccum = 0;
}
}
Thank you very much!
Explanation / Answer
import java.util.Random;
import java.io.*;
class SortPerf {
private static final long INIT_SEED = 1234567;
/**
* Times a sorting algorithm on data for arrays of size incN, 2 * incN, ...,
* maxN and writes the timing data to "outFileName".
*
* For an array of each size, sorts the random data numTrials times to
* produce an total running time.
**/
public static void main(String[] argv) throws IOException {
String outFileName = "";
int incN = 0;
int maxN = 0;
int numTrials = 0;
String sortAlg = "";
if (argv.length != 5) {
printUsage();
return;
} else try {
sortAlg = argv[0];
incN = Integer.parseInt(argv[1]);
maxN = Integer.parseInt(argv[2]);
numTrials = Integer.parseInt(argv[3]);
outFileName = argv[4];
} catch (Exception e) {
printUsage();
return;
}
PrintWriter timings = new PrintWriter(new FileOutputStream(outFileName));
timings.println("# Results for " + numTrials + " trials");
timings.println("# n time (msec)");
timings.println("# ---------------");
timeSort(timings, incN, maxN, numTrials, sortAlg);
timings.close();
System.out.println("done! results in `" + outFileName + "'");
}
/**
* Times a sorting algorithm on data for arrays of size incN, 2 * incN, ...,
* maxN. Performs numTrials trials and computes the total running time.
*/
private static void timeSort(PrintWriter timings, int incN, int maxN,
int numTrials, String sortAlg) {
Timer stopWatch = new Timer();
for (int n = incN; n <= maxN; n += incN) {
System.out.println("timing n == " + n + " ... ");
stopWatch.reset();
int[][] data = new int[numTrials + 1][];
for (int t = 0; t < numTrials + 1; t++) {
data[t] = randomData(n);
}
if (sortAlg.equals("insert")) {
Sort.insertionSort(data[numTrials]); // sort once without counting
stopWatch.start();
for (int t = 0; t < numTrials; t++) {
Sort.insertionSort(data[t]);
}
stopWatch.stop();
} else if (sortAlg.equals("select")) {
Sort.selectionSort(data[numTrials]); // sort once without counting
stopWatch.start();
for(int t = 0; t < numTrials; t++) {
Sort.selectionSort(data[t]);
}
stopWatch.stop();
} else if (sortAlg.equals("merge")) {
Sort.mergeSort(data[numTrials]); // sort once without counting
stopWatch.start();
for (int t = 0; t < numTrials; t++) {
Sort.mergeSort(data[t]);
}
stopWatch.stop();
} else if (sortAlg.equals("quick")) {
Sort.quicksort(data[numTrials]); // sort once without counting
stopWatch.start();
for(int t = 0; t < numTrials; t++) {
Sort.quicksort(data[t]);
}
stopWatch.stop();
} else if (sortAlg.equals("best")) {
YourSort.sort(data[numTrials]); // sort once without counting
stopWatch.start();
for(int t = 0; t < numTrials; t++) {
YourSort.sort(data[t]);
}
stopWatch.stop();
} else {
printUsage();
return;
}
long totalTime = stopWatch.elapsed();
timings.println(n + " " + totalTime);
}
}
// Prints the contents of A.
private static void printData(int[] A) {
for (int i = 0; i < A.length - 1; i++) {
System.out.print(A[i] + ", ");
}
if (A.length - 1 >= 0) {
System.out.println(A[A.length - 1]);
}
}
/**
* Assumes n > 0
* Returns an array of `n' randomly selected integers.
**/
private static int[] randomData(int n) {
// choose same sequence of random numbers so that
// we can fairly compare our sorting algorithms
Random randGen = new Random(INIT_SEED);
int[] newData = new int[n];
for (int i = 0; i < n; i++) {
newData[i] = randGen.nextInt();
}
return newData;
}
/** Print a message saying how the main method should be called. */
private static void printUsage() {
System.out.println("Usage:");
System.out.println(" java SortPerf <sort> <incr> <max> <runs> <outfile>");
System.out.println(" sort - one of insert, merge, quick, or best");
System.out.println(" incr - the initial array size and increment");
System.out.println(" max - the maximum array size");
System.out.println(" runs - the number of runs for each size");
System.out.println(" outfile - is the name of the timing output file");
}
}