Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

This assignment asks you to sort the lines of an input file (or from standard in

ID: 3791668 • Letter: T

Question

This assignment asks you to sort the lines of an input file (or from standard input) and print the sorted lines to an output file (or standard output). Your program, called bstsort (binary search tree sort), will take the following command line arguments:

% bstsort [-c] [-o output_file_name] [input_file_name]

If -c is present, the program needs to compare the strings case sensitive; otherwise, it's case insensitive. If the output_file_name is given with the -o option, the program will output the sorted lines to the given output file; otherwise, the output shall be the standard output. Similarly, if
the input_file_name is given, the program will read from the input file; otherwise, the input will be from the standard input. You must use getopt() to parse the command line arguments to determine the cases. All strings will be no more than 100 characters long.

In addition to parsing and processing the command line arguments, your program needs to do the following:

1. You need to construct a binary search tree as you read from input. A binary search tree is a binary tree. Each node can have at most two child nodes (one on the left and one on the right), both or either one can be empty. If a child node exists, it's the root of a binary search tree (we call subtree). Each node contains a key (in our case, it's a string) and a count of how many of that string were included. If the left subtree of a node exists, it contains only nodes with keys less than the node's key. If the right subtree of a node exists, it contains only nodes with keys greater than the node's key. You can look up binary search tree on the web

. Note that you do not need to balance the binary search tree (that is, you can ignore all

those rotation operations) in this assignment.
2. Initially the tree is empty (that is, the root is null). The program reads from the input file (or stdin)

one line at a time; If the line is not an empty line, it should create a tree node that stores (a copy of) the string (you shall remove the trailing line feed) and a count of 1 indicating this is the first occurrence of that string, and then insert the tree node to the binary search tree. An empty line would indicate the end of input for stdin, an empty line or end of file would indicate the end of input for an input file.

You must develop two string comparison functions, one for case sensitive and the other for case insensitive. You must not use strcmp() and strcasecmp() functions provided by the C library. You must implement your own version.

Once the program has read all the input (when EOF is returned), the program then performs an in-order traversal of the binary search tree to print out all the strings one line at a time to the output file or stdout. If there are duplicates than include all duplicates.

Before the program ends, it must reclaim the tree! You can do this by performing a post-order traversal, i.e., reclaiming the children nodes before reclaiming the node itself. Make sure you also reclaim the memory occupied by the string as well.

It is required that you use getopt for processing the command line and use malloc/free functions for dynamically allocating and deallocating nodes and the buffers for the strings. It is required that you implement your own string comparison functions instead of using the corresponding libc

functions.

Here's an example:

Please submit your work through blackboard as one zip file. Follow the instructions below carefully (to avoid unnecessary loss of grade):

You should submit the source code and the Makefile. One should be able to create the executable by simply 'make'. The Makefile should also contain a 'clean' target for cleaning up the directory (removing all temporary files, object files and executable files). Make sure you don't include intermediate files: *.o, executables, *~, etc., in your submission. (There'll be a penalty for including unnecessary intermediate files). Only three files should be included unless permission is given for more, those would be bstsort.c, bstsort.h, and Makefile.

1. You need to construct a binary search tree as you read from input. A binary search tree is a binary tree. Each node can have at most two child nodes (one on the left and one on the right), both or either one can be empty. If a child node exists, it's the root of a binary search tree (we call subtree). Each node contains a key (in our case, it's a string) and a count of how many of that string were included. If the left subtree of a node exists, it contains only nodes with keys less than the node's key. If the right subtree of a node exists, it contains only nodes with keys greater than the node's key. You can look up binary search tree on the web

. Note that you do not need to balance the binary search tree (that is, you can ignore all

those rotation operations) in this assignment.
2. Initially the tree is empty (that is, the root is null). The program reads from the input file (or stdin)

one line at a time; If the line is not an empty line, it should create a tree node that stores (a copy of) the string (you shall remove the trailing line feed) and a count of 1 indicating this is the first occurrence of that string, and then insert the tree node to the binary search tree. An empty line would indicate the end of input for stdin, an empty line or end of file would indicate the end of input for an input file.

Explanation / Answer

bstsort.c

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include "bstsort.h"

void usage(){
fprintf(stderr, "usage: bstsort [-c] [-o output_file_name] [input_file_name] ");
exit(1);
}

struct Node{
char* data;
struct Node* left;
struct Node* right;

};

struct Tree{
struct Node* root;
};


struct Tree* create_binary_tree(){
struct Tree* t = (struct Tree*)malloc(sizeof(struct Tree));
t->root = 0;
return t;
}

void insert_node (struct Node* node, struct Node* current, int cs){
/* add lower string to right subtree */
if(strCompare(node->data, current->data, cs) > 0){
    if(!current->right){
      current->right = node;
    } else{
      insert_node(node, current->right, cs);
    }
}

/* add lower string to left subtree */
if(strCompare(node->data, current->data, cs) <= 0){
    if(!current->left){
      current->left = node;
    } else{
      insert_node(node, current->left, cs);
    }
}
return;
}

void add_to_tree(struct Tree* t, char* data, int cs){
struct Node* node = (struct Node*)malloc(sizeof(struct Node*));
node->data = data;
node->left = 0;
node->right = 0;

if(!t->root){
    t->root = node;
} else {
    insert_node(node, t->root, cs);
}
return;
}

void print_node (struct Node* n, FILE* fout){
/* left */
if(n->left){
    print_node(n->left, fout);
}

/* root */
fprintf(fout, "%s", n->data);
free(n->data);

/* right */
if(n->right){
    print_node(n->right, fout);
}

/* reclaim memory */
free(n->left);
free(n->right);
}

void print_tree (struct Tree* t, FILE* fout){
if(t->root){
    print_node(t->root, fout);
    free(t->root);
}
}

int numchar(char* x){
int length;
length = 0;

while(*x++ != ''){
    length++;
}
return length;
}

void ditto(char* a, char* b){
while((*a++ = *b++));
}

int main(int argc, char* argv[])
{
int c;
int cs, of; /* case sensistive, output file */
FILE *fin, *fout;

cs = of = 0;
while((c = getopt(argc, argv, "co")) != -1){
    switch(c) {
      case 'c':

        cs = 1;
        break;
      case 'o':
   of = 1;
   break;
      case '?':
      default:
        usage();
        printf("default");
    }
}
argc -= optind;
argv += optind;

/* open input and output files */
if(!of && argc == 0){
    fin = stdin;
    fout = stdout;
}else if(of && argc == 1){
    fin = stdin;
    fout = fopen(argv[0],"w");
        if(!fout) {
      fprintf(stderr, "ERROR: can't open file (to write): %s ", argv[0]);
      return -1;
    }
}else if(!of && argc == 1){
    fout = stdout;
    fin = fopen(argv[0],"r");
   if(!fin) {
      fprintf(stderr, "ERROR: can't open file (to read): %s ", argv[0]);
      return -1;
    }
}else if(of && argc == 2){
    fout = fopen(argv[0],"w");
   if(!fout) {
      fprintf(stderr, "ERROR: can't open file (to write): %s ", argv[0]);
      return -1;
    }
   fin = fopen(argv[1],"r");
   if(!fin) {
      fprintf(stderr, "ERROR: can't open file (to read): %s ", argv[1]);
      return -1;
    }
} else {
    fprintf(stderr, "ERROR: wrong arguments! ");
   usage();
    return -1;
}

struct Tree* tree;
/* create a binary tree */
tree = create_binary_tree();

char* x;
char* line;
size_t len = 0;
/* read data and create binary tree */
while((getline(&line, &len, fin))>0){
    /* allocate space and copy line into memory */
    x = (char*)malloc(sizeof(char)*numchar(line));
    ditto(x, line);
    /* add line to tree */
    add_to_tree(tree, x, cs);
}

/* print tree */
print_tree(tree, fout);

/* close input and output files */
if(!of && argc == 1){
    fclose(fin);
}else if(of && argc == 2){
    fclose(fout);
    fclose(fin);
}

return 0;
}

bstsort.h

void strCompareUsage();
int compareWithCase(char*, char*);
int compareWithOutCase(char*, char*);
int strCompare(char*, char*, int);

strCompare.c

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include "bstsort.h"

void strCompareUsage();
int compareWithCase(char* str1, char* str2);
int compareWithOutCase(char* str1, char* str2);
int strCompare(char* string1, char* string2, int flag);

void strCompareUsage()
{
fprintf(stderr, "usage: strCompare [-w] [string1] [string2] ");
exit(1);
}

int compareWithCase(char* str1, char* str2){
while(*str1 != '' && *str2 != ''){
    if(*str1 != *str2){
      return *str1 - *str2;
    }
    str1++;
    str2++;
}

if(*str1 != '' || *str2 != ''){
    return -1;
}

/* strings match */
return 0;
}

int compareWithOutCase(char* str1, char* str2){
while(*str1 != '' && *str2 != ''){
    if(*str1 != *str2 && *str1 != (*str2 + 32) && *str1 != (*str2 - 32)){
      return *str1 - *str2;
    }
    str1++;
    str2++;
}

if(*str2 != ''){
    return 1; /* string1 is shorter */
}
else if(*str1 != ''){
    return -1; /* string2 is shorter */
}

/* strings match */
return 0;
}

int strCompare(char* string1, char* string2, int cs){
if(cs)
    return compareWithCase(string1, string2);
else
    return compareWithOutCase(string1, string2);
}


Makefile

all: bstsort

bstsort: bstsort.o strCompare.o
   gcc bstsort.o strCompare.o bstsort.h -o bstsort

bstsort.o: bstsort.c
   gcc -c bstsort.c -Wall -o bstsort.o

strCompare.o: strCompare.c  
   gcc -c strCompare.c -Wall -o strCompare.o

clean:
   rm -f bstsort core *~ *.o strCompare