Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Please follow the insturction list in the photos. A sample file has been given f

ID: 3701842 • Letter: P

Question

Please follow the insturction list in the photos. A sample file has been given for the directory names and information ina sample of one of cranfieldXXXX.

doclist.txt File

cranfield0001
cranfield0002
cranfield0003
cranfield0004
cranfield0005
cranfield0006
cranfield0007
cranfield0008
cranfield0009
cranfield0010
cranfield0011
cranfield0012
cranfield0013
cranfield0014
cranfield0015
cranfield0016
cranfield0017
cranfield0018
cranfield0019
cranfield0020
cranfield0021
cranfield0022
cranfield0023
cranfield0024
cranfield0025
cranfield0026
cranfield0027
cranfield0028
cranfield0029
cranfield0030
cranfield0031
cranfield0032
cranfield0033
cranfield0034
cranfield0035
cranfield0036
cranfield0037
cranfield0038
cranfield0039
cranfield0040
cranfield0041
cranfield0042
cranfield0043
cranfield0044
cranfield0045
cranfield0046
cranfield0047
cranfield0048
cranfield0049
cranfield0050

cranfield0001 File

<DOC>
<DOCNO>
1
</DOCNO>
<TITLE>
experimental investigation of the aerodynamics of a
wing in a slipstream .
</TITLE>
<AUTHOR>
brenckman,m.
</AUTHOR>
<BIBLIO>
j. ae. scs. 25, 1958, 324.
</BIBLIO>
<TEXT>
an experimental study of a wing in a propeller slipstream was
made in order to determine the spanwise distribution of the lift
increase due to slipstream at different angles of attack of the wing
and at different free stream to slipstream velocity ratios . the
results were intended in part as an evaluation basis for different
theoretical treatments of this problem .
the comparative span loading curves, together with supporting
evidence, showed that a substantial part of the lift increment
produced by the slipstream was due to a /destalling/ or boundary-layer-control
effect . the integrated remaining lift increment,
after subtracting this destalling lift, was found to agree
well with a potential flow theory .
an empirical evaluation of the destalling effects was made for
the specific configuration of the experiment .
</TEXT>
</DOC>

2. The corpus consist of a list of files with names that the follow the pattern where the xxxx identifies different files e.g. cranfield0001, crandfield0002, ..., cranfield0050 The words in the corpus will help us to build a dictionary later on in this assignment Part2: Using fstream and argv/argc (20 pts 1. (10 pts) Once you have created doclist.txt successfully with the right format, do the following Read the string name doctlist.txt into your program using argvl] Using the file streams open doclist.txt as we learned in class (See Week 6) Report a suitable error message (on std::cerr) if exactly 1 file-path is not supplied as the command-line argument. Make up your own error message (See Exercise 6) Report a suitable error message (on std::cerr if the file cannot be successfully opened for reading. Make up your own error message (See Exercise 6) * * 2· (10 pts) Use the content of doclist.txt (names of files) to open the çtanfieldxxxx documents of vour corpus one at time You will need a new fstream that will open and close before being reused to open the next file. This processes continues until your program has passed over all the documents Warning: doclist.txt does not indicate where currently your files are (target directory) You will need to concatenate the substring "cranfieldxxxx" with another substring that indicates the path of the Report a suitable error message (on std::cerr if the file cannot be successfully opened for reading. Make up your own error message (See Exercise 6) * * corpus Note: At this point you can test your code whether is functional or not. You can integrate a std: cout statement and write into standard output (screen) the content of each document. If you have done this you are in good shape to proceed to the next part of the assignment. Part3: Nested Loops and maps (20 pts

Explanation / Answer

Compiling instructios : g++ -std=c++11 filename docsfilename

Please complete the path to corpus variable in the code to run it successfully

Read the comments for more clarity on the implementation

#include <bits/stdc++.h>
using namespace std;
int main(int argc,char **argv)
{
   unordered_map<string,int> mydict; // Map to store frequency of tokens
   ifstream docsfile;
   string path_to_corpus = ""; // Mention path to the corpus here ending with a /

   //Checking number of arguments
   if(argc > 2) cerr << "Invalid number of arguments";
   docsfile.open(argv[1]);
   // Checking if file was successfully opened
   if(docsfile.is_open())
   {

       while(!docsfile.eof())
       {
           // get a doc from docslist and open it
           string filename;
           docsfile >> filename;
           filename = path_to_corpus + filename;
           ifstream currfile;
           currfile.open(filename);

           // checking if the corpus file can be opened
           if(currfile.is_open())
           {
               while(!currfile.eof())
               {
                   string currline;
                   currfile >> currline;
                   stringstream cline(currline);
                   string temp;

                   // read a line , get tokens and increment count
                   while(getline(cline,temp,' ')) mydict[temp]++;
               }
           }
           else
           {
               cerr << "Unable to open corpus file";
           }
       }
   }
   else
   {
       cerr << "Unable to open docsfile";
   }

   int size_of_vocab = mydict.size();
   int totalwords = 0;
  
   // iterating over all keys
   for(auto val : mydict) totalwords += val.second;

   cout << "Vocabulary Size : " << size_of_vocab << endl;
   cout << "Total Words : " << totalwords << endl;
}