Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Need help with a python assignment. The goal of the assignment is to use the pro

ID: 3795621 • Letter: N

Question

Need help with a python assignment. The goal of the assignment is to use the provided data file to develop an algorithm that will automatically score the sentiment of a new review that the user inputs. Before a user-input review can be scored, the program would need to execute a training phase based on the reviews that are provided in the data file. The training phase would process the file as follows: 1. Read in a review 2. Assign each word in the review the score attributed to the review. 3. Enter an object into a dictionary where each entry is keyed by the word. The corresponding total score and number of occurrences are grouped as the value (You can create a simple class to store the score and occurrences, or simply store them in a list). If a word already exists in the dictionary, the program would update the score and number of occurrences. For instance, if a review with a score of 3 contains the word "amazing" and the dictionary already contains an entry for that word with 32 as the total score and 8 as the number of occurrences, the dictionary entry would be updated to 35 and 9, respectively. 4. Repeat Step 1 until all data is entered Once the training phase is completed, the program should prompt the user to input a movie review, which it automatically scores based on the overall average score of the words in the review as determined by the training phase. The average score of each word is computed by dividing the total score for the entry by the number of occurrences. Your responsibility is to implement the training phase as detailed in steps 1-4 above. Your implementation must be based on the Python dictionary class. Example output: Enter a review Press return to exit A weak script that ends with a quick and boring finale The review has an average value of 1.79128 Negative Sentiment Enter a review Press return to exit Loved every minute of it The review has an average value of 2.39219 Positive Sentiment

Explanation / Answer

File Name = training_reviews.txt
Reviews are seperated by "$$EOD_REVIEW$$" delimeter. The first line of the review contains the rating by the user. Subsequent lines contains the text. The format is shown below

------------------------------------ training_reviews.txt -----------------------------

4.35
This movie is great amazing movie
I loved it very much !!!
$$EOD_REVIEW$$

1.8
This is the worst movie of all time. I really hate it

$$EOD_REVIEW$$

1.50
A weak script that ends in a boring finale and is really a bad movie

$$EOD_REVIEW$$

4.50
This movie is great amazing fantastic
superb

$$EOD_REVIEW$$

3.5
Loved every minute of it

$$EOD_REVIEW$$

------------------------------------------- End of the file ------------------------------------------


import re

class WordStatistic:

    def __init__(self, keyword, averageScore = 0, occurences = 0):
        self.keyword = keyword
        self.averageScore = averageScore
        self.occurences = occurences

    def getWord(self) :
        return self.keyword

    def getAverageScore(self) :
        return self.averageScore

    def getOccurences(self) :
        return self.occurences

    def addNewScore(self, newScore) :
        oldScoreSum = self.averageScore * self.occurences
        self.occurences = self.occurences + 1
        self.averageScore = (oldScoreSum + newScore) / (self.occurences)

    def printWordStatistic(self) :
           print "Word          : ", self.keyword
           print "Occurences    : ", self.occurences
           print "Average Score : ", self.occurences, " "

# Starting Training Phase

wordDictionary = {}
fileInstance = open("training_reviews.txt",'r')
fileText = fileInstance.read()

# Assuming, that each review is seperated by following delimeter
reviewSplits = fileText.split("$$EOD_REVIEW$$")
for review in reviewSplits :
        review = review.strip()
        if review == "" :
            continue
        # In each review, first line contains the score and the
        # subsequent lines contains the text
        lineSplits = review.split(" ")
        score = float(lineSplits[0].strip())
        for i in range(1, len(lineSplits)) :
            # Splitting out the words in each line of the review
            wordSplits = re.split(" | ", lineSplits[i])
            for word in wordSplits :
                if word == "" :
                    continue
                # If it is already present, then update the score and count
                # Otherwise just add the new entry to the dictionary
                if wordDictionary.has_key(word) :
                    wordStatistic = wordDictionary.get(word)
                    wordStatistic.addNewScore(score)
                else :
                    wordStatistic = WordStatistic(word, score, 1)
                    wordDictionary[word] = wordStatistic

# Training Phase Completed


# To print the statistics of all words in the dictionary
def printAllWordStatistic(wordDictionary) :
    for wordStatistic in wordDictionary.values() :
        wordStatistic.printWordStatistic()

# To rate a review based on the training data
def calculateAverageOfReview(review) :
    review.replace(" ", " ")
    review.replace(" ", " ")
    wordSplits = review.split(" ")

    averageScore = 0.0
    totalCount = 0;
    for word in wordSplits :
        if wordDictionary.has_key(word) :
            averageScore += wordDictionary.get(word).getAverageScore()
            totalCount = totalCount + 1
    if totalCount != 0 :
        return averageScore / totalCount
    return -1


# User Review Input
while (True) :
    print " Enter a review, (enter empty-line to save) : "
    multiLines = []
    while True:
        line = raw_input()
        if line:
            multiLines.append(line)
        else:
            break
    inputReview = ' '.join(multiLines)

    averageScore = calculateAverageOfReview(inputReview)
    if averageScore != -1 :
        if averageScore >= 2.50 :
            print "Positive Review"
        else :
            print "Negative Review"
    else :
        print "Unable to rate the review"

    if raw_input(" Do you want to continue ? (Y/N) : ") != "Y" :
        print "Quitting the session. Good Bye !"
        exit()