In this assignment we are going to work with a larger collection of tweets (10,0

ID: 3843697 • Letter: I

Question

In this assignment we are going to work with a larger collection of tweets (10,000) that are available here:
http://rasinsrv07.cstcis.cti.depaul.edu/CSC455/Assignment5.txt

A.Using python, identify the top-5 most frequent terms (words separated by ‘ ‘) that are at
least 4 characters or longer (i.e. ignore articles such as “a” or “the” and any other short
terms) in the text of the tweets. It is up to you whether you prefer to use the contents of
the loaded database (reading tweets from SQLite, which contains fewer tweets) or the
contents of the original Assignment5.txt file (reading tweets directly from the file).

Explanation / Answer

#!/usr/bin/python

def printMaximum():
     list [] # list to hold unique words
     list1[] # list to hold their corresponding counts
     with open('Assignment5.txt','r') as f:
          for line in f:
              for word in line.split():
                  if len(word) >= 4 :
                     if word not in list:
                           list.append(word)

     for i in range(len(list)):
         count = 0
         for j in range(len(list)):
                 if list[i] == list[j]:
                          count = count + 1
                          list1.append(count)


     for i in range(5)): # Printing top 5 most frequent words.
         index = 0
         max = 0
         for j in range(len(list1):
                     if list1[i] > max:
                        max = list1[i]
                        index = i

         print list[index]
         print ' '
         list1[index] = 0

Navigate

In this assignment we are going to use class construction to improve upon it a b

In this assignment we are going to work with a larger collection of tweets (10,0

In this assignment we are going to work with a larger collection of tweets (10,0

Question

Explanation / Answer

Related Questions

Navigate