I have to answer the questions in the script below based on a csv file similar t
ID: 3791579 • Letter: I
Question
I have to answer the questions in the script below based on a csv file similar to the picture below. I had to shorten the pictuture because there were over 1000 records, so you only see a few of the records. I have answered the first 7 completed and need help with 8-11.
resdf=pd.read_csv('res.csv',encoding='utf8')
resdf.apply(lambda x: pd.lib.infer_dtype(x.values))
rawdata=resdf.values
rawdata
"""
question-1: Find how many different zipCode in the table
"""
zipvar=rawdata[:,1]
zip = np.unique(zipvar)
a=np.count_nonzero(zip)
a
"""
question-2: Find how many different councilDistrict in the table
"""
couvar=rawdata[:,3]
cou = np.unique(couvar)
b=np.count_nonzero(cou)
b
"""
question-3: Find how many different zipCode in the table
"""
zipvar=rawdata[:,1]
zip = np.unique(zipvar)
c=np.count_nonzero(zip)
c
"""
question-4: Find how many different policeDistrict in the table
"""
polvar=rawdata[:,4]
pol = np.unique(polvar)
d=np.count_nonzero(pol)
d
"""
question-5: Find out which policeDistrict has the largest number of
restaurants. If you got more than one policeDistricts, put them in a list,
e.g. ['SOUTHERN','NORTHERN']
"""
z,indices = np.unique(rawdata[:,4], return_inverse=True)
e=z[np.argmax(np.bincount(indices))]
e
"""
question-6: Find out which policeDistrict has the largest number of
restaurants. If you got more than one policeDistricts, put them in a list,
e.g. ['SOUTHERN','NORTHERN']
"""
z,indices = np.unique(rawdata[:,4], return_inverse=True)
f=z[np.argmax(np.bincount(indices))]
f
"""
question-7: Find out which zipCode has the largest number of
restaurants. If you got more than one zipCode, put them in a list,
e.g. [21215,21217]
"""
z,indices = np.unique(rawdata[:,1], return_inverse=True)
g=z[np.argmax(np.bincount(indices))]
g
"""
question-8: List all different neighborhood in the SOUTHERN policeDistrict.
Put your answer in a list, e.g. ['Cherry Hill', 'Curtis Bay', 'Federal Hill']
"""
"""
question-9: After finding out all BURGER KING restaurants. Find out in which
policeDistrict, it has more than one BURGER KING restaurants. If you got
more than one policeDistricts, put them in a list,
e.g. ['SOUTHERN','NORTHERN'].
NOTE: the name like BURGER KING # 10293 is also a BURGER KING restaurant,
"""
"""
question-10: Are there any relplicated names in the location 1 column? If
the answer is yes, put them in a list,
e.g. ['Hopkins Pl Baltimore, MD','Hayward Ave Baltimore, MD']. If not, assign
empty list to it.
"""
"""
question-11: How many different zipCodes are used in the CENTRAL
policeDistrict?
"""
"""
Fill in you answer into the follow structure. e.g.
suppose a=[3 4 5] and b=39
answer={'question-1':a,'question-2':b}
"""
answer={'question-1':a ,'question-2':b ,
'question-3':c ,'question-4':d ,
'question-6':e ,'question-6':f ,
'question-7':g ,'question-8':h ,
'question-9':i ,'question-10':j ,
'question-11':k }
"""
DON'T CHANGE THE FOLLOWING CODE
"""
with open('myoutput.txt','w') as outfile:
json.dump(answer,outfile)
Explanation / Answer
I have done from 8 to 11 using data frame.
Feel free to ask any doubt.
"""
question-8: List all different neighborhood in the SOUTHERN policeDistrict.
Put your answer in a list, e.g. ['Cherry Hill', 'Curtis Bay', 'Federal Hill']
"""
list(resdf["neighbourhood"][resdf["policeDistrict"]=='SOUTHERN'])
"""
question-9: After finding out all BURGER KING restaurants. Find out in which
policeDistrict, it has more than one BURGER KING restaurants. If you got
more than one policeDistricts, put them in a list,
e.g. ['SOUTHERN','NORTHERN'].
NOTE: the name like BURGER KING # 10293 is also a BURGER KING restaurant,
"""
df=pd.DataFrame(resdf["policeDistrict"][resdf["name"]=='BRUGUR KING'])
df=pd.DataFrame(df.policeDistrict.value_counts() > 1)
list(df[df["policeDistrict"]==True].index.get_values())
"""
question-10: Are there any relplicated names in the location 1 column? If
the answer is yes, put them in a list,
e.g. ['Hopkins Pl Baltimore, MD','Hayward Ave Baltimore, MD']. If not, assign
empty list to it.
"""
list(resdf["Location1"][resdf.duplicated("Location1")])
"""
question-11: How many different zipCodes are used in the CENTRAL
policeDistrict?
"""
resdf["zipCode"][resdf["policeDistrict"]=='CENTRAL'].nunique()