Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

A database contains 100 documents out of which only 10 documents are relevant fo

ID: 3598161 • Letter: A

Question

A database contains 100 documents out of which only 10 documents are relevant for a given query. Two search engines, A and B, report the following documents.
System A: RRNNNRNNNR RNNRNNNNRN NNRNNNNNRN and so on
System B: RRRNNNNNNN NRRNNNNNNR NNNNNRNNNN and so on
where R represents relevant document and N represents Non-relevant document. A search engine expert wants to estimate the accuracy of both these systems by computing the following information, so that, he can know how many documents to display.
(a) Using the PR-Curve info, compute the Precision at 40% Recall.
(b) If the expert decides to display only the first 15 documents for both of these search engines, calculate Precision, Recall, and F-measure.

Explanation / Answer

a) See below

(i) 40% Recall (4 out of 10 relevent document are retrived) for search engine A happens when it gives RRNNNRNNNR. At this time search engine A has produced 10 documents and only 4 of them are relevent to presision is 4/10 =40%

(i) 40% Recall (4 out of 10 relevent document are retrived) for search engine B happens when it gives RRRNNNNNNNNR. At this time search engine B has produced 12 documents and only 4 of them are relevent to presision is 4/12 =33.3%

b) When only 15 documents are displayed

(i) Search engine A gives RRNNNRNNNR RNNRN. It has 6 relevent documents. So recall is 6/10 = 60%. Presision is 6/15 =40%, Corresponding F-measure is (2*presision*recall)/(presision+ recall)=(2*(6/15)*(6/10))/((6/15)+(6/10))=0.48

(i) Search engine B gives RRRNNNNNNN NRRNN. It has 5 relevent documents. So recall is 5/10 = 50%. Presision is 5/15 =33.3%, F-measure is Corresponding F-measure is (2*presision*recall)/(presision+ recall)=(2*(5/15)*(5/10)/((5/15)+(5/10))=0.4