You are reminded that this work is for credit towards the composite mark in CE306/CE706 and that the work you submit must therefore be your own. Any material you make use of, whether it be from textbooks, the Web or any other source must be acknowledged as a comment in the program, and the extent of the reference indicated.
To properly evaluate a system, your test information needs must be germane (relevant) to the documents in the test document collection, and appropriate for predicted usage of the system. Given information needs and documents, you need to collect relevance assessments. This is a time-consuming and expensive process involving human beings (in this case you). For tiny collections, exhaustive judgments of relevance for each query and document pair can be obtained. For large modern collections, it is usual for relevance to be assessed only for a subset of the documents for each query. The most standard approach is pooling, where relevance is assessed over a subset of the collection that is formed from the top k documents returned by many different IR systems (usually the ones to be evaluated).
The Document Collection (dataset) for this assignment you will use the dataset that you used in the first assignment (Wikipedia Movie Plots or COVID-19 Open Research Dataset, for CE306 and CE706 respectively).
This task comes in stages. Marks are given for each stage. The stages are as follows:
Tasks in summary: Using the dataset from assignment 1, decide on 3 pieces of information you want to learn from the dataset. Use your original IR system from assignment 1 and a modified version to retrieve the answers from the dataset. You will then create a pool and assess the relevance of the documets in the pool given each of the queries. Finally, you will compare both systems in terms of P@5 and R@5.
You will have noticed that the percentages above only add up to 90%. This is because one of the important aspects of the project is that your work should be well documented. 10% of your mark will come from this. The report should contain:
The report does not need to be long as long as it addresses all the above points.
The backend search engine to be used is Elasticsearch. Apart from that you are free to write additional code in any language of your choice and employ any open-source tool that you find suitable.
You should submit:
The submission should be submitted as a single pdf file via the electronic submission system. Please check the details of the submission deadline with the CSEE School Office.
The guidelines about late assignments are explained in the students’ handbook.
CE306 or CE706 – Information Retrieval 2021
Include here the selected information needs and how they will be represented as a query.
IR systems (Task 2)
Include here the details of your two IR systems and the difference between them.
Pool method (Task 3)
For each method retrieve the top 10 documents. Therefore for each query, you will have a maximum of 20 documents.
|Query||# different documents||Id of the documents retrieve by System 1||Id of the documents retrieve by System 2|
To be consistent with all the queries, you need to define criteria to judge if a document is relevant for an information need. The same criteria should be used for all the queries. Notice that only containing the same words is not a valid criterion.
|Query||ID of relevant documents|
Include here the details of how you did this step including any issue that you had and how did you face it. You may include screenshots to clarify.
|System 1||System 2|
Discussion: Include the discussion of your solution focusing on the comparison of both systems.
Try it now!
How it works?
Follow these simple steps to get your paper done
Place your order
Fill in the order form and provide all details of your assignment.
Proceed with the payment
Choose the payment system that suits you most.
Receive the final file
Once your paper is ready, we will email it to you.
No need to work on your paper at night. Sleep tight, we will cover your back. We offer all kinds of writing services.
No matter what kind of academic paper you need and how urgent you need it, you are welcome to choose your academic level and the type of your paper at an affordable price. We take care of all your paper needs and give a 24/7 customer care support system.
Admission Essays & Business Writing Help
An admission essay is an essay or other written statement by a candidate, often a potential student enrolling in a college, university, or graduate school. You can be rest assurred that through our service we will write the best admission essay for you.
Our academic writers and editors make the necessary changes to your paper so that it is polished. We also format your document by correctly quoting the sources and creating reference lists in the formats APA, Harvard, MLA, Chicago / Turabian.
If you think your paper could be improved, you can request a review. In this case, your paper will be checked by the writer or assigned to an editor. You can use this option as many times as you see fit. This is free because we want you to be completely satisfied with the service offered.