Attached Files:
File Details on the assignment.pdf Details on the assignment.pdf – Alternative Formats (35.621 KB)
File Assignment1-sample solution.pdf Assignment1-sample solution.pdf – Alternative Formats (331.527 KB)
Step 1:
Use the 20% subset of the KDDTrain+.txt file from NSL-KDD Dataset located at https://www.unb.ca/cic/datasets/nsl.html for this exercise. Read the information in the URL to guide you in understanding the dataset, then scroll to the end of the page for the ‘download this dataset’ link.
Step 2:
The next page will get you to fill a form; your organization is UMBC and your job title is student. Download the NSL-KDD zip file, unzip, and locate KDDTrain+_20Percent.TXT (you can change the extension to .csv)
Step 3
Create 2 training sets by selecting samples from this data set and evaluate them using decision trees (such as J48 in Weka). You can use random sampling or any other selective sampling technique. Compare the decision trees you find and describe any key changes between the trees.
Comment on why these changes may be occuring by looking at the class distribution in your samples or the size of your training samples.
You may use alternate analysis techniques such as clustering and associations to supplement your analysis (although this is not required).
Submit a word document of your assignment, please make sure to include the decision tree snapshots and other relevant snapshots in your assignment. You do not need to include snapshots of every intermediate step or analysis.
You can use weka or any other alternative data mining tool for this assignment.
Note that this is not KDD Cup 99 dataset but all preprocessing steps of homework 1 are still relevant for this dataset. The sample solution remains a guide for this homework
IF YOU ANY CLARIFICATION QUESTIONS ON THIS ASSIGNMENT REACH OUT.
Details on the assignment are in the pdf attached. Feel free to break the assignment
Week 1 – Data Statistics/Preprocessing
Week 2 – Results/Overall Discussion
Feel free to listen to this as well.
Works through Assignment Sample solution and gives you an idea of expectation for your assignment.


0 comments