Insert answers to ALL questions in THIS DOCUMENT. When finished, upload the edited version of THIS DOCUMENT to iLearn, with the following naming convention: yourLastname_yourFirstname_ITS836Assignment2.docx.
Answer ALL questions. If you see “[show code]”, provide the R source code you used to get to this point (i.e. include all commands since the previous question) (TYPE IT IN, DO NOT INSERT A SCREENSHOT), and any output. If your output is a plot, use RSturdio’s plot Export option to copy to Clipboard and copy into this document.
Part A – Association Rules The data in for part A of this assignment was generated using the method by Agrawal and Srikant (random.patterns) to simulate transactions (random.transactions) which contains correlated items. 10,000 transactions occurred with 100 items to choose from. The average length of the transactions is 10 items. Note: You will need to load the arules R package to complete this assignment.
Tips: Read the arules library documentation: https://cran.r-project.org/web/packages/arules/arules.pdf. Review the examples for: read.transactions(), itemFrequencyPlot(), and itemFrequency(). Refer to sections 5.5.3 and 7.2.5 in your textbook.
- Import the AssociationRules.csv transaction data file and create a frequent item plot, and a frequent item table.
- [show code, including any libraries] Which item was the most frequent item bought in the store?
- [show code] How many items were bought in the largest transaction?
Mine the Association rules with a minimum Support of 1% and a minimum Confidence of 0%.
- [show code] How many rules appear in the data?
- [show code] How many rules are observed when the minimum confidence is 50%.
- [no code] Explain how the specified confidence impacts the number of rules.
Part B – Naïve Bayes In part B of this assignment you will train a Naïve Bayes classifier on categorical data and predict individuals’ incomes.
- Import the nbtrain.csv file.
- Use the first 9010 records as training data and the remaining 1000 records as testing data.
- Construct the Naïve Bayes classifier from the training data, according to the formula “income ~ age + sex + educ”. To do this, use the “naiveBayes” function from the “e1071” package. Provide the model’s a priori and conditional probabilities.