Competencies In this project, you will demonstrate your mastery of the following competencies:
Apply statistical techniques to address research problems • Perform regression analysis to address an authentic problem
Overview The purpose of this project is to have you complete all of the steps of a real-world linear regression research project starting with developing a research question, then completing a comprehensive statistical analysis, and ending with summarizing your research conclusions.
Scenario You have been hired by the D. M. Pan National Real Estate Company to develop a model to predict median housing prices for homes sold in 2019. The CEO of D. M. Pan wants to use this information to help their real estate agents better determine the use of square footage as a benchmark for listing prices on homes. Your task is to provide a report predicting the median housing prices based square footage.
To complete this task, use the provided real estate data set for all U.S. home sales as well as national descriptive statistics and graphs provided.
Directions Using the Project One Template located in the What to Submit section, generate a report including your tables and graphs to determine if the square footage of a house is a good indicator for what the listing price should be. Reference the
National Statistics and Graphs document for national comparisons and the Real Estate County Data spreadsheet (both found in the Supporting Materials section) for your statistical analysis.
Note: Present your data in a clearly labeled table and using clearly labeled graphs. Specifically, include the following in your report: Introduction
A. Describe the report: Give a brief description of the purpose of your report. a. Define the question your report is trying to answer.
b. Explain when using linear regression is most appropriate. i. When using linear regression, what would you expect the scatterplot to look like?
c. Explain the difference between response and predictor variables in a linear regression to justify the selection of variables.
Data Collection
A. Sampling the data: Select a random sample of 50 counties. a. Identify your response and predictor variables. B. Scatterplot: Create a scatterplot of your response and predictor variables to ensure they are appropriate for developing a linear model.
Data Analysis
A. Histogram: For your two variables, create histograms.
B. Summary statistics: For your two variables, create a table to show the mean, median, and standard deviation.
C. Interpret the graphs and statistics: a. Based on your graphs and sample statistics, interpret the center, spread, shape, and any unusual characteristic (outliers, gaps, etc.) for the two v.. Ia