Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Predicting the Amount of Money Spent on Insured Customers For this assignment, w

ID: 3371342 • Letter: P

Question

Predicting the Amount of Money Spent on Insured Customers

For this assignment, we will be analyzing insured customers' data for an insurance company:

Based on a sample data that consists of the profile of insured customers, we want to be able to predict the dollar amount of money spent by the insurance company on insured customers.

Insured ustomers' Data

The insured customers' data is in a csv file. It has information sconsisting of:

1.age

2.sex (female, male)

3.BMI

4.Children

5.Smoker (yes, no)

6.Region (northeast, northwest, southeast, southwest])

7.expenses

The value we want to predict is expenses

Necessary files are in onedrive:

https://1drv.ms/u/s!Al0FoC_cg4VI3r5Y-ORAr_DjO5etwQ

https://1drv.ms/u/s!Al0FoC_cg4VI3r5X-v6AWSBI2zapLw

Explanation / Answer

I write R-code for that problem. But before run this code first
copy the given data from Excel. then run it:

The R-code is:


b=read.table("clipboard",header=T)
head(b,10)
attach(b)
x1=as.numeric(sex)
x2=as.numeric(smoker)
x3=as.numeric(region)
l=lm(expenses~age+x1+bmi+children+x2+x3)
summary(l)

And the output is:


> summary(l)

Call:
lm(formula = expenses ~ age + x1 + bmi + children + x2 + x3)

Residuals:
Min 1Q Median 3Q Max
-11340 -2811 -1021 1407 29740

Coefficients:
Estimate Std. Error t value Pr(>|t|)   
(Intercept) -35152.71 1174.34 -29.934 < 2e-16 ***
age 257.27 11.89 21.646 < 2e-16 ***
x1 -131.15 332.80 -0.394 0.69359   
bmi 332.64 27.72 12.000 < 2e-16 ***
children 479.56 137.64 3.484 0.00051 ***
x2 23819.32 411.83 57.838 < 2e-16 ***
x3 -353.49 151.92 -2.327 0.02013 *  
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 6060 on 1331 degrees of freedom
Multiple R-squared: 0.7508, Adjusted R-squared: 0.7496
F-statistic: 668.2 on 6 and 1331 DF, p-value: < 2.2e-16


Thus the regression equation is:


expenses=(257.27)*age - (131.15)*sex + (332.64)*bmi + (479.56)*children +(23819.32)*smoker-(353.49)*region

Thus we can predict value of expenses by putting other known values in regression equation.