This is my complete code and I am not able to determine what should be done for corr_value. I am getting error "TypeError: float argument required, not DataFrame" at the last step. Any idea will be highly helpful
from sklearn.datasets import load_boston
import pandas as pd
boston = load_boston()
dataset = pd.DataFrame(data=boston.data, columns=boston.feature_names)
dataset['target'] = boston.target
print(dataset.head())
X = dataset.drop('target', axis = 1 )
Y = dataset["target"]
corr_value = X.corr()
print(corr_value.head())
import statsmodels.api as sm
statsModel = sm.OLS(Y, X)
fittedModel = statsModel.fit()
print(fittedModel.summary())
OLS Regression Results
==============================================================================
Dep. Variable: target R-squared: 0.959
Model: OLS Adj. R-squared: 0.958
Method: Least Squares F-statistic: 891.1
Date: Mon, 02 Jul 2018 Prob (F-statistic): 0.00
Time: 22:36:03 Log-Likelihood: -1523.8
No. Observations: 506 AIC: 3074.
Df Residuals: 493 BIC: 3129.
Df Model: 13
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
CRIM -0.0916 0.034 -2.675 0.008 -0.159 -0.024
ZN 0.0487 0.014 3.379 0.001 0.020 0.077
INDUS -0.0038 0.064 -0.059 0.953 -0.130 0.123
CHAS 2.8564 0.904 3.160 0.002 1.080 4.633
NOX -2.8808 3.359 -0.858 0.392 -9.481 3.720
RM 5.9252 0.309 19.168 0.000 5.318 6.533
AGE -0.0072 0.014 -0.523 0.601 -0.034 0.020
DIS -0.9680 0.196 -4.947 0.000 -1.352 -0.584
RAD 0.1704 0.067 2.554 0.011 0.039 0.302
TAX -0.0094 0.004 -2.393 0.017 -0.017 -0.002
PTRATIO -0.3924 0.110 -3.571 0.000 -0.608 -0.177
B 0.0150 0.003 5.561 0.000 0.010 0.020
LSTAT -0.4170 0.051 -8.214 0.000 -0.517 -0.317
==============================================================================
Omnibus: 204.050 Durbin-Watson: 0.999
Prob(Omnibus): 0.000 Jarque-Bera (JB): 1372.527
Skew: 1.609 Prob(JB): 9.11e-299
Kurtosis: 10.399 Cond. No. 8.50e+03
==============================================================================
Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The condition number is large, 8.5e+03. This might indicate that there are
strong multicollinearity or other numerical problems.
r_squared = fittedModel.rsquared
###End code(approx 1 line)
with open("output.txt", "w") as text_file:
text_file.write("corr= %f\n"% corr_value)
text_file.write("rsquared= %f\n" % r_squared)
TypeError Traceback (most recent call last)
<ipython-input-11-b55e673c78db> in <module>()
3 ###End code(approx 1 line)
4 with open("output.txt", "w") as text_file:
----> 5 text_file.write("corr= %f\n" % corr_value)
6 text_file.write("rsquared= %f\n" % r_squared)
TypeError: float argument required, not DataFrame
What I have tried:
text_file.write("corr= %f\n"