Click here to Skip to main content
15,920,508 members
Please Sign up or sign in to vote.
1.00/5 (1 vote)
See more:
This is my complete code and I am not able to determine what should be done for corr_value. I am getting error "TypeError: float argument required, not DataFrame" at the last step. Any idea will be highly helpful

Python
from sklearn.datasets import load_boston
import pandas as pd
boston = load_boston()
dataset = pd.DataFrame(data=boston.data, columns=boston.feature_names)
dataset['target'] = boston.target
print(dataset.head())

X = dataset.drop('target', axis = 1 )
Y = dataset["target"]

#Now the dataframe X has just the features that influence the target
#print the correlation matrix for dataframe X. Use '.corr()' function to compute #correlation matrix
#from the correlation matrix note down the correlation value between 'CRIM' and #'PTRATIO' and assign it to variable 'corr_value'
corr_value = X.corr()
print(corr_value.head())

import statsmodels.api as sm
statsModel = sm.OLS(Y, X)
fittedModel = statsModel.fit()
print(fittedModel.summary())


OLS Regression Results
==============================================================================
Dep. Variable: target R-squared: 0.959
Model: OLS Adj. R-squared: 0.958
Method: Least Squares F-statistic: 891.1
Date: Mon, 02 Jul 2018 Prob (F-statistic): 0.00
Time: 22:36:03 Log-Likelihood: -1523.8
No. Observations: 506 AIC: 3074.
Df Residuals: 493 BIC: 3129.
Df Model: 13
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
CRIM -0.0916 0.034 -2.675 0.008 -0.159 -0.024
ZN 0.0487 0.014 3.379 0.001 0.020 0.077
INDUS -0.0038 0.064 -0.059 0.953 -0.130 0.123
CHAS 2.8564 0.904 3.160 0.002 1.080 4.633
NOX -2.8808 3.359 -0.858 0.392 -9.481 3.720
RM 5.9252 0.309 19.168 0.000 5.318 6.533
AGE -0.0072 0.014 -0.523 0.601 -0.034 0.020
DIS -0.9680 0.196 -4.947 0.000 -1.352 -0.584
RAD 0.1704 0.067 2.554 0.011 0.039 0.302
TAX -0.0094 0.004 -2.393 0.017 -0.017 -0.002
PTRATIO -0.3924 0.110 -3.571 0.000 -0.608 -0.177
B 0.0150 0.003 5.561 0.000 0.010 0.020
LSTAT -0.4170 0.051 -8.214 0.000 -0.517 -0.317
==============================================================================
Omnibus: 204.050 Durbin-Watson: 0.999
Prob(Omnibus): 0.000 Jarque-Bera (JB): 1372.527
Skew: 1.609 Prob(JB): 9.11e-299
Kurtosis: 10.399 Cond. No. 8.50e+03
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The condition number is large, 8.5e+03. This might indicate that there are
strong multicollinearity or other numerical problems.


r_squared = fittedModel.rsquared
###End code(approx 1 line)
with open("output.txt", "w") as text_file:
text_file.write("corr= %f\n"% corr_value)
text_file.write("rsquared= %f\n" % r_squared)



TypeError                                 Traceback (most recent call last)
<ipython-input-11-b55e673c78db> in <module>()
      3 ###End code(approx 1 line)
      4 with open("output.txt", "w") as text_file:
----> 5     text_file.write("corr= %f\n" % corr_value)
      6     text_file.write("rsquared= %f\n" % r_squared)

TypeError: float argument required, not DataFrame


What I have tried:

text_file.write("corr= %f\n"
Posted
Updated 18-Jul-18 22:34pm
v3
Comments
Richard MacCutchan 16-Jul-18 1:30am    
The last line in your code sample is incomplete. Also you have not explained which line of code throws the error.
Member 13895806 18-Jul-18 13:15pm    
I have updated the details now. Following lines throws the error

----> 5 text_file.write("corr= %f\n" % corr_value)
Richard MacCutchan 19-Jul-18 3:21am    
I do not know what sample you copied that from, but it does not even look like a valid Python statement. If you want to use formatted output then see 7. Input and Output — Python 3.7.0 documentation[^]

1 solution

The error is quite clear:
corr_value is of type DataFrame while the "%f" format requires the argument to be of type float.

Even the comment in your code states that the corr() function computes a matrix and you have to use that to determine the correlation value.

I can't answer how to do that because I don't know what the functions are doing, how that matrix is organised, and what the members represent.
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

  Print Answers RSS
Top Experts
Last 24hrsThis month


CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900