Click here to Skip to main content
15,892,737 members
Please Sign up or sign in to vote.
1.00/5 (2 votes)
See more:
I am having issues while running the below mentioned multi linear regression using stats model.

The below thing is given:

from sklearn.datasets import load_boston
import pandas as pd
boston = load_boston()
dataset = pd.DataFrame(data=boston.data, columns=boston.feature_names)
dataset['target'] = boston.target
print(dataset.head())

Now it has been asked to do the following things:

- From the above output you can see the various attributes of the dataset.
- The 'target' column has the dependent values(housing prices) and rest of the colums are the independent values that influence the target values
- Lets find the relation between 'housing price' and 'average number of rooms per dwelling' using stats model
- Assign the values of column "RM"(average number of rooms per dwelling) to variable X
- similerly assign the values of 'target'(housing price) column to variable Y
- sample code: values = data_frame['attribute_name']
- import statsmodel.api as sm
- initialise the OLS model by passing target(Y) and attribute(X).Assign the model to variable 'statsModel'
- fit the model and assign it to variable 'fittedModel, make sure you add constant term to input X'
- sample code for initialization: sm.OLS(target, attribute)
- print the summary of fittedModel using the summary() function
- from the summary report note down the R-squared value and assign it to variable 'r_squared' in the below cell

Can some one pls help me to implement these items.

What I have tried:

i) X = dataset.drop('target', axis = 1)
ii) Y = dataset['target']
iii) X.corr()
iv) corr_value = <something>
v) import statsmodels.api as sm
Remaining not able to do..
Posted
Updated 17-Oct-20 7:03am

Quote:
Can some one pls help me to implement these items.

Help you to fix your work: Yes !
Doing your homework: No

You show no attempt to solve the problem yourself, you have no question, your main effort is pasting the requirement, you just want us to do your HomeWork.
HomeWork problems are simplified versions of the kind of problems you will have to solve in real life, their purpose is learning and practicing.

We do not do your HomeWork.
HomeWork is not set to test your skills at begging other people to do your work, it is set to make you think and to help your teacher to check your understanding of the courses you have taken and also the problems you have at applying them.
Any failure of you will help your teacher spot your weaknesses and set remedial actions.
Any failure of you will help you to learn what works and what don't, it is called 'trial and error' learning.
So, give it a try, reread your lessons and start working. If you are stuck on a specific problem, show your code and explain this exact problem, we might help.
 
Share this answer
 
Comments
mmpss248 17-Oct-20 11:03am    
I am not a python expert...Im on Oracle....Im learning python by myself....I tried my best but stuck....so I posted here....
mmpss248 17-Oct-20 11:05am    
If we you want to help me please do...lot of difference between "begging and requesting"...And I hope you understood....if you can help. Please help or no need of this degrading or demotivating comments....it doesn't worth anything...
mmpss248 17-Oct-20 11:34am    
I tried this : by doing myself an d searching on google...but can't do it....
from sklearn.datasets
import load_boston
import pandas as pd
boston = load_boston()
dataset = pd.DataFrame(data=boston.data, columns=boston.feature_names)
dataset[‘target’] = boston.target
print(dataset.head())

X = dataset[“RM”]
Y = dataset[“target”]

import statsmodels.api as sm
X= sm.add_constant(X)
statsModel =sm.OLS(Y,X)
fittedModel = statsModel.fit()

print (fittedModel.summary())

r_squared = fittedModel.rsquared
with open(“output.txt”, “w”) as text_file:
text_file.write(“rsquared= %f\n” % r_squared)
Patrice T 17-Oct-20 12:14pm    
Use Improve question to update your question.
So that everyone can pay attention to this information.
I got solution: Thank you...

from sklearn.datasets
import load_boston
import pandas as pd
boston = load_boston()
dataset = pd.DataFrame(data=boston.data, columns=boston.feature_names)
dataset[‘target’] = boston.target
print(dataset.head())

X = dataset[“RM”]
Y = dataset[“target”]

import statsmodels.api as sm
import statsmodels.formula.api as smf

X= sm.add_constant(X)
statsModel =sm.OLS(Y,X)
fittedModel = statsModel.fit()

print (fittedModel.summary())

r_squared = <type the="" value="" here=""> ; from the output of previous step
Got the value - 0.48400
 
Share this answer
 
v2

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

  Print Answers RSS
Top Experts
Last 24hrsThis month


CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900