MMPSS24 in Python statsmodels - OLS linear regression

Question

1.00/5 (2 votes)

See more:

I am having issues while running the below mentioned multi linear regression using stats model.

The below thing is given:

from sklearn.datasets import load_boston
import pandas as pd
boston = load_boston()
dataset = pd.DataFrame(data=boston.data, columns=boston.feature_names)
dataset['target'] = boston.target
print(dataset.head())

Now it has been asked to do the following things:

- From the above output you can see the various attributes of the dataset.
- The 'target' column has the dependent values(housing prices) and rest of the colums are the independent values that influence the target values
- Lets find the relation between 'housing price' and 'average number of rooms per dwelling' using stats model
- Assign the values of column "RM"(average number of rooms per dwelling) to variable X
- similerly assign the values of 'target'(housing price) column to variable Y
- sample code: values = data_frame['attribute_name']
- import statsmodel.api as sm
- initialise the OLS model by passing target(Y) and attribute(X).Assign the model to variable 'statsModel'
- fit the model and assign it to variable 'fittedModel, make sure you add constant term to input X'
- sample code for initialization: sm.OLS(target, attribute)
- print the summary of fittedModel using the summary() function
- from the summary report note down the R-squared value and assign it to variable 'r_squared' in the below cell

Can some one pls help me to implement these items.

What I have tried:

i) X = dataset.drop('target', axis = 1)
ii) Y = dataset['target']
iii) X.corr()
iv) corr_value = <something>
v) import statsmodels.api as sm
Remaining not able to do..

Posted 16-Oct-20 12:06pm

mmpss248

Updated 17-Oct-20 7:03am

Add a Solution

2 solutions

Solution 1

Quote:
Can some one pls help me to implement these items.

Help you to fix your work: Yes !
Doing your homework: No

You show no attempt to solve the problem yourself, you have no question, your main effort is pasting the requirement, you just want us to do your HomeWork.
HomeWork problems are simplified versions of the kind of problems you will have to solve in real life, their purpose is learning and practicing.

We do not do your HomeWork.
HomeWork is not set to test your skills at begging other people to do your work, it is set to make you think and to help your teacher to check your understanding of the courses you have taken and also the problems you have at applying them.
Any failure of you will help your teacher spot your weaknesses and set remedial actions.
Any failure of you will help you to learn what works and what don't, it is called 'trial and error' learning.
So, give it a try, reread your lessons and start working. If you are stuck on a specific problem, show your code and explain this exact problem, we might help.

Posted 16-Oct-20 12:13pm

Patrice T

Comments

mmpss248 17-Oct-20 11:03am

I am not a python expert...Im on Oracle....Im learning python by myself....I tried my best but stuck....so I posted here....

mmpss248 17-Oct-20 11:05am

If we you want to help me please do...lot of difference between "begging and requesting"...And I hope you understood....if you can help. Please help or no need of this degrading or demotivating comments....it doesn't worth anything...

mmpss248 17-Oct-20 11:34am

I tried this : by doing myself an d searching on google...but can't do it....
from sklearn.datasets
import load_boston
import pandas as pd
boston = load_boston()
dataset = pd.DataFrame(data=boston.data, columns=boston.feature_names)
dataset[‘target’] = boston.target
print(dataset.head())

X = dataset[“RM”]
Y = dataset[“target”]

import statsmodels.api as sm
X= sm.add_constant(X)
statsModel =sm.OLS(Y,X)
fittedModel = statsModel.fit()

print (fittedModel.summary())

r_squared = fittedModel.rsquared
with open(“output.txt”, “w”) as text_file:
text_file.write(“rsquared= %f\n” % r_squared)

Patrice T 17-Oct-20 12:14pm

Use Improve question to update your question.
So that everyone can pay attention to this information.

Add a Solution

Add your solution here

Treat my content as plain text, not as HTML

Preview 0

…

Existing Members

Sign in to your account

...or Join us

Download, Vote, Comment, Publish.

Your Email
Password
Forgot your password?

Your Email
This email is in use. Do you need your password?
Optional Password

I have read and agree to the Terms of Service and Privacy Policy
Please subscribe me to the CodeProject newsletters

When answering a question please:

Read the question carefully.
Understand that English isn't everyone's first language so be lenient of bad spelling and grammar.
If a question is poorly phrased then either ask for clarification, ignore it, or edit the question and fix the problem. Insults are not welcome.
Don't tell someone to read the manual. Chances are they have and don't get it. Provide an answer or move on to the next question.

Let's work to help developers, not make them feel stupid.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

mmpss248 · Accepted Answer · 2020-10-17T07:03:00

I got solution: Thank you...

from sklearn.datasets
import load_boston
import pandas as pd
boston = load_boston()
dataset = pd.DataFrame(data=boston.data, columns=boston.feature_names)
dataset[‘target’] = boston.target
print(dataset.head())

X = dataset[“RM”]
Y = dataset[“target”]

import statsmodels.api as sm
import statsmodels.formula.api as smf

X= sm.add_constant(X)
statsModel =sm.OLS(Y,X)
fittedModel = statsModel.fit()

print (fittedModel.summary())

r_squared = <type the="" value="" here=""> ; from the output of previous step
Got the value - 0.48400