Scikit Learn provides a function named "train_test_kit" to divide a dataset into two parts - train dataset and test dataset. Here is an example to see how to use it.
Run the code in Jupyter Notebook. Note: We are using the dataset in the example using our own csv. Without a csv with same name and columns in your folder, your code will not work.
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4, random_state=101)
This takes dataset X and y and takes randomly 40% of rows for train dataset and rest for test dataset for both X and y. And through tuple unpacking, the 4 datasets are assigned to X_train, X_test, y_train, and y_test.
y = df['Price']
from sklearn.linear_model import LinearRegression
lm = LinearRegression()
print(lm.intercept_)
lm.coef_
X_train.columns
cdf = pd.DataFrame(lm.coef_, X.columns, columns=['Coeff'])
cdf
Run the code in Jupyter Notebook. Note: We are using the dataset in the example using our own csv. Without a csv with same name and columns in your folder, your code will not work.
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4, random_state=101)
This takes dataset X and y and takes randomly 40% of rows for train dataset and rest for test dataset for both X and y. And through tuple unpacking, the 4 datasets are assigned to X_train, X_test, y_train, and y_test.
Create Model
Linear Regression Model
X = df[['Avg. Area Income', 'Avg. Area House Age', 'Avg. Area Number of Rooms', 'Avg. Area Number of Bedrooms', 'Area Population']]y = df['Price']
from sklearn.linear_model import LinearRegression
lm = LinearRegression()
Train Model
lm.fit(X_train, y_train)print(lm.intercept_)
lm.coef_
X_train.columns
cdf = pd.DataFrame(lm.coef_, X.columns, columns=['Coeff'])
cdf
Comments
Post a Comment