我想将名为(my_data2)的数据表分为两个样本(学习样本和测试样本)。如何在我的表的第一部分(第一个样本)上应用逻辑回归,然后在第二部分应用预测?谢谢你。这是我的编码;
import numpy as np
from statsmodels.formula.api import logit
FNAME2 ="C:/Users/lenovo/Desktop/Nouveau dossier (2)/table.csv"
FinalTableau=np.savetxt(FNAME2,my_data[index_to_use] , delimiter=",")
my_data2 = np.genfromtxt (FNAME2, delimiter = ',')
x= my_data2 [:,1]
a= my_data2[:,3]
#x with values 1 and 2
print x
#converts my binary data series from (1, 2) to (0,1)
x= my_data[:, 1] - 1
print x
form = 'x ~ a'
affair_model = logit (form, my_data2)
affair_result = affair_model.fit ()
print affair_result.summary ()
print affair_result.predict()