I have a table that looks like the following:
Row Student Major
1 Jenn Math
2 Jenn Science
3 Jenn CS
4 Mark English
5 Mark History
6 Steve Math
7 Steve Science
8 Steve Engineering
9 Ann Biology
10 Ann Chemistry
I am trying to look for patterns in consecutive rows where a student changed majors from Math to Science and then changed to a third major and flag those values so I would have:
Row Name Major Flagged_Values
1 Jenn Math
2 Jenn Science
3 Jenn CS 1
4 Mark English
5 Mark History
6 Steve Math
7 Steve Science
8 Steve Engineering 1
9 Ann Biology
10 Ann Chemistry
In this case, students Jenn and Steve meet that critera, so they would get flagged with a 1.
I then flag those students who majored in Math, Science and Engineering consecutively (which would be Steve in this case) and add another column with a 1 in it. My question is, if I have two separate statements to flag these patterns, and I later want to flag values where BOTH criterion are met, how would I do that? Would I use the addBatch
method? I have code but it doesn't yield the desired results. I am using Java (JDBC) with an SQL library using an Oracle Database.
String x = "SELECT STUDENTS.ROW, STUDENTS.MAJOR, STUDENTS.NAME " +
"CASE WHEN prior_row.NAME IS NOT NULL" +
"AND EXISTS(SELECT 'x' FROM STUDENTS prior_row " +
"WHERE STUDENTS.MAJOR = prior_row.MAJOR" +
"AND STUDENTS.ROW > prior_row.ROW + 1" +
"SELECT STUDENTS.MAJOR, STUDENTS.ROW, STUDENTS.NAME WHERE" +
"MAJOR < (SELECT MAJOR FROM STUDENTS WHERE MAJOR = 'MATH'
"AND WHERE MAJOR > (SELECT MAJOR FROM STUDENTS WHERE MAJOR = 'SCIENCE' THEN 1 ELSE NULL END Flagged_Values";
st.addBatch(x);
String y = "SELECT STUDENTS.ROW, STUDENTS.MAJOR, STUDENTS.NAME" +
"CASE WHEN previous.NAME IS NOT NULL" +
"AND EXISTS(SELECT 'y' FROM STUDENTS previous" +
"WHERE STUDENTS.MAJOR = previous.MAJOR" +
"AND STUDENTS.ROW > previous.ROW + 1" +
"SELECT STUDENTS.MAJOR, STUDENTS.ROW, STUDENTS.NAME WHERE" +
"MAJOR < (SELECT THE_OUTCOME FROM STUDENTINFO WHERE MAJOR ='Math' +
"AND WHERE MAJOR > (SELECT MAJOR FROM STUDENTS WHERE MAJOR = 'SCIENCE'" +
"AND WHERE MAJOR > (SELECT MAJOR FROM STUDENTS WHERE MAJOR = 'Engineering'
"THEN 1 ELSE NULL END Flag ";
st.addBatch(y);
I am trying to compare for consecutive instances. My second question: Is this considered data mining? Any help would be appreciated.