Data Employed in the Statistical Analysis of the First Empirical Study
File: CLActivity.csv
CSV-file with information related to the execution of CL sessions.
- On-line visualization: CLActivity.csv
| Column | Description |
|---|---|
| UserID | Integer as user identification to differentiate students on the empirical study |
| NroUSP | Integer as user identification to differentiate students on the school |
| Type | Type of CL session in which the student with UserID participated in the empirical study |
| CLGroup | Name for the CL group in which the student with UserID is member of |
| CLRole | The CL role assigned for the student with UserID |
| PlayerRole | The player role assigned for the student with UserID in ont-gamified CL sessions |
| ParticipationLevel | Participation level of students in the CL sessions |
| NroNecessaryInteractions | Number of necessary interactions performed by the students with UserID |
| NroDesiredInteractions | Number of desired interactions performed by the students with UserID |
| NroTotalInteractions | Number of necessary and desired interactions performed by the students with UserID |
| NroSolutionReviewed (Apprentice) | Number of solutions sent by the apprentice student with UserID and reviewed for a master student |
| NroSolutionReviewed (Master) | Number of solutions reviewed by the master student with UserID |
| NroSolutionWithoutReviewed (Apprentice) | Number of solutions without review sent by the apprentice student with UserID but without review for a master student |
| NroSolutionWithoutReviewed (Master) | Number of solutions sent to the master student with UserID and with pending review |
| (n) … | Number of necessary interactions with the identificador (n) carried out by the student with UserID |
| (x) … | Number of desired interactions with the identificador (x) carried out by the student with UserID |
The possible values for the column ParticipationLevel are:
- none is the participation level in which the students did not interact with other members in the CL sessions.
- incomplete is the participation level in which the students interacted in the CL sessions, but they did not complete all the necessary interactions.
- semicomplete is the participation level in which the students interacted in the CL sessions performing all the necessary interactions, but that they did not respond all the requests made by other members of the CL group.
- complete is the participation level in which the students interacted in CL sessions performing all the necessary interactions, and they answered all the requests made by other members of the CL group.
File: SignedUpParticipants.csv
CSV-file with the list of all students enrolled as participants.
- On-line visualization: SignedUpParticipants.csv
- R script used to generate this file: 00-processing-mysql.R (more info)
| Column | Description |
|---|---|
| UserID | Integer as user identification to differentiate students on the empirical study |
| NroUSP | Integer as user identification to differentiate students on the school |
| Type | Type of CL session in which the student with UserID participated in the empirical study |
| CLGroup | Name for the CL group in which the student with UserID is member of |
| CLRole | The CL role assigned for the student with UserID |
| PlayerRole | The player role assigned for the student with UserID in ont-gamified CL sessions |
File: EffectiveParticipants.csv
CSV-file with the list of students with effective participation.
- On-line visualization: EffectiveParticipants.csv
- R script used to generate this file: 00-processing-mysql.R (more info)
| Column | Description |
|---|---|
| UserID | Integer as user identification to differentiate students on the empirical study |
| NroUSP | Integer as user identification to differentiate students on the school |
| Type | Type of CL session in which the student with UserID participated in the empirical study |
| CLGroup | Name for the CL group in which the student with UserID is member of |
| CLRole | The CL role assigned for the student with UserID |
| PlayerRole | The player role assigned for the student with UserID in ont-gamified CL sessions |
effective: A student with effective participation is a student that, at least one time, interacted with other member of the CL group by following the necessary interactions indicated in the CSCL script. It is a students who had a complete, semicomplete or incomplete participation level in the CL session.
Data Related to the Students’ Motivation
File: SourceIMIWithCareless.csv
CSV-file with responses of the IMI questionnaire. These responses included careless responses.
- On-line visualization: SourceIMIWithCareless.csv
- R script used to generate this file: 00-processing-mysql.R (more info)
| Column | Description |
|---|---|
| UserID | Integer as user identification to differentiate students on the empirical study |
| NroUSP | Integer as user identification to differentiate students on the school |
| Gamified | Type of CL session in which the student with UserID participated in the empirical study |
| Group | Name for the CL group in which the student with UserID is member of |
| CLRole | The CL role assigned for the student with UserID |
| ItemX | Value for the 7 point Likert scale with the identification ItemX |
File: SourceIMI.csv
CSV-file with responses of the IMI questionnaire, and careless responses removed from the data through the process detailed in the file: outliers-motivation-surveys.pdf
- On-line visualization: SourceIMI.csv
- R script used to generate this file: 01-removing-careless-motivation.R (more info)
| Column | Description |
|---|---|
| UserID | Integer as user identification to differentiate students on the empirical study |
| NroUSP | Integer as user identification to differentiate students on the school |
| Type | Type of CL session in which the student with UserID participated in the empirical study |
| CLGroup | Name for the CL group in which the student with UserID is member of |
| CLRole | The CL role assigned for the student with UserID |
| PlayerRole | The player role assigned for the student with UserID in ont-gamified CL sessions |
| ItemX | Value from a 7 point Likert scale for the item with identification ItemX |
File: IMI.csv
CSV-file with the validated responses through the factorial analysis and reliability test detailed in the file: validation-motivation-surveys.pdf
- On-line visualization: IMI.csv
- R script used to generate this file: 00-reliability-analysis-IMI.R (more info)
| Column | Description |
|---|---|
| UserID | Integer as user identification to differentiate students on the empirical study |
| NroUSP | Integer as user identification to differentiate students on the school |
| Type | Type of CL session in which the student with UserID participated in the empirical study |
| CLGroup | Name for the CL group in which the student with UserID is member of |
| CLRole | The CL role assigned for the student with UserID |
| PlayerRole | The player role assigned for the student with UserID in ont-gamified CL sessions |
| ItemX | Value from a 7 point Likert scale for the item with identification ItemX |
| Interest/Enjoyment | Mean of values in the items related to the Interest/Enjoyment. This v is calculate as IE = (Item22IE + Item09IE + Item12IE + Item24IE + Item21IE + Item01IE)/6 |
| Perceived Choice | Mean of values in the items related to the Perceived Choice. This value is calculate as PC = (40-(Item17PC + Item15PC + Item06PC + Item02PC + Item08PC))/5 |
| Pressure/Tension | Mean of values in the items related to the Pressure/Tension. This value is calculate as PT = (Item16PT + Item14PT + Item18PT + 8-Item11PT)/4 |
| Effort/Importance | Mean of values in the items related to the Effort/Importance. This value is calculate as EI = (Item03EI + 16-(Item13EI + Item07EI))/3 |
| Intrinsic Motivation | Mean of values in the items related to the Intrinsic Motivation. This value is calculate as IM = (IE + PC + EI + 8-PT)/4 |
File: InterestEnjoyment.csv
CSV-file with the IRT-based estimates of Interest/Enjoyment. These estimates were calculated through the building process of RSM-based instruments detailed in the file: irt-instruments.pdf
- On-line visualization: InterestEnjoyment.csv
- R script used to generate this file: 00-rsm-motivation-measurement-building.R (more info)
| Column | Description |
|---|---|
| UserID | Integer as user identification to differentiate students on the empirical study |
| NroUSP | Integer as user identification to differentiate students on the school |
| Type | Type of CL session in which the student with UserID participated in the empirical study |
| CLGroup | Name for the CL group in which the student with UserID is member of |
| CLRole | The CL role assigned for the student with UserID |
| PlayerRole | The player role assigned for the student with UserID in ont-gamified CL sessions |
| Score | Score calculated as the sum of the items in each record |
| theta | Estimate of the latent trait in logit scale |
| error | Standard error for the estimate of the latent trait theta |
| Outfit | outlier-sensitive fit statistic based on chi-square test. Values greater than >2 distorts or degraddes the measurement system. |
| Infit | inlier-pattern-sensitive fit statistic based on chi-square test. Values greater than >2 distorts or degraddes the measurement system. |
File: PerceivedChoice.csv
CSV-file with the IRT-based estimates of Perceived Choice. These estimates were calculated through the building process of RSM-based instruments detailed in the file: irt-instruments.pdf
- On-line visualization: PerceivedChoice.csv
- R script used to generate this file: 00-rsm-motivation-measurement-building.R (more info)
| Column | Description |
|---|---|
| UserID | Integer as user identification to differentiate students on the empirical study |
| NroUSP | Integer as user identification to differentiate students on the school |
| Type | Type of CL session in which the student with UserID participated in the empirical study |
| CLGroup | Name for the CL group in which the student with UserID is member of |
| CLRole | The CL role assigned for the student with UserID |
| PlayerRole | The player role assigned for the student with UserID in ont-gamified CL sessions |
| Score | Score calculated as the sum of the items in each record |
| theta | Estimate of the latent trait in logit scale |
| error | Standard error for the estimate of the latent trait theta |
| Outfit | outlier-sensitive fit statistic based on chi-square test. Values greater than >2 distorts or degraddes the measurement system. |
| Infit | inlier-pattern-sensitive fit statistic based on chi-square test. Values greater than >2 distorts or degraddes the measurement system. |
File: PressureTension.csv
CSV-file with the IRT-based estimates of Pressure/Tension. These estimates were calculated through the building process of RSM-based instruments detailed in the file: irt-instruments.pdf
- On-line visualization: PressureTension.csv
- R script used to generate this file: 00-rsm-motivation-measurement-building.R (more info)
| Column | Description |
|---|---|
| UserID | Integer as user identification to differentiate students on the empirical study |
| NroUSP | Integer as user identification to differentiate students on the school |
| Type | Type of CL session in which the student with UserID participated in the empirical study |
| CLGroup | Name for the CL group in which the student with UserID is member of |
| CLRole | The CL role assigned for the student with UserID |
| PlayerRole | The player role assigned for the student with UserID in ont-gamified CL sessions |
| Score | Score calculated as the sum of the items in each record |
| theta | Estimate of the latent trait in logit scale |
| error | Standard error for the estimate of the latent trait theta |
| Outfit | outlier-sensitive fit statistic based on chi-square test. Values greater than >2 distorts or degraddes the measurement system. |
| Infit | inlier-pattern-sensitive fit statistic based on chi-square test. Values greater than >2 distorts or degraddes the measurement system. |
File: EffortImportance.csv
CSV-file with the IRT-based estimates of Effort/Importance. These estimates were calculated through the building process of RSM-based instruments detailed in the file: irt-instruments.pdf
- On-line visualization: EffortImportance.csv
- R script used to generate this file: 00-rsm-motivation-measurement-building.R (more info)
| Column | Description |
|---|---|
| UserID | Integer as user identification to differentiate students on the empirical study |
| NroUSP | Integer as user identification to differentiate students on the school |
| Type | Type of CL session in which the student with UserID participated in the empirical study |
| CLGroup | Name for the CL group in which the student with UserID is member of |
| CLRole | The CL role assigned for the student with UserID |
| PlayerRole | The player role assigned for the student with UserID in ont-gamified CL sessions |
| Score | Score calculated as the sum of the items in each record |
| theta | Estimate of the latent trait in logit scale |
| error | Standard error for the estimate of the latent trait theta |
| Outfit | outlier-sensitive fit statistic based on chi-square test. Values greater than >2 distorts or degraddes the measurement system. |
| Infit | inlier-pattern-sensitive fit statistic based on chi-square test. Values greater than >2 distorts or degraddes the measurement system. |
File: IntrinsicMotivation.csv
CSV-file with the IRT-based estimates of Intrinsic Motivation. These estimates were calculated through the building process of RSM-based instruments detailed in the file: irt-instruments.pdf
- On-line visualization: IntrinsicMotivation.csv
- R script used to generate this file: 00-rsm-motivation-measurement-building.R (more info)
| Column | Description |
|---|---|
| UserID | Integer as user identification to differentiate students on the empirical study |
| NroUSP | Integer as user identification to differentiate students on the school |
| Type | Type of CL session in which the student with UserID participated in the empirical study |
| CLGroup | Name for the CL group in which the student with UserID is member of |
| CLRole | The CL role assigned for the student with UserID |
| PlayerRole | The player role assigned for the student with UserID in ont-gamified CL sessions |
| Score | Score calculated as the sum of the items in each record |
| theta | Estimate of the latent trait in logit scale |
| error | Standard error for the estimate of the latent trait theta |
| Outfit | outlier-sensitive fit statistic based on chi-square test. Values greater than >2 distorts or degraddes the measurement system. |
| Infit | inlier-pattern-sensitive fit statistic based on chi-square test. Values greater than >2 distorts or degraddes the measurement system. |
Data Related to the Learning Outcomes
File: PreAMCscr.csv
CSV-file with information from the programming problem tasks solved by the students throughout the pretest phase, and scored using a rule defined by the teacher of course.
- On-line visualization: PreAMCscr.csv
- R script used to generate this file: 00-processing-amc.R (more info)
| Column | Description |
|---|---|
| UserID | Integer as user identification to differentiate students on the empirical studies |
| NroUSP | Integer as user identification to differentiate students on the school |
| Type | Type of CL session in which the student with UserID participated in the empirical study |
| CLGroup | Name for the CL group in which the student with UserID is member of |
| CLRole | The CL role assigned for the student with UserID |
| PlayerRole | The player role assigned for the student with UserID in ont-gamified CL sessions |
| QuX | Teacher-based score for the AMC question with identification QuX |
| score | Score calculated as the sum of scores obtained in all the questions |
Columns with name of QuX can have values Qu = {Re, Un, Ap, An, Ev} and X = {1, 2, 3} to represent questions classified according to the Bloom and SOLO taxonomies. Qu = Re: Remember level, Qu = Un: Understand level, Qu = Ap: Apply level, Qu = An: Analysing level, Qu = Ev: Evaluation level, X = 1: unistructural level, X = 2: multistructural level, and X = 3: relational level. (more information of these taxonomies in https://dl.acm.org/citation.cfm?id=1379265 and in https://doi.org/10.1145/2676723.2677311)
Teacher-based scoring rules for the columns QuX
score(QuX) = (NBC/NB - NMC/NM) * w(QuX)
- NB: is the number of correct responses to the question.
- NBC: is the count of correct responses which have been checked.
- NM: is the number of wrong responses to the question.
- NMC: is the count of wrong responses which have been checked.
- w(QuX): is the weight for the question (QuX) whose value is defined by the teacher according to the level of difficulty infered by him/her. These weights are the followings:
w(Re1) = 0.6for the question with remember-unistructural levelw(Re2) = 0.7for the question with remember-multistructural levelw(Un1) = 0.8for the question with understand-unistructural levelw(Un2) = 0.9for the question with understand-multistructural levelw(Ap1) = 1.0for the question with apply-unistructural levelw(Ap2) = 1.0for the question with apply-multistructural levelw(Ap3) = 1.1for the question with apply-relational levelw(An3) = 1.2for the question with analyse-relational levelw(Ev1) = 1.3for the question with evaluate-unistructural levelw(Ev2) = 1.4for the question with evaluate-unistructural level
File: PosAMCscr.csv
CSV-file with information from the programming problem tasks solved by the students throughout the posttest phase, and scored using a rule defined by the teacher of course.
- On-line visualization: PosAMCscr.csv
- R script used to generate this file: 00-processing-amc.R (more info)
| Column | Description |
|---|---|
| UserID | Integer as user identification to differentiate students on the empirical studies |
| NroUSP | Integer as user identification to differentiate students on the school |
| Type | Type of CL session in which the student with UserID participated in the empirical study |
| CLGroup | Name for the CL group in which the student with UserID is member of |
| CLRole | The CL role assigned for the student with UserID |
| PlayerRole | The player role assigned for the student with UserID in ont-gamified CL sessions |
| QuX | Teacher-based score for the AMC question with identification QuX |
| score | Score calculated as the sum of scores obtained in all the questions |
Columns with name of QuX can have values Qu = {Re, Un, Ap, An, Ev} and X = {1, 2, 3} to represent questions classified according to the Bloom and SOLO taxonomies. Qu = Re: Remember level, Qu = Un: Understand level, Qu = Ap: Apply level, Qu = An: Analysing level, Qu = Ev: Evaluation level, X = A: unistructural level, X = B: multistructural level, and X = C: relational level. (more information of these taxonomies in https://dl.acm.org/citation.cfm?id=1379265 and in https://doi.org/10.1145/2676723.2677311)
Teacher-based scoring rules for the columns QuX
score(QuX) = (NBC/NB - NMC/NM) * w(QuX)
- NB: is the number of correct responses to the question.
- NBC: is the count of correct responses which have been checked.
- NM: is the number of wrong responses to the question.
- NMC: is the count of wrong responses which have been checked.
- w(QuX): is the weight for the question (QuX) whose value is defined by the teacher according to the level of difficulty infered by him/her. These weights are the followings:
w(ReA) = 0.6for the question with remember-unistructural levelw(ReB) = 0.7for the question with remember-multistructural levelw(UnA) = 0.8for the question with understand-unistructural levelw(UnB) = 0.9for the question with understand-multistructural levelw(ApA) = 1.0for the question with apply-unistructural levelw(ApB) = 1.0for the question with apply-multistructural levelw(ApC) = 1.1for the question with apply-relational levelw(AnC) = 1.2for the question with analyse-relational levelw(EvA) = 1.3for the question with evaluate-unistructural levelw(EvB) = 1.4for the question with evaluate-unistructural level
File: PreAMC.csv
CSV-file with information from the programming problem tasks solved by the students throughout the pretest phase, and scored using the GPCM-based rule detailed in the file: irt-instruments.pdf (page 342).
- On-line visualization: PreAMC.csv
- R script used to generate this file: 00-processing-amc.R (more info)
| Column | Description |
|---|---|
| UserID | Integer as user identification to differentiate students on the empirical studies |
| NroUSP | Integer as user identification to differentiate students on the school |
| Type | Type of CL session in which the student with UserID participated in the empirical study |
| CLGroup | Name for the CL group in which the student with UserID is member of |
| CLRole | The CL role assigned for the student with UserID |
| PlayerRole | The player role assigned for the student with UserID in ont-gamified CL sessions |
| QuX | GPCM-based score for the AMC question with identification QuX |
Columns with name of QuX can have values Qu = {Re, Un, Ap, An, Ev} and X = {1, 2, 3} to represent questions classified according to the Bloom and SOLO taxonomies. Qu = Re: Remember level, Qu = Un: Understand level, Qu = Ap: Apply level, Qu = An: Analysing level, Qu = Ev: Evaluation level, X = 1: unistructural level, X = 2: multistructural level, and X = 3: relational level. (more information of these taxonomies in https://dl.acm.org/citation.cfm?id=1379265 and in https://doi.org/10.1145/2676723.2677311)
GPCM-based scoring rules for the columns QuX
Let NBC be the number of correct responses which have been checked in the question QuX, NM be the number of wrong responses; and NMC be the number of wrong responses which have been checked in the question QuX; then, the GPCM-scoring rule for a n-th question in the multiple choice questionnaire is given by:
score(n) = 0 ; if NBC = 0
score(n) = (NBC * (NM+1)) - NMC; otherwise
File: PosAMC.csv
CSV-file with information from the programming problem tasks solved by the students throughout the posttest phase, and scored using the GPCM-based rule detailed in the file: irt-instruments.pdf (page 342).
- On-line visualization: PosAMC.csv
- R script used to generate this file: 00-processing-amc.R (more info)
| Column | Description |
|---|---|
| UserID | Integer as user identification to differentiate students on the empirical studies |
| NroUSP | Integer as user identification to differentiate students on the school |
| Type | Type of CL session in which the student with UserID participated in the empirical study |
| CLGroup | Name for the CL group in which the student with UserID is member of |
| CLRole | The CL role assigned for the student with UserID |
| PlayerRole | The player role assigned for the student with UserID in ont-gamified CL sessions |
| QuX | GPCM-based score for the AMC question with identification QuX |
Columns with name of QuX can have values Qu = {Re, Un, Ap, An, Ev} and X = {1, 2, 3} to represent questions classified according to the Bloom and SOLO taxonomies. Qu = Re: Remember level, Qu = Un: Understand level, Qu = Ap: Apply level, Qu = An: Analysing level, Qu = Ev: Evaluation level, X = A: unistructural level, X = B: multistructural level, and X = C: relational level. (more information of these taxonomies in https://dl.acm.org/citation.cfm?id=1379265 and in https://doi.org/10.1145/2676723.2677311)
GPCM-based scoring rules for the columns QuX
Let NBC be the number of correct responses which have been checked in the question QuX, NM be the number of wrong responses; and NMC be the number of wrong responses which have been checked in the question QuX; then, the GPCM-scoring rule for a n-th question in the multiple choice questionnaire is given by:
score(n) = 0 ; if NBC = 0
score(n) = (NBC * (NM+1)) - NMC; otherwise
File: PreGuttmanVPL.csv
CSV-file with information from the programming problem tasks solved by the students throughout the pretest phase, and scored with Guttman-based rules detailed in the file: irt-instruments.pdf (pages 342-343).
- On-line visualization: PreGuttmanVPL.csv
- R script used to generate this file: 00-processing-vpl.R (more info)
| Column | Description |
|---|---|
| UserID | Integer as user identification to differentiate students on the empirical studies |
| PXs0 | Guttman-based score for the programming problem task with identification PX and rule s0. |
| PXs1 | Guttman-based score for the programming problem task with identification PX and rule s1. |
| PXs2 | Guttman-based score for the programming problem task with identification PX and rule s2. |
| PXs3 | Guttman-based score for the programming problem task with identification PX and rule s3. |
Guttman-structure scoring rules for the columns PXsY
rule s0: score(Q)
0: when the solution is incorrect (Q = 0), and the solving time is irrelevant1: when the solution is correct (Q = 1), and the solving time is irrelevant
rule s1: score(Q x T50)
(0,x) = 0: when the solution is incorrect (Q = 0) and the solving time is irrelevant(1,0) = 1: when the solution is correct (Q = 1) and the solving time is greater than the median (t > T55)(1,1) = 2: when the solution is correct (Q = 1) and the solving time is less than the median (t < T50)
rule s2: score(Q x T66 x T33)
(0,x,x) = 0: when the solution is incorrect (Q =0) and the solving time is irrelevant(1,0,x) = 1: when the solution is correct (Q =1) and the solving time is greater than 66-th percentile (t > T66)(1,1,0) = 2: when the solution is correct (Q =1) and the solving time is greater than 33-th percentile (t > T33)(1,1,1) = 3: when the solution is correct (Q =1) and the solving time is less than 33-th percentile (t < T33)
rule s3: score(Q x T75 x T50 x T25)
(0,x,x,x) = 0: when the solution is incorrect (Q = 0) and the solving time is irrelevant(1,0,x,x) = 1: when the solution is correct (Q = 1) and the solving time is greater than 75-th percentile (t > T75)(1,1,0,x) = 2: when the solution is correct (Q = 1) and the solving time is greater than the median (t > T50)(1,1,1,0) = 3: when the solution is correct (Q = 1) and the solving time is greater than 25-th percentile (t > T25)(1,1,1,1) = 4: when the solution is correct (Q = 1) and the solving time is less than 25-th percentile (t < T25)
File: PosGuttmanVPL.csv
CSV-file with information from the programming problem tasks solved by the students throughout the posttest phase, and scored with Guttman-based rules detailed in the file: irt-instruments.pdf (pages 342-343).
- On-line visualization: PosGuttmanVPL.csv
- R script used to generate this file: 00-processing-vpl.R (more info)
| Column | Description |
|---|---|
| UserID | Integer as user identification to differentiate students on the empirical studies |
| PXs0 | Guttman-based score for the programming problem task with identification PX and rule s0. |
| PXs1 | Guttman-based score for the programming problem task with identification PX and rule s1. |
| PXs2 | Guttman-based score for the programming problem task with identification PX and rule s2. |
| PXs3 | Guttman-based score for the programming problem task with identification PX and rule s3. |
Guttman-structure scoring rules for the columns PXsY
rule s0: score(Q)
0: when the solution is incorrect (Q = 0), and the solving time is irrelevant1: when the solution is correct (Q = 1), and the solving time is irrelevant
rule s1: score(Q x T50)
(0,x) = 0: when the solution is incorrect (Q = 0) and the solving time is irrelevant(1,0) = 1: when the solution is correct (Q = 1) and the solving time is greater than the median (t > T55)(1,1) = 2: when the solution is correct (Q = 1) and the solving time is less than the median (t < T50)
rule s2: score(Q x T66 x T33)
(0,x,x) = 0: when the solution is incorrect (Q =0) and the solving time is irrelevant(1,0,x) = 1: when the solution is correct (Q =1) and the solving time is greater than 66-th percentile (t > T66)(1,1,0) = 2: when the solution is correct (Q =1) and the solving time is greater than 33-th percentile (t > T33)(1,1,1) = 3: when the solution is correct (Q =1) and the solving time is less than 33-th percentile (t < T33)
rule s3: score(Q x T75 x T50 x T25)
(0,x,x,x) = 0: when the solution is incorrect (Q = 0) and the solving time is irrelevant(1,0,x,x) = 1: when the solution is correct (Q = 1) and the solving time is greater than 75-th percentile (t > T75)(1,1,0,x) = 2: when the solution is correct (Q = 1) and the solving time is greater than the median (t > T50)(1,1,1,0) = 3: when the solution is correct (Q = 1) and the solving time is greater than 25-th percentile (t > T25)(1,1,1,1) = 4: when the solution is correct (Q = 1) and the solving time is less than 25-th percentile (t < T25)
File: GainSkillsKnowledge.csv
CSV-file with the IRT-based estimates of Skill/Knowledge gains. These estimates were calculated through the stacking process based on the General Partial Credit Model (GPCM), and detailed in the file: irt-instruments.pdf
- On-line visualization: GainSkillsKnowledge.csv
- R script used to generate this file: 00-gpcm-learning-outcomes-measurement-building.R (more info)
| Column | Description |
|---|---|
| UserID | Integer as user identification to differentiate students on the empirical studies |
| QuX | GPCM-based score for the AMC question with identification QuX |
| PXsY | Guttman-based score for the programming problem task with identification PX and rule sY. |
| pre.PersonScores | Score calculated as the sum of items used during the pretest phase |
| pos.PersonScores | Score calculated as the sum of items used during the posttest phase |
| pre.theta | Estimate of the latent trait in logit scale for the pretest phase |
| pos.theta | Estimate of the latent trait in logit scale for the posttest phase |
| pre.sd.error | Standard error for the estimate of the latent trait theta calculated in the pretest phase |
| pos.sd.error | Standard error for the estimate of the latent trait theta calculated in the posttest phase |
| gain.theta | Estimate of the difference of latent traits (pos.theta - pre.theta) in logit scale |
Columns with name of QuX can have values Qu = {Re, Un, Ap, An, Ev} and X = {1, 2, 3} to represent questions classified according to the Bloom and SOLO taxonomies. Qu = Re: Remember level, Qu = Un: Understand level, Qu = Ap: Apply level, Qu = An: Analysing level, Qu = Ev: Evaluation level, X = 1 or X = A: unistructural level, X = 2 or X = B: multistructural level, and X = 3 or X = C: relational level. (more information of these taxonomies in https://dl.acm.org/citation.cfm?id=1379265 and in https://doi.org/10.1145/2676723.2677311)
GPCM-based scoring rules for the columns QuX
Let NBC be the number of correct responses which have been checked in the question QuX, NM be the number of wrong responses; and NMC be the number of wrong responses which have been checked in the question QuX; then, the GPCM-scoring rule for a n-th question in the multiple choice questionnaire is given by:
score(n) = 0 ; if NBC = 0
score(n) = (NBC * (NM+1)) - NMC; otherwise
Guttman-structure scoring rules for the columns PXsY
rule s0: score(Q)
0: when the solution is incorrect (Q = 0), and the solving time is irrelevant1: when the solution is correct (Q = 1), and the solving time is irrelevant
rule s1: score(Q x T50)
(0,x) = 0: when the solution is incorrect (Q = 0) and the solving time is irrelevant(1,0) = 1: when the solution is correct (Q = 1) and the solving time is greater than the median (t > T55)(1,1) = 2: when the solution is correct (Q = 1) and the solving time is less than the median (t < T50)
rule s2: score(Q x T66 x T33)
(0,x,x) = 0: when the solution is incorrect (Q =0) and the solving time is irrelevant(1,0,x) = 1: when the solution is correct (Q =1) and the solving time is greater than 66-th percentile (t > T66)(1,1,0) = 2: when the solution is correct (Q =1) and the solving time is greater than 33-th percentile (t > T33)(1,1,1) = 3: when the solution is correct (Q =1) and the solving time is less than 33-th percentile (t < T33)
rule s3: score(Q x T75 x T50 x T25)
(0,x,x,x) = 0: when the solution is incorrect (Q = 0) and the solving time is irrelevant(1,0,x,x) = 1: when the solution is correct (Q = 1) and the solving time is greater than 75-th percentile (t > T75)(1,1,0,x) = 2: when the solution is correct (Q = 1) and the solving time is greater than the median (t > T50)(1,1,1,0) = 3: when the solution is correct (Q = 1) and the solving time is greater than 25-th percentile (t > T25)(1,1,1,1) = 4: when the solution is correct (Q = 1) and the solving time is less than 25-th percentile (t < T25)