Bayesian Generative Modelling of Student Results in Course Networks

Authors

DOI:

https://doi.org/10.18608/jla.2023.7957

Keywords:

statistical modeling, learning analytics, bayesian generative modeling, curriculum evaluation, data and tools report

Abstract

We present an innovative modelling technique that simultaneously constrains student performance, course difficulty, and the sensitivity with which a course can differentiate between students by means of grades. Grade lists are the only necessary ingredient. Networks of courses will be constructed where the edges are populations of students that took both connected course nodes. Using idealized experiments and two real-world data sets, we show that the model, even though simple in its set-up, can constrain the properties of courses very well, as long as some basic requirements in the data set are met: (1) significant overlap in student populations, and thus information exchange through the network; (2) non-zero variance in the grades for a given course; and (3) some correlation between grades for different courses. The model can then be used to evaluate a curriculum, a course, or even subsets of students for a very wide variety of applications, ranging from program accreditation to exam fraud detection. We publicly release the code with examples that fully recreate the results presented here.

References

Archambault, I., Janosz, M., Fallu, J.- S., & Pagani, L. S. (2009). Student engagement and its relationship with early high school dropout. Journal of adolescence, 32(3), 651–670.

Asif, R., Merceron, A., Ali, S. A., & Haider, N. G. (2017). Analyzing undergraduate students’ performance using educational data mining. Computers & Education, 113, 177–194. https://doi.org/https://doi.org/10.1016/j.compedu.2017.05.007

B ̈ottcher, A., Thurner, V., & H ̈afner, T. (2020). Applying data analysis to identify early indicators for potential risk of dropout in cs students. 2020 IEEE Global Engineering Education Conference (EDUCON), 827–836. https://doi.org/10.1109/EDUCON45650.2020.9125378

Camargo, C. P., Tempski, P. Z., Busnardo, F. F., de Arruda Martins, M., & Gemperli, R. (2020). Online learning and covid-19: A meta-synthesis analysis. Clinics, 75, e2286. https://doi.org/https://doi.org/10.6061/clinics/2020/e2286

Celik, I., Gedrimiene, E., Silvola, A., & Muukkonen, H. (2022). Response of learning analytics to the online education challenges during pandemic: Opportunities and key examples in higher education. Policy Futures in Education, 0(0), 14782103221078401. https://doi.org/10.1177/14782103221078401

Chen, B., & Poquet, O. (2022). Networks in learning analytics: Where theory, methodology, and practice intersect. Journal of Learning Analytics, 9(1), 1–12. https://doi.org/10.18608/jla.2022.7697

Chen, F., & Cui, Y. (2020). Utilizing student time series behaviour in learning management systems for early prediction of course performance. Journal of Learning Analytics, 7(2), 1–17. https://doi.org/10.18608/jla.2020.72.1

Chou, C.-Y., Tseng, S.-F., Chih, W. -C., Chen, Z. -H., Chao, P. - Y., Lai, K. R., Chan, C.-L., Yu, L. -C., & Lin, Y.-L. (2017). Open student models of core competencies at the curriculum level: Using learning analytics for student reflection. IEEE Transactions on Emerging Topics in Computing, 5(1), 32–44. https://doi.org/10.1109/TETC.2015.2501805

Columbia University, USA, Lang, C., Siemens, G., University of Texas at Arlington, USA, Wise, A., New York University, USA, Gasevic, D., & University of Edinburgh, UK (Eds.). (2017). Handbook of Learning Analytics (First). Society for Learning Analytics Research (SoLAR). https://doi.org/10.18608/hla17

Cui, Y., Chen, F., Shiri, A., & Fan, Y. (2019). Predictive analytic models of student success in higher education: A review of methodology. Information and Learning Sciences, 120(3/4), 208–227.

Daud, A., Aljohani, N. R., Abbasi, R. A., Lytras, M. D., Abbas, F., & Alowibdi, J. S. (2017). Predicting student performance using advanced learning analytics. Proceedings of the 26th International Conference on World Wide Web Companion, 415–421. https://doi.org/10.1145/3041021.3054164

Dawson, S., Poquet, O., Colvin, C., Rogers, T., Pardo, A., & Gasevic, D. (2018). Rethinking learning analytics adoption through complexity leadership theory. Proceedings of the 8th International Conference on Learning Analytics and Knowledge, 236-244. https://doi.org/10.1145/3170358.3170375

Di Pietro, L., Guglielmetti Mugion, R., Musella, F., Renzi, M. F., & Vicard, P. (2015). Reconciling internal and external performance in a holistic approach: A Bayesian network model in higher education. Expert Systems with Applications, 42(5), 2691–2702. https://doi.org/https://doi.org/10.1016/j.eswa.2014.11.019

Diaconis, P. (2009). The markov chain monte carlo revolution. Bull. Amer. Math. Soc., 46, 179–205. https://doi.org/10.1090/S0273-0979-08-01238-X

Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists (1st ed.). Psychology Press.

Ferguson, R., & Clow, D. (2017). Where is the evidence? a call to action for learning analytics. Proceedings of the Seventh International Learning Analytics & Knowledge Conference, 56–65. https://doi.org/10.1145/3027385.3027396

Gardner, J. P., & Brooks, C. (2018). Evaluating predictive models of student success: Closing the methodological gap. Journal of Learning Analytics, 5(2), 105–125. https://doi.org/10.18608/jla.2018.52.7

Gudeva, L. K., Dimova, V., Daskalovska, N., & Trajkova, F. (2012). Designing descriptors of learning outcomes for higher education qualification. Procedia - Social and Behavioral Sciences, 46, 1306–1311.

Hagberg, A. A., Schult, D. A., & Swart, P. J. (2008). Exploring network structure, dynamics, and function using networks. In G. Varoquaux, T. Vaught, & J. Millman (Eds.), Proceedings of the 7th python in science conference (pp. 11–15).

Harris, C. R., Millman, K. J., van der Walt, S. J., Gommers, R., Virtanen, P., Cournapeau, D., Wieser, E., Taylor, J., Berg, S., Smith, N. J., Kern, R., Picus, M., Hoyer, S., van Kerkwijk, M. H., Brett, M., Haldane, A., del R ́ıo, J. F., Wiebe, M., Peterson, P., . . . Oliphant, T. E. (2020). Array programming with NumPy. Nature, 585(7825), 357–362. https://doi.org/10.1038/s41586-020-2649-2

Herodotou, C., Rienties, B., Verdin, B., & Boroowa, A. (2019). Predictive learning analytics ’at scale’: Guidelines to successful implementation in higher education. Journal of Learning Analytics, 6(1), 85–95. https://doi.org/10.18608/jla.2019.61.5

Hoffman, D., & Gelman, A. (2014). The no-u-turn sampler: Adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15, 1593–1623.

Hunter, J. D. (2007). Matplotlib: A 2d graphics environment. Computing in Science & Engineering, 9(3), 90–95. https ://doi.org/10.1109/MCSE.2007.55

Jeffreys, M. R. (2015). Jeffreys’s nursing universal retention and success model: Overview and action ideas for optimizing outcomes a–z. Nurse Education Today, 35(3), 425–431. https://doi.org/https://doi.org/10.1016/j.nedt.2014.11.004

Knight, S., Buckingham Shum, S., & Littleton, K. (2013). Epistemology, pedagogy, assessment and learning analytics. Proceedings of the Third International Conference on Learning Analytics and Knowledge, 75–84. https://doi.org/10.1145/2460296.2460312

Kumar, R., Carroll, C., Hartikainen, A., & Martin, O. (2019). Arviz a unified library for exploratory analysis of Bayesian models in Python. Journal of Open Source Software, 4(33), 1143. https://doi.org/10.21105/joss.01143

Kuzilek, J., Hlosta, M., & Zdrahal, Z. (2017). Open university learning analytics dataset. Sci Data, 4, 170–171.

Lacave, C., Molina, A. I., & Cruz-Lemus, J. A. (2018). Learning analytics to identify dropout factors of computer science studies through Bayesian networks. Behaviour & Information Technology, 37(10-11), 993–1007. https://doi.org/10.1080/0144929X.2018.1485053

Macfadyen, L., & Dawson, S. (2012). Numbers are not enough:why e-learning analytics failed to inform an institutional strategic plan. Journal of Educational Technology & Society, 15(3), 149.

McElreath, R. (2020). Statistical rethinking: A Bayesian course with examples in r and stan, 2nd edition (2nd ed.). CRC Press. http://xcelab.net/rm/statistical-rethinking/

McEneaney, J., & Morsink, P. (2022). Curriculum modelling and learner simulation as a tool in curriculum (re)design. Journal of Learning Analytics, 9(2), 161–178. https://doi.org/10.18608/jla.2022.7499

Munguia, P., & Brennan, A. (2020). Scaling the student journey from course-level information to program-level progression and graduation: A model. Journal of Learning Analytics, 7(2), 84–94. https://doi.org/10.18608/jla.2020.72.5

Neal, R. (1994). An improved acceptance procedure for the hybrid Monte Carlo algorithm. Journal of Computational Physics, 111, 194–203.

Nicholls, M. (2007). Assessing the progress and the underlying nature of the flows of doctoral and master degree candidates using absorbing Markov chains. Higher Education, 53, 769–790. https://doi.org/10.1007/s10734-005-5275-x

Ochoa, X. (2016). Simple metrics for curricular analytics. In J. Greer, M. Molinaro, X. Ochoa, & T. McKay (Eds.), Proceedings of the 1st learning analytics for curriculum and program quality improvement workshop (pcla 2016) (pp. 20–24). International Educational Data Mining Society.

Raji, M., Duggan, J., DeCotes, B., Huang, J., & Zanden, B. V. (2021). Modeling and visualizing student flow. IEEE Transactions on Big Data, 7(3), 510–523. https://doi.org/10.1109/TBDATA.2018.2840986

Richardson, J. T. (2015). Coursework versus examinations in end-of-module assessment: A literature review. Assessment & Evaluation in Higher Education, 40(3), 439–455. https://doi.org/10.1080/02602938.2014.919628

Rienties, B., Cross, S., & Zdrahal, Z. (2017). Implementing a learning analytics intervention and evaluation framework: What works? In B. Kei Daniel (Ed.), Big data and learning analytics in higher education: Current theory and practice (pp. 147–166). Springer International Publishing. https://doi.org/10.1007/978-3-319-06520-5_10

Roberts, S., Osborne, M., Ebden, M., Reece, S., Gibson, N., & Aigrain, S. (2013). Gaussian processes for time-series modelling. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 371(1984), 20110550. https://doi.org/10.1098/rsta.2011.0550

Salvatier, J., Wiecki, T. V., & Fonnesbeck, C. (2016). Probabilistic programming in Python using PyMC3. PeerJ Computer Science, 2, e55. https://doi.org/10.7717/peerj-cs.55

Schmitz, M., Scheffel, M., Bemelmans, R., & Drachsler, H. (2022). Fola2 — a method for co-creating learning analytics–supported learning design. Journal of Learning Analytics, 9(2), 265–281. https://doi.org/10.18608/jla.2022.7643

Sense, F., van der Velde, M., & van Rijn, H. (2021). Predicting university students’ exam performance using a model-based adaptive fact-learning system. Journal of Learning Analytics, 8(3), 155–169. https://doi.org/10.18608/jla.2021.6590

Shah, C., & Burke, G. (1999). An undergraduate student flow model: Australian higher education. Higher Education, 37, 359–375. https://doi.org/10.1023/A:1003765222250

Stretch, P., Cruz, L., Soares, C., Mendes-Moreira, J., & Abreu, R. (2015). A comparative study of classification and regression algorithms for modelling students’ academic performance. In O. S. et al. (Ed.), Proceedings of the 8th international conference on educational data mining (edm2015) (pp. 392–395). International Educational Data Mining Society.

Van Rossum, G., & Drake, F. L. (2009). Python 3 reference manual. CreateSpace.

Waskom, M. L. (2021). Seaborn: Statistical data visualization. Journal of Open Source Software, 6(60), 3021. https://doi.org/10.21105/joss.03021

Wise, A., Cui, Y., & Jin, W. (2017). Honing in on social learning networks in MOOC forums: Examining critical network definition decisions. Proceedings of the 7th International Conference on Learning Analytics and Knowledge (LAK’17), 383–392.

Downloads

Published

2023-12-15

How to Cite

Haas, M. R., Caprani, C., & van Beurden, B. (2023). Bayesian Generative Modelling of Student Results in Course Networks. Journal of Learning Analytics, 10(3), 135-152. https://doi.org/10.18608/jla.2023.7957

Issue

Section

Data and Tools Reports