Causal Inference and Bias in Learning Analytics
A Primer on Pitfalls Using Directed Acyclic Graphs
DOI:
https://doi.org/10.18608/jla.2022.7577Keywords:
learning analytics, causal inference, LA, directed acyclic graphs, DAG, research design, observational research, bias, research paperAbstract
As a research field geared toward understanding and improving learning, Learning Analytics (LA) must be able to provide empirical support for causal claims. However, as a highly applied field, tightly controlled randomized experiments are not always feasible nor desirable. Instead, researchers often rely on observational data, based on which they may be reluctant to draw causal inferences. The past decades have seen much progress concerning causal inference in the absence of experimental data. This paper introduces directed acyclic graphs (DAGs), an increasingly popular tool to visually determine the validity of causal claims. Based on this, three basic pitfalls are outlined: confounding bias, overcontrol bias, and collider bias. Further, the paper shows how these pitfalls may be present in the published LA literature alongside possible remedies. Finally, this approach is discussed in light of practical constraints and the need for theoretical development.
References
Achen, C. H. (2005). Let’s put garbage-can regressions and garbage-can probits where they belong. Conflict Management and Peace Science, 22(4), 327–339. https://doi.org/10.1080/0738894050033916
Akçapınar, G., Altun, A., & Aşkar, P. (2019). Using learning analytics to develop early-warning system for at-risk students. International Journal of Educational Technology in Higher Education, 16(40). https://doi.org/10.1186/s41239-019-0172-z
Antonakis, J., Bendahan, S., Jacquart, P., & Lalive, R. (2010). On making causal claims: A review and recommendations. The Leadership Quarterly, 21(6), 1086–1120. https://doi.org/10.1016/j.leaqua.2010.10.010
Arnold, K. E., & Pistilli, M. D. (2012). Course signals at Purdue: Using learning analytics to increase student success. In S. Buckingham Shum, D. Gašević, & R. Ferguson (Eds.), Proceedings of the 2nd International Conference on Learning Analytics and Knowledge (LAK ʼ12), 29 April–2 May 2012, Vancouver, BC, Canada (pp. 267–270). ACM Press. https://doi.org/10.1145/2330601.2330666
Austin, P. C. (2011). An introduction to propensity score methods for reducing the effects of confounding in observational studies. Multivariate Behavioral Research, 46(3), 399–424. https://doi.org/10.1080/00273171.2011.568786
Baker, R. S., Gašević, D., & Karumbaiah, S. (2021). Four paradigms in learning analytics: Why paradigm convergence matters. Computers and Education: Artificial Intelligence, 2, 100021. https://doi.org/10.1016/j.caeai.2021.100021
Bareinboim, E., Correa, J. D., Ibeling, D., & Icard, T. (2020). On Pearl’s hierarchy and the foundations of causal inference. Technical Report R-60, Causal AI Lab, Columbia University. https://causalai.net/r60.pdf
Beheshitha, S. S., Hatala, M., Gašević, D., & Joksimović, S. (2016). The role of achievement goal orientations when studying effect of learning analytics visualizations. Proceedings of the 6th International Conference on Learning Analytics and Knowledge (LAK ʼ16), 25–29 April 2016, Edinburgh, UK (pp. 54–63). ACM Press. https://doi.org/10.1145/2883851.2883904
Bellemare, M. F., Bloem, J. R., & Wexler, N. (2019). The paper of how: Estimating treatment effects using the front-door criterion. Working Paper.
Berliner, D. C. (2002). Comment: Educational research: The hardest science of all. Educational Researcher, 31(8), 18–20. https://www.jstor.org/stable/3594389
Bloom, H. S. (2012). Modern regression discontinuity analysis. Journal of Research on Educational Effectiveness, 5(1), 43–82. https://doi.org/10.1080/19345747.2011.578707
Caulfield, M. (2013, Sept 26). Why the Course Signals math does not add up. Hapgood. https://hapgood.us/2013/09/26/why-the-course-signals-math-does-not-add-up/
Cinelli, C., Forney, A., & Pearl, J. (2020, October 29). A crash course in good and bad controls. http://dx.doi.org/10.2139/ssrn.3689437
Clow, D. (2013). An overview of learning analytics. Teaching in Higher Education, 18(6), 683–695. https://doi.org/10.1080/13562517.2013.827653
Cooper, A. (2012). What is analytics? Definition and essential characteristics. CETIS Analytics Series, 1(5), 1–10. http://publications.cetis.org.uk/wp-content/uploads/2012/11/What-is-Analytics-Vol1-No-5.pdf
Dawson, S., Mirriahi, N., & Gašević, D. (2015). Importance of theory in learning analytics in formal and workplace settings. Journal of Learning Analytics, 2(2), 1–4. https://doi.org/10.18608/jla.2015.22.1
Dawson, S., Joksimović, S., Poquet, O., & Siemens, G. (2019). Increasing the impact of learning analytics. Proceedings of the 9th International Conference on Learning Analytics and Knowledge (LAK ’19), 4–8 March 2019, Tempe, AZ, USA (pp. 446–455). ACM Press. https://doi.org/10.1145/3303772.3303784
Elwert, F., & Winship, C. (2014). Endogenous selection bias: The problem of conditioning on a collider variable. Annual Review of Sociology, 40, 31–53. https://doi.org/10.1146/annurev-soc-071913-043455
Ferguson, R., & Clow, D. (2017). Where is the evidence? A call to action for learning analytics. Proceedings of the 7th International Conference on Learning Analytics and Knowledge (LAK ’17), 13–17 March 2017, Vancouver, BC, Canada (pp. 56–65). ACM Press. https://doi.org/10.1145/3027385.3027396
Flanders, W. D., & Ye, D. (2019). Limits for the magnitude of M-bias and certain other types of structural selection bias. Epidemiology, 30(4), 501–508. https://doi.org/10.1097/EDE.0000000000001031
Foster, E., & Siddle, R. (2020). The effectiveness of learning analytics for identifying at-risk students in higher education. Assessment & Evaluation in Higher Education, 45(6), 842–854. https://doi.org/10.1080/02602938.2019.1682118
Galikyan, I., Admiraal, W., & Kester, L. (2021). MOOC discussion forums: The interplay of the cognitive and the social. Computers & Education, 165, 104133. https://doi.org/10.1016/j.compedu.2021.104133
Gašević, D., Dawson, S., Rogers, T., & Gašević, D. (2016). Learning analytics should not promote one size fits all: The effects of instructional conditions in predicting academic success. The Internet and Higher Education, 28, 68–84. https://doi.org/10.1016/j.iheduc.2015.10.002
Grosz, M. P., Rohrer, J. M., & Thoemmes, F. (2020). The taboo against explicit causal inference in nonexperimental psychology. Perspectives on Psychological Science, 15(5), 1243–1255. https://doi.org/10.1177/1745691620921521
Gustafsson, J.-E. (2013). Causal inference in educational effectiveness research: A comparison of three methods to investigate effects of homework on student achievement. School Effectiveness and School Improvement, 24(3), 275–295. https://doi.org/10.1080/09243453.2013.806334
Hellings, J., & Haelermans, C. (2020). The effect of providing learning analytics on student behaviour and performance in programming: A randomised controlled experiment. Higher Education, 83, 1–18. https://doi.org/10.1007/s10734-020-00560-z
Hernán, M. A., Hsu, J., & Healy, B. (2019). A second chance to get causal inference right: A classification of data science tasks. Chance, 32(1), 42–49. https://doi.org/10.1080/09332480.2019.1579578
Hernán, M. A. (2018). The C-word: Scientific euphemisms do not improve causal inference from observational data. American Journal of Public Health, 108(5), 616–619. https://doi.org/10.2105/AJPH.2018.304337
Hernán, M. A., & Robins, J. M. (2020). Causal inference: What if. Chapman & Hall/CRC.
Hicks, B., Kitto, K., Payne, L., & Buckingham Shum, S. (2022). Thinking with causal models: A visual formalism for collaboratively crafting assumptions. Proceedings of the 12th International Conference on Learning Analytics and Knowledge (LAK ’22), 21–25 March 2022, Online (pp. 250–259). ACM Press. https://doi.org/10.1145/3506860.3506899
Jacob, B. A., & Lefgren, L. (2004). Remedial education and student achievement: A regression-discontinuity analysis. The Review of Economics and Statistics, 86(1), 226–244. https://doi.org/10.1162/003465304323023778
Jivet, I., Scheffel, M., Schmitz, M., Robbers, S., Specht, M., & Drachsler, H. (2020). From students with love: An empirical study on learner goals, self-regulated learning and sense-making of learning analytics in higher education. The Internet and Higher Education, 47, 100758. https://doi.org/10.1016/j.iheduc.2020.100758
Joksimović, S., Poquet, O., Kovanović, V., Dowell, N., Mills, C., Gašević, D., Dawson, S., Graesser, A. C., & Brooks, C. (2018). How do we model learning at scale? A systematic review of research on MOOCs. Review of Educational Research, 88(1), 43–86. https://doi.org/10.3102/003465431774033
Jørnø, R. L., & Gynther, K. (2018). What constitutes an “actionable insight” in learning analytics? Journal of Learning Analytics, 5(3), 198–221. https://doi.org/10.18608/jla.2018.53.13
Kahlert, J., Gribsholt, S. B., Gammelager, H., Dekkers, O. M., & Luta, G. (2017). Control of confounding in the analysis phase: An overview for clinicians. Clinical Epidemiology, 9, 195–204. https://doi.org/10.2147/CLEP.S129886
King, G., Nielsen, R., Coberley, C., Pope, J. E., & Wells, A. (2011). Comparative effectiveness of matching methods for causal inference. Unpublished manuscript, Institute for Quantitative Social Science, Harvard University, Cambridge, MA.
King, G., & Nielsen, R. (2019). Why propensity scores should not be used for matching. Political Analysis 27(4), 435–454. https://doi.org/10.1017/pan.2019.11
Klenke, J., Massing, T., Reckmann, N., Langerbein, J., Otto, B., Goedicke, M., & Hanck, C. (2021). Effects of early warning emails on student performance. https://doi.org/10.48550/arXiv.2102.0880
Knight, C. R., & Winship, C. (2013). The causal implications of mechanistic thinking: Identification using directed acyclic graphs (DAGs). In Handbook of Causal Analysis for Social Research (pp. 275–299). Springer. https://doi.org/10.1007/978-94-007-6094-3_14
Lee, J. J. (2012). Correlation and causation in the study of personality. European Journal of Personality, 26(4), 372–390. https://doi.org/10.1002/per.186
Lim, L.-A., Gentili, S., Pardo, A., Kovanović, V., Whitelock-Wainwright, A., Gašević, D., & Dawson, S. (2021). What changes, and for whom? A study of the impact of learning analytics-based process feedback in a large course. Learning and Instruction, 72, 101202. https://doi.org/10.1016/j.learninstruc.2019.04.003
Littlejohn, A., Hood, N., Milligan, C., & Mustain, P. (2016). Learning in MOOCs: Motivations and self-regulated learning in MOOCs. The Internet and Higher Education, 29, 40–48. https://doi.org/10.1016/j.iheduc.2015.12.003
Mathewson, T. G. (2015, August 21). Analytics programs show ‘remarkable’ results — and it’s only the beginning. Higher Ed Dive. https://www.highereddive.com/news/analytics-programs-show-remarkable-results-and-its-only-the-beginning/404266/
McNamee, R. (2005). Regression modelling and other methods to control confounding. Occupational and Environmental Medicine, 62(7), 500–506. http://dx.doi.org/10.1136/oem.2002.001115
Moreno-Marcos, P. M., Muñoz-Merino, P. J., Maldonado-Mahauad, J., Pérez-Sanagustín, M., Alario-Hoyos, C., & Kloos, C. D. (2020). Temporal analysis for dropout prediction using self-regulated learning strategies in self-paced MOOCs. Computers & Education, 145, 103728. https://doi.org/10.1016/j.compedu.2019.103728
Morgan, S. L., & Winship, C. (2015). Counterfactuals and causal inference. Cambridge University Press.
Meehl, P. E. (1970). Nuisance variables and the ex post facto design. University of Minnesota Press. https://hdl.handle.net/11299/184638
Motz, B. A., Carvalho, P. F., de Leeuw, J. R., & Goldstone, R. L. (2018). Embedding experiments: Staking causal inference in authentic educational contexts. Journal of Learning Analytics, 5(2), 47–59. https://doi.org/10.18608/jla.2018.52.4
Mousavi, A., Schmidt, M., Squires, V., & Wilson, K. (2021). Assessing the effectiveness of student advice recommender agent (SARA): The case of automated personalized feedback. International Journal of Artificial Intelligence in Education, 31, 603–621. https://doi.org/10.1007/s40593-020-00210-6
Mullaney, T., & Reich, J. (2015). Staggered versus all-at-once content release in massive open online courses: Evaluating a natural experiment. Proceedings of the 2nd ACM Conference on Learning @ Scale (L@S 2015), 14–18 March 2015, Vancouver, BC, Canada (pp. 185–194). ACM Press. https://doi.org/10.1145/2724660.2724663
Munafò, M. R., Tilling, K., Taylor, A. E., Evans, D. M., & Smith, G. D. (2018). Collider scope: When selection bias can substantially influence observed associations. International Journal of Epidemiology, 47(1), 226–235. https://doi.org/10.1093/ije/dyx206
Pearl, J. (1993). Comment: Graphical models, causality and intervention. Statistical Science, 8(3), 266–269. https://www.jstor.org/stable/2245965
Pearl, J. (1995). Causal diagrams for empirical research. Biometrika, 82(4), 669–688. https://doi.org/10.2307/2337329
Pearl, J. (2009). Causality: Models, Reasoning, and Inference (2nd ed.). Cambridge University Press. https://doi.org/10.1017/CBO9780511803161
Pearl, J., Glymour, M., & Jewell, N. P. (2016). Causal inference in statistics: A primer. John Wiley & Sons.
Pearl, J. (2019). The seven tools of causal inference, with reflections on machine learning. Communications of the ACM, 62(3), 54–60. https://doi.org/10.1145/3241036
Pearl, J. (2021). Causal and counterfactual inference. In M. Knauff & W. Spohn (Eds.), The Handbook of Rationality (pp. 427–438). The MIT Press.
Prosperi, M., Guo, Y., Sperrin, M., Koopman, J. S., Min, J. S., He, X., Rich, S., Wang, M., Buchan, I. E., & Bian, J. (2020). Causal inference and counterfactual prediction in machine learning for actionable healthcare. Nature Machine Intelligence, 2, 369–375. https://doi.org/10.1038/s42256-020-0197-y
Rao, P. (1971). Some notes on misspecification in multiple regressions. The American Statistician, 25(5), 37–39. https://doi.org/10.2307/2686082
Reich, J., & Ruipérez-Valiente, J. A. (2019). The MOOC pivot. Science, 363(6423), 130–131. https://doi.org/10.1126/science.aav7958
Richardson, T. G., Smith, G. D., & Munafò, M. R. (2019). Conditioning on a collider may induce spurious associations: Do the results of Gale et al. (2017) support a health-protective effect of neuroticism in population subgroups? Psychological Science, 30(4), 629–632. https://doi.org/10.1177/0956797618774532
Robins, J. M. (2001). Data, design, and background knowledge in etiologic inference. Epidemiology, 12(3), 313–320. https://doi.org/10.1097/00001648-200105000-00011
Rohrer, J. M. (2018). Thinking clearly about correlations and causation: Graphical causal models for observational data. Advances in Methods and Practices in Psychological Science, 1(1), 27–42. https://doi.org/10.1177/25152459177456
Rosenbaum, P. R., & Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70(1), 41–55. https://doi.org/10.2307/2335942
Russell, J.-E., Smith, A., & Larsen, R. (2020). Elements of success: Supporting at-risk student resilience through learning analytics. Computers & Education, 152, 103890. https://doi.org/10.1016/j.compedu.2020.103890
Schisterman, E. F., Cole, S. R., & Platt, R. W. (2009). Overadjustment bias and unnecessary adjustment in epidemiologic studies. Epidemiology, 20(4), 488–495. https://doi.org/10.1097/EDE.0b013e3181a819a1
Shrier, I., & Platt, R. W. (2008). Reducing bias through directed acyclic graphs. BMC Medical Research Methodology, 8. https://doi.org/10.1186/1471-2288-8-70
Tennant, P. W. G., Murray, E. J., Arnold, K. F., Berrie, L., Fox, M. P., Gadd, S. C., Harrison, W. J., Keeble, C., Ranker, L. R., Textor, J., Tomova, G. D., Gilthorpe, M. S., & Ellison, G. T. H. (2021). Use of directed acyclic graphs (DAGs) to identify confounders in applied health research: Review and recommendations. International Journal of Epidemiology, 50(2), 620–632. https://doi.org/10.1093/ije/dyaa213
Sønderlund, A. L., Hughes, E., & Smith, J. (2018). The efficacy of learning analytics interventions in higher education: A systematic review. British Journal of Educational Technology, 50(5), 2594–2618. https://doi.org/10.1111/bjet.12720
Siemens, G., & Gašević, D. (2012). Guest editorial: Learning and knowledge analytics. Educational Technology & Society, 15(3), 1–2.
Tripepi, G., Jager, K. J., Dekker, F. W., & Zoccali, C. (2010). Stratification for confounding — Part 1: The Mantel-Haenszel formula. Nephron Clinical Practice, 116(4), c317–c321. https://doi.org/10.1159/000319590
Tseng, S.-F., Tsao, Y.-W., Yu, L. C., Chan, C.-L., & Lai, K. R. (2016). Who will pass? Analyzing learner behaviors in MOOCs. Research and Practice in Technology Enhanced Learning, 11. https://doi.org/10.1186/s41039-016-0033-5
VanderWeele, T. (2015). Explanation in causal inference: Methods for mediation and interaction. Oxford University Press.
Verbert, K., Manouselis, N., Drachsler, H., & Duval, E. (2012). Dataset-driven research to support learning and knowledge analytics. Educational Technology & Society, 15(3), 133–148. https://www.jstor.org/stable/jeductechsoci.15.3.133
Viberg, O., Hatakka, M., Bälter, O., & Mavroudi, A. (2018). The current landscape of learning analytics in higher education. Computers in Human Behavior, 89, 98–110. https://doi.org/10.1016/j.chb.2018.07.027
Wang, W., Guo, L., He, L., & Wu, Y. J. (2019). Effects of social-interactive engagement on the dropout ratio in online learning: Insights from MOOC. Behaviour & Information Technology, 38(6), 621–636. https://doi.org/10.1080/0144929X.2018.1549595
Westfall, J., & Yarkoni, T. (2016). Statistically controlling for confounding constructs is harder than you think. PloS One, 11(3), e0152719. https://doi.org/10.1371/journal.pone.0152719
Westreich, D., & Greenland, S. (2013). The Table 2 fallacy: Presenting and interpreting confounder and modifier coefficients. American Journal of Epidemiology, 177(4), 292–298. https://doi.org/10.1093/aje/kws412
Winne, P. H. (1982). Minimizing the black box problem to enhance the validity of theories about instructional effects. Instructional Science, 11, 13–28. https://doi.org/10.1007/BF00120978
Winne, P. H. (1983). Distortions of construct validity in multiple regression analysis. Canadian Journal of Behavioural Science, 15(3), 187–202. https://doi.org/10.1037/h0080736
Winne, P. (2017). Leveraging big data to help each learner and accelerate learning science. Teachers College Record, 119(3), 1–24. https://doi.org/10.1177/016146811711900305
Winne, P. H. (2020). Construct and consequential validity for learning analytics based on trace data. Computers in Human Behavior, 112, 106457. https://doi.org/10.1016/j.chb.2020.106457
Wise, A. F., & Shaffer, D. W. (2015). Why theory matters more than ever in the age of big data. Journal of Learning Analytics, 2(2), 5–13. https://doi.org/10.18608/jla.2015.22.2
Wong, J., Baars, M., de Koning, B. B., van der Zee, T., Davis, D., Khalil, M., Houben, G.-J., & Paas, F. (2019). Educational theories and learning analytics: From data to knowledge. In D. Ifenthaler, D.-K. Mah, & J. Yin-Kim Yau (Eds.), Utilizing learning analytics to support study success (pp. 3–25). Springer. https://doi.org/10.1007/978-3-319-64792-0_1
Zhou, M., & Winne, P. H. (2012). Modeling academic achievement by self-reported versus traced goal orientation. Learning and Instruction, 22(6), 413–419. https://doi.org/10.1016/j.learninstruc.2012.03.004
Zhu, M., Bergner, Y., Zhang, Y., Baker, R., Wang, Y., & Paquette, L. (2016). Longitudinal engagement, performance, and social connectivity: A MOOC case study using exponential random graph models. Proceedings of the 6th International Conference on Learning Analytics and Knowledge (LAK ʼ16), 25–29 April 2016, Edinburgh, UK (pp. 223–230). ACM Press. https://doi.org/10.1145/2883851.2883934
Zimmerman, B. J. (1990). Self-regulated learning and academic achievement: An overview. Educational Psychologist, 25(1), 3–17. https://doi.org/10.1207/s15326985ep2501_2
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2022 Journal of Learning Analytics
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
TEST