The Impact of Attribute Noise on the Automated Estimation of Collaboration Quality Using Multimodal Learning Analytics in Authentic Classrooms

Pankaj Chejara; Luis P.  Prieto; Yannis Dimitriadis; María Jesús Rodríguez-Triana; Adolfo Ruiz-Calleja; Reet  Kasepalu; Shashi Kant  Shankar

doi:10.18608/jla.2024.8253

Authors

Pankaj Chejara Tallinn University, Estonia. https://orcid.org/0000-0002-7630-5789
Luis P. Prieto Universidad de Valladolid, Valladolid, Spain https://orcid.org/0000-0002-0057-0682
Yannis Dimitriadis Universidad de Valladolid, Valladolid, Spain https://orcid.org/0000-0001-7275-2242
María Jesús Rodríguez-Triana Tallinn University, Tallinn, Estonia https://orcid.org/0000-0001-8639-1257
Adolfo Ruiz-Calleja Tallinn University, Tallinn, Estonia https://orcid.org/0000-0003-1717-6304
Reet Kasepalu Tallinn University, Tallinn, Estonia https://orcid.org/0000-0003-3389-8673
Shashi Kant Shankar AMMACHI Labs, Kerela, India https://orcid.org/0000-0001-8266-3681

DOI:

https://doi.org/10.18608/jla.2024.8253

Keywords:

multimodal learning analytics, collaboration, collaborative learning, computer-supported collaborative learning, machine learning, research paper

Abstract

Multimodal learning analytics (MMLA) research has shown the feasibility of building automated models of collaboration quality using artificial intelligence (AI) techniques (e.g., supervised machine learning (ML)), thus enablingthe development of monitoring and guiding tools for computer-supported collaborative learning (CSCL). However, the practical applicability and performance of these automated models in authentic settings remains largely an under-researched area. In such settings, the quality of data features or attributes is often affected by noise, which is referred to as attribute noise. This paper undertakes a systematic exploration of the impact of attribute noise on the performance of different collaboration-quality estimation models. Moreover, we also perform a comparative analysis of different ML algorithms in terms of their capability of dealing with attribute noise. We employ four ML algorithms that have often been used for collaboration-quality estimation tasks due to their high performance: random forest, naive Bayes, decision tree, and AdaBoost. Our results show that random forest and decision tree outperformed other algorithms for collaboration-quality estimation tasks in the presence of attribute noise. The study contributes to the MMLA (and learning analytics (LA) in general) and CSCL fields by illustrating how attribute noise impacts collaboration-quality model performance and which ML algorithms seem to be more robust to noise and thus more likely to perform well in authentic settings. Our research outcomes offer guidance to fellow researchers and developers of (MM)LA systems employing AI techniques with multimodal data to model collaboration-related constructs in authentic classroom settings.

References

Alexandron, G., Yoo, L. Y., Ruipérez-Valiente, J. A., Lee, S., & Pritchard, D. E. (2019). Are MOOC learning analytics results trustworthy? With fake learners, they might not be! International Journal of Artificial Intelligence in Education, 29, 484–506. https://doi.org/10.1007/s40593-019-00183-1

Baker, R. S., & Hawn, A. (2022). Algorithmic bias in education. International Journal of Artificial Intelligence in Education, 32(4), 1052–1092. https://doi.org/10.1007/s40593-021-00285-9

Chejara, P., Kasepalu, R., Prieto, L., Rodríguez-Triana, M. J., & Ruiz-Calleja, A. (2024). Bringing collaborative analytics using multimodal data to masses: Evaluation and design guidelines for developing a MMLA system for research and teaching practices in CSCL. In Proceedings of the 14th International Conference on Learning Analytics and Knowledge (LAK 2024), 18–22 March 2024, Kyoto, Japan (pp. 800–806). ACM. https://doi.org/10.1145/3636555.3636877

Chejara, P., Kasepalu, R., Prieto, L. P., Rodríguez-Triana, M. J., Ruiz Calleja, A., & Schneider, B. (2023). How well do collaboration quality estimation models generalize across authentic school contexts? British Journal of Educational Technology, 55(4), 1602–1624. https://doi.org/https://doi.org/10.1111/bjet.13402

Chejara, P., Kasepalu, R., Prieto, L. P., Rodríguez-Triana, M. J., Ruiz-Calleja, A., & Shankar, S. K. (2023). Multimodal learning analytics research in the wild: Challenges and their potential solutions. In CrossMMLA’23: Leveraging Multimodal Data for Generating Meaningful Feedback, 14 March 2023, Arlington, Texas, USA (pp. 1–5). https://ceur-ws.org/Vol-3439/paper5.pdf

Chejara, P., Prieto, L. P., Rodríguez-Triana, M. J., Kasepalu, R., Ruiz-Calleja, A., & Shankar, S. K. (2023). How to build more generalizable models for collaboration quality? Lessons learned from exploring multi-context audio-log datasets using multimodal learning analytics. In Proceedings of the 13th International Conference on Learning Analytics and Knowledge (LAK 2023), 13–17 March 2023, Arlington, Texas, USA (pp. 111–121). ACM. https://doi.org/10.1145/3576050.3576144

Chejara, P., Prieto, L. P., Ruiz-Calleja, A., Rodríguez-Triana, M. J., Shankar, S. K., & Kasepalu, R. (2021). EFAR-MMLA: An evaluation framework to assess and report generalizability of machine learning models in MMLA. Sensors, 21(8), 1–27. https://doi.org/10.3390/s21082863

Chounta, I. A., Bardone, E., Raudsep, A., & Pedaste, M. (2022). Exploring teachers’ perceptions of artificial intelligence as a tool to support their practice in Estonian K-12 education. International Journal of Artificial Intelligence in Education, 32(3), 725–755. https://doi.org/10.1007/s40593-021-00243-5

Chua, Y. H. V., Dauwels, J., & Tan, S. C. (2019). Technologies for automated analysis of co-located, real-life, physical learning spaces. In Proceedings of the Ninth International Conference on Learning Analytics and Knowledge (LAK 2019), 4–8 March 2019, Tempe, Arizona, USA (pp. 11–20). ACM. https://doi.org/10.1145/3303772.3303811

Crescenzi-Lanna, L. (2020). Multimodal learning analytics research with young children: A systematic review. British Journal of Educational Technology, 51(5), 1485–1504. https://doi.org/10.1111/bjet.12959

Cukurova, M., Luckin, R., Millán, E., & Mavrikis, M. (2018). The NISPI framework: Analysing collaborative problem-solving from students’ physical interactions. Computers & Education, 116, 93–109. https://doi.org/10.1016/j.compedu.2017.08.007

Darvishi, A., Khosravi, H., Sadiq, S., & Weber, B. (2022). Neurophysiological measurements in higher education: A systematic literature review. International Journal of Artificial Intelligence in Education, 32(2), 413–453. https://doi.org/10.1007/s40593-021-00256-0

Di Mitri, D., Schneider, J., Specht, M., & Drachsler, H. (2018). From signals to knowledge: A conceptual model for multimodal learning analytics. Journal of Computer Assisted Learning, 34(4), 338–349. https://doi.org/10.1111/jcal.12288

Drachsler, H., & Schneider, J. (2018). JCAL special issue on multimodal learning analytics. Journal of Computer Assisted Learning, 34(4), 335–337. https://doi.org/10.1111/jcal.12291

Emerson, A., Henderson, N., Rowe, J., Min,W., Lee, S., Minogue, J., & Lester, J. (2020). Early prediction of visitor engagement in science museums with multimodal learning analytics. Proceedings of the 2020 International Conference on Multimodal Interaction (ICMI 2020), 25–29 October 2020, online, 107–116. https://doi.org/10.1145/3382507.3418890

Han, J., & Kamber, M. (2000). Data mining: Concepts and techniques. In J. Gray (Ed.), The Morgan Kaufmann series in data management systems. Morgan Kaufmann Publishers. https://www.sciencedirect.com/book/9780123814791/datamining-concepts-and-techniques

Hickey, R. J. (1996). Noise modelling and evaluating learning from examples. Artificial Intelligence, 82(1-2), 157–179.https://doi.org/10.1016/0004-3702(94)00094-8

Kalapanidas, E., Avouris, N., Craciun, M., & Neagu, D. (2003). Machine learning algorithms: A study on noise sensitivity. In Y. Manolopoulos & P. Spirakis (Eds.), Proceedings of the First Balkan Conference in Informatics, 21–23 November 2003, Thessaloniki, Greece (pp. 356–365). http://delab.csd.auth.gr/bci1/Balkan/0prefaceBalkan.pdf

Kasepalu, R., Chejara, P., Prieto, L. P., & Ley, T. (2023). Studying teacher withitness in the wild: Comparing a mirroring and an alerting & guiding dashboard for collaborative learning. International Journal of Computer-Supported Collaborative Learning, 1–32. https://doi.org/10.1007/s11412-023-09414-z

Kasepalu, R., Prieto, L. P., Ley, T., & Chejara, P. (2022). Teacher artificial intelligence-supported pedagogical actions in collaborative learning coregulation: A Wizard-of-Oz study. Frontiers in Education, 7. https://doi.org/10.3389/feduc.2022.736194

Laal, M., & Ghodsi, S. M. (2012). Benefits of collaborative learning. Procedia—Social and Behavioral Sciences, 31, 486–490. https://doi.org/10.1016/j.sbspro.2011.12.091

Loyola-González, O. (2019). Black-box vs. white-box: Understanding their advantages and weaknesses from a practical point of view. IEEE Access, 7, 154096–154113. https://doi.org/10.1109/ACCESS.2019.2949286

Lubold, N., & Pon-Barry, H. (2014). Acoustic-prosodic entrainment and rapport in collaborative learning dialogues. In Proceedings of the 2014 ACM Workshop on Multimodal Learning Analytics Workshop and Grand Challenge (MLA 2014), 12 November 2014, Istanbul, TÅNurkiye (pp. 5–12). ACM. https://doi.org/10.1145/2666633.2666635

Ma, Y., Celepkolu, M., Boyer, K. E., Lynch, C. F., Wiebe, E., & Israel, M. (2023). How noisy is too noisy? The impact of data noise on multimodal recognition of confusion and conflict during collaborative learning. In E. AndrÅLe, M. Chetouani, D. Vaufreydaz, G. Lucas, T. Schultz, L.-P. Morency, & A. Vinciarelli (Eds.), Proceedings of the 25th International Conference on Multimodal Interaction (ICMI 2023), 9–13 October 2023, Paris, France (pp. 326–335). ACM. https://doi.org/10.1145/3577190.3614127

Martinez-Maldonado, R., Clayphan, A., Yacef, K., & Kay, J. (2015). MTFeedback: Providing notifications to enhance teacher awareness of small group work in the classroom. IEEE Transactions on Learning Technologies, 8(2), 187–200.https://doi.org/10.1109/TLT.2014.2365027

Martinez-Maldonado, R., Dimitriadis, Y., Martinez-Monés, A., Kay, J., & Yacef, K. (2013). Capturing and analyzing verbal and physical collaborative learning interactions at an enriched interactive tabletop. International Journal of Computer-Supported Collaborative Learning, 8(4), 455–485. https://doi.org/10.1007/s11412-013-9184-1

Martínez-Maldonado, R. (2011). Modelling symmetry of activity as an indicator of collocated group collaboration. In J. Konstan, R. Conejo, J. Marzo, & N. Oliver (Eds.), Lecture notes in computer science (including subseries Lecture notes in artificial intelligence and Lecture notes in bioinformatics) (pp. 207–218, Vol. 6787). https://doi.org/10.1007/978-3-642-22362-4_18

Meier, A., Spada, H., & Rummel, N. (2007). A rating scheme for assessing the quality of computer-supported collaboration processes. International Journal of Computer-Supported Collaborative Learning, 2, 63–86. https://doi.org/10.1007/s11412-006-9005-x

Melville, P., Shah, N., Mihalkova, L., & Mooney, R. J. (2004). Experiments on ensembles with missing and noisy data. In F. Roli, J. Kittler, & T. Windeatt (Eds.), Lecture notes in computer science (including subseries Lecture notes in artificial intelligence and Lecture notes in bioinformatics) (pp. 293–302, Vol. 3077). https://doi.org/10.1007/978-3-540-25966-4_29

Nettleton, D. F., Orriols-Puig, A., & Fornells, A. (2010). A study of the effect of different types of noise on the precision of supervised learning techniques. Artificial Intelligence Review, 33(4), 275–306. https://doi.org/10.1007/s10462-010-9156-z

Ochoa, X., & Worsley, M. (2016). Augmenting learning analytics with multimodal sensory data. Journal of Learning Analytics, 3(2), 213–219. https://doi.org/10.18608/jla.2016.32.10

OECD. (2017). PISA 2015 collaborative problem-solving framework. https://doi.org/10.1787/9789264281820-8-en

Papandreou, G., Katsamanis, A., Pitsikalis, V., & Maragos, P. (2009). Adaptive multimodal fusion by uncertainty compensation with application to audiovisual speech recognition. IEEE Transactions on Audio, Speech, and Language Processing, 17(3), 423–435. https://doi.org/10.1109/TASL.2008.2011515

Peebles, J. (1987). Probability, random variables, and random signal principles. McGraw-Hill.

Ponce-Lopez, V., Escalera, S., & Baro, X. (2013). Multi-modal social signal analysis for predicting agreement in conversation settings. Proceedings of the 2013 ACM International Conference on Multimodal Interaction (ICMI 2013), 9–13 December 2013, Sydney, Australia, 495–502. https://doi.org/10.1145/2522848.2532594

Praharaj, S., Scheffel, M., Drachsler, H., & Specht, M. (2021). Co-located collaboration modelling using multimodal learning analytics—Can we go the whole nine yards? IEEE Transactions on Learning Technologies, 14(3), 367–385. https://doi.org/10.1109/TLT.2021.3097766

Pugh, S. L., Rao, A., Stewart, A. E., & D’Mello, S. K. (2022). Do speech-based collaboration analytics generalize across task contexts? In Proceedings of the 12th International Conference on Learning Analytics and Knowledge (LAK 2022), 21–25 March 2022, online (pp. 208–218). ACM. https://doi.org/10.1145/3506860.3506894

Pugh, S. L., Subburaj, S. K., Rao, A. R., Stewart, A. E. B., Andrews-Todd, J., & D’Mello, S. K. (2021). Say what? Automatic modeling of collaborative problem solving skills from student speech in the wild. In Proceedings of the 14th International Conference on Educational Data Mining (EDM 2021), 29 June 2021–2 July 2021, Paris, France, and online (pp. 55–67). International Educational Data Mining Society. https://educationaldatamining.org/EDM2021/virtual/static/pdf/EDM21_paper_141.pdf

Redman, T. C. (2001). Data quality: The field guide. Digital Press.

Reilly, J. M., & Schneider, B. (2019). Predicting the quality of collaborative problem solving through linguistic analysis of discourse. In Proceedings of The 12th International Conference on Educational Data Mining (EDM 2019), 2–5 July 2019, Montréal, Québec, Canada (pp. 149–157). https://educationaldatamining.org/edm2019/proceedings/

Reyes-García, J., Galeana-Zapién, H., Galaviz-Mosqueda, A., & Torres-Huitzil, C. (2018). Evaluation of the impact of data uncertainty on the prediction of physiological patient deterioration. IEEE Access, 6, 38595–38606. https ://doi.org/10.1109/ACCESS.2018.2853701

Roschelle, J., & Teasley, S. D. (1995). The construction of shared knowledge in collaborative problem solving. In C. O’Malley (Ed.), Computer supported collaborative learning. NATO ASI Series (pp. 69–97, Vol. 128). Springer. https://doi.org/10.1007/978-3-642-85098-1_5

Rummel, N., Deiglmayr, A., Spada, H., Kahrimanis, G., & Avouris, N. (2011). Analyzing collaborative interactions across domains and settings: An adaptable rating scheme. In S. Puntambekar, G. Erkens, & C. Hmelo-Silver (Eds.), Analyzing interactions in CSCL (pp. 367–390). https://doi.org/10.1007/978-1-4419-7710-6_17

Schneider, B., & Pea, R. (2013). Real-time mutual gaze perception enhances collaborative learning and collaboration quality. International Journal of Computer-Supported Collaborative Learning, 8(4), 375–397. https://doi.org/10.1007/s11412-013-9181-4

Schneider, B., Sung, G., Chng, E., & Yang, S. (2021). How can high-frequency sensors capture collaboration? A review of the empirical links between multimodal metrics and collaborative constructs. Sensors, 21(24), 8185. https://doi.org/10.3390/s21248185

Schwarz, B. B., & Asterhan, C. S. (2011). E-moderation of synchronous discussions in educational settings: A nascent practice. Journal of the Learning Sciences, 20(3), 395–442. https://doi.org/10.1080/10508406.2011.553257

Smith, J., Bratt, H., Richey, C., Bassiou, N., Shriberg, E., Tsiartas, A., D’Angelo, C., & Alozie, N. (2016). Spoken interaction modeling for automatic assessment of collaborative learning. In Proceedings of the International Conference on Speech Prosody, 31 May–3 June 2016, Boston, Massachusetts, USA (pp. 277–281). https://doi.org/10.21437/SpeechProsody.2016-57

Som, A., Kim, S., Lopez-Prado, B., Dhamija, S., Alozie, N., & Tamrakar, A. (2021). Automated student group collaboration assessment and recommendation system using individual role and behavioral cues. Frontiers in Computer Science, 3, 728801. https://doi.org/10.3389/fcomp.2021.728801

Sonnenberg, C., & Bannert, M. (2015). Discovering the effects of metacognitive prompts on the sequential structure of SRL-processes using process mining techniques. Journal of Learning Analytics, 2(1), 72–100. https://doi.org/10.18608/jla.2015.21.5

Spikol, D., Cukurova, M., & Ruffaldi, E. (2017). Using multimodal learning analytics to identify aspects of collaboration in project-based learning introduction PELARS system and context. In Making a Difference: Prioritizing Equity and Access in CSCL, 12th International Conference on Computer Supported Collaborative Learning (CSCL 2017), 18–22 June 2017, Philadelphia, Pennsylvania, USA (pp. 263–270). https://repository.isls.org/handle/1/240

Spikol, D., Ruffaldi, E., Dabisias, G., & Cukurova, M. (2018). Supervised machine learning in multimodal learning analytics for estimating success in project-based learning. Journal of Computer Assisted Learning, 34(4), 366–377. https://doi.org/10.1111/jcal.12263

Storch, N. (2001). How collaborative is pair work? ESL tertiary students composing in pairs. Language Teaching Research, 5(1), 29–53. https://doi.org/10.1177/136216880100500103

Viswanathan, S. A., & Vanlehn, K. (2018). Using the tablet gestures and speech of pairs of students to classify their collaboration. IEEE Transactions on Learning Technologies, 11(2), 230–242. https://doi.org/10.1109/TLT.2017.2704099

Voogt, J., & Roblin, N. P. (2012). A comparative analysis of international frameworks for 21st century competences: Implications for national curriculum policies. Journal of Curriculum Studies, 44(3), 299–321. https://doi.org/10.1080/00220272.2012.668938

Wang, R. Y., & Strong, D. M. (1996). Beyond accuracy: What data quality means to data consumers. Journal of Management Information Systems, 12(4), 5–33. https://doi.org/10.1080/07421222.1996.11518099

Webb, G. I. (2010). Naïve Bayes. In C. Sammut & G. I. Webb (Eds.), Encyclopedia of machine learning (pp. 713–714). Springer US. https://doi.org/10.1007/978-0-387-30164-8_576

Weinberger, A., & Fischer, F. (2006). A framework to analyze argumentative knowledge construction in computer-supported collaborative learning. Computers & Education, 46(1), 71–95. https://doi.org/10.1016/j.compedu.2005.04.003

Worsley, M., & Blikstein, P. (2018). A multimodal analysis of making. International Journal of Artificial Intelligence in Education, 28, 385–419. https://doi.org/10.1007/s40593-017-0160-1

Yan, L., Zhao, L., Gasévic, D., & Martinéz-Maldonado, R. (2022). Scalability, sustainability, and ethicality of multimodal learning analytics. In Proceedings of the 12th International Conference on Learning Analytics and Knowledge (LAK 2022), 21–25 March 2022, online (pp. 13–23). https://doi.org/10.1145/3506860.3506862

Zhu, X., & Wu, X. (2004). Class noise vs. attribute noise: A quantitative study. Artificial Intelligence Review, 22(3), 177–210. https://doi.org/10.1007/s10462-004-0751-8

The Impact of Attribute Noise on the Automated Estimation of Collaboration Quality Using Multimodal Learning Analytics in Authentic Classrooms

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Most read articles by the same author(s)