Assessment and evaluation of an artificial intelligence-based deep learning curriculum

Authors

  • Aam Hamdani Universitas Pendidikan Indonesia
  • Amay Suherman Universitas Pendidikan Indonesia
  • Bambang Darmawan Universitas Pendidikan Indonesia
  • Enda Permana Universitas Pendidikan Indonesia

DOI:

https://doi.org/10.17509/curricula.v4i2.95294

Keywords:

artificial intelligence, authentic assessment, deep learning curriculum, learning evaluation, personalized learning

Abstract

The rapid development of Artificial Intelligence (AI) and the implementation of deep learning curricula in education have necessitated reforms to learning assessment and evaluation. This article examines the conceptual framework and assessment practices appropriate for AI-based deep learning curricula: how assessments can measure critical thinking, collaboration, knowledge transfer, and data literacy competencies; how AI-based platforms enable personalization, automated assessment, and learning analytics; and the ethical and validity challenges of using AI for evaluation. The method used is a systematic literature review and a comparative analysis of current practices in school and higher education contexts. The results indicate that AI-based assessments can improve learning effectiveness by providing real-time feedback and progress tracking; however, their success depends heavily on authentic task design, teacher readiness, infrastructure, and ethical/data security arrangements. Recommendations are provided for educational institutions to adopt a combination of authentic assessments (portfolios, projects, observations) and AI metrics, and to conduct regular algorithm audits to ensure the validity, reliability, and fairness of assessments. 

 

Abstrak

Perkembangan pesat di bidang Artificial Intelligence (AI) dan penerapan kurikulum pembelajaran mendalam (deep learning) dalam ranah pendidikan memunculkan kebutuhan untuk mereformasi asesmen dan evaluasi pembelajaran. Artikel ini mengkaji kerangka konseptual dan praktik asesmen yang sesuai untuk kurikulum deep-learning berbasis AI: bagaimana asesmen dapat mengukur kompetensi berpikir kritis, kolaborasi, transfer pengetahuan, dan literasi data; bagaimana platform berbasis AI memungkinkan personalisasi, penilaian otomatis, serta analitik pembelajaran; serta tantangan-etika dan validitas dalam penggunaan AI untuk evaluasi. Metode yang digunakan adalah studi literatur sistematis dan analisis komparatif terhadap praktik terkini dalam konteks sekolah dan perguruan tinggi. Hasil menunjukkan bahwa asesmen berbasis AI dapat meningkatkan efektivitas pembelajaran dengan menyediakan umpan balik real-time dan pelacakan kemajuan, namun keberhasilannya sangat bergantung pada desain tugas otentik, kesiapan guru, infrastruktur, dan pengaturan etika/keamanan data. Rekomendasi diberikan agar institusi pendidikan mengadopsi kombinasi penilaian otentik (portofolio, proyek, observasi) dengan metrik AI, serta melakukan audit algoritma secara berkala untuk memelihara validitas, reliabilitas, dan keadilan penilaian.

Kata Kunci: asesmen otentik; evaluasi pembelajaran; kecerdasan buatan; kurikulum deep learning; personalisasi pembelajaran

References

Akgun, S., & Greenhow, C. (2022). Artificial intelligence in education: Addressing ethical challenges in K‐12 settings. AI and Ethics, 2, 431-440.

Ali, E. Y., Nugraha, R. G., & Bagja, N. (2025). The effectiveness of AI-Based education management systems in implementing deep learning curriculum in elementary schools. Journal of Integrated Elementary Education, 5(2), 339-352.

Aloisi, C. (2023). The future of standardised assessment: Validity and trust in algorithms for assessment and scoring. European Journal of Education, 58(1), 98-110.

Andersen, N., Mang, J., Goldhammer, F., & Zehner, F. (2025). Algorithmic fairness in automatic short answer scoring. International Journal of Artificial Intelligence in Education, 38, 1-38.

Arslan, B., Lehman, B., Tenison, C., Sparks, J. R., López, A. A., Gu, L., & Zapata-Rivera, D. (2024). Opportunities and challenges of using generative AI to personalize educational assessment. Frontiers in Artificial Intelligence, 7(1), 1-8.

Baines, S., Otermans, P., Tree, D., & Worsfold, N. (2025). Measuring and mapping authentic assessment with a novel quantitative typology. Teacher in Higher Education, 30(3), 663-682.

Banihashem, S. K., Gasevic, D., Noroozi, O., Jarodzka, H., Brinke, D. J., & Draschler, H. (2025). Optimizing formative assessment with learning analytics. Review of Educational Research, 1(1), 1-49.

Berg, S. V. D., & Papadopoulos, P. M. (2025). Summative assessment with artificial intelligence: Qualitative analysis and comparison of technology acceptance in student and teacher populations. Innovations in Education and Teaching International, 62(5), 1529-1544.

Bulut, O., Beiting-Parrish, M., Casabianca, J. M., Slater, S. C., Jiao, H., Song, D., & Walsh, C. (2024). The rise of artificial intelligence in educational measurement: Opportunities and ethical challenges. Chinese/English Journal of Educational Measurment and Evaluation, 5(3), 1-34.

Burstein, J., & LaFlair, G. T. (2024). Where assessment validation and responsible AI meet. Europe Knowledge Development Institute, 1(1), 1-32.

Cahyani, V. P., Ahmad, F., Soekarman, & Nawi, M. Z. M. (2024). Authentic assessment as a solution to enhance deep learning and maintain academic integrity in higher education. Journal of Educational Analytics (JEDA), 3(3), 531-544.

Campos, C. M. (2025). The impact of artificial intelligence on personalized learning in higher education: A systematic review. Trends in Higher Education, 4(2), 1-15.

Chen, J., & Singh, C. K. S. (2024). A systematic review on deep learning in education: Concepts, factors, models and measurements. Journal of Education and Educational Research, 7(1), 125-129.

Chinta, S. V., Wang, Z., Yin, Z., Hoang, N., Gonzalez, M., Quy, T. L., & Zhang, W. (2024). FairAIED: Navigating fairness, bias, and ethics in educational AI applications. CORR, 1(1), 1-59.

Deepshikha, D. (2025). A comprehensive review of AI-Powered grading and tailored feedback in universities. Discover Artificial Intelligence, 5(1), 1-18.

Fernandez-Sanchez, A., Lorenzo-Castineiras, J., & Sanchez-Bello, A. (2024). Navigating the future of pedagogy: The integration of AI tools in developing educational assessment rubrics. European Journal of Education, 60(1), 1-13.

Gonsalves, C. (2025). Contextual assessment design in the age of generative AI. Journal of Learning Development in Higher Education, 34(1), 1-14.

Halim, M. F. (2025). Transforming educational leadership: The role of school principals in implementing deep learning-based curricula. Journal of Educational Management Research, 4(1), 308-321.

Hmoud, M., Swaity, H., Anjass, E., & Aguaded-Ramirez, E. (2024). Rubric development and validation for assessing tasks' solving via AI chatbots. The Electronic Journal of E-Learning, 22(6), 1-17.

Inel, O., Draws, T., & Aroyo, L. (2023). Collect, measure, repeat: Reliability factors for responsible AI data collection. Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, 11(1), 51-64.

Johnson, M. S., & McCaffrey, D. F. (2023). Evaluating fairness of automated scoring in educational measurement. Advancing Natural Language Processing in Educational Assessment, 142(1), 1-10.

Kaldaras, L., Akaeze, H. O., & Reckase, M. D. (2024). Developing valid assessments in the era of generative artificial intelligence. Sec. Assessment, Testing and Applied Measurement, 9(1), 1-10.

Kasi, Y. F., Bai, D. V., Novia, N., Deporos, S. R. C., & Mabbaya, A. D. (2025). Implementation of deep learning in school curriculum: Perspective of teachers in Nagekeo Regency. Jurnal Penelitian Pendidikan Paedagogia, 28(2), 320-328.

Kinnear, B., St-Onge, C., Schumacher, D. J., Marceau, M., & Naidu, T. (2024). Validity in the next era of assessment: Consequences, social impact, and equity. Perspectives on Medical Education, 13(1), 452-459.

Lim, T., Gottipati, S., & Cheong, M. (2025). What students really think: Unpacking AI ethics in educational assessments through a triadic framework. International Journal of Educational Technology in Higher Education, 22(56), 1-32.

Manganello, F., & Fante, C. (2025). Scoping review on the role of learning analytics in assessing and fostering creativity in educational contexts. Journal of Learning Analytics, 12(2), 5-18.

Memarian, B., & Doleck, T. (2024). A Review of assessment for learning with artificial intelligence. Computers in Human Behavior: Artificial Humans, 2(1), 1-11.

Nafi’ah, J., & Faruq, D. J. (2025). Conceptualizing deep learning approach in primary education: Integrating mindful, meaningful, and joyful. Journal of Educational Research and Practice, 6(2), 78-89.

Pardosi, V. B. A., Xu, S., Umurohmi, U. Nurdiana, & Sabur, F. (2024). Implementation of an artificial intelligence based learning management system for adaptive learning. Al-Fikrah: Jurnal Manajemen Pendidikan, 12(1), 149-161.

Putri, D. N., Mayasril, N., & Gusman, R. (2025). Asesmen abad 21: Menakar kompetensi, bukan sekedar nilai. Didaktik: Jurnal Ilmiah PGSD STKIP Subang, 11(2), 221-228.

Sadykova, A., Iskakova, M., Ismailova, G., Ishmukhametova, A., Sovetova, A., & Mukasheva, K. (2024). The impact of a metacognition-based course on school students’ metacognitive skills and biology comprehension. Frontiers in Education, 9(1), 1-9.

Sappaile, B. I. (2025). A comparative analysis of the merdeka belajar framework as a means to achieve deep learning in elementary school students: A literature review. Cosmos: Jurnal Ilmu Pendidikan, Ekonomi dan Teknologi, 2(4), 944-963.

Suherman, A., Supriyadi, T., Safari, I., Saptani, E., Fauzi, R. A., Sudirjo, E., Komiljon O'g'li, T. S., & Abdisamiyevich. (2025). Bridging teacher readiness and deep learning-based teaching practice: Assessing the effectiveness of the active model for enhancing teacher pedagogy. International Research Journal of Multidisciplinary Scope, 6(4), 284-298.

Tan, L. Y., Hu, S., Yeo, D. J., & Cheong, K. H. (2025). A comprehensive review on automated grading systems in STEM using AI techniques. Mathematics, 13(1), 1-10.

Tan, X., Cheng, G., & Ling, M. H. (2025). Artificial intelligence in teaching and teacher professional development: A systematic review. Computers and Education: Artificial Intelligence, 8(1), 1-19.

Timperley, C., & Schick K. (2024). Assessment as pedagogy: Inviting authenticity through relationality, vulnerability and wonder. Teaching in Higher Education, 30(3), 592-607.

Vergara, D., Lampropoulos, G., Antón-Sancho, Á., & Fernández-Arias, P. (2024). Impact of artificial intelligence on learning management systems: A bibliometric review. Multimodal Technologies and Interaction, 8(9), 1-19.

Vlachopoulos, D., & Makri, A. (2024). A systematic literature review on authentic assessment in higher education: Best practices for the development of 21st century skills, and policy considerations. Studies in Educational Evaluation, 83(1), 1-13.

Weidlich, J., Fink, A., Frey, A., Jivet, I., Gombert, S., Menzel, L., Giorgashvili, T., Yau, J., & Drachsler, H. (2025). Highly informative feedback using learning analytics: How feedback literacy moderates students perceptions of feedback. International Journal of Education Technology in Higher Eduucation, 22, 43(1), 1-25.

Weng, C., Chen, C., & Ai, X. (2022). A pedagogical study on promoting students’ deep learning through design‑based learning. International Journal of Technology and Design Education, 33(1), 1653-1647.

Yan, L., Sha, L., Zhao, L., Li, Y., Maldano, R. M., Chen, G., Li, X., Jin, Y., & Gasevic, D. (2023). Pracatical and ethical challenges of large language models in education: A systematic scoping review. British Journal of Educational Technology, 55(1), 90-112.

Yuliansyah, H., & Saidah, N. (2025). Transformasi kurikulum berbasis kecerdasan buatan sebagai strategi pendidikan abad 21. Jurnal Pendidikan Islam Educan, 9(1), 53-66.

Zheng, E. L., Jin, W., Hamarneh, G., & Lee, S. S. J. (2024). From human-in-the-loop to human-in-power. The American Journal of Bioethics, 24(9), 84-86.

Published

2025-12-26

How to Cite

Hamdani, A., Suherman, A., Darmawan, B., & Permana, E. (2025). Assessment and evaluation of an artificial intelligence-based deep learning curriculum. Curricula: Journal of Curriculum Development, 4(2), 1833-1846. https://doi.org/10.17509/curricula.v4i2.95294

Most read articles by the same author(s)