learning with insight

Anthropomorphic Pedagogical Agents (APA)

APAs embody human traits such as the analysis process and insightfulness of humans. They are able to interact with humans by detecting and exhibiting emotion, communicating through natural language, and enacting a pedagogical role.


Researcher – Ireti Fakinlede

❚  Description

Language acquisition, the process by which a human acquires the ability to recognize and produce language that other speakers can understand, remains a hotly debated topic in linguistics. Nativist theorists offer their observations of the human language acquisition process as proof that human beings possess a specialized innate neurological support for linguistic competence (e.g., phonology, lexicon, and grammar), that is, a prewiring to adapt to language which is called ‘the language instinct’ and also referred to as ‘the language acquisition device.’ These views are opposed by behaviorists, constructivists, and social interaction theorists who emphasize the role of social interaction in shaping human language acquisition. Constructivists focus on the formative years during which human learners gain linguistic competence through interaction with and mimicry of mature speakers of the language. We take the stance that both arguments are not necessarily mutually exclusive. Both arguments favor the existence of cognitive support for language acquisition; the difference is whether those tools have been adapted specially for language or are these the same tools that support other human learning abilities. Literature shows that there are some aspects of language acquisition that may be explained by innate abilities and some aspects that can be explained as a consequence of socialization. Whether genetically, cognitively, or socially inspired, generic or specialized, we take as given the existence of the language acquisition device such as the mix of cognitive tools and knowledge and learning mechanisms by which human children acquire their first language. This project proposes to artificially produce such a device, named GIOIA, and aims to effectively develop pragmatic competence in conversational software agents.

❚  Publications

Fakinlede, I., Kumar, V., Wen, D., & Kinshuk. (2013). Conversational forensics: Building conversational pedagogical agents with attitude. In proceedings of the 2013 IEEE Fifth International Conference on Technology for Education, Kharagpur, India, 18–20 December (pp. 65-68). doi: 10.1109/T4E.2013.52

Fakinlede, I., Kumar, V., & Wen, D. (2013). Knowledge representation for context and sentiment analysis. In proceedings of the 2013 IEEE 13th International Conference on Advanced learning technologies, Beijing, China, 15–18 July (pp. 493-494). doi: 10.1109/ICALT.2013.158 doi: 10.1109/ICALT.2013.158

Fakinlede, I., Kumar, V., Wen, D., & Graf, S. (2013). Auto generating APA persona. In proceedings of the 2013 IEEE 13th International Conference on Advanced learning technologies, Beijing, China, 15–18 July (pp.473-474). doi: 10.1109/ICALT.2013.181

Big Data Analytics – Technologies, Standards

Scaling learning analytics systems at the big data level requires the ingestion of multimodal data, standardization of data of interests, and the integration of large-scale storage and data processing technologies.

PROJECT | Gravité – Analytics Platform (2017-present)

Researcher – Jeremie Seanosky

❚  Description

The Gravité system is a generic big data learning analytics platform that has been so far experimented with coding and writing analytics software. Gravité has a central REST API-based hub, implemented in NodeJS, that handles all the data traffic to and from sensors (i.e., software sensors embedded within a text editor or an integrated development environment). The hub receives all learning events (in JSON format) from the sensors and stores them in a NoSQL database (MongoDB). The analytics engines query the hub for ‘unprocessed’ learning events, which they process. The output from the processing engines is then sent back through the hub and into the database from where dashboards (i.e., visualizations) query the hub for processed results and metrics. Gravité also includes some custom encryption to ensure security of the data in transit. This encryption is generic to work with the different sensors that we attach. We designed Gravité to be modular so adding new sensors is seamless as is also the processing side.

❚  Publications

Seanosky, J., Jacques, D., Kumar, V., & Kinshuk (2016). Security and Privacy in Bigdata Learning Analytics. In Proceedings of the 3rd International Symposium on Big Data and Cloud Computing Challenges (ISBCC–16’), Chennai, India, March 10–11 (pp. 43-55). Springer International Publishing. doi: 10.1007/978-3-319-30348-2_4

PROJECT | OpenACRE – Caliper and Open Analytics Collaborative Research Environment (2015-2016)

Researchers – David Boulanger, Jeremie Seanosky, Rahim Virani

❚  Description

This project aims to develop solutions that capture learning events in a standardized and interoperable manner. Caliper events, arising from a variety of sensors attached to numerous learning tools used by learners, are generated, transmitted, and stored in OpenACRE. Analytics solutions can then be applied on sanitized data to extract, analyze, or generate insights. OpenACRE was designed to be scalable and to support semi-structured data and was, therefore, implemented using big data technologies (e.g., Spark, Kafka, HDFS, MongoDB). It also keeps history of data, paving the way for continuous improvement of both research and learning processes. It allows for the integration of third-party datasets, where applicable, such as textbook vendors and MOOCs. OpenACRE supports the integration of learning tools (e.g., MI-Writer) to provide storage of learner data and engine support for real-time or pseudo real-time feedback. The research team evaluated a few other learning analytics frameworks (e.g., xAPI) based on the documentation available and has elected to use Caliper because of its modularity and ontology-based approach.

❚  Publications

Field, J., Lewkow, N., Zimmerman, N., Riedesel, M., Essa, A., Boulanger, D., Seanosky, J., Kumar, V.S., Kinshuk, Kode, S. (2016). A scalable learning analytics platform for automated writing feedback, Educational Data Mining, pp. 688-693, Raleigh, NC, USA, June 29 – July 2.

Lewkow, N., Field, J., Zimmerman, N., Riedesel, M., Essa, A., Boulanger, D., Seanosky, J., Kumar, V.S., Kinshuk, Kode, S. (2016). A scalable learning analytics platform for automated writing feedback, ACM conference on Learning @ Scale, pp. 109-112, Edinburgh, UK, April 25–26. doi: 10.1145/2876034.2893380

Kumar, V.S., Kinshuk, Somasundaram, T.S., Boulanger, D., Seanosky, J., & Vilela, M. (2015). Big data learning analytics: A new perspective. In Kinshuk, & R. Huang (Eds.), Ubiquitous learning environments and technologies (pp. 139–158). Berlin, Germany: Springer Berlin Heidelberg. doi: 10.1007/978-3-662-44659-1_8

Seanosky, J., Boulanger, D., Kumar, V., & Kinshuk (2015). Unfolding learning analytics for big data. In Chen, V. Kumar, Kinshuk, R. Huang, & S.C. Kong (Eds.), Emerging issues in smart learning. In proceedings of the International Conference on Smart Learning Environment 2014, The Hong Kong Institute of Education, Hong Kong, 24–25 July (pp. 377–384). Berlin, Germany: Springer Berlin Heidelberg. doi: 10.1007/978-3-662-44188-6_52


Researchers – David Boulanger, Jeremie Seanosky, Colin Pinnell, Jason Bell

❚  Description

Much has been written on big data and learning analytics individually, but an important gap lies between both areas. Our research proposes a generic learning analytics competence-based framework called LAMBDA as a potential candidate to reach big data through learning analytics. LAMBDA provides generic and customizable functionalities to sense data, shape data to represent actual learning episodes, analyze learning experiences, measure the impact of learning activities, and offer reflection/regulation opportunities to improve the learning process. Its universal applicability to any learning domain as well as its ability to extract evidences to assess both cognitive and metacognitive skills make it an ideal vehicle to expand the volume, velocity, and variety of learning data. For more information about some of the tools that were integrated with LAMBDA, please consult the CODEX, SCALE, MI-DASH, and SCRL projects.

PROJECT | P-PSO – Parallel Particle Swarm Optimization

Researcher: Kannan Govindarajan

❚  Description

P-PSO is a scalable algorithm that ingests in real time large volumes of learner data generated from their interactions with their learning environment and learning activities to continuously, accurately, and efficiently cluster them as per various sets of learning traits (e.g., study habits, competence growth, engagement, motivation, attitude, etc.) to observe the diverse groups to which each learner belongs. It measures and reports its clustering accuracy in terms of processing time, acceleration, and inter/intra cluster distances. In particular, the inter/intra cluster distances provide insights that guide learners towards better performance and highlight the level of cohesion among learners of the same group.

❚  Publications

Somasundaram, T.S., Govindarajan, K., & Kumar, V.S. (2016), Swarm Intelligence (SI) based Profiling and Scheduling of Big Data Applications, IEEE Conference on Big Data, Session on Intelligent Data Mining, pp. 1875-1880, Washington, DC, USA, Dec 5-8. doi: 10.1109/BigData.2016.7840806

Govindarajan, K., Boulanger, D., Seanosky, J., Bell, J., Pinnell, C., & Kumar, V. S. (2016). Assessing Learners’ Progress in a Smart Learning Environment using Bio-Inspired Clustering Mechanism. In E. Popescu, Kinshuk, M. K. Khribi, R. Huang, M. Jemni, N-S. Chen, & D.G. Sampson (Eds.), Innovations in Smart Learning (pp. 49-58). Springer Singapore. 3rd International Conference on Smart Learning Environments, pp. n/a, Tunis, Tunisia, September 28-30. doi: 10.1007/978-981-10-2419-1_9.

Govindarajan, K., Kumar, V.S., Kinshuk (2015). Parallel Particle Swarm Optimization (PPSO) Clustering for Learning Analytics, IEEE International Conference on Big Data (IEEE Big Data 2015), pp. 1461-1465, Santa Clara, CA, USA, doi: 10.1109/BigData.2015.7363907.

Govindarajan, K., Boulanger, D., Seanosky, J., Bell, J., Pinnell, C., Kumar, V. S., & Somasundaram, T. S. (2015, July). Performance Analysis of Parallel Particle Swarm Optimization Based Clustering of Students. In Advanced Learning Technologies (ICALT), 2015 IEEE 15th International Conference on Advanced Learning Technologies, Hualien, Taiwan (pp. 446–450). doi: 10.1109/ICALT.2015.136

Govindarajan, K., Somasundaram, T.S., Kumar, V., & Kinshuk. (2013). Continuous clustering in big data learning analytics. In proceedings of the 2013 IEEE Fifth International Conference on Technology for Education, Kharagpur, India, 18-20 December (pp. 61–64). doi: 10.1109/T4E.2013.23

Govindarajan, K., Somasundaram, T.S., Kumar, V.S., & Kinshuk. (2013). Particle swarm optimization (PSO)-based clustering for improving the quality of learning using cloud computing. In proceedings of the 2013 IEEE 13th International Conference on Advanced learning technologies, Beijing, China, 15–18 July (pp. 495–497). doi: 10.1109/ICALT.2013.160

Coding Analytics

Coding analytics captures, assesses, and provides timely feedback to learners on their coding processes, testing, and debugging habits with the goal of fostering high-quality, professional programming skills demanded in the industry.

PROJECT | CODEX 2.0 – Coding Experiences

Researchers: Jeremie Seanosky, David Boulanger, Rebecca Guillot

❚  Description

Coding is a highly structured learning domain with very practical, precise, and measurable outcomes, making it susceptible to be automatically scored by the machine. Machine learning along with its plethora of deep composite and hybrid architectures prove adequate to extract the patterns in students’ code that human graders are looking for when scoring and assigning rubric scores (functionality, documentation, testing, optimization, code quality). In addition to considering submitted assignment deliverables, a machine-learned automated scoring system also considers the coding process through which the deliverables went through, allowing not only to assign holistic and rubric scores to coding assignments but also to measure the underlying competences and confidence of the student that shaped these learning artifacts. This system is currently in development and will assist tutors in turning their summative feedback into a post hoc formative feedback that will help students not only at the learning outcome level but also at the competence level.

❚  Publications

Boulanger, D., Seanosky, J., Guillot, R., Guillot, I., Guillot, C., Fraser, S., Kumar, V., & Kinshuk (2019, accepted). Assessing Learning Analytics Impact on Coding Competence Growth. International Conference on Advanced Learning Technologies (ICALT 2019), July 15-18, Maceió-AL, Brazil.

Guillot, R., Seanosky, J., Guillot, I., Boulanger, D., Guillot, C., Kumar, K., Fraser, S.N., & Kinshuk. (2018). Assessing Learning Analytics Systems Impact by Summative Measures, International Conference on Advanced Learning Technologies (ICALT 2018), July 9-13, Mumbai, India (pp. 188-190). doi: 10.1109/ICALT.2018.00051

PROJECT | CODEX 1.0 – Coding Experiences

Researchers: Jeremie Seanosky, David Boulanger, Rebecca Guillot

❚  Description

In this research project (CODEX for CODing EXperiences), we developed a coding analytics system for the Java programming language that automatically and continuously captures students’ code in their development environment of choice, at home or in the lab; performs real-time formative assessment on the incremental pieces of the code; and provides real-time feedback on changes to programming skills. The system features three components: 1) a sensor capturing coding episodes, 2) a module assessing low-level coding competences, and 3) feedback visualizations. A plugin was developed for the NetBeans IDE to capture students’ coding episodes periodically as they write Java programs and every time they save their code. Each coding event consists of 1) the source code of the file on which the student is currently working on, 2) the time at which the code capture is made, 3) the student ID, 4) the absolute file path of the code file, and 5) whether the coding event was triggered as the student saved his/her code or after an elapsed time. These coding events are then sent to a learning analytics (LA) server for further analysis. In addition, an offline buffering mechanism keeps coding events offline on the student’s computer if the LA server is not accessible or the internet connectivity is lost. Those temporary offline data are automatically pushed to the LA server when connectivity is restored.

The coding events are then stored on the LA server, allowing to reconstruct the coding process if necessary. Code parsers retrieve the abstract syntax tree of each code file and the Java “building blocks” that students used to write their code are analyzed next. Their levels of competence in using all of these basic coding constructs are quantitatively assessed and a competence portfolio is created for each student. This competence portfolio can be consulted by each novice programmer and exhibits information as a bar chart, where the horizontal axis represents the various Java building blocks to be mastered and the vertical axis the proficiency level of the learner per coding construct. It is also possible for the students (and tutors) to compare their individual performance against the average of the class and against the top student in the class. Finally, students and tutors can monitor the growth of these competences over time to detect if their performance is at risk of failing the course.

❚  Publications

Kumar, V., Seanosky, J., Boulanger, D., Guillot, R., Guillot, C., & Guillot, I. (2018). CODEX: A Formative Assessment Tool to Analyze Beginners’ Coding Experiences. In Educational Communications and Technology (Chinese Journal), 2018(3), pp. 58-64.

Seanosky, J., Guillot, I., Boulanger, D., Guillot, R., Guillot, C., Kumar, V., Fraser, S.N., Aljojo, N. and Munshi, A. (2017, July). Real-time visual feedback: a study in coding analytics. In Advanced Learning Technologies (ICALT), 2017 IEEE 17th International Conference on (pp. 264-266). doi: 10.1109/ICALT.2017.38

Kumar, V., Kinshuk, Somasundaram, T.S., Harris, S., Boulanger, D., Seanosky, J., Paulmani, G., & Panneerselvam, K. (2015). An approach to measure coding competency evolution. In M. Chang, & Y.Li (Eds.), Smart learning environments (pp. 27–43). Berlin, Germany: Springer Berlin Heidelberg. doi: 10.1007/978-3-662-44447-4_2

PROJECT | MI-DASH – Mixed-Initiative Dashboard

Researchers: Jeremie Seanosky, David Boulanger, Rebecca Guillot

❚  Description

MI-DASH is a web-based dashboard for students, teachers, school administrators, and anybody with which these educational stakeholders want to share their data, where the individual and group learning performance of students can be monitored in real time and where reflective and self-regulatory activities are promoted. MI-DASH is the unique interface through which learners and teachers consult the system’s evaluation of the students’ performance in any learning domain such as math, science, writing, reading, music, and coding. MI-DASH also includes an embedded chat feature where students can communicate with teachers and/or peers for guidance or answers to questions they have on the feedback generated by learning analytics. Our vision for MI-DASH is that it becomes a ubiquitous, customizable, and interactive learning companion, accessible through mobile devices, tablets, laptops, and desktop computers.

Communication Skills Analytics

Assesses and nurtures skills for effective verbal and non-verbal communication including listening/understanding complex language and articulating and pronouncing ideas thoroughly.

PROJECT | ListenEx – Measuring the Listening Skills of Online Learners

Researcher: Isabelle Guillot

❚  Description

Aural interactions are a significant portion of online learning, particularly for recently emerging synchronous applications that enable learners with real-time interactions. To be successful, online learners require effective listening skills. However, there are hardly any training opportunities available for online listening. Analysis of training sessions involving listening is even nonexistent. This research explores a computational mechanism as part of an analytics tool called ListenEx to measure and improve the listening skills of online learners. ListenEx presents learners audio or video content of varying lengths, multiple types (documentary, news, storytelling, conference, etc.), different speeds, diverse levels of vocabulary, varied cultural influxes, targeted workplace conversation, and contrasting degrees of distractions (pictures, animations, irrelevant material). Tasks that students are expected to complete include the identification of the main topic, linking of essential pieces of information, locating of important details, answering specific questions about the content, and paraphrasing one’s understanding. Learner responses to these tasks are analyzed to measure the level of listening comprehension. ListenEx also aims to measure the time intervals between the question and the answer, the quality of response, the correctness of response, the number of times the learner listened to the source, chunking time, the number of times an answer is changed, and so on. These measures can then be used to provide a continuous analysis of listening skills as learners study through a curriculum. (Retrieved from:

FUTURE PROJECT | SpeakEx – Speaking Analytics

❚  Description

SpeakEx aims at parsing translated text material arising from speech utterances and at presenting a scaffolded dashboard for feedback, reflection, and regulation opportunities to learners. The types of feedback include grammar-based feedback, feedback on the pacing of the speech, feedback on the breaks in speech, and identification of misspoken words.

FUTURE PROJECT | Interview Mastery Analytics

❚  Description

Skills needed to succeed in an interview include mastery of the subject, comprehension of the question, quick thinking, body language, pacing of answers, and dress code. This project aims at developing a system simulating an online interview environment, conducting different types of simulated interviews, observing the response of the candidate, offering feedback, and allowing reflection/regulation opportunities to improve interviewee’s skills.

Competence Analytics

Competence analytics measures the knowledge and skills of learners independently of the instructional materials at their disposal during the learning processes. It also estimates the retention level of acquired knowledge and the transfer effect of learning one concept over another, which allows to fill in knowledge gaps in a new domain.

PROJECT | SCALE – Smart Competence Analytics on Learning

Researchers: David Boulanger, Jeremie Seanosky

❚  Description

SCALE is a smart competence analytics technology that transforms your learning traces into standardized measurements of competences that will help you assess how proficient you are in the concepts introduced in your course. SCALE will also allow you to evaluate how confident you are at solving a particular exercise and how confident you are in the overall learning domain. SCALE’s mission is to provide you with a scale that will help you measure and optimize your learning as it occurs and this in any learning area. SCALE stores the analysis results in a MySQL database, ready to be used by the visualization and reporting tools from learning dashboards, such as MI-DASH.

The following is an excerpt from [1]:

SCALE consists basically of three processing layers: parsing, inferencing, and profiling. Essentially, SCALE captures and parses (parsing) a student’s interactions with a learning activity (e.g., writing an English essay) and encodes it in a way that the computer will understand what the student has done. Then the machine will search for patterns (inferencing) in the student’s work (e.g., use of synonym words in two consecutive sentences) and will map the evidence found to one or more competences (e.g., local cohesion in English writing). Finally, all evidences cumulated for each competence will be aggregated and input in a math model for quantitative competence assessment.

To apply SCALE in a specific course, SCALE requires 1) the specification of the learners’ potential interactions with the courses’ learning activities, 2) explicit coding of the patterns (i.e., application of a concept or a misconception) to be extracted from the students’ interactions, 3) the specification of the impact that the application of a concept or misconception will have on the quantitative assessment of a competence, and 4) the explicit definition of the course’s learning outcomes in terms of competence development.

The first processing layer (parsing) in SCALE involves analyzing the data collected from the student’s interactions with a learning activity so that the machine can understand the value in the student’s work from a teacher perspective. In technical terms, SCALE will expand the dataset through domain-specific parsers. In the context of English writing, this may imply submitting the student’s work (e.g., a descriptive paragraph) to a natural language processor and a grammar/spell checker (domain-specific parsers) so that the machine can perceive the same as a human teacher when grading an assignment. The original data along with the results of the natural language processing are then stored in an RDF ontology (called interaction ontology) for further analysis in the inferencing layer. The storage of all data in an ontological format allows to translate the results from disparate parsing tools in a standardized language that is the language of the learning domain. Thus, the parsing layer in SCALE requires the integration of domain-specific parsers, the definition of the entities (e.g., words, phrases, errors) and relationships (e.g., word dependencies) defining the learning domain in question (e.g., English writing), and the specification of the resulting interaction ontology. This process is generic and can be applied in any learning domain.

SCALE’s second layer of processing (inferencing) encloses the teacher’s expertise and assesses the student’s work. Basically, it looks for patterns showing skillful application of concepts or patterns demonstrating the presence of misconceptions. Next, these patterns will be associated with one or more competences. As the evidence of a student’s proficiency in a particular competence increases, SCALE will quantitatively assess the competence of the student following the scoring guidelines input in the system. Technically speaking, the patterns and the scoring guidelines are programmed as a set of rules (if/then statements) using the vocabulary defined in the interaction ontology that has been generated by the parsing layer. SCALE will then employ a Java ontological rule-based reasoning engine called BaseVISor to extend the system’s capability to recognize more complex learning patterns.

BaseVISor’s compact and intuitive XML syntax (to write rules) requires only a low level of expertise to code the rules. Moreover, the rules can be easily written in pseudo-code by a knowledge expert (e.g., English teacher) and implemented by a non-expert programmer (e.g., who is not knowledgeable in English teaching).

In the third processing layer, SCALE profiles students individually and collectively and provides students with a formative feedback that shows the progression of their proficiency and confidence at different levels of granularity (i.e., learning activity, competence, learning domain).

❚  Publications

[1] Boulanger, D., Seanosky, J., Clemens, C., & Kumar, V. (2016, July). SCALE: A Smart Competence Analytics Solution for English Writing. In 2016 IEEE 16th International Conference on Advanced Learning Technologies (ICALT) (pp. 468-472). doi: 10.1109/ICALT.2016.108

Boulanger, D., Seanosky, J., Pinnell, C., Bell, J., Kumar, V.S., Kinshuk. (2016). SCALE: A competence analytics framework. In Y. Li and M. Chang (Eds.), State-of-the-Art and future directions of smart learning (International Conference on Smart Learning Environments, December 23–25, 2015, Sinaia, Romania), Springer, pp. 19–30. Chapter DOI: 10.1007/978-981-287-868-7_3

Govindarajan, K., Kumar, V. S., Boulanger, D., Seanosky, J., Bell, J., Pinnell, C., & Somasundaram, T. S. (2016). Software-Defined Networking (SDN)-Based Network Services for Smart Learning Environments: The Role of SDN in Smart Competence LEarning Analytics platform (SCALE). In State-of-the-Art and Future Directions of Smart Learning. Springer Singapore. [Springer]. Paper presented at the 2015 International Conference on Smart Learning Environments, Sinaia, Romania, 23–25 December (pp. 69-76).

Boulanger, D., Seanosky, J., Kumar, V., Panneerselvam, K., & Somasundaram, T. S. (2015). Smart learning analytics. In G. Chen, V. Kumar, Kinshuk, R. Huang, & S.C. Kong (Eds.), Emerging issues in smart learning. In proceedings of the International Conference on Smart Learning Environment 2014, The Hong Kong Institute of Education, Hong Kong, 24–25 July (pp. 289–296). Springer, Berlin, Heidelberg. doi: 10.1007/978-3-662-44188-6_39

Kumar, V., Boulanger, D., Seanosky, J., Kinshuk, Panneerselvam, K., & Somasundaram, T.S. (2014). Competence analytics. Journal of computers in education, 1(4), 251-270. doi: 10.1007/s40692-014-0018-6

Healthcare Analytics

Healthcare analytics automates through education data the detection of mental health disorders such as ADHD and remedies to these disorders through educational rehabilitation.

PROJECT | MHADS – Mental Health Analysis and Diagnostic Service

Reasearcher: Diane Mitchnick

❚  Description

Previous research has shown a link between Attention Deficit Hyperactivity Disorder (ADHD), a mental health disorder, and writing difficulties. Students with ADHD have an increased likelihood of having writing difficulties, and rarely is there a presence of writing difficulties without ADHD or another mental health disorder. However, the presence of writing difficulties does not necessarily indicate the presence of a Written Language Disorder (WLD), a learning disorder. There are other physical and behavioral factors of ADHD that can contribute to a student having a WLD as well. People diagnosed with ADHD are often inattentive, overly impulsive, and are hyperactive. ADHD is often diagnosed through psychiatric assessments with additional input from physical/neurological evaluations. People diagnosed with WLD often make multiple spelling, grammar, and punctuation mistakes, have sentences that lack cohesion and topic flow, and have trouble completing written assignments. Typically, WLD is also diagnosed through psychological educational assessments with additional input from physical/neurological evaluation. Hence, the Mental Health Analysis and Diagnostic Service (MHADS), a web-based tool, was developed to assist patients and medical practitioners in understanding the process of diagnosing adult ADHD. MHADS takes information from various sources, and through a series of questions, exam results, and performance test scores, it offers a more reliable diagnosis on the disorder. It also aims to detect the potential for ADHD based on writing competences of learners. This research demonstrated through MHADS’ integrated computational model (artificial neural network), which was trained on data from a systematic review, that a strong statistical association exists between WLD and physical and behavioral aspects of ADHD.

❚  Publications

Mitchnick, D., Clemens, C., Kagereki, J., Kumar, V., & Fraser, S. (2017). Measuring the written language disorder among students with attention deficit hyperactivity disorder. Journal of Writing Analytics, 1.

Mitchnick, D., Kumar, V., Kinshuk, & Fraser, S. (2016). Using Healthcare Analytics to Determine an Effective Diagnostic Model for ADHD in Students. 2016 IEEE-EMBS International Conference on Biomedical and Health Informatics (BHI), (pp. 1-4). Las Vegas February 24-27 doi:10.1109/BHI.2016.7467133

PROJECT | Synthetic Biology Analytics

Researcher: Bertrand Sodjahin

❚  Description

Pseudomonas aeruginosa is a bacterial organism notable for its ubiquity in the ecosystem and its capacity to resist antibiotics. It can survive at length in any environment. This is of particular medical concern because pseudomonas aeruginosa can live on hospital surfaces and in the water supply. Being a common origin of hospital-acquired infections, it causes various diseases. Not only 40% of mechanically ventilated patients with pseudomonas aeruginosa succumb to their condition but this bacterium also affects animals and plants. It has been shown that pseudomonas aeruginosa can be isolated from water in a number of intensive care units. Understanding how it survives in water is important in designing strategies for the prevention and treatment of the resulting infections. Furthermore, identifying the survival mechanism in the absence of nutrients is beneficial because pseudomonas aeruginosa and related organisms are capable of bioremediation. We hypothesize that pseudomonas aeruginosa is capable of long-term survival due to the presence of particular genes, which encode for persistence proteins. With readily available pseudomonas aeruginosa gene expression data from previous works, here we propose machine learning, a computer science technique, to identify genes involved in this bacterium’s survival, by analyzing their response to low nutrient water. We then establish genes interaction and regulatory network that are conducive to the comprehension of the survival mechanism. Subsequently, we develop a pseudomonas aeruginosa persistence model from which a general bacteria persistence model is inferred. With inductive logic programming (ILP), we also aspire to investigate possible unknown environmental risk factors associated with pseudomonas aeruginosa.


❚  Publications

Sodjahin, B., Kumar, V.S., Lewenza, S., Reckseidler-Zenteno, S. (2017). Probabilistic Graphs to Model Pseudomonas Aeruginosa Survival Mechanism and Infer Low Nutrient Water Response Genes, IEEE Congress on Evolutionary Computation, pp. 2552-2558, Donostia – San Sebastián, Spain, June 5-8. doi: 10.1109/CEC.2017.7969615

Sodjahin, B., Kumar, V.S., Lewenza, S., Reckseidler-Zenteno, S. (2017). Bayesian Networks to Model Pseudomonas Aeruginosa Survival Mechanism and Identify Low Nutrient Response Genes in Water, In Canadian conference on Artificial Intelligence, pp. 341-347, Edmonton, Canada, May 16-19, Springer.

Sodjahin, B., Kumar, V.S., Lewenza, S., Reckseidler-Zenteno, S. (2017). Modeling Pseudomonas Aeruginosa Survival Mechanism and Identifying Low Nutrient Response Genes in Water with Machine Learning Techniques, Poster, ISCB NGS-Barcelona – Structural Variation and Population Genomics Conference, (pp. N/A), Barcelona, Spain, 3-4 April.

Sodjahin, B., Reckseidler-Zenteno, S., Lewenza, S., Kumar, V., & Wang, J. (2015). Identification of Low Nutrient Response Genes in the Bacterium Pseudomonas aeruginosa with Hierarchical Clustering. In M. Chang, & F. Al-Shamali (Eds.), 2015 Proceedings of Science and Technology Innovations (pp. 1–16). Faculty of Science and Technology, Athabasca University, Canada.

Industry Training Analytics

Industry training analytics assesses operators’ skill development during learning-by-doing processes done within immersive environments (virtual and augmented reality) in both standard and emergency settings.

PROJECT | STAGE – Smart Training with Analytics and Graphics Environment

Researchers: Jeremie Seanosky, Rebecca Guillot, Isabelle Guillot, Claudia Guillot

❚  Description

STAGE offers a new approach to training, combining micro-learning, incremental/telescopic learning (5 stages: Learn, Associate, Understand, Explore, Troubleshoot), and practice until full mastery in a self-paced formative method via leading-edge immersive technologies that make learning real and fun. STAGE aims at increasing the efficiency and quality of training (faster knowledge acquisition) to ensure full mastery of operations of complex machinery in settings of standard and emergency operating procedures in the oil/gas industry.

PROJECT | ART – Augmented Reality Training/Testing

Researcher: Rebecca Guillot

❚  Description

The goal of the ART project was to transition from text-based training (i.e., PeT) to interactive training environments modeled according to the Task-related Knowledge Structures (TKS) theory where tasks are defined as goals reached by following one or more strategies. The operator is then responsible to complete a procedure according to his/her own plan of actions to execute on the appropriate equipment items (objects). The interactive training environment was programmed using ActionScript in Adobe Flash CS6. All of the trainee’s interactions and steps within the assigned training scenarios were then recorded and analyzed to determine the trainee’s time spent on each decision and in each location, track the trainee’s choices and the number of times s/he changed his/her mind, and count how many trials were done before getting the right solution.

PROJECT | PeT – Procedure e-training Tool

Researchers: David Boulanger, Jeremie Seanosky, Rebecca Guillot

❚  Description

PeT is a Procedure e-Training tool empowering the energy industry to train and re-certify operators in standard and emergency operating procedures. PeT tracks the knowledge and behavior of operators through highly monitored multiple-choice questionnaires. Managers, trainers, and operators can visualize their performance in a learning analytics dashboard and see how proficient the operator is in every step in every operating procedure. Additionally, PeT enables to track the behavior of an operator in an emergency setting to ensure the operator not only has the proper knowledge to perform the right actions in the proper sequence but also has the ability to perform those critical actions typical of emergency situations in the proper timeline. In emergency situations, operators often have only a few minutes to intervene with no possibility to consult any resource. The time constraint as well as the high risk of serious consequences can hinder the operator’s ability to intervene properly. The rapidity and correctness of the operator’s actions will make the difference in terms of expensive material damage, productivity loss, injuries, and even death.

PeT is built upon a six-factor confidence model. This model enables to track the operator’s behavior by recording the amount of time the operator takes to answer a question, the number of times s/he revised his/her answers, the number of different answers s/he selected, his/her reaction time that is the time before giving a first answer, the total number of selections made in answer to a question, and the correctness of the final answer to a question. This model describes the degree of hesitation of the operator and will pinpoint any area in which the operator will need further training.

The data collected from these training and evaluation sessions will give organizations the ability to manage and optimize their knowledge assets and human capital.

❚  Publications

Boulanger, D., Seanosky, J., Baddeley, M., & Kumar, V. (2014, December). Learning Analytics in the Energy Industry: Measuring Competences in Emergency Procedures. In Technology for Education (T4E), 2014 IEEE Sixth International Conference on, Kerala, India, 18–21 December (pp. 148–155). doi: 10.1109/T4E.2014.44

Instructional Designs Analytics

Evaluates the quality and effectiveness of learning resources and course designs on student performance and provides recommendations on how to improve the learning experience.

PROJECT | MI-IDEM – Mixed-Initiative Instructional Design Evaluation Model

Researcher: Lino Forner, Jeremie Seanosky, David Boulanger, Jason Bell

❚  Description

Traditionally, the quality of a course offering is measured based on learner feedback at the end of the course. This project offers a method to measure the quality of a course offering—continually, formatively, and summatively—using factors such as the quality of resources used, learner motivation, learner capacity, learner competence growth, and instructor competence. These factors are represented in a Bayesian belief network (BBN) in a system called MI-IDEM. MI-IDEM receives streams of data corresponding to these factors as and when they become available, which leads to estimates of quality of the course offering based on individual factors as well as an overall quality of the offering. Continuous, formative, and summative course quality measurements are imperative to identify weaknesses in the learning process of students and to assist them when they need help. We profess the need for a comprehensive measurement of course quality and ensuing initiatives to personalize and adapt course offerings. It presents two case studies of this novel approach: first, measurement of the quality of a course offering in a blended online learning environment and second, measurement of the quality of training course offering in an industry environment. (Excerpt from [1]).

❚  Publications

[1] Seanosky, J., Boulanger, D., Pinnell, C., Bell, J., Forner, L., Baddeley, M., … Kumar, V. S. (2016). Measurement of Quality of a Course. In B. Gros, Kinshuk, & M. Maina (Eds.), The Future of Ubiquitous Learning: Learning Designs for Emerging Pedagogies (pp. 199–216). Berlin, Heidelberg: Springer Berlin Heidelberg.

Forner, L., Kumar, V.S., & Kinshuk. (2013). Assessing design of online courses using Bayesian belief networks. In proceedings of the 2013 IEEE Fifth International Conference on Technology for Education, Kharagpur, India, 18–20 December (pp. 36–42). doi: 10.1109/T4E.2013.17

PROJECT | Curricular Analytics

Researcher: Geetha Paulmani

❚  Description

Curricular learning analytics explores relations between efficiency of student learning with curricular elements such as topics coverage, prerequisite relations among topics, learning outcomes, instructor effectiveness, student workload, students’ capacity, cultural constraints, socio-economic/political influences, and so on. We are looking for insights and evidences from learner interactions with respect to the curricular elements that allow us to measure the degree of achievement of curricular outcomes. This will require the identification of learning analytics opportunities in each course of a higher education institution’s program; estimation of the volume, velocity, and variety of data to be collected from sensors from students taking these courses; the determination of the kind of support to be offered to students and teacher; and the measurement of the effectiveness of that support. As a first step, we propose to simulate, with fake data, such a curricular encompassing interactive dashboard, which is intended to be used by students, teachers, parents, university administrators, and politicians.

❚  Publications

Pinnell, C., Paulmani, G., Kumar, V.S., Kinshuk. (2017). Curricular and learning analytics: a big data perspective. In B. Daniel, R. Butson (Eds.), Big Data and Learning Analytics in Higher Education, (pp. 125–145), Springer. DOI: 10.1007/978-3-319-06520-5_9

Lean & Agile Collaborations

Adapting the Lean & Agile methods of project management to software development projects in the context of academia-academia or industry-academia collaborations.

PROJECT | Optimization of Research Project Management

Researcher: Isabelle Guillot

❚  Description

Successful industry-academia research collaborations (IARCs) in the software development area can be challenging. The literature identifies best practices in IARCs along with process frameworks with the aim of ensuring successful outcomes for both industry and academia, namely: funding opportunities for universities, training and employment possibilities for students, new knowledge leading to innovative products for industry, and on-time delivery of software benefiting the economy, the institution, and the community. This research investigates ways in which core principles of the project management approach, Agile, and the Scrum framework can be applied and lead to the success of IARCs. In addition to IARCs’ common challenges, additional challenges are often faced such as short-term software development projects accomplished by small geographically distributed teams. Early and frequent customer-centric software delivery, constant communications, responsiveness to change, and highly motivated individuals are often key in terms of realizing the positive outcomes in spite of the obstacles inherent to IARCs. (Adapted from [1])

❚  Publications

[1] Guillot, I., Paulmani, G., Kumar, V.S., Fraser, S.N. (2017). Case Studies of Industry-Academia Research Collaborations for Software Development with Agile. In C. Gutwin, S. Ochoa, J. Vassileva, & T. Inoue (Eds.), In Collaboration and Technology: 23rd International Conference, CYTED-RITOS International Workshop on Groupware (CRIWG 2017), Saskatoon, Canada, August 9-11. LNCS 10391, (pp. 196–212). Springer, Cham. doi: 10.1007/978-3-319-63874-4_15

Math Analytics

The Knowledge Space Theory allows to capture the dependencies between a set of related math concepts to form a knowledge space within which the student’s learning state is constantly recorded and his/her overall learning path tracked. Through continuous adaptive formative assessments, gaps in the student’s comprehension are easily detected and remedied. This is crucial to help students build their confidence.

PROJECT | MATHEX – Mathematics Experiences

Researcher: Rebecca Guillot

❚  Description

The goal of this project is to study students’ physical, affective, problem-solving, and metacognitive behavior as they work on math exercises. MATHEX also focuses on finding ways to strengthen the weaker skills of the students, create opportunities to motivate them, and continuously assess their progress. We all know that mathematics is a key subject in the education of every child. However, how many of them are failing in math and why? How to prevent every single child to fail? How to captivate the curiosity of the most reluctant ones? How to create appetite in those who have this subject in aversion? How to give self-confidence to these students with negative thinking? The main goal of this research project is to identify and meet each individual need of every student. To do so, MATHEX is creating a software that captures every aspect of the student’s behavior while working in math. The results of these captures will allow teachers to detect and remedy more easily to the gaps in the student’s comprehension. In fact, detecting those gaps is only the beginning of the process. Every gap or weakness has its own root somewhere and it is crucial that these roots be clearly identified in order to work on the actual causes rather than on the apparent effects only. MATHEX also focuses on finding ways and technologies to strengthen the weakest skills of students, induce motivation by revealing their strengths, and constantly evaluate their progress by guiding them in optimal learning paths that will lead to success. MATHEX is a system under development targeting K-12 students, aiming to 1) formally modeling knowledge components in K-12 math curricula using the Knowledge Space Theory, 2) continuously record standardized math experiences, 3) offer adapted guidance for students to commit to mathematics at levels comfortable to them, 4) explore the effectiveness of the tools used to write math expressions (handwriting, typing, mouse clicking) and types of interactive components, and 5) identify the most efficient patterns of competence development and study habits.

❚  Publications

Guillot, R., Boulanger, D., Seanosky, J., Kumar, V., & Kinshuk (2015). Enhancing Mathematical Problem-Solving Experiences through Learning Analytics. In M. Chang, & F. Al-Shamali (Eds.), 2015 Proceedings of Science and Technology Innovations (pp. 57–73). Faculty of Science and Technology, Athabasca University, Canada.

Music Analytics

Assessing music skills, whether it is singing, playing instruments, or understanding the theoretical concepts of music, is a challenging but exciting task. Nowadays, technologies such as computer vision and audio digital processing allow to capture valuable data during the learning process, measure the level of engagement and music skills of learners, and provide timely feedback to musicians in training and teachers.

PROJECT | The Objective Ear

Researchers: Joel Burrows

❚  Description

The objective ear is an application that, given a pair of performances of a piece of music, judges the amount of progress made between the two performances. The application has two components: an evaluator and a classifier. The evaluator component analyzes each performance to generate a vector of metrics. These vectors are subtracted from each other to give a vector of differences. The difference vector is used as input to a decision tree, a machine learning classifier, which assigns a level of progress to the pair of performances. Testing of the classifier shows that the application provides accurate assessments and could be used in music education environments to aid students in assessing their progress, and to provide useful data on how music students progress. (Retrieved from:

❚  Publications

Burrows, J., Kumar, V.S. (2018). The Objective Ear: assessing the progress of a music task. In International Journal of Smart Learning Environments, Springer, 5:13,, pp: open access.

Burrows, J., Kumar, V.S., Dewan, A., & Kinshuk (2018). Assessing A Music Student’s Progress, International Conference on Advanced Learning Technologies (ICALT 2018), July 9-13, Mumbai, India (pp. 202-206). doi: 10.1109/ICALT.2018.00055.

Burrows J., & Kumar V.S. (2018). The Objective Ear: Assessing the Progress of a Music Task. In: Chang M. et al. (eds) Challenges and Solutions in Smart Learning. Lecture Notes in Educational Technology. Springer, Singapore, pp: 107-112. Presented at the 2018 International Conference on Smart Learning Environments (ICSLE), Beijing, China. doi: 10.1186/s40561-018-0062-1.

PROJECT | MUSIX – Music Experiences

Researchers: Claudia Guillot, Rebecca Guillot

❚  Description

Today’s music education typically consists of lessons given to students at fixed time intervals, such as once a week. Such a teaching norm can hamper the learning efficiency of students since teachers do not have the possibility to know the difficulties faced by the students and help at the time students study music. Even further, teachers become aware of the progress made by their students only when they assess them. This research looks at the possibility of teachers becoming aware of the ‘traceable efforts’ made by the students as they practice and learn during both formal instructional sessions (e.g., classrooms, music labs) and informal practice sessions (e.g., online practice, peer-to-peer social exposure to musical practice). Contemporary music training software do not capture learning traces observed in such formal or informal sessions. Nor do they allow the integration of data from formal music lessons and the data from informal music practice. To address this gap, an analytics software package, namely MUSIX, is proposed that enables music students to track their own understanding of music theory, to track their own challenges in playing a musical instrument, and to track their own capacity on specific singing techniques. MUSIX’s goal is to track students’ activities through instructional sessions, computer-based exercises, games, and quizzes that will be created within the context of a learning management system to help students learn music at their own pace and explicitly capture the growth in music competence and music confidence. (Retrieved from:

❚  Publications

Guillot, C., Guillot, R., & Kumar, V. (2016). MUSIX: Learning Analytics in Music Teaching. In Y. Li and M. Chang (Eds.), State-of-the-Art and future directions of smart learning (International Conference on Smart Learning Environments, December 23–25, 2015, Sinaia, Romania), Springer, pp. 269–273. Chapter DOI: 10.1007/978-981-287-868-7_31

Guillot, C., Guillot, R., Kumar, V., & Kinshuk (2015). Enhancing Music Prowess through Analytics. In M. Chang, & F. Al-Shamali (Eds.), 2015 Proceedings of Science and Technology Innovations (pp. 93–104). Faculty of Science and Technology, Athabasca University, Canada.

Guillot, R., & Guillot, I. (2015). Music and the Making of Modern Science. Journal of Educational Technology & Society, 18(3), 328-330. (Book Review);

Observational Study Approach

The Internet of Things improves the precision with which learners can be non-intrusively observed, increasing the potential of observational studies to approximate randomized block designs and estimate causal effects without the ethical concerns and recruitment limitations of randomized experiments.

PROJECT | Inferring Drivers of Academic Performance from Observational Data

Researcher: David Boulanger

❚  Description

Traditionally, the research community denotes randomized experiments as the gold standard for science research. Nonetheless, completely randomized experiments – inherently disposed to bias – have raised certain ethical concerns in educational settings. Accordingly, researchers are investigating observational studies as an alternative to the randomized experiments in educational research. Observational study refers to research that explores cause and effect whereby researchers limit their control of independent variables for ethical and logistical reasons. Nevertheless, many researchers believe that observational studies overestimate treatment effects, which can reduce their validity. In addition, the results of observational studies may contain undetected confounding bias, thus leaving them also subject to debate. On the other side, several researchers also maintain that the benefits of randomized experiments should not be oversimplified. As a result, researchers are exploring the compatibility of both study types. The literature has shown that observational studies – using larger and more diverse population samples over longer follow-up periods – can supplement the findings of randomized experiments.

This research proposes an observational study built on the matching techniques prescribed by King (Why Propensity Scores Should Not Be Used for Matching), in which increasingly available new sensors can better observe and/or record teaching and learning experiences in real time. It will demonstrate how learning analytics processes can incorporate observational sensors and advance blocked randomized experiments that measure the impact of analytics. Basically, such techniques strive to enable teachers to step into the roles of analytics researchers using interactive analyses. In addition, this research sets forward matching techniques such as Propensity Score Matching, Coarsened Exact Matching, and Mahalanobis Distance Matching along with their corresponding imbalance metrics (i.e., L1 vector norm, Average Mahalanobis Imbalance, and Difference in Means) to measure and control for sources of bias.

Propensity Score Matching is the most popular matching technique in observational studies. This research demonstrates its suboptimality by presenting an observational study with both randomized and non-randomized data using R libraries (MatchingFrontier, CEM, and MatchIt) and the web application framework for R called Shiny. We measure the data imbalance accuracy of the proposed observational study design using the three aforementioned matching techniques and compare their measured levels of imbalance against the level of data imbalance found an equivalent randomized experiment.

For more information:

Excerpt adapted from:

Kumar, V. S., Fraser, S., & Boulanger, D. (2018). Open Research and Observational Study for 21st Century Learning. In M. Chang, E. Popescu, Kinshuk, N.-S. Chen, M. Jemni, R. Huang, & J. M. Spector (Eds.), Challenges and Solutions in Smart Learning (pp. 121–126). Singapore: Springer Singapore.

Reading Analytics

Analyzes reading behaviors and recommends concepts ready to be learned based on learner’s profile and when to transition from practice to reading (vice versa) to optimize learning.


Researcher: David Boulanger, Jeremie Seanosky, Rebecca Guillot

❚  Description

This research synthesized recent developments in intelligent textbooks over the last five years and identified potential research areas of interest to the AIED community. It characterized traits that make a textbook intelligent and highlighted hot spots in the AIED community such as a) the prediction of academic performance based on students’ reading behaviors, b) the assessment of learner skills based on their reading behaviors, and c) the automatic extraction of concepts taught in textbooks and their interdependencies (e.g., prerequisite, outcome, currency). This review of literature exposed the key components of adaptivity that lead to full-fledged personalization, advocating the need for intelligent adaptivity as a trade-off between personalized provision of reading/learning materials and development and measurement of self-regulatory traits and grit. This research project proposes to embed observational research methods as part of intelligent e-textbooks to automatically and continually infer causality between reading habits, reading activities, subject-matter competences, and metacognitive competences. It also aims at making traditional printed textbooks smart by incorporating interactive components (e.g., videos, animations, 3D modeling, etc.) through augmented reality. Augmented reality observations, along with embedded eye-tracking devices, can capture a rich collection of interaction and physiological data to advance research on optimal reading behaviors and on the effectiveness of merging state-of-the-art technology with traditional media.

Adapted from:

Boulanger, D., & Kumar, V. (submitted). An Overview of Recent Developments in Intelligent e-Textbooks and Reading Analytics. In 2019 AIED Workshop on Intelligent Textbooks.

Regulation Analytics

Self-regulated learning analytics implies assessing the student’s ability to set goals, craft a plan to reach those goals, stick by the plan when working toward that goal, and monitor one’s progress to adapt the goal and its learning path as needed. It also provides students with feedback to guide them toward desirable self-regulatory traits.

PROJECT | Regulate/Manage/Chat

Researcher: Rebecca Guillot, Isabelle Guillot

❚  Description

This project empowers students in managing their own learning processes by integrating tools within a unified learning analytics dashboard, where learners can analyze their competence development and learning performance, which allows them to self-regulate their learning, learn and practice management skills, and to better connect with instructors and peers for help. First, students will have the possibility to set goals in terms of competence growth by specifying the level of proficiency they want to reach per competence by which deadline using which strategy. Second, learners will be presented with a Kanban-like board with a list of tasks to be performed to complete all coursework. Students could modify the list of tasks at will and update the status of completion of a task as they work on it (e.g., to do, in progress, done). Students could set goals in terms of tasks or subtasks to be performed by a given date. Finally, the students’ metacognitive skills will be quantitatively assessed based on their interactions with these tools, that is, how the students perceived opportunities to set goals, how often they actually set goals, the frequency at which they monitored their performance in order to reach their goals, and whether they adapted their goals when they became unrealistic.

PROJECT | SCRL – Self-/Co-Regulated Learning

Researchers: Colin B. Pinnell, Jason Bell, Moushir M. El-Bishouty, Lanqin Zheng

❚  Description

SCRL proactively engages learners in self-regulated and co-regulated activities based on Winne and Hadwin’s information processing perspective, where regulation is experienced as a four-phase iterative process of 1) defining tasks, 2) setting goals and planning how to achieve them, 3) enacting tactics/strategies, and 4) adapting. SCRL can be customized to observe, promote, and measure other types of regulation models. Moreover, the tool monitors learners’ competence through analyzing their performance and interaction data captured by a set of software sensors that measure competence development. The current phase of the project focuses on two application domains: programming and writing competences. The developed sensors capture students’ interactions while using leaning platforms, such as learning management systems (LMS) and integrated development environments (IDE). The captured data conform the raw data that are utilized by SCRL for the monitoring, measuring and evaluation of students’ regulated learning and competence development. SCRL engages learners in creating self-regulated learning (SRL)/co-regulated learning (CRL)-specific activities, traces learners’ interactions related to those SRL/CRL-specific activities, and analyzes/measures the degree of regulation exhibited in these activities. It then infers the relation between enacted SRL/CRL activities and learning performance and models SRL/CRL traits in a distributed causal network to observe the evolution of SRL/CRL traits in each learner. Specifically, SCRL can engage and measure the following based entirely on student’s interactions with the software agents:

  1. Task perception: SCRL engages the self-regulation model at the task perception level by providing an easily understood view on participant competences and activities. It actively engages the participant by calling attention to deficient areas through suggesting initiatives and alerting students to focus on areas through the activation of triggers.
  2. Goal setting/planning: SCRL engages goal setting by suggesting default initiatives for all participants within a field, as well as suggesting initiatives to address problem areas or areas of interest. SCRL aids in planning paths to those goals through the initiative design process, which makes explicit the steps that the participant may take towards their goal.
  3. Enacting: SCRL engages the enactment of initiatives through monitoring participant behavior with relation to the competences addressed by their initiatives. Through this monitoring, SCRL can identify issues in enactment and keeps a record of all enactment details available to it.
  4. Adaptation: SCRL engages the adaptation of initiatives through enactment monitoring over time. By keeping a record of all changes within the domain of a monitored competence, SCRL can identify enactment problems and suggest changes to some or all steps of an initiative plan. Such changes can then be affected in the initiative design interface.

❚  Publications

Zheng, L., El-Bishouty, M.M., Pinnell, C., Bell, J., Kumar, V., & Kinshuk (2015). A framework to automatically analyze regulation. In G. Chen, V. Kumar, Kinshuk, R. Huang, & S.C. Kong (Eds.), Emerging issues in smart learning. In proceedings of the International Conference on Smart Learning Environment 2014, The Hong Kong Institute of Education, Hong Kong, 24–25 July (pp. 23–30). Berlin, Germany: Springer Berlin Heidelberg. doi: 10.1007/978-3-662-44188-6_3

Zheng, L., Kumar, V., & Kinshuk. (2014). The role of co-regulation in synchronous online learning environment. In proceedings of the 2014 IEEE Sixth International Conference on Technology for Education, Clappana, Kerala, India, 18–21 December (pp. 241–244). doi: 10.1109/T4E.2014.49

PROJECT | SDLeX – Self-Directed Learning Experiences

Researcher: Stella Lee

❚  Description

SDLeX aims to analyze learner interactions on self-directed learning environments and capture holistic learner experiences to provide opportunities for reflection and regulation. It captures learner’s perceptions, physical responses, motivations, emotions, and social reactions that emerge from interacting with a learning environment to offer a positive learner experience and measure such experience.

❚  Publications

Lee, S., Barker, T., & Kumar, V.S. (2016). Effectiveness of a learner-directed model for e-learning. Journal of educational technology & society, 19(3), 221-233. PDF

Lee, S., Barker, T., & Kumar, V. (2011). Learning preferences and self-regulation – Design of a learner-directed e-learning model. In T. Kim, H. Adeli, H.K. Kim, H.J. Kang, K.J. Kim, A. Kiumi, & B.H. Kang (Eds.), Software engineering, business continuity, and education. In proceedings of the International Conferences, ASEA, DRBC and EL 2011, Held as Part of the Future Generation Information Technology Conference, FGIT 2011, in Conjunction with GDC 2011, Jeju Island, Korea, December 8-10 (pp. 579–589). Berlin, Germany: Springer Berlin Heidelberg. doi: 10.1007/978-3-642-27207-3_63

Lee, S., Barker, T., & Kumar, V. (2011). Models of eLearning: The development of a learner-directed adaptive eLearning system. In S. Greener & A. Rospigliosi (Eds.), In proceedings of the 10th European conference on e-learning, Brighton, UK, November 10–11

Kumar, V., Lee, S., El-Kadi, M., Manimalar, P.-D., Somasundaram, T.S., & Sidhan, M. (2009). Open instructional design. In proceedings of the International Workshop on Technology for Education, Bangalore, India, 4– 6 August (pp. 42–48). doi: 10.1109/T4E.2009.5314104

Research Analytics

Research analytics aims at promoting and facilitating the sharing and integration of scientific study results with the purpose to discover new and finer-grained insights in education to optimize the learning and teaching processes and the environments in which they occur. The ultimate goal is to avoid actionable insights to sleep over decades and rather optimize the research benefits to the education community.

PROJECT | RPA – Research Publications Analytics

Researcher: Jeremie Seanosky

❚  Description

Research Publications Analytics provides an interface for researchers to add their publications to a database, including a functionality to directly import one’s publications from Google Scholar. RPA analyzes each publication entry for missing elements. It also provides a way to generate PDF reports that includes indexed values for the quality of the publication avenue.

PROJECT | xDesign – Experiment Design Simulated Environment

Researcher: Moushir El-Bishouty

❚  Description

This research offers a simulation environment for learners to design an experiment and analyze the effect of a study using simulated data and research processes. The simulation environment includes interactions on human research ethics, data validation, and statistical analysis.

Sentiment Analytics

Sentiment analytics is the detection of the learner’s emotion during the learning process and the identification of the cause identified through computer vision, audio, neurological, physical, and text data.


Researcher: Steven Harris

❚  Description

This research presents the initial results of a text-based natural language processing classifier for detecting frustrated or confused students in an online learning environment to alert instructors of the potential problem as soon as possible. The research investigates whether an analysis of student interactions in a digital learning environment can successfully identify individual students experiencing academic difficulties or, more broadly, potential content deficiencies. The raw data for our work consists of individual forum posts and messages from a single course, observed over several offerings. Pre-processing includes anonymizing the data to replace student names with unmapped serial numbers and to obscure any other personally identifiable information. Our next step was to manually annotate a section of the data so that it could be used to train a set of potential algorithms that have, based on literature, been successful in similar applications and evaluate their performance to the task. To this end, 500 pieces of text were initially pulled from the data and annotated – first as to whether or not they contained an opinion (objective vs. subjective), and second, where the sentences did contain an opinion, whether the orientation was positive or negative. Naïve Bayes, a pure SVM, and a hybrid SVM classifier, utilizing principal component analysis to reduce variability, were developed. The results showed that both SVM-based classifiers performed to a level comparable to that of manually classified data, though the hybrid SVM classifier performed slightly better. In addition to textual analysis, we also look to develop a multiple-media sentiment analysis engine where other media information such as facial expression and physiology trackers can be used to augment sentiment observations. Secondary areas of related research include using similar algorithms to identify areas of weak learning content, based on identifying areas where students regularly seem to have problems, and ultimately even identifying student learning styles and suggesting additional materials that may aid in their success with the course. (Adapted from:

❚  Publications

Harris, S. & Kumar, V.S. (2018). TutorAlert: Identifying Student Difficulty in a Digital Learning Environment, International Conference on Advanced Learning Technologies (ICALT 2018), July 9-13, Mumbai, India (pp. 199-201). doi: 10.1109/ICALT.2018.00054

Harris, S.C., Zheng, L., Kumar, V., & Kinshuk. (2014). Multi-dimensional sentiment classification in online learning environment. In proceedings of the 2014 IEEE Sixth International Conference on Technology for Education, Clappana, Kerala, India, 18–21 December (pp. 172–175). doi: 10.1109/T4E.2014.50

Sport Analytics

Sport analytics consists of designing sportswear that tracks players’ movements as well as developing sensors in sports equipment that assess players’ dexterity and velocity. The data analyzed allow to provide real-time and concise feedback that help teams optimize their performance and assist referees/umpires in the decision-making process, with careful attention to not interfere with the sport itself.

PROJECT | BOOTT – Badminton Officials Online Testing and Training

Researchers: Jeremie Seanosky, Rebecca Guillot

❚  Description

This research aims to develop a real-time reconstruction of badminton matches from camera sensor data into virtual reality (VR) viewable scenes. The project will integrate VR technology with real-time gameplay to increase audience engagement and increase the accessibility of the sport. This VR reconstruction will allow audience to review plays during the match at game speed and in slow motion.

The VR reconstruction can be made available to audiences viewing courtside, and also online, allowing for a more immersive viewing experience. The VR reconstruction also provides additional tools for use in match, including viewing shots from different angles and at different speeds immediately, without waiting for replay after the fact.

The scope of this research is enhancing the experience, accessibility, and enjoyment of the audience (spectators and coaches), in-person and online. This research seeks to use VR to enhance audience interest by enabling audiences to contextualize plays in a match. The project seeks to add additional viewpoints by providing another viewing dimension through the VR aspect, and by constructing the tools with which audiences can socially share specific points of interest in a match. For example, due to the speed of play in badminton, it can be difficult for a spectator to observe many technical aspects of an individual play without being at an ideal viewing location and focusing on a specific position of play. Similarly, spectators and coaches may miss certain plays until later viewing video of the match.

Recent research related to sports viewership has focused on smart arenas and using data to improve the experience of attending sporting events. This research narrows the focus on actual viewing of the match. Looking of viewership can be improved by making a form of virtual instant replay available to viewers attending the event, and over the Internet.

PROJECT | Slapshot

Researchers: Geoffrey Glass, Colin Pinnell

❚  Description

Analytics is about awareness of the states of competence of users. Users can become aware of their own states of competence at different levels. Analytics measures the skills of each user and engages them in taking initiatives to hop from one competence state to the next. The hops happen mostly gradually, depending on the capacity of the user, punctuated by dramatic jumps. Analytics identifies such scenarios where dramatic jumps are necessary and offers the information needed to enact such jumps. Learning analytics, in the context of ice hockey, is the study of detection, analysis, and generation of moments of progress awareness about skaters, goaltenders, and team experiences. By employing recent advances in statistics, machine learning algorithms and sensor technologies, this project aims to build a big data learning analytics infrastructure that provides progress awareness scenarios and helps minor and junior hockey players, coaches, and parents assess and improve player and team performance using data from sensors and existing game records and statistics. The project entails hardware and software development, novel analytics, and user experience considerations. More precisely, it aims at inferring play skills, tactics, and strategies from observational data from sensors, video, subject matter experts, and reflection/regulation activities of players.

Traffic Analytics

Traffic analytics leverages techniques developed in educational data mining and learning analytics to innovatively apply them to better determine and improve traffic conditions thanks to smart technologies.

PROJECT | TADA – Traffic Analytics with Data Accretion

Researchers: Liao Ming, Raushan Kumar Singh

❚  Description

TADA stands for Traffic Analytics with Data Accretion, a brand-new tool/technique that allows contextualization of sensor data from physical sensors (e.g., GPS, vehicle sensors, traffic sensors) and from personal observations using physiological sensors. Contemporary mathematical and computational models are good at predicting the flow density of traffic; however, they are also impractical in the context of automating the flow of traffic. Hence, we propose a novel technology that integrates mathematical and causal traffic models with the goal of optimizing the flow of traffic at real-time. The technology will be able to use historical data as well as real-time data. The proposed technology can be embedded in each traffic light controller along a major artery in an urban center. These controllers will dynamically synchronize/regulate the timing of the lights depending on traffic data. Moreover, the number of vehicles crossing a traffic light will be estimated through computer vision techniques. The traffic lights will wirelessly communicate and cooperate with each other, in a semi-automatic fashion, with the goal of smoother flow of traffic, hence saving the liquid gold.

Classical traffic models are mostly based on the treatment of vehicles on the road, their statistical distribution, or their density and average velocity as a function of space and time. Most models employ a passive approach to traffic optimization by using techniques ranging from cellular automata, particle-hopping, car-following, and gas-kinetics up to fluid dynamics. That is, traffic data is collated a priori, and the models are validated post hoc. In a compelling argument for the need to change the manual adjustments to traffic signals, the literature shows, using limited simulation models, that the best traffic signal performance could be achieved using reinforcement learning.

We propose to use a dynamic Bayesian belief network (DBBN) to model and simulate urban ground traffic behavior and to show how the DBBN optimizes traffic flow in real time by controlling states of traffic signals. GPS and GIS (geographic information system) technologies enable real-time access to traffic data such as vehicle’s location, speed, and direction. This real-time data, in combination with data specific to road segments, is sufficient to model entities affecting the flow of the traffic in a DBBN.

Although the proposed Bayesian technique has no more intrinsic ability to represent causation than a traditional mathematical model, it is, however, more explicit and is directly manipulable by end users (e.g., the traffic control department of a city). Also, Bayesian models use the same amount of information as that is available to contemporary mathematical and statistical models, hence model updates and validation metrics are quite comparable across all models if not the same. Finally, in line with the goals of contemporary traffic control models, the proposed Bayesian traffic signal control has the ability to 1) model the causal elements of traffic situations, 2) control traffic signals using probabilistic inferences, 3) regulate traffic for vehicle-specific situations (e.g., ambulances, fire engines, VIP vehicle), and 4) predict undesirable traffic situations and propose timely alternatives.

You will find in the material below some initial research done with a deep learning approach:




User Experience

Identifying and minimizing barriers to learning analytics adoption to analyze the effectiveness of analytics-generated feedback and optimize the educational benefits.

PROJECT | Reducing Barriers to Learning Analytics Adoption

Researcher: Isabelle Guillot

❚  Description

Assessing the user experience of learning analytics tools is core for their successful adoption. As learning analytics (LA) tools are increasingly adopted across diverse student-facing applications, the need for a valid, generalized, and standardized instrument to measure the user experience of these tools becomes compellingly vital. This research investigates frameworks and instruments (e.g., the 3-TUM [three-tier Technology Use Model]) for understanding student experiences with LA tools through the constructs of perceived self-efficacy, perceived satisfaction, perceived usefulness, behavioral intention, system quality, and effectiveness. The goal of this project is to minimize barriers to LA adoption by continuously improving learner experiences based on the user feedback received. In particular, this research is interested in 1) exploring, executing, and assessing recruitment and retainment strategies for smart learning environment studies; 2) assessing the user experience of LA tools; 3) assessing the technology acceptance of LA, augmented reality (AR), and virtual reality (VR); and 4) developing protocols to improve the technology acceptance of users.

❚  Publications

Guillot, I., Guillot, C., Guillot, R., Seanosky, J., Boulanger, D., Fraser, S.N., Kumar, V., & Kinshuk (2019). Challenges in recruiting and retaining participants for smart learning environment studies. In: Chang M. et al. (eds) Foundations and Trends in Smart Learning. Lecture Notes in Educational Technology. Springer, Singapore (pp. 61-66). (ICSLE 2019), March 18-20, Denton, USA. doi: 10.1007/978-981-13-6908-7_8

Writing Analytics

Assisting a student during the writing process is a colossal task that requires combining the forces of state-of-the-art natural language processing and deep learning techniques. Predicting essay final scores and rubric scores is just one of many ways of providing formative feedback to students to help them reinforce their writing skills.


Researchers: David Boulanger, Jeremie Seanosky, Rebecca Guillot

❚  Description

This research project aims at developing generalizable automated essay scoring models. Several approaches have been tested for the prediction of essay holistic scores, such as multiple linear regression, deep neural network (NN), and recurrent neural network (RNN). Moreover, state-of-the-art linguistic analysis tools were leveraged to extract more than 1,400 writing metrics. Our team has extensively worked with one of the most popular essay datasets freely available, that is, the eighth essay dataset from the Automated Student Assessment Prize contest (622 words on average, written by Grade 10 students, narrative, scoring scale between 10 and 60, rubric scoring, no bias in the way holistic scores were derived, tested the writing ability of students). These datasets have served as a point of reference against which researchers have measured and compared the performance of their automated essay scoring models. Our research has demonstrated the limitations of this dataset (e.g., the distribution of essay scores was imbalanced preventing the machine to adequately learn high-quality essays) and highlighted the data requirements that training generalizable scoring models imposes. As part of our work, we also developed holistic and rubric scoring models to tentatively explain how holistic scores were derived from rubric scores. It was found that the level of agreement between human and machine graders reached on holistic scores did not translate into comparable levels of agreement on rubric scores, which were significantly lower. Our next step is to build a hybrid RNN+NN architecture in order to locate the text passages that contributed to the holistic and rubric scores of an essay and to report the suboptimal writing metrics associated with these lower-quality text passages.

A learning analytics system has been extended to address the English writing domain (in addition to coding) for a breadthwise expansion. The system has also been infused with analytics solutions targeting competence, grade prediction, and regulation traits, thus offering deeper insights. Writing experiences of students on the Moodle learning management system are incrementally tracked. A dashboard for students and a dashboard for teachers have been designed to monitor the performance of each individual student and of the overall classrooms and to support regulation activities among students. It shows how students can receive feedback under the form of rubric scores over their essays, which they can submit as many times as the teacher allows them to do to observe how they progress during their writing process. As for teachers, they are empowered to identify quickly lagging or at-risk students, giving them the opportunity to quickly remedy these situations.

❚  Publications

Boulanger, D., & Kumar V. (2019, accepted). Shedding Light on the Automated Essay Scoring Process. In Educational Data Mining (EDM 2019), Poster, July 2-5, Montreal, Canada.

Boulanger, D., & Kumar V. (2018). Deep Learning in Automated Essay Scoring. International Conference on Intelligent Tutoring Systems (ITS) 2018. In R. Nkambou, R. Azevedo, & J. Vassileva (eds) Lecture Notes in Computer Science. Springer, Switzerland, pp. 294-299. Presented at the 2018 Intelligent Tutoring Systems (ITS), 14th International Conference, Montreal, Canada, June 11-15, 2018.

Boulanger, D., Clemens, C., Seanosky, J., Fraser, S., Kumar, V.S. (2018, accepted). Performance analysis of a serial NLP pipeline for scaling analytics of academic writing process. In D. Sampson, D. Ifenthaler, J.M. Spector, P. I Isaías, S. Sergis (Eds.) Learning Technologies for Transforming Teaching, Learning and Assessment at Large Scale, Springer.

Kumar, V.S., Fraser, S.N., Boulanger, D. (2017). Discovering the predictive power of five baseline writing competences, Journal of Writing Analytics, 1 (1), pp. N/A.

Boulanger, D., Seanosky, J., Guillot, R., & Suresh, V. (2017). Breadth and Depth of Learning Analytics. In E. Popescu, Kinshuk, M. K. Khribi, R. Huang, M. Jemni, N.-S. Chen, & D. G. Sampson (Eds.), Innovations in Smart Learning (pp. 221–225). Singapore: Springer Singapore.


Researcher: Clayton Clemens

❚  Description

The concept of translating natural language into a form that computers can understand is decades old, and the applications of that quantification to education has followed not far behind. The field of natural language processing has attempted to solve the problem of natural language ambiguity through the analysis and the tagging of written text. Building upon this, other systems are capable of producing a number of metrics that describe the properties of text. These analytics have been used in studies to assess learner competences in more detail, usually for the purposes of automated essay scoring. These studies usually focus on a final product, however, after the student has completed their work. There is little consideration for the writing process itself, the manner in which the work was actually done. There is no data on how a student’s competences in writing may evolve within a single composition, or over the course of many compositions. Software that supports the real-time recording of writing and derives analytics from each ‘snapshot’ will allow for the consideration of this evolution on a large scale. Having such data will enable the determination of how competences grow and interrelate and to what degree this growth is unique to students or intrinsic to the way writing is taught. Analytics at this level will provide educators with deeper, more meaningful insights to the myriad aspects of student composition and enable them to offer assistance and feedback that they were previously unable to provide. (Retrieved from:

❚  Publications

Clemens, C., Kumar, V.S., Boulanger, D., Seanosky, J., Kinshuk. (2018). Learning traces, competence assessment, and causal inference for English composition. In Frontiers of Cyberlearning: Emerging technologies for teaching and learning. J.M. Spector, V.S. Kumar, A. Essa, Y-M. Huang, R. Koper, R.A.W. Tortorella, T-W. Chang, Y. Li, Z. Zhang (editors). Springer Singapore, pp. 49-68.

Boulanger, D., Seanosky, J., Clemens, C., Kumar, V.S., Kinshuk. (2016). SCALE: A smart competence analytics solution for English writing, International Conference on Advanced Learning Technologies, pp. 468-472, Austin, TX, USA, July 25-28. doi: 10.1109/ICALT.2016.108

Clemens, C., Kumar, V., & Mitchnick, D., (2013). Writing-based learning analytics for education, workshop on learning analytics. In proceedings of the 2013 IEEE 13th International Conference on Advanced learning technologies, Beijing, China, 15–18 July (pp. 504-505). doi: 10.1109/ICALT.2013.164

Clemens, C., Chang, M., Wen, D., Kumar, V., Lin, O., & Kinshuk. (2011). Traces of writing competency – Surfing the classroom, social, and virtual worlds. In proceedings of the 2011 11th IEEE International Conference on Advanced Learning Technologies, Athens, Georgia, USA, 6–8 July (pp. 625–626). doi: 10.1109/ICALT.2011.193

Kumar, V., Chang, M., & Leacock, T.L. (2011). Mobile computing and mixed-initiative support for writing competence, In S. Graf, F. Lin, Kinshuk, & R. McGreal (Eds.), Intelligent and adaptive learning systems: Technology enhanced support for learners and teachers (pp. 327–341). Hershey, PA: IGI Global. doi: 10.4018/978-1-60960-842-2.ch021

Kumar, V., Chang, M., & Leacock, T. (2011). Ubiquitous writing: Writing technologies for situated mediation and proactive assessment. Ubiquitous learning: An international journal, 3(3), 173–188. PDF

Kumar, V., Chang, M., & Leacock, T. (2010), Ubiquitous writing: In search of traces of writing skills and opportunities for proactive feedback. Paper presented at the International Conference on Ubiquitous Learning, Vancouver, Canada, 10–11 December (pp. N/A).