The handling of a large amount of data to analyze certain behaviors is reaching great popularity in the decade 2010–2020. This phenomenon has been called Big Data. In the field of education, the analysis of this large amount of data, generated to a greater extent by students, has begun to be introduced in order to improve the teaching-learning process. In this paper, it was proposed as an objective to analyze the scientific production on Big Data in education in the databases Web of Science (WOS), Scopus, ERIC, and PsycINFO. A bibliometric study was carried out on a sample of 1491 scientific documents. Among the results, the increase in publications in 2017 and the configuration of certain journals, countries, and authors as references in the subject matter stand out. Finally, potential explanations for the study findings and suggestions for future research are discussed.
Big Data, education, bibliometric study, Internet
Big Data is a concept that is currently in fashion and has been in specialized literature for more than a decade, alluding to the large amount of data that is generated at every moment as a result of technological evolution and the interactions of people in digital spaces (Waller and Fawcett 2013). However. it is only recently that it has had its greatest apogee and impact as an object of research as a result of technological advances and the development of platforms for interaction between users and these with the content, leading to an enormous amount of data (Ghani et al.).
Specifically, Big Data refers to the large volume of data generated because of the development of technology and the continuous actions and interactions of users in digital environments (Hussain and Cambria 2018). Other concepts related to Big Data are data learning mining or learning analytics. Data learning mining is all those techniques and procedures used to extract useful and relevant information from the large amount of data reported from educational platforms (Menon et al. 2017). On the other hand, learning analytics is a construct that is derived from data mining and alludes to the management, processing, and analysis of students’ educational data, which are studied with the purpose of improving and optimizing the learning process (Liang et al. 2016).
That is why, today, society is in what experts call the Big Data era, promulgating new challenges and benefits through the analysis of all data generated in environments characterized by high quantification (Pugna et al. 2019). Since the arrival of the new millennium, services such as the Internet and the development of the Web began to record data from users, their movements and interactions, creating a large bank of useful and relevant information, whose analysis reports great potentialities to study the needs and demands of people (Chen et al. 2012; Khan et al. 2018). Technological development and the emergence of popular social networks have led people to become active agents in digital media, exponentially multiplying the amount of data generated (Ni et al. 2016).
All this has led to a great interest on the part of researchers in studying all aspects concerning the enormous presence of data in all aspects of people’s lives (Williamson 2015; Williams et al. 2017). Thus, the European Commission stated that the Horizon 2020 report would be a major step towards the study of Big Data, with the aim of developing strategies to conduct research and innovation in this field of knowledge (Jin et al. 2015). The purpose of Big Data analysis is to collect a set of data from various electronic sources to be transformed into relevant information in order to improve the services which the user habitually accesses (Jagadish 2016).
Big Data is nourished by an era marked by the connectivity of people (Veltri 2017), where the action of creating content, sharing, and interacting with the rest of users in the community are the order of the day (Hussain and Cambria 2018). This provides a great opportunity to know—in addition to the needs—the psychological state of people and their behavior in virtual spaces (Eichstaedt et al. 2015). Given the peculiarities of the society in which we live, the data are growing at great speed (Al Nuaimi et al. 2015). So much so, that volume, speed, variety, veracity, and value are already spoken of as fundamental characteristics of the data and that are inherent to Big Data.
They present a disorganized structure and are in various formats such as text, image, voice, and video (Injadat et al. 2016). In order to analyze all the data in the digital environment, the concept of data science arises with the intention of managing and interpreting each and every one of the data by means of specialized programs with high processing capacity (Hicks and Irizarry 2018). These developments have led to the evolution of predictive analytics (Waller and Fawcett 2013), to adapt services to current trends demanded by the user (Saiki et al. 2018). Therefore, the data are used to predict and make decisions about the future (Ghani et al.), based on a strategic design that analyzes the requirements of the audience (Perlado-Lamo-de-Espinosa et al. 2019).
According to Moreno-Carriles (2018), the literature reveals that the treatment of Big Data has expanded into different fields of action, such as security, customer service, public services, preservation of the environment, the economy, finance, in addition to education, which is the field that interests us in this study. The Big Data that has mainly been exploited in the business world today is already being widely used in education (Aretio 2017), finding us in a new phase of teaching and learning based on the study of data generated by students (Gibson 2017). All the data derived from the different educational agents (teachers and learners) are currently being processed in order to improve the quality and experience of learning processes in digital environments (Liang et al. 2016).
Likewise, the data source produced by educational content management platforms is being used to develop tools and services adapted to the singularities of contemporary education, highly conditioned by the development of educational technology (Merceron et al. 2015). The immersion of the students in a distance and ubiquitous education has caused a great flow of data about their developed activity (Seufert et al. 2019).
However, experts such as Menon et al. (2017) consider that data mining techniques in the field of education—to this day—are not completely successful, so not all meaningful and valuable information is extracted. This is due to the fact that the handling and treatment of Big Data require the collaboration of teachers with specialists, with the objective of being able to obtain the relevant information from the data reported by the use of tools and digital resources of an educational nature (Huda et al. 2017). This allows learners to perform all kinds of actions in virtual spaces, whose generated data are used to obtain knowledge about their activity, performance, and satisfaction (Elia et al. 2019).
An effective analysis of Big Data contributes to the promotion of new and better educational experiences (Reidenberg and Schaub 2018), to an improvement of didactic programming tasks on the part of teachers with the help of scientists specializing in data analysis, to an efficient selection of strategies and decision making to approach the formative process, adequate to the demands of a learning group increasingly familiar with technology, seeking innovative learning as a result of the study of data (Huda et al. 2018), and all of this based on a predictive analysis of the data collected (Daniel 2015; Daniel 2017).
Therefore, Big Data and analytics of the interactions of educational agents in virtual environments are positioned as new ways to solve the shortcomings of the educational system (Picciano 2012), in such a way as to improve productivity, innovation (Sanchez and Ball 2015), and the personalization of learning (Dishon 2017). As a result, it was proposed as an objective to analyze the scientific output, understood as the published articles on Big Data in education in the Web of Science (WOS), Scopus, ERIC, and PsycINFO databases. Consequently, the following research questions were identified:
RQ1. What is the state of scientific production over time?
RQ2. Which journals and countries concentrate on the greatest scientific production on Big Data in education?
RQ3. Which are the articles of greater impact in the area of Big Data in education?
RQ4. What are the main lines of research in this field that are derived from the keywords of scientific articles?
Discussion and Conclusions
The Kavian Scientific Research Association (KSRA) is a non-profit research organization to provide research / educational services in December 2013. The members of the community had formed a virtual group on the Viber social network. The core of the Kavian Scientific Association was formed with these members as founders. These individuals, led by Professor Siavosh Kaviani, decided to launch a scientific / research association with an emphasis on education.
KSRA research association, as a non-profit research firm, is committed to providing research services in the field of knowledge. The main beneficiaries of this association are public or private knowledge-based companies, students, researchers, researchers, professors, universities, and industrial and semi-industrial centers around the world.
Our main services Based on Education for all Spectrum people in the world. We want to make an integration between researches and educations. We believe education is the main right of Human beings. So our services should be concentrated on inclusive education.
The KSRA team partners with local under-served communities around the world to improve the access to and quality of knowledge based on education, amplify and augment learning programs where they exist, and create new opportunities for e-learning where traditional education systems are lacking or non-existent.
FULL Paper PDF file:
Big Data in Education. A Bibliometric Review
Soc. Sci. 2019
PDF reference and original file: Click here
Nasim Gazerani was born in 1983 in Arak. She holds a Master's degree in Software Engineering from UM University of Malaysia.
Professor Siavosh Kaviani was born in 1961 in Tehran. He had a professorship. He holds a Ph.D. in Software Engineering from the QL University of Software Development Methodology and an honorary Ph.D. from the University of Chelsea.
Somayeh Nosrati was born in 1982 in Tehran. She holds a Master's degree in artificial intelligence from Khatam University of Tehran.