This paper proposes a set of non-parametric statistical functions designed to analyse non-linear systems within the context of refined educational management. Recognizing the limitations of traditional parametric methods—particularly their reliance on stringent distributional assumptions this research develops a comprehensive framework that accommodates the complexities inherent in educational data, which often exhibit non-normal distributions and diverse variable types. The proposed functions enable robust evaluation of relationships among educational metrics, facilitating improved hypothesis testing and the identification of meaningful patterns in student performance and teaching effectiveness. Furthermore, the study discusses the practical application of these methods in educational settings, demonstrating their utility in evaluating program efficacy and informing data-driven decision-making processes. Through methodological innovation, this work aims to enhance analytical rigor in educational research, ultimately contributing to improved management strategies and optimized learning outcomes in dynamic educational environments. Future work will explore integration with federated learning for multi-institutional data collaboration.
In the realm of educational management, the analysis of complex, nonlinear systems often presents significant challenges due to the multifaceted nature of educational data, which frequently includes variants such as student performance, demographic factors, and institutional policies. Traditional parametric statistical methods, reliant on assumptions of normality and linearity, can fall short in accurately modelling these relationships, particularly in cases where data distributions are non-standard or when sample sizes are small. Non-parametric statistical methods provide a robust alternative by offering flexibility to analyse ordinal and nominal data without stringent distributional requirements. Techniques such as the Mann-Whitney U test, Kruskal-Wallis H test, and Spearman’s rank correlation are particularly useful in this context, as they effectively assess differences and correlations amidst the inherent nonlinearities and complexities of educational systems. By leveraging these non-parametric approaches, educational researchers can uncover significant insights that inform decision-making processes, helping to enhance teaching efficacy and improve student outcomes. Furthermore, it can significantly enhance the career development of students [1-6].
The traditional education and scientific research management models face numerous challenges that can hinder their effectiveness and adaptability in an ever-evolving global landscape. Below, it outlines some key challenges of traditional education and scientific research management models. The challenges of the traditional education management models were related to standardization vs., Individuality, outdated curriculum, inequity in access, resistance to change, teacher training, funding, constraints, reproducibility, crisis, interdisciplinary collaboration and publication pressure. Specifically Traditional education systems often prioritize standardized testing and curriculum over personalized learning experiences. This can lead to disengagement among students who may not fit the standard learning mild [7].Many curricula do not keep pace with technological and societal advancements, leaving students ill prepared for the modern workforce and critical global issues [8]. There are significant disparities in access to quality education, particularly between urban and rural areas, and among different socio-economic groups. This inequity exacerbates existing social problems [9]. Educational institutions often resist implementing innovative teaching methods and technologies due to bureaucratic inertia or lack of resources [10]. Teachers often receive inadequate training to adapt to new pedagogies or technologies, which can limit their effectiveness in the classroom [11]. The challenges of scientific research management models were indicated as research funding is often competitive, scarce, and tied to short-term projects, which can limit innovative long-term research [12]. Many scientific fields are grappling with issues around the reproducibility of research findings, which raises questions about the reliability of published results. Traditional research models often silo disciplines, making interdisciplinary collaboration difficult despite the importance of breadth in tackling complex global challenges [13]. Researchers often face immense pressure to publish frequently, which can lead to lower quality research and practices like “salami slicing” data [14]. Both tradition-al education and scientific research management face significant challenges that require innovative solutions and a shift in mind-set to address effectively.
Modern research and education systems face mounting complexity due to interdisciplinary demands and diverse stakeholder inputs. Traditional management models, reliant on static thresholds (e.g., fixed funding criteria), struggle to adapt to ambiguous scenarios like evaluating emergent research fields or student potential. Hybrid AI approaches, such as fuzzy-neural networks, offer a promising solution: fuzzy logic interprets qualitative data (e.g., peer reviews), while neural networks optimize decisions using historical trends. For instance, the previous studies demonstrated an improvement in grant allocation fair-ness using such systems [15-22]. This paper proposes a framework bridging these technologies, targeting scalability and transparency in institutional decision-making." By integrating these elements, the introduction becomes more persuasive, specific, and action-able, clearly framing the problem, solution, and significance.
Figure 1 illustrates the proposed method that was designed to generate expected outcomes from the domain of education and scientific research management.
The motivation for developing new methods for dimensionality reduction in high-dimensional data is rooted in the complexities and challenges posed by the inherent characteristics of such datasets. High-dimensional data often contains a vast number of features, which can lead to various issues including:
Curse of dimensionality: As the dimensionality of the data increases, the volume of the space increases exponentially, making data points sparse. This sparsity complicates the identification of meaningful patterns and relationships between features, ultimately affecting the performance of statistical analyses and machine learning algorithms.
Increased over fitting risk: Models trained on high-dimensional data are prone to over fitting due to the abundance of features, particularly when the sample size is limited. This over fitting results in poor generalization to unseen data, compromising the predictive ac-curacy of the model.
Computational complexity: Processing and analysing high-dimensional data is computationally intensive. High dimensionality can significantly increase the time and re-sources required for model training and evaluation, leading to inefficiencies in practical applications.
Difficulty in feature interpretation: A large number of features can obscure the underlying structure of the data, making it challenging to identify and interpret significant variables that contribute to the outcomes of interest. Simplifying the feature space can enhance the interpretability of the results.
To address these challenges, there is a pressing need for innovative dimensionality reduction techniques that effectively capture the essential structures and relationships within high-dimensional datasets. By focusing on reducing dimensionality while retaining the informative aspects of the data, such methods can significantly improve statistical analysis, enhance feature extraction, and ultimately lead to better predictive performance.
The development of advanced dimensionality reduction techniques will not only stream-line data processing but also empower researchers and practitioners across various fields including bioinformatics, finance, and social sciences to leverage high-dimensional data more effectively, thereby uncovering valuable insights and improving decision-making processes.
Development of a comprehensive framework for non-parametric statistical functions in nonlinear systems: The proposed non-parametric statistical functions are designed within a structured framework that addresses the complexities of nonlinear systems prevalent in educational data. This framework integrates various statistical techniques tailored for different types of data distributions, including ordinal and nominal scales. By encompassing methods such as the Mann-Whitney U test, Kruskal-Wallis H test, and Spearman’s rank correlation, the framework provides researchers with a versatile toolkit capable of addressing a wide range of analytical scenarios. This comprehensive approach allows for the identification of meaningful relationships among variables without the constraints imposed by para-metric assumptions, thereby ensuring greater statistical validity and robustness when dealing with heterogeneous educational datasets.
Methodological innovation in analysing complex relationships: The proposed non-parametric functions introduce innovative methodologies specifically adapted for examining the intricate relationships often found in educational systems. By incorporating techniques that can effectively handle non-linear interactions among variables, these statistical functions enable a more nuanced understanding of educational phenomena. For instance, the use of rank-based methods allows researchers to discern underlying trends and correlations that may not be apparent through traditional linear analyses. This methodological advancement enhances the capability to conduct robust hypothesis testing, enabling educational managers to derive actionable insights and make informed decisions based on a deeper appreciation of the multifactorial influences on student outcomes and institutional performance.
Practical application of non-parametric statistical functions in educational management: The applicability of the proposed non-parametric statistical functions in the field of education represents a significant contribution to both academic research and practical management strategies. By employing these advanced methods, educators and administrators can rigorously assess the effectiveness of teaching interventions, evaluate program efficacy and analyze student performance metrics across diverse demographics. For example, non-parametric techniques can facilitate the comparison of student assessment scores across different groups without the assumption of normality, providing reliable evaluations of educational equity and quality. Furthermore, the ability to apply these functions in real-time data analysis supports ongoing evaluation processes, allowing institutions to adapt strategies swiftly in response to evolving educational needs. Consequently, this practical application of non-parametric methods fosters a data-driven culture in educational management, ultimately contributing to enhanced educational practices and improved student outcomes.
The structure of the paper is outlined as follows: Section II presents an overview of statistical test methods. Subsequently, it reviews the relevant literature, which indicates the current gap in the latest related work. Finally, we present our research motivation and objectives. Section III presents the proposed method with the proposed non parametric statistical functions. In Section IV, we describe the simulation of the proposed method with corresponding outcomes. Section V presents the conclusion.
Statistical testing methods are essential for validating whether the performance differences between models are statistically significant, thereby reducing the likelihood of chance results. These methods ensure that experimental findings are reproducible and robust, thereby supporting the validity of scientific conclusions. Furthermore, statistical tests can identify the true impact of different models, features, or algorithmic improvements, helping to avoid misleading optimizations.
Research manuscripts reporting large datasets that are deposited in a publicly available database should specify where the data have been deposited and provide the relevant ac-cession numbers. If the accession numbers have not yet been obtained at the time of sub-mission, please state that they will be provided during review. They must be provided prior to publication.
Interventionary studies involving animals or humans, and other studies that require ethical approval, must list the authority that provided approval and the corresponding ethical approval code.
In this section, where applicable, authors are required to disclose details of how generative artificial intelligence (GenAI) has been used in this paper (e.g., to generate text, data, or graphics, or to assist in study design, data collection, analysis, or interpretation). The use of GenAI for superficial text editing (e.g., grammar, spelling, punctuation, and formatting) does not need to be declared.
Figure 2 shows ten methods of the statistical testing techniques, they are t - test, Chi-Square Test, Analysis of Variance (ANOVA), Mann-Whitney U Test, The Kolmogorov-Smirnov (K-S) Test, Wilcoxon Signed-Rank Test, Kruskal-Wallis H Test, fisher's exact test, McNemar Test, and Cochran's Q Test [1996].
t - test: t - test is a statistical hypothesis test used to determine whether there is a significant difference between the means of two groups. It assesses the likelihood that any observed difference between group means could have occurred by random chance alone. The t - test can be applied in various scenarios, including independent samples, paired samples, and one-sample tests, making it a versatile tool in statistical analysis for comparing means and drawing inferences about populations based on sample data [23]. The test statistical for the independent samples t- test is given by:
Where:
are the means of Sample 1 and Sample 2, respectively,
are the variances of Sample 1 and Sample 2, respectively,
are the sample sizes of Sample 1 and Sample 2, respectively.
Fisher: The fisher's exact test is a statistical significance test used to determine the association between two categorical variables in a contingency table, particularly when sample sizes are small and the expected frequencies in any of the cells are low. Unlike chi-square tests, which rely on large sample approximations, fisher's exact test calculates the exact probability of observing the data under the null hypothesis of independence between the variables. It is particularly useful for analysing 2 x 2 contingency tables and is commonly applied in clinical research and other disciplines where small sample sizes can lead to unreliable results from other methods [24].
The core formula for fisher's exact test is given by:
where:
a,b,c and d are the frequencies in the 2 x 2 contingency table, which is given as
n is the total sample size, defined as n = a+b+c+d,
is combinations number that is given as
where ! denotes the factorial function.
where the table consists of counts a, b, c and d, representing the frequencies of occurrences for the respective categories.
Hypergeometric distribution: Assuming that the row and column variables are independent, the elements in the sample follow a hypergeometric distribution. The entries in the contingency table represent the sampling outcomes, while the hypergeometric distribution quantifies the probability of a specific combination occurring.
Combinations calculation: Fisher's exact test calculates the probabilities of all possible configurations of the contingency table and computes the overall significance based on the probabilities of each configuration. The probability of each possible combination is given by the hypergeometric distribution, considering the constraints of the specific table configuration.
p - value calculation: The total p - value is obtained by summing the probabilities of all combinations that are as extreme. If the computed p - value is less than the specified significance level a, then the null hypothesis of independence is rejected.
LSTMs are specialized types of Recurrent Neural Networks (RNNs) designed to effectively capture temporal dependencies and manage sequential data. Introduced by Hochreiter and Schmidhuber in 1997, LSTMs address the vanishing gradient problem commonly associated with traditional RNNs, enabling the modelling of long-range dependencies in sequences. The architecture of an LSTM unit comprises 3 primary gates: the input, forget, and output gates, respectively. These gates regulate the flow of information within the cell state and serve as a memory mechanism. The input gate determines which new information should be added to the cell state, the forget gate controls which information should be discarded, and the output gate determines which information to output based on the current cell state.
Figure 3 shows the calculation process of the LSTM. The forget gate is expressed as follows:
The input gate is represented by:
The output gate is denoted by:
where ft denotes the output of the forget gate, which ranges between 0 and 1; wt represents the weight matrix associated with the forget gate; bt represents the bias vector; σ denotes the sigmoid activation function by:
where wc represents the weight matrix, bc characterizes the bias vector, ht-1 indicates the hidden state from the previous time step, wi denotes the weight matrix for the input gate, bi represents the corresponding bias vector, and Ct-1 indicates the memory-cell state from the previous time step.
The framework of the proposed statistical test method was depicted as shown in figure 4 the proposed statistical method how to be used to test any new education method with relative functions.
The proposed distribution function of non-parametric statistical was given by:
The proposed multidimensional probability density function was given by:
where x,y,z are continuous random variable falls within a certain interval [a,b],[c,d],[e,f]. denote different function based on variables and data size.
The proposed multidimensional probability density function:
The proposed cumulative distribution function was given by:
Where F (x,y,z) is related to the proposed PDF and gives the probability that random variables x, y, z are less than or equal to themselves.
Figure 5 show difference between usual statistical functions and the proposed multidimensional statistical functions. Specifically it indicates the advantages of analyzing data from different dimensions based on the features of the data. All eigenvalues are sorted in descending order, and the eigenvectors corresponding to the top k largest eigenvalues are taken to form a projection matrix, which is Vk = [v1,v2,v3,...,vn ] Rd×k. The reduced dimensional data is Z = x'Vk, among them, Z ∈ Rn×k is the data after dimensionality reduction. Specifically advantages of the method are that the method reduce data dimensionality and decrease computational complexity, remove redundant features and improve model performance and no label information required, suitable for unsupervised learning. Shortcomings of the method are that PCA assumes that the data follows a linear structure and cannot effectively handle nonlinear data, the new feature after dimensionality reduction is a linear combination, which is difficult to directly explain its physical meaning and PCA is greatly affected by outliers in the data.
Proposed formula for nonlinear data in classification method:
where hθ(x) the data point, x is the probability that belongs to the positive class, and θ is the model parameter. WT is the normal vector, x is feature vector, b is the bias. δ denotes to deal with noise and indistinguishable problems, subject to
Assuming we want to evaluate whether a new teaching method is more effective than a traditional approach, we conduct an experiment involving two independent groups of students: one group receives instruction through traditional teaching methods, while the other group utilizes the new teaching method. Our objective is to compare the average scores of students in the final exam from these two groups to determine if the new teaching method significantly enhances student performance.
Traditional method group: Students in this group use the traditional teaching method, with their scores randomly generated from a specified normal distribution.
New method group: Students in this group employ the new teaching method, with their scores generated from another normal distribution.
t - test: We will utilize an independent two-sample t - test to compare the average scores of the two groups, with the hypothesis that the new teaching method results in higher average scores for students.
Figure 6 shows the distribution of scores between two groups for easy observation of median, interquartile range, and outliers in box plot. It can clearly display the central and discrete trends of data, helping to determine the distribution and potential outliers of the data. Histogram display the frequency distribution of two sets of scores, and overlay the kernel density estimation curve to help observe the shape and normality of the data. By combining the kernel density estimation curve, the distribution shape of the data can be displayed, which facilitates the verification of the normality of the hypothesized data and is crucial for t - tests. Mean bar chart displays the average score and standard error of each group, visually comparing the differences in mean values. By combining the standard error bar, the difference and uncertainty between the means of two groups can be visually displayed, thereby helping to evaluate the magnitude of the difference between groups.
Assuming we want to evaluate whether a new teaching method is more effective than a traditional approach, we conduct an experiment involving two independent groups of students: one group receives instruction through traditional teaching methods, while the other group utilizes the new teaching method. Our objective is to compare the attendance rate of students in the final exam from these two groups to determine if the new teaching method significantly enhances student performance.
Figure 7 shows the effects of traditional methods versus innovative strategies on attendance rates among various grade levels. The analysis aims to elucidate the impact of these differing approaches on student attendance, providing insights into their relative effectiveness. Specifically, stacked bar chart compares the distribution of different methods in different grade levels by the height and color of the bars, and observe whether there are significant proportional differences. Heat map show the frequency of the cross tabulation is displayed through color intensity, making it easy to visually observe which combinations have higher or lower frequencies of attendance. Mosaic chart displays the combined frequency of different grade levels and methods through the size of each rectangular area. The larger the area, the higher the frequency, presenting a clear proportional relationship.
The dataset contains 150 samples with three discrete numerical features (1-9) for researchers: Project count, Article count, and Funding amount, along with a 1-5 job title category and a randomly generated binary assessment outcome (Result). Since the outcome column lacks meaningful correlations with features, it is recommended to generate discriminative probabilities using a logistic function applied to a weighted composite of numerical features (e.g., value = 1 Project + 3 Article + 5*Funding), then derive the Result through binomial sampling. This approach creates structured data suitable for classification model training, job title performance analysis, or research resource allocation studies.
Figure 8 shows nine statistical plots that display various relationships among three variables: Project, Article, and Funding, along with Job title. These plots, through the combination of scatter and kernel density plots, illustrate the complex relationships among the variables of Project, Article, Funding, and Job Title. Different colours and symbols are used to distinguish different subsets of data, aiding in the analysis and understanding of data distribution and correlations. The Job Title was divided into 5 classes. 1 was related to assistant, 2 was related to Lecture, 3 was related to Senior Lecturer, 4 was related to Associate Professor, 5 was related to Professor.
Figure 9 demonstrates the distribution of the assessment results, which compare predicted and real outcomes based on 3 principal components, which are the research project, research funding, and research article. Specifically the simulation dataset was split into 80% for training, 10% for validation, and 10% for testing through stratified random sampling. The sampling method extracted ‘m’ samples from the dataset using Bootstrap Sampling to generate a sub-dataset.
Each sub-dataset was equal to the original dataset in size (i.e., m = n), but the samples can be repeated. On average, about 63.2% of the original samples were included in the sub-dataset, while the rest was 'Out of Bag Data' (OOB). Every time a tree grows, each split node is randomly selected as a subset. From d features is attained. A decision tree hb (x) is built using a dataset Db. Each tree used recursive splitting, argmaxθ as a split point. (θ) is information gain. The final Random Forest (RF) output was the ensemble of all tree predictions. The model was trained using backpropagation with a defined learning rate and batch size. Cross-validation techniques were employed to mitigate over fitting and enhance the model's prediction accuracy. Two results are obtained when the proposed method is applied. Third diagram illustrates age is positively correlated with overall research scores. The scores of vertical projects are higher, showing an exponential increase. The suggested model is a well-structured and theoretically solid regression tool, especially suitable for tasks that require consideration of target distribution characteristics and model interpretability.
The results of the simulation were evaluated by principle components analysis. Figure 10 shows the confusion matrix of different classification models. Specifically it indicated the different type’s precision of the proposed method was higher than the other models, the recall and F1-score of the proposed method were better than the other models. Figure 11 shows illustrates the significance of each feature in influencing model predictions. Notably, credit score and loan amount emerge as predominant factors affecting the outcomes. A confusion matrix is employed to delineate the accuracy of the model's classifications, detailing both the counts of correctly and incorrectly classified instances. The ROC curve serves as a measure of the overall performance of the classification model, with the AUC value approaching 1 indicative of superior model performance. It exhibits the correlation between credit score and loan default rates, revealing that individuals with lower credit scores exhibit a higher prevalence of default.
There are some aspects of the proposed method and the problems, which have be evaluated and addressed.
The first question involves how to improve p value in the statistical test method. P value was given by:
where X is the observation result, μ is the assumed parameter, σ is the estimated parameter value, and n is the sample size.
Figure 11 shows p values of nine different statistical test methods, which were involved in t - test, Chi-Square Test, Mann-Whitney U, KS Wilcoxon test, Kruskal Wallis, Fisher Exact Test, McNemar Test and the proposed method. There were some related worked with those statistical test method as shown in table 1.
Table 1 indicates the p - value is less than the preset significance level (such as 0.05), it means that the probability of observing the current data is low when the null hypothesis is true, and therefore there is sufficient evidence to reject the null hypothesis; On the contrary, if p - value is greater than or equal to the significance level, there is not enough evidence to reject the null hypothesis. Second problem relates to how to improve accuracy in the test algorithm. The proposed method included decision Tree and random forest algorithms model that increased accuracy of the statistical test method as detailed in table 2.
| Table 1: Comparison of different statistical methods. | |||
| Reference | Statistical test methods | p -value | Accuracy |
| [3,26] | t - test | < 0.05 | ≈ 0.8 |
| [26] | Chi-Square Test | ≈ 0.2081 | |
| [27] | Mann-Whitney U | ≈ 0.0002 | - |
| [26] | K-S Statistic | < 0.05 | |
| [28] | Wilcoxon test statistic | ≈ 0.0073 | |
| [29] | Kruskal-Wallis | < 0.05 | |
| [1] | Fisher Exact Test | ≈ 0.3391 | ≈ 0.8 |
| [2] | McNemar Test | ≈ 0.0339 | |
| KNN | ≈ 0.94 | ||
| Random forest | ≈ 0.96 | ||
| t - SNE | ≈ 0.93 | ||
| PCA | ≈ 0.95 | ||
| LSTM | ≈ 0.94 | ||
| Proposed method | < 0.05 | ≈ 0.95 | |
Table 2 summarizes the key differences between decision tree and random forest models, highlighting their respective strengths and weaknesses in various contexts. The traditional random forest (Table 2) relies on decision tree ensemble, but this method introduces joint optimization of probability density function (Formula 11) and cumulative distribution function (Formula 13), enhancing the statistical significance testing ability under high-dimensional data (Figures 9,10,12).
| Table 2: Comparison features. | ||
| Feature | Decision Tree | Random Forest |
| Generalization Ability | Prone to Overfitting | Strong Anti-Overfitting Capability |
| Training Speed | Fast | Slower |
| Prediction Accuracy | Moderate; Highly Affected by Noise | High; Robust Performance |
| Interpretability | High (Easy to Visualize) | Low (Black Box Model) |
| Applicable Scenarios | Simple and Intuitive Problems, Easily Interpretable Contexts | Complex Datasets with Significant Feature Interactions |
Figure 13 shows ROC Curve and AUC Value Analysis. The Area Under the Curve (AUC) for the random forest model is significantly higher than that of the decision tree, indicating a superior classification capability. This enhanced AUC value reflects the random forest's ability to distinguish between classes more effectively across various threshold settings [30-35].
The proposed non-parametric statistical functions offer a robust framework tailored for analysing nonlinear systems in educational management. By circumventing the assumptions inherent in traditional parametric methods, these functions enhance the ability to identify and interpret complex relationships within educational data, which often exhibit non-normal distributions and intricate interactions among variables. The framework established through these non-parametric functions facilitates the effective assessment of diverse educational metrics, enabling stakeholders to derive meaningful insights from various data types, including ordinal and nominal scales. In applying this framework within educational contexts, practitioners can more accurately evaluate the efficacy of teaching practices, identify factors influencing student performance, and inform policy decisions that impact educational outcomes. The integration of these non-parametric approaches supports data-driven decisions that account for the multifaceted nature of education systems, ultimately contributing to improved learning environments and increased student success. Thus, the introduction of these non-parametric statistical functions not only advances methodological rigor in educational research but also enhances the practical relevance of findings in real-world educational settings. As such, this approach represents a significant step forward in the quest for effective educational management and reform. Further applications include adaptive curriculum design via real-time feedback systems and interdisciplinary extensions to healthcare analytics.
SignUp to our
Content alerts.
This work is licensed under a Creative Commons Attribution 4.0 International License.
Are you the author of a recent Preprint? We invite you to submit your manuscript for peer-reviewed publication in our open access journal.
Benefit from fast review, global visibility, and exclusive APC discounts.