..

生物识别与生物统计学杂志

体积 13, 问题 12 (2022)

研究

K−Nearest Neighbours and K−Fold Cross Validation for Big Data of Covid 19

Kuntoro Kuntoro*

The most popular model in machine learning is K-Nearest Neighbours (KNN). It is used for solving classification. Moreover, K- Fold Crossvalidation is an important tool for assessing the performance of machine learning in doing KNN algorithm given available data. Compared to traditional statistical methods, both algorithms are effective to be implemented in big data. A supervised machine learning approach using KNN and K- Fold Cross- Validation algorithms is implemented in this study. For learning process, data of covid 19 is obtained from website. Four predictors such as new case, reproduction rate, new case in ICU, and hospitalized new case are selected to predict the target, new cases will be alive or will die. After cleaning process, 13,223 of 132,645 data sets are selected. This is considered as original data sets. When K-Fold Cross-Validation is executed by Python showing User Warning, the original data sets are replicated to be 264,441 data sets. This is considered as replicated data sets. Performance of KNN algorithm in predicting the target using original data sets shows lower accuracy than that using replicated data sets (75% vs. 92%). The number of members (K) using original data sets is lower than that using replicated data sets (7 vs. 12). Performance of K-Fold Cross-Validation using original data sets shows very small mean accuracy than that using replicated data sets (0.054 vs. 0.998). In using replicated data sets, mean accuracy shows consistent value until 5 splits while in using original data sets mean accuracy only shows in 2 splits. In using big data from various sources, it is recommended to implement appropriate Python libraries which can remove not a number (nan) and messy record effectively. It is also recommended to develop combine and comprehensive algorithm of KNN and K-Fold Cross-Validation.

研究

Socioeconomic Inequalities and Factors Contributing to Under-Five Mortality in Uttar Pradesh: A Decomposition Analysis

Neha Mishra* and Sheela Mishra

Background: Childhood mortality in India has declined substantially in during last three decades (1992-2021) from 119 to 42 per 1000 live births. However, this decline does not necessarily imply reduction in the inequalities which remains both in accesses to quality care and health outcomes among under-five children in Uttar Pradesh (India).

Objective: To estimate and quantify the prevailing socio-economic inequalities contributing to Under-five mortality in Uttar Pradesh along with the temporal trends over 2005–2021.

Methods: The last three rounds of National Family Health Survey (NFHS) were used to estimate and quantify the socioeconomic inequalities and factors contributing in the under-five mortalities by using concentration indexes (CI), concentration curves (CCs) and decomposition analysis.

Results: It was observed that during the period 2019-21 and 2015-16, high concentration of socio-economic inequalities for U5MR among women of age 35 years or more, had primary education, and belonged to Scheduled caste/tribe and Hindus. While during the period 2005-06, high concentration of inequalities was found among women of age 25-34 years, belonged to SC/ST and OBC caste groups, and among Hindus. Overall, mother’s education and place of residence mostly explained the U5MR inequality in all three time periods. Conclusion: The findings suggest that more efforts are needed in the state of Uttar Pradesh to narrow the income related U5MR inequalities. An effective way to reduce inequality is not only to reduce the gap of income but also focus should be made on increasing the level of education of mothers as educational attainment is critical in imparting the feelings of self-worth and confidence which are critical in bringing the changes in health-related behaviour.

索引于

相关链接

arrow_upward arrow_upward