Feature Selection and Reduction in Happiness Index Analysis: A Systematic Literature Review
Abstract
This study investigates the role and effectiveness of feature selection and feature reduction techniques in improving the accuracy, validity, and efficiency of predictive models for survey-based happiness indices. A Systematic Literature Review (SLR) was conducted following the PRISMA 2020 protocol, evaluating 40 peer-reviewed articles published between 2020 and 2025. The results demonstrate that feature selection methods namely wrapper, filter, and embedded approaches can significantly enhance model performance, yielding higher coefficients of determination (R²) and lower prediction errors. Furthermore, the identification of relevant features has been shown to improve construct validity and the reliability of happiness indicators. The integration of feature selection and feature reduction techniques also contributes to more efficient and stable models, particularly in high-dimensional data contexts. However, the limited number of studies directly addressing happiness and the methodological heterogeneity across works pose challenges to the generalizability of the findings. This review provides valuable insights for establishing evidence-based practices and guiding strategic developments in future happiness index analytics
Downloads
References
T. Greyling and S. Rossouw, “Development and Validation of a Real-Time Happiness Index Using Google TrendsTM,” J Happiness Stud, vol. 26, no. 3, Mar. 2025, doi: 10.1007/s10902-025-00881-9.
J. Vendrow, J. Haddock, D. Needell, and L. Johnson, “Feature selection from lyme disease patient survey using machine learning,” Algorithms, vol. 13, no. 12, Dec. 2020, doi: 10.3390/a13120334.
M. Jamei, A. S. Mohammed, I. Ahmadianfar, M. M. S. Sabri, M. Karbasi, and M. Hasanipanah, “Predicting Rock Brittleness Using a Robust Evolutionary Programming Paradigm and Regression-Based Feature Selection Model,” Applied Sciences (Switzerland), vol. 12, no. 14, Jul. 2022, doi: 10.3390/app12147101.
P. Agrawal, H. F. Abutarboush, T. Ganesh, and A. W. Mohamed, “Metaheuristic algorithms on feature selection: A survey of one decade of research (2009-2019),” IEEE Access, vol. 9, pp. 26766–26791, 2021, doi: 10.1109/ACCESS.2021.3056407.
O. Manullang, C. Prianto, and N. H. Harani, “Analisis Sentimen Untuk Memprediksi Hasil Calon Pemilu Presiden Menggunakan Lexicon Based dan Random Forest,” Jurnal Ilmiah Informatika (JIF), vol. 11, no. 02, pp. 160–169, 2023, doi: https://doi.org/10.33884/jif.v11i02.7987.
R. Jain and W. Xu, “RHDSI: A novel dimensionality reduction based algorithm on high dimensional feature selection with interactions,” Inf Sci (N Y), vol. 574, pp. 590–605, Oct. 2021, doi: 10.1016/j.ins.2021.06.096.
H. Gunduz, “An efficient dimensionality reduction method using filter-based feature selection and variational autoencoders on Parkinson’s disease classification,” Biomed Signal Process Control, vol. 66, Apr. 2021, doi: 10.1016/j.bspc.2021.102452.
A. A. Aouragh, M. Bahaj, and F. Toufik, “Diabetes Prediction: Optimization of Machine Learning through Feature Selection and Dimensionality Reduction,” International journal of online and biomedical engineering, vol. 20, no. 8, pp. 100–114, May 2024, doi: 10.3991/ijoe.v20i08.47765.
H. B. Andrews, L. R. Sadergaski, and S. K. Cary, “Pursuit of the Ultimate Regression Model for Samarium(III), Europium(III), and LiCl Using Laser-Induced Fluorescence, Design of Experiments, and a Genetic Algorithm for Feature Selection,” ACS Omega, vol. 8, no. 2, pp. 2281–2290, Jan. 2023, doi: 10.1021/acsomega.2c06610.
T. Wang, “A combined model for short-term wind speed forecasting based on empirical mode decomposition, feature selection, support vector regression and crossvalidated lasso,” PeerJ Comput Sci, vol. 7, pp. 1–23, 2021, doi: 10.7717/peerj-cs.732.
B. H. Nguyen, B. Xue, and M. Zhang, “A survey on swarm intelligence approaches to feature selection in data mining,” Swarm Evol Comput, vol. 54, May 2020, doi: 10.1016/j.swevo.2020.100663.
D. Bender, D. J. Licht, and C. Nataraj, “A novel embedded feature selection and dimensionality reduction method for an SVM type classifier to predict periventricular leukomalacia (PVL) in neonates,” Applied Sciences (Switzerland), vol. 11, no. 23, Dec. 2021, doi: 10.3390/app112311156.
P. Dhal and C. Azad, “A comprehensive survey on feature selection in the various fields of machine learning,” Applied Intelligence, vol. 52, no. 4, pp. 4543–4581, Mar. 2022, doi: 10.1007/s10489-021-02550-9.
D. A. Otchere, T. O. A. Ganat, J. O. Ojero, B. N. Tackie-Otoo, and M. Y. Taki, “Application of gradient boosting regression model for the evaluation of feature selection techniques in improving reservoir characterisation predictions,” J Pet Sci Eng, vol. 208, Jan. 2022, doi: 10.1016/j.petrol.2021.109244.
M. R. Islam, A. A. Lima, S. C. Das, M. F. Mridha, A. R. Prodeep, and Y. Watanobe, “A Comprehensive Survey on the Process, Methods, Evaluation, and Challenges of Feature Selection,” IEEE Access, vol. 10, pp. 99595–99632, 2022, doi: 10.1109/ACCESS.2022.3205618.
M. Ashraf et al., “A Survey on Dimensionality Reduction Techniques for Time-Series Data,” IEEE Access, vol. 11, pp. 42909–42923, 2023, doi: 10.1109/ACCESS.2023.3269693.
Y. Liu, P. Pi, and S. Luo, “A semi-parametric approach to feature selection in high-dimensional linear regression models,” Comput Stat, vol. 38, no. 2, pp. 979–1000, Jun. 2023, doi: 10.1007/s00180-022-01254-z.
S. Çelik, B. Doğanlı, M. Ü. Şaşmaz, and U. Akkucuk, “Accuracy Comparison of Machine Learning Algorithms on World Happiness Index Data,” Mathematics, vol. 13, no. 7, Apr. 2025, doi: 10.3390/math13071176.
A. Stelmokienė and G. Jarašiūnaitė-Fedosejeva, “Is Leadership Position Related To More Social Inclusion, Happiness, And Satisfaction With Life? The Importance Of Power Distance Index,” Business: Theory and Practice, vol. 24, no. 1, pp. 148–159, Jan. 2023, doi: 10.3846/btp.2023.16705.
M. Li, H. Wang, L. Yang, Y. Liang, Z. Shang, and H. Wan, “Fast hybrid dimensionality reduction method for classification based on feature selection and grouped feature extraction,” Expert Syst Appl, vol. 150, Jul. 2020, doi: 10.1016/j.eswa.2020.113277.
C. H. Feng, M. L. Disis, C. Cheng, and L. Zhang, “Multimetric feature selection for analyzing multicategory outcomes of colorectal cancer: random forest and multinomial logistic regression models,” Laboratory Investigation, vol. 102, no. 3, pp. 236–244, Mar. 2022, doi: 10.1038/s41374-021-00662-x.
S. Li, J. Yu, H. Kang, and J. Liu, “Genomic Selection in Chinese Holsteins Using Regularized Regression Models for Feature Selection of Whole Genome Sequencing Data,” Animals, vol. 12, no. 18, Sep. 2022, doi: 10.3390/ani12182419.
J. J. A. Mendes Junior, M. L. B. Freitas, H. V. Siqueira, A. E. Lazzaretti, S. F. Pichorim, and S. L. Stevan, “Feature selection and dimensionality reduction: An extensive comparison in hand gesture classification by sEMG in eight channels armband approach,” Biomed Signal Process Control, vol. 59, May 2020, doi: 10.1016/j.bspc.2020.101920.
Y. Lyu, Y. Feng, and K. Sakurai, “A Survey on Feature Selection Techniques Based on Filtering Methods for Cyber Attack Detection,” Mar. 01, 2023, MDPI. doi: 10.3390/info14030191.
B. Zuhri and N. H. Harani, “Studi Literatur: Optimasi Algoritma Machine Learning Untuk Prediksi Penerimaan Mahasiswa Pascasarjana,” Jurnal Informatika dan Teknologi Komputer (J-ICOM), vol. 05, no. 01, pp. 01–10, 2024, doi: https://doi.org/10.55377/j-icom.v5i1.8074.
X. Chen, X. Zhu, Y. Lu, and Z. Pu, “Non-negative low-rank adaptive preserving sparse matrix regression model for supervised image feature selection and classification,” IET Image Process, vol. 17, no. 7, pp. 2056–2071, May 2023, doi: 10.1049/ipr2.12772.
F. Hassan et al., “A hybrid approach for intrusion detection in vehicular networks using feature selection and dimensionality reduction with optimized deep learning,” PLoS One, vol. 20, no. 2 February, Feb. 2025, doi: 10.1371/journal.pone.0312752.
S. Zhang, J. Zhao, J. Yang, J. Xie, and Z. Sun, “Feature Selection and Regression Models for Multisource Data-Based Soil Salinity Prediction: A Case Study of Minqin Oasis in Arid China,” Land (Basel), vol. 13, no. 6, Jun. 2024, doi: 10.3390/land13060877.
K. Kaushik, A. Bhardwaj, A. Aggarwal, and M. Kumar, “Enumerating happiness index during COVID-19 lockdowns using artificial intelligence techniques,” International Journal of Technology Management and Sustainable Development, vol. 22, pp. 35–52, May 2023, doi: 10.1386/tmsd_00066_1.
R. Allu and V. N. R. Padmanabhuni, “Convex Least Angle Regression Based LASSO Feature Selection and Swish Activation Function Model for Startup Survival Rate,” Cybernetics and Information Technologies, vol. 23, no. 4, pp. 110–127, Nov. 2023, doi: 10.2478/cait-2023-0039.
R. Wang et al., “State of Health Estimation for Lithium-Ion Batteries Using Enhanced Whale Optimization Algorithm for Feature Selection and Support Vector Regression Model,” Processes, vol. 13, no. 1, Jan. 2025, doi: 10.3390/pr13010158.
B. Ahadzadeh et al., “Improved binary differential evolution with dimensionality reduction mechanism and binary stochastic search for feature selection,” Appl Soft Comput, vol. 151, Jan. 2024, doi: 10.1016/j.asoc.2023.111141.
R. Zhu et al., “Well-Production Forecasting Using Machine Learning with Feature Selection and Automatic Hyperparameter Optimization,” Energies (Basel), vol. 18, no. 1, Jan. 2025, doi: 10.3390/en18010099.
C. Gakii, P. O. Mireji, and R. Rimiru, “Graph Based Feature Selection for Reduction of Dimensionality in Next-Generation RNA Sequencing Datasets,” Algorithms, vol. 15, no. 1, Jan. 2022, doi: 10.3390/a15010021.
M. Mokhtia, M. Eftekhari, and F. Saberi-Movahed, “Dual-manifold regularized regression models for feature selection based on hesitant fuzzy correlation,” Knowl Based Syst, vol. 229, Oct. 2021, doi: 10.1016/j.knosys.2021.107308.
A. R. Patil and S. Kim, “Combination of ensembles of regularized regression models with resampling-based lasso feature selection in high dimensional data,” Mathematics, vol. 8, no. 1, Jan. 2020, doi: 10.3390/math8010110.
D. Ueno, H. Kawabe, S. Yamasaki, T. Demura, and K. Kato, “Feature selection for RNA cleavage efficiency at specific sites using the LASSO regression model in Arabidopsis thaliana,” BMC Bioinformatics, vol. 22, no. 1, Dec. 2021, doi: 10.1186/s12859-021-04291-5.
B. Tang, Y. Wang, Y. Chen, M. Li, and Y. Tao, “A Novel Early-Stage Lung Adenocarcinoma Prognostic Model Based on Feature Selection With Orthogonal Regression,” Front Cell Dev Biol, vol. 8, Jan. 2021, doi: 10.3389/fcell.2020.620746.
W. K. Hong and T. D. Pham, “Reverse designs of doubly reinforced concrete beams using Gaussian process regression models enhanced by sequence training/designing technique based on feature selection algorithms,” Journal of Asian Architecture and Building Engineering, vol. 21, no. 6, pp. 2345–2370, 2022, doi: 10.1080/13467581.2021.1971999.
Y. Huang, Z. Shen, F. Cai, T. Li, and F. Lv, “Adaptive graph-based generalized regression model for unsupervised feature selection,” Knowl Based Syst, vol. 227, Sep. 2021, doi: 10.1016/j.knosys.2021.107156.
V. Moreido, B. Gartsman, D. P. Solomatine, and Z. Suchilina, “How well can machine learning models perform without hydrologists? Application of rational feature selection to improve hydrological forecasting,” Water (Switzerland), vol. 13, no. 12, Jun. 2021, doi: 10.3390/w13121696.
M. Mokhtia, M. Eftekhari, and F. Saberi-Movahed, “Feature selection based on regularization of sparsity based regression models by hesitant fuzzy correlation,” Applied Soft Computing Journal, vol. 91, Jun. 2020, doi: 10.1016/j.asoc.2020.106255.