Abstract
Corporate environmental information disclosure manipulation (EIDM) has a high level of concealment, which brings great challenges to the identification and judgment of manipulation behavior. Compared to traditional methods, machine learning techniques excel in handling large and complex datasets while achieving higher accuracy. This research applies machine learning techniques to construct the identification model of EIDM behavior and carry out the identification research of EIDM behavior. Based on the “public pressure” theory, the detection indicators will be improved from three aspects: public pressure, corporate governance, and financial indicators. By combining the collected environmental pollution penalty cases of Chinese listed companies from 2011 to 2020 with a pressure pool indicator system, we establish a training set and a test set to compare the identification ability of the logistic regression (LR), decision tree (DT), Support Vector Machine (SVM), Backpropagation (BP) Neural Network, and random forest (RF) models. Additionally, during the initial phase of model training, hyperparameter tuning is conducted across these models to ensure the maximization of their performance. For imbalanced data, after comparing the two oversampling techniques of the Borderline synthetic minority oversampling technique (Borderline SMOTE) and adaptive synthetic sampling (ADASYN), our study indicates that the Borderline SMOTE model has a better recognition effect than ADASYN and that the Borderline SMOTE-RF model is superior to the LR, DT, BP, and SVM models. We hope that our research can provide a reference for regulatory authorities, accelerate the improvement of the mandatory environmental information disclosure (EID) system of listed companies, improve the identification and early warning capabilities of EIDM, and promote the improvement of EID quality. © The Author(s), under exclusive licence to Springer Nature B.V. 2024.
| Original language | English |
|---|---|
| Number of pages | 40 |
| Journal | Environment, Development and Sustainability |
| Online published | 11 Sept 2024 |
| DOIs | |
| Publication status | Online published - 11 Sept 2024 |
| Externally published | Yes |
Research Keywords
- Adaptive synthetic sampling
- Borderline synthetic minority oversampling technique
- Environmental information disclosure manipulation
- Machine learning
- Random Forest model
Fingerprint
Dive into the research topics of 'Identifying environmental information disclosure manipulation behavior via machine learning'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver