In object-oriented software development, a plethora of studies have been carried out to present the application of machine learning algorithms for fault prediction. Furthermore, it has been empirically validated that an ensemble method can improve classification performance as compared to a single classifier. But, due to the inherent differences among machine learning and data mining approaches, the classification performance of ensemble methods will be varied. In this study, we investigated and evaluated the performance of different ensemble methods with itself and base-level classifiers, in predicting the faults proneness classes. Subsequently, we used three ensemble methods AdaboostM1, Vote and StackingC with five base-level classifiers namely Naivebayes, Logistic, J48, VotedPerceptron and SMO in Weka tool. In order to evaluate the performance of ensemble methods, we retrieved twelve datasets of open source projects from PROMISE repository. In this experiment, we used k-fold (k=10) cross-validation and ROC analysis for validation. Besides, we used recall, precision, accuracy, F-value measures to evaluate the performance of ensemble methods and base-level Classifiers. Finally, we observed significant performance improvement of applying ensemble methods as compared to its base-level classifier, and among ensemble methods we observed StackingC outperformed other selected ensemble methods for software fault prediction.