Data mining based investigation of the impact of imbalanced dataset over fractured zone detection
Accuracy, Classifier, Fractured Reservoirs, Random Forest, Support Vector Machine. -
Several studies have been conducted in recent years to discriminate between fractured (FZs) and non-fractured zones (NFZs) in oil wells. These studies have applied data mining techniques to petrophysical logs (PLs) with generally valuable results; however, identifying fractured and non-fractured zones is difficult because imbalanced data is not treated as balanced data during analysis. We studied the importance of using balanced data to detect fractured zones using PLs. We used Random-Forest and Support Vector Machine classifiers on eight oil wells drilled into a fractured carbonite reservoir to study PLs with imbalanced and balanced datasets, then validated our results with image logs. A significant difference between accuracy and precision indicates imbalanced data with fractured zones categorized as the minor class. The results indicated that the accuracy of imbalanced and balanced datasets is similar, but precision is significantly improved by balancing, regardless of how low or high the calculated indices might be.
date: 2021-05-25
