In this paper, a new deep rule-based approach using high-level ensemble feature descriptor is proposed for aerial scene classification. By creating an ensemble of three pre-trained deep convolutional neural networks for feature extraction, the proposed approach is able to extract more discriminative representations from the local regions of aerial images. With a set of massively parallel IF...THEN rules built upon the prototypes identified through a self-organizing, nonparametric, transparent and highly human-interpretable learning process, the proposed approach is able to produce the state-of-the-art classification results on the unlabeled images outperforming the alternatives. Numerical examples on benchmark datasets demonstrate the strong performance of the proposed approach.