Classification of Remote Sensing Image on Vision Transformers and Transfer Learning
Main Article Content
Abstract
Remote sensing pictures are useful sources of earth observation data because they enable the measurement and monitoring of detailed features on the Earth's surface. Advances in Earth observation technology have increased the need for intelligent earth observation approaches. The ability to appropriately categorize the accompanying photos with established semantic categories is a crucial problem in comprehending these images. considerable internal differences, a high degree of resemblance, considerable object/stage size disparities, and several humdrum existences are all challenges in categorizing remote sensing photos. The suggested technique for identifying remote sensing pictures is based on Vision Transformers and captures long-term relationships between patches via the attention module. We look at how various batch sizes affect the performance of the proposed approach. When a larger batch size is given to the model, it learns quicker than when a smaller batch size is given. However, larger batch sizes necessitated the use of more resources. Furthermore, if you provide the model with a larger batch size, the model's performance may suffer. In our case, the accuracy is 69.26% when we use a batch size of 200; however, when I use a bigger batch size, such as 1000, the accuracy is 62.50%. Performance is reduced by the model by around 7%. The experimental results show that the proposed technique performs better in terms of classification accuracy than current approaches. The study also demonstrates that combining data expansion strategies may enhance classification accuracy even more.