Abstract: An important challenge that existing work has yet to address is the relatively small differences in audio representations compared with the rich content provided by remote sensing (RS) ...