FOI has published the report Introduction to Multimodal Models, which provides an overview of the latest developments in multimodal neural networks. These advanced AI models combine data from multiple sources, such as text, image, and sound, to create a more accurate representation of complex situations. The technology has potential for military applications but also raises ethical questions, the agency writes in a press release.

– The strength of multimodal models lies in their ability to evaluate different types of information simultaneously. This makes them promising for managing complex environments, such as on the battlefield, but they are still resource-intensive, says Edward Tjörnhammar, researcher at FOI and one of the report's authors, in the press release.

The report highlights areas of use such as the analysis of satellite images, battlefield sounds, and geopositioning. At the same time, the development raises questions about how autonomous decisions in weapon systems and information management affect morality and ethics within the military.

– More autonomous decisions in the decision chain mean that fewer moral decisions are made by humans. This is a central challenge, says Tjörnhammar.

FOI emphasises that commercial solutions often pave the way for military applications but stresses the importance of integrating moral considerations into the development of AI for defence. The report also warns of potential misuse of the technology, something that requires a democratic conversation about its role in society.

The report concludes by stating that multimodal AI models are likely to profoundly impact both everyday life and the future of defence.

Links: