Abstract: The paper introduces VATMAN (Video-Audio-Text Multimodal Abstractive summarizatioN), a novel approach for generating hierarchical multimodal summaries utilizing Trimodal Hierarchical ...