Retrieval-Augmented Generation for Identifying ATT&CK Technique

Sheng-Shan Chen,
Kai-Siang Cao,
Chung-Kuan Chen,
Chin-Yu Sun,

Abstract


Cyber Threat Intelligence (CTI) analysis faces significant challenges due to the scale and complexity of threat data. Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) offer promising solutions; however, existing approaches often struggle with limited accuracy and hallucination. We propose an enhanced RAG framework that incorporates fine-tuned BERT embeddings for semantic retrieval and technique annotation, coupled with structured prompt generation to guide LLMs toward more precise and context-aware threat analysis. Compared with traditional encoder-only architectures, our framework substantially improves both accuracy and efficiency. Experiments conducted on the MITRE ATT&CK database and recent open-source threat reports demonstrate that our model achieves an F1-score of 0.93, outperforming state-of-the-art baselines including GPT-4 and LLaMA-3. These results highlight the potential of advanced RAG architectures to enable scalable, accurate, and trustworthy automated CTI analysis.


Citation Format:
Sheng-Shan Chen, Kai-Siang Cao, Chung-Kuan Chen, Chin-Yu Sun, "Retrieval-Augmented Generation for Identifying ATT&CK Technique," Communications of the CCISA, vol. 31, no. 3 , pp. 20-39, Aug. 2025.

Full Text:

PDF

Refbacks

  • There are currently no refbacks.





Published by Chinese Cryptology and Information Security Association (CCISA), Taiwan, R.O.C
CCCISA Editorial Office
E-mail: ccisa.editor@gmail.com