Malware Classification by Deep Learning Using Characteristics of Hash Functions

Takahiro Baba, Kensuke Baba, Toshihiro Yamauchi

Research output: Chapter in Book/Report/Conference proceedingConference contribution


As the Internet develops, the number of Internet of Things (IoT) devices increases. Simultaneously, the risk of IoT devices being infected with malware also increases. Thus, malware detection has become an important issue. Dynamic analysis logs are effective at detecting malware, but it takes time to collect a large amount of data because the malware must be executed at least once before the logs can be collected. Moreover, dynamic analysis logs are affected by external factors such as the execution environment. A malware detection method that uses a static property analysis log could solve these problems. In this study, deep learning (DL) was used as a machine learning method because DL is effective for large-scale data and can automatically extract features. Research has been conducted on malware detection using static properties of portable executable (PE) files, establishing that such detection is possible. However, research on malware detection using hash functions such as Fuzzy hash and peHash is lacking. Therefore, we investigated the characteristics of hash values in malware classification. Moreover, when the surface analysis log is viewed in chronological order, that the data are considered have concept drift characteristics. Therefore, we compared malware detection performance using data with the concept drift property. We found that the hash function could be used to prevent performance degradation even with concept drift data. In an experiment combining PE surface information and hash values, concept drift showed the highest performance for certain data.

Original languageEnglish
Title of host publicationAdvanced Information Networking and Applications - Proceedings of the 36th International Conference on Advanced Information Networking and Applications AINA-2022
EditorsLeonard Barolli, Farookh Hussain, Tomoya Enokido
PublisherSpringer Science and Business Media Deutschland GmbH
Number of pages12
ISBN (Print)9783030995867
Publication statusPublished - 2022
Event36th International Conference on Advanced Information Networking and Applications, AINA 2022 - Sydney, Australia
Duration: Apr 13 2022Apr 15 2022

Publication series

NameLecture Notes in Networks and Systems
Volume450 LNNS
ISSN (Print)2367-3370
ISSN (Electronic)2367-3389


Conference36th International Conference on Advanced Information Networking and Applications, AINA 2022


  • Deep learning
  • Fuzzy hash
  • Malware detection
  • PE file
  • peHash

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Signal Processing
  • Computer Networks and Communications


Dive into the research topics of 'Malware Classification by Deep Learning Using Characteristics of Hash Functions'. Together they form a unique fingerprint.

Cite this