par Sisiaridis, Dimitrios ;Markowitch, Olivier
Référence Advances in intelligent systems and computing, 733, page (310-321)
Publication Publié, 2018
Article révisé par les pairs
Résumé : Feature extraction is the first task of pre-processing input logs in order to detect cybersecurity threats and attacks while utilizing machine learning. When it comes to the analysis of heterogeneous data derived from different sources, this task is found to be time-consuming and difficult to be managed efficiently. In this paper we present an approach for handling feature extraction for security analytics of heterogeneous data derived from different network sensors. The approach is implemented in Apache Spark, using its python API, named pyspark.