z-logo
Premium
High performance deep learning techniques for big data analytics
Author(s) -
Li Maozhen
Publication year - 2018
Publication title -
concurrency and computation: practice and experience
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.309
H-Index - 67
eISSN - 1532-0634
pISSN - 1532-0626
DOI - 10.1002/cpe.5032
Subject(s) - big data , deep learning , computer science , data science , artificial intelligence , analytics , machine learning , data analysis , artificial neural network , data mining
The past few years have witnessed the momentum of big data, which continuously receives a growing effort from both the academia and industry. The challenge with big data is how to extract meaningful information and knowledge from it. Recently, deep learning1 as an advanced machine learning technique has been widely taken up by the research community due to its multi-layered structure and effectiveness in extracting low-level features. Therefore, it is critical to explore advanced and high performance deep learning techniques for big data analytics especially for heterogeneous big data analytics including the process of data acquisition, feature extraction and representation, time series data analysis, knowledge representation, and semantic modeling. This special issue aimed to solicit high quality research articles and reviews reflecting the advances in deep learning for big data analytics of a high volume, velocity, variety, and veracity. Potential topics include parallel deep neural networks for data analytics of high volumes, high performance deep neural networks for data stream analytics, semantic modeling in big data analytics, optimized architectural designs of deep neural networks, parameter tuning in deep neural networks, and distributed deep neural networks for big data analytics. We received a large number of submissions for this special issue and conducted a rigor review process. The papers to be included in this special bring a wide scope of topics related to deep learning. A portion of the papers included in this special issue focus on traditional machine learning. Xu et al2 present a hybrid interpretable model for fraud detection in user credit transactions. Yan et al3 optimize a neural network with genetic algorithms for finance early warning in the insurance sector. Based on Compressive Sensing Theory,4-6 Jin et al7 focus on adaptive learning for target tracking considering the fact that individual samples make different contributions in the learning process. Zhang et al8 present a transfer learning based online algorithm for multi-person tracking in surveillance videos. The algorithm takes into account the background information in the tracking inference and deals with object occlusions. It features online transfer learning by utilizing the extracted knowledge from the current state of the tracking targets for generating tracking decisions. Petri Nets are also used for supervised and unsupervised learning.9 In the works of Tian et al10 and Wang et al,11 Petri nets are optimized for learning activities from business processes. It is worth noting that a large portion of the papers included in this special issue target at deep learning using neural networks such as Convolutional Neural Networks (CNNs) and Long Short-Term Memory (LSTM) networks. Cheng et al12 research into data-driven pedestrian re-identification based on a hierarchical semantic representation. A CNN network is enhanced with semantic representations based on middle level attributes. Attribute features that contribute to the modeling not only encode the color and shape information of low-level features but also associate the semantics with the features. A deep CNN is introduced in the work of Zhao et al13 for drowsy student state detection, which combines with the AdaBoost face detection algorithm14 and PERCOLS drowsy judgment.15 Liu et al16 extract features from the EEG recordings based on a hybrid dimension feature reduction scheme for emotion detection. A total of 14 features are extracted and further reduced with the Principal Component Analysis (PCA) method. Li et al17 present a semantic enhanced convolution deep Boltzmann machine that combines the convolutional neural network model with the deep Boltzmann machine for image classification. Two hidden deep learning models are employed to extract image semantics, and the high-level semantics of an image is obtained by learning from an input image. CNNs have also been widely applied in video content analysis. Wang et al18 apply CNN networks for automatic recognition of human actions in surveillance videos. In addition to image classification and video content analysis, CNNs are applied in other areas such as natural language processing. For example, Yao et al19 apply CNN for sentence similarity computation. The incorporation of word embeddings in the CNN reduces the time in training compared with other neural networks such as LSTM networks and deep CNNs, and facilitate the process of dealing with sentences of any length. Similarly, word embeddings are also employed in the training process in the work of Yan et al20 for retrieval of word semantics. Wang et al21 apply deep neural networks such as CNN and LSTM to process EEG signals for brain computer interactions. Hadoop,22 as an open source implementation of the MapReduce model,23 has been widely employed in dealing with ever growing data sets by utilization of the resources of computer clusters. It has been widely realized that training deep neural networks such as CNN and LSTM models can be time consuming especially on large data sets. Li et al24 speed up the computation of the Hartley Transform by distributing the workload of computation into a number of computers using the Hadoop framework. Liu et al25 parallelize a gene expression programming algorithm for function mining in big data settings. Similarly, Zhang et al26 apply Hadoop in speeding up the computation in power load forecast by clustering the spatial-temporal load data. Shou and Li27 parallelize the process of local outlier detection in large data set summarization albeit Hadoop is not employed in this work.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here