Hadoop Based Wavelet Histogram for Big Data in Cloud


Jeong-Joon Kim, Journal of Information Processing Systems Vol. 13, No. 4, pp. 668-676, Aug. 2017  

10.3745/JIPS.04.0036
Keywords: Big data, histogram, MapReduce, wavelet
Fulltext:

Abstract

Recently, the importance of big data has been emphasized with the development of smartphone, web/SNS. As a result, MapReduce, which can efficiently process big data, is receiving worldwide attention because of its excellent scalability and stability. Since big data has a large amount, fast creation speed, and various properties, it is more efficient to process big data summary information than big data itself. Wavelet histogram, which is a typical data summary information generation technique, can generate optimal data summary information that does not cause loss of information of original data. Therefore, a system applying a wavelet histogram generation technique based on MapReduce has been actively studied. However, existing research has a disadvantage in that the generation speed is slow because the wavelet histogram is generated through one or more MapReduce Jobs. And there is a high possibility that the error of the data restored by the wavelet histogram becomes large. However, since the wavelet histogram generation system based on the MapReduce developed in this paper generates the wavelet histogram through one MapReduce Job, the generation speed can be greatly increased. In addition, since the wavelet histogram is generated by adjusting the error boundary specified by the user, the error of the restored data can be adjusted from the wavelet histogram. Finally, we verified the efficiency of the wavelet histogram generation system developed in this paper through performance evaluation.


Statistics
Show / Hide Statistics

Statistics (Cumulative Counts from November 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.




Cite this article
[APA Style]
Kim, J. (2017). Hadoop Based Wavelet Histogram for Big Data in Cloud. Journal of Information Processing Systems, 13(4), 668-676. DOI: 10.3745/JIPS.04.0036.

[IEEE Style]
J. Kim, "Hadoop Based Wavelet Histogram for Big Data in Cloud," Journal of Information Processing Systems, vol. 13, no. 4, pp. 668-676, 2017. DOI: 10.3745/JIPS.04.0036.

[ACM Style]
Jeong-Joon Kim. 2017. Hadoop Based Wavelet Histogram for Big Data in Cloud. Journal of Information Processing Systems, 13, 4, (2017), 668-676. DOI: 10.3745/JIPS.04.0036.