Data compression in multimedia databases

Pallavi Mahajan

Assistant Professor, RRMK Arya Mahila Mahavidyalaya,Pathankot

Neelam

Assistant Professor, RRMK Arya Mahila Mahavidyalaya,Pathankot

With the growth of internet technology, it allow user to utilize and spread multimedia data like image and videos. Here in this paper we survey the various research papers which describe about the techniques for processing multimedia data. For embellishing the approach we discuss the architecture of split and merge that are used for video encoding. This technique handles the input size of video file by using dynamic resource provisioning. The conservative approach utilizes the transcoding data and costly hardware because of high definition features. In this paper we purpose a new system that is fusion of various techniques. The technique will work on images and videos used on internet.

KEYWORDS: Hadoop Distributed File System (HDFS); Singular Value Decomposition (SVD)

INTRODUCTION

In Data compression, we deal with large amount of data requiring large amount of processing, storage and communication resources. In this project study, we design and implement the compression of an data which is based on MapReduce strategy and Hadoop to solve the above mentioned problems.

The intention to use MapReduce framework here is that it provides a specific programming model and a run time system for processing and creating large data sets amenable to various real-world task. MapReduce programming model is executed in two main steps called as mapping and reducing. Mapping and reducing are defined by mapper and reducer whereas mapping step receives input data sets and feeds each data element to mapper in the form of key and value pairs. In reducing step, all outputs from the mapper are processed and final result is generated by reducer using the merging process. HDFS (Hadoop Distributed File System) is used by Hadoop application which creates the replicas of blocks, distributes them on computed nodes throughout a cluster. This implementation is divided into two parts:

In first part we store the large amount of data into HDFS and second part deals with processing of data which is stored in HDFS using MapReduce and JAI (Java Advanced imaging) for compression data into target formats. In order to support these methods Imple-mentation of this work starts with taking multimedia data as input by using record Reader method of the class inputFormat. InputFormat transforms the data into sets of keys and values and further they are passed to mapper.

Mapper processes video data using JAI and compression module which compression the video data into specific formats suitable for smart-phones, pads and personal computers. After completing video data

compression, mapper send the result to outputFormat then RecordWriter method of outputFormat class writes the result as a file to HDFS.

RELATED WORK

[1] Authors used Singular Value Decomposition expresses image data in terms of number of eigen vectors depending upon the dimension of an image. The psycho visual redundancies in an image are used for compression.

Thus an image can be compressed without affecting the image quality. [2] Authors used The MSE and compression ratio as thresholding, parameters for reconstruction. SVD is applied on variety of images for experimentation. The work is concentrated to reduce the number of eigen values required to reconstruct an image. [3] Authors used we an algorithm for data compression, called jbit encoding (JBE). This algorithm manipulates each bit of data inside file to minimize the size without losing any data after decoding which is classified to lossless compression. This basic algorithm is intended to be combining with other data compression algorithms to optimize the compression ratio. The performance of this algorithm is measured by comparing combination of different data compression algorithms. [4]

Authors used Run-length encoding (RLE) is one of basic technique for data compression. The idea behind this approach is this: If a data item d occurs n consecutive times in the input stream, replace the n occurrences with the single pair nd RLE is mainly used to compress runs of the same byte. This approach is useful when repetition often occurs inside data.

PROPOSED ALGORITHM

• Design Considerations:

• Image/video file is uploaded.

• Distribution of Frames among different parallel systems properly.

• SVD algorithm is used for compression.

• Merge all the frame sequentially.

• If it is video file then Combine video and audio file in proper way.

• Final compressed file obtained as a result.

• Description of the Proposed Algorithm:

Aim of the proposed algorithm is to design a system which will compress high definition videos and images using Hadoop distributed file system and map-reduce. The proposed algorithm is consists of four main steps.

DATA COMPRESSION VIA SVD

A. Data Matrix

Consider, for example, that measurements from m different meters, taken over t time instants, need to be transmitted through the communications network of a smart distribution system to serve as inputs for a given application. Let this set of measurements be put in the form of a matrix X, with each row of X containing the measurements taken from a given meter at each time instant.

B. Data Compression

Data matrix X can be factorized into three matrices by applying the SVD. In order to simplify the notation, matrix VT is hereafter written as just V

X(m×t) = U(m×m) (m×t)V(t×t)

where diagonal matrix contains the SVs of X, ordered from the highest to the lowest. Data compression can be achieved by taking advantage of the fact that many matrices occurring in practice do exhibit some kind of structure that leads to only a few SVs actually being non-negligible. In such cases, good approximation of matrix X can be obtained by keeping only the SVs found to be significant in matrix . Assume that r SVs are to be retained in and let matrix XR denotes the approximated matrix X.

C. Compression Ratio

The extent of compression achieved by a coding scheme can be measured by a CR. The term CR has been defined in several ways in the literature. In many contexts, the CR is computed by dividing the size of the original data by the size of the compressed data. A CR = 4, for example, means that the data has been compressed with the ratio 4:1.

Alternatively, it can be said that the volume of the compressed data is 25% of the original data. In this paper, the CR is computed, which expresses the ratio between the total number of elements in the original matrix X (measurements) and the total number of elements in the submatrices that are needed to compute matrix XR

CR= m × t

--------------

(m + t + 1) × r

.

D. Loss of Information

As discussed, lossy compression methods can be very effective for data compression, but this comes with a cost, which is the loss of information that will not be retrieved when the original data is reconstructed. Then, data compression should be carried out in a way that a good tradeoff between the CR and loss of information is achieved. In other words, data compression should not result in loss of information that renders the reconstructed matrix of limited use to the applications that would employ it as input data. In this paper, the loss of information is measured in terms of the mean absolute error (MAE) and the mean percentage error (MPE) observed when comparing the reconstructed data matrix with the original one.

PSEUDO CODE

Step 1: Form data matrix X

Step 2: Perform SVD to obtain matrices U, E, and V.

Step 3: Based on a value of r chosen to achieve a given CR,

Step 4: Reconstruct matrix X by computing XR.

Step 5: Evaluate the loss of information by computing MAE and MPE respectively.

Step 6: End.

CONCLUSION AND FUTURE WORK

In this project, we tend to compress any high definition video uploaded by a user. The video data needs to be stored in a database, so to reduce the storage problem we use Hadoop Distributed File System(HDFS). This project is useful for the compa-nies working with huge amount of multimedia data. This project will be helpful in saving buffering time and providing satisfactory results to the user while watching the video. We use Parallel Distributed environment by establishing the nodes on our laptops. By using this environment, the data is distributed and is processed in parallel manner thus resulting in achieving the end state(compressed video) in less time. The proposed module is based on Hadoop HDFS and the MapReduce framework for distributed parallel processing of large-scale video data. We redesigned and implemented Input Format and Output Format in the MapReduce framework for image data. We used the JAI library for converting the video format and compressing the videos.

REFERENCES

[1] Book Pressman, Tom White, “Hadoop: The Definitive Guide” 4th Edition, year = 2015, isbn = 978-1-491-90163-2, 1491901632, 9781491901632, publisher = McGraw-Hill, Inc., address = New York, NY, USA

[2] A. Ghassemi, S. Bavarian, and L. Lampe, “Cognitive radio for smart grid communications,” in Proc. 1st IEEE Int. Conf. Smart Grid Commun.,Gaithersburg, MD, USA, 2010, pp. 297–302.

[3] G. T. Heydt, “The next generation of power distribution systems,” IEEE Trans. Smart Grid, vol. 1, no. 3, pp. 225–335, Dec. 2010.

[4] G. Artale et al., “Medium voltage smart grid: Experimental analysis of secondary substation narrow band power line communication,”IEEE Trans. Instrum. Meas., vol. 62, no. 9, pp. 2391–2398, Sep. 2013.

[5] K. Shvachko, H. Kuang, S. Radia and R. Chansler, The Hadoop distributed file system, in Proc. 2846 Kim et al.: A Hadoop-based Multimedia Transcoding System for Processing Social Media of 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies, pp.1-10, May. 2010.

[6] A. Ghassemi, S. Bavarian, and L. Lampe, “Cognitive radio for smart grid communications,” in Proc. 1st IEEE Int. Conf. Smart Grid Commun.,Gaithersburg, MD, USA, 2010, pp. 297–302.

[7] Ishfaq Ahmad, Xiaohui Wei, Yu Sun, Student Member, and Ya-Qin Zhang “Video Transcoding: An Overview of Various Techniques and Research Issues” (2005, IEEE).

[8] Myoungjin Kim1, Seungho Han1, Yun Cui1, Hanku Lee1,* and Changsung Jeong2. “A Hadoop-based Multimedia Transcoding System for Processing Social Media in the PaaS Platform of SMCCSE”

* * * * *

## No comments:

## Post a Comment