Award Date

August 2015

Degree Type

Thesis

Degree Name

Master of Science in Computer Science

Department

Computer Science

First Committee Member

Kazem Taghva

Second Committee Member

Kazem Taghva

Third Committee Member

Ajoy Datta

Fourth Committee Member

Laxmi Gewali

Fifth Committee Member

Ebrahim Saberinia

Number of Pages

63

Abstract

Abstract In this thesis, we present a summary of our activities associated with the storage and query processing of Google 1T 5-gram data set. We rst give a brief introduction to some of the implementation techniques for the relational algebra followed by a Map Reduce implementation of the same operators. We then implement a database schema in Hive for the Google 1T 5-gram data set.

The thesis will further look into the query processing with Hive and Pig in the Hadoop setting.

More specially, we report statistics for our queries in this environment.

Keywords

Hive

Disciplines

Computer Sciences

Language

English


Share

COinS