Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

The relational database model gained popularity and many proponents during the 1

ID: 404059 • Letter: T

Question

The relational database model gained popularity and many proponents during the 1980's and 1990's, a period of time in which IT organizations were seeking to consolidate data from multiple individual information systems into a single data repository which could then be accessed by all of the company's application systems.  Keep in mind that this was before the trend toward Enterprise Information Systems (EIS) in which the applications themselves were also brought together under a single umbrella in a comprehensive system such as SAP as done.  Relational structures were widely discussed and analyzed, raising relational methodologies to a high degree of development, and almost all major systems included relational database technology.

Since that time, other database models were introduced, most notably the Object Oriented Database Model, which provides an excellent tools for modern businesses.

However, it seems that, up to the present time, the relational model continues as the most prominently used database management tool.  Any thoughts about why companies have not moved in significant numbers toward new database models?

Explanation / Answer

Traditionally, data warehouses have been implemented through relational

databases, particularly those optimized for a specic workload known as online

analytical processing (OLAP). A number of vendors oer parallel databases,

but customers nd that they often cannot cost-eectively scale to the crushing

amounts of data an organization needs to deal with today. Parallel databases

are often quite expensive|on the order of tens of thousands of dollars per

terabyte of user data. Over the past few years, Hadoop has gained popularity

as a platform for data-warehousing. Hammerbacher [68], for example, discussed

Facebook's experiences with scaling up business intelligence applications with

Oracle databases, which they ultimately abandoned in favor of a Hadoop-based

solution developed in-house called Hive (which is now an open-source project).

Pig [114] is a platform for massive data analytics built on Hadoop and capable of

handling structured as well as semi-structured data. It was originally developed

by Yahoo, but is now also an open-source project.

Given successful applications of Hadoop to data-warehousing and complex

analytical queries that are prevalent in such an environment, it makes sense to

examine MapReduce algorithms for manipulating relational data. This section

focuses specically on performing relational joins in MapReduce. We should

stress here that even though Hadoop has been applied to process relational

data, Hadoop is not a database. There is an ongoing debate between advocates of parallel databases and proponents of MapReduce regarding the merits

of both approaches for OLAP-type workloads. Dewitt and Stonebraker, two

107