The relational database model gained popularity and many proponents during the 1
ID: 404059 • Letter: T
Question
The relational database model gained popularity and many proponents during the 1980's and 1990's, a period of time in which IT organizations were seeking to consolidate data from multiple individual information systems into a single data repository which could then be accessed by all of the company's application systems. Keep in mind that this was before the trend toward Enterprise Information Systems (EIS) in which the applications themselves were also brought together under a single umbrella in a comprehensive system such as SAP as done. Relational structures were widely discussed and analyzed, raising relational methodologies to a high degree of development, and almost all major systems included relational database technology.
Since that time, other database models were introduced, most notably the Object Oriented Database Model, which provides an excellent tools for modern businesses.
However, it seems that, up to the present time, the relational model continues as the most prominently used database management tool. Any thoughts about why companies have not moved in significant numbers toward new database models?
Explanation / Answer
Traditionally, data warehouses have been implemented through relational
databases, particularly those optimized for a specic workload known as online
analytical processing (OLAP). A number of vendors oer parallel databases,
but customers nd that they often cannot cost-eectively scale to the crushing
amounts of data an organization needs to deal with today. Parallel databases
are often quite expensive|on the order of tens of thousands of dollars per
terabyte of user data. Over the past few years, Hadoop has gained popularity
as a platform for data-warehousing. Hammerbacher [68], for example, discussed
Facebook's experiences with scaling up business intelligence applications with
Oracle databases, which they ultimately abandoned in favor of a Hadoop-based
solution developed in-house called Hive (which is now an open-source project).
Pig [114] is a platform for massive data analytics built on Hadoop and capable of
handling structured as well as semi-structured data. It was originally developed
by Yahoo, but is now also an open-source project.
Given successful applications of Hadoop to data-warehousing and complex
analytical queries that are prevalent in such an environment, it makes sense to
examine MapReduce algorithms for manipulating relational data. This section
focuses specically on performing relational joins in MapReduce. We should
stress here that even though Hadoop has been applied to process relational
data, Hadoop is not a database. There is an ongoing debate between advocates of parallel databases and proponents of MapReduce regarding the merits
of both approaches for OLAP-type workloads. Dewitt and Stonebraker, two
107