Skip to content

Commit 8a19652

Browse files
author
John Sichi
committed
HIVE-2598. Update README.txt file to use description from wiki
(Carl Steinbach via jvs) git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1203885 13f79535-47bb-0310-9956-ffa450edef68
1 parent 4910f33 commit 8a19652

File tree

1 file changed

+24
-11
lines changed

1 file changed

+24
-11
lines changed

README.txt

+24-11
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,27 @@
1-
Apache Hive @VERSION@
2-
=================
3-
4-
Apache Hive is a data warehouse system for Hadoop that facilitates
5-
easy data summarization, ad-hoc querying and analysis of large
6-
datasets stored in Hadoop compatible file systems. Hive provides a
7-
mechanism to put structure on this data and query the data using a
8-
SQL-like language called HiveQL. At the same time this language also
9-
allows traditional map/reduce programmers to plug in their custom
10-
mappers and reducers when it is inconvenient or inefficient to express
11-
this logic in HiveQL.
1+
Apache Hive (TM) @VERSION@
2+
======================
3+
4+
The Apache Hive (TM) data warehouse software facilitates querying and
5+
managing large datasets residing in distributed storage. Built on top
6+
of Apache Hadoop (TM), it provides:
7+
8+
* Tools to enable easy data extract/transform/load (ETL)
9+
10+
* A mechanism to impose structure on a variety of data formats
11+
12+
* Access to files stored either directly in Apache HDFS (TM) or in other
13+
data storage systems such as Apache HBase (TM)
14+
15+
* Query execution via MapReduce
16+
17+
Hive defines a simple SQL-like query language, called QL, that enables
18+
users familiar with SQL to query the data. At the same time, this
19+
language also allows programmers who are familiar with the MapReduce
20+
framework to be able to plug in their custom mappers and reducers to
21+
perform more sophisticated analysis that may not be supported by the
22+
built-in capabilities of the language. QL can also be extended with
23+
custom scalar functions (UDF's), aggregations (UDAF's), and table
24+
functions (UDTF's).
1225

1326
Please note that Hadoop is a batch processing system and Hadoop jobs
1427
tend to have high latency and incur substantial overheads in job

0 commit comments

Comments
 (0)