码迷,mamicode.com
首页 > Web开发 > 详细

Apache MRQL——Apache又一开源孵化利器

时间:2015-05-25 16:47:38      阅读:176      评论:0      收藏:0      [点我收藏+]

标签:distributed   hadoop   mrql   bigdata   

MRQL is a query processing and optimization system for large-scale, distributed data analysis, built on top of Apache Hadoop, Hama, Spark, and Flink.

MRQL (pronounced miracle) is a query processing and optimization system for large-scale, distributed data analysis. MRQL (the MapReduce Query Language) is an SQL-like query language for large-scale data analysis on a cluster of computers. The MRQL query processing system can evaluate MRQL queries in four modes:

  1. in Map-Reduce mode using Apache Hadoop,
  2. in BSP mode (Bulk Synchronous Parallel mode) using Apache Hama,
  3. in Spark mode using Apache Spark, and
  4. in Flink mode using Apache Flink.

The MRQL query language is powerful enough to express most common data analysis tasks over many forms of raw in-situ data, such as XML and JSON documents, binary files, and CSV documents. MRQL is more powerful than other current high-level MapReduce languages, such as Hive and PigLatin, since it can operate on more complex data and supports more powerful query constructs, thus eliminating the need for using explicit MapReduce code. With MRQL, users are able to express complex data analysis tasks, such as PageRank, k-means clustering, matrix factorization, etc, using SQL-like queries exclusively, while the MRQL query processing system is able to compile these queries to efficient Java code.

mrql source code

Apache MRQL——Apache又一开源孵化利器

标签:distributed   hadoop   mrql   bigdata   

原文地址:http://blog.csdn.net/wzhg0508/article/details/45969669

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!