博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
【03】把 Elasticsearch 当数据库使:简单指标
阅读量:6198 次
发布时间:2019-06-21

本文共 6200 字,大约阅读时间需要 20 分钟。

使用 可以用 SQL 进行 elasticsearch 的查询。今天需要做一些最简单的聚合查询

COUNT(*)

SQL

$ cat << EOF | ./es_query.py http://127.0.0.1:9200select count(*) from quote                                           EOF
{"count(*)": 20994400}

Elasticsearch

{  "aggs": {},   "size": 0}
{  "hits": {    "hits": [],     "total": 20994400,     "max_score": 0.0  },   "_shards": {    "successful": 1,     "failed": 0,     "total": 1  },   "took": 26,   "timed_out": false}

这个就不算聚合,只是看了一下最终满足过滤条件的 total hits count。

COUNT(ipo_year)

这个和 COUNT(*) 的区别是 COUNT(ipo_year) 要求字段必须有值才算一个。

$ cat << EOF | ./es_query.py http://127.0.0.1:9200select count(ipo_year) from symbolEOF
{"count(ipo_year)": 2898}

Elasticsearch

{  "aggs": {    "count(ipo_year)": {      "value_count": {        "field": "ipo_year"      }    }  },   "size": 0}
{  "hits": {    "hits": [],     "total": 6714,     "max_score": 0.0  },   "_shards": {    "successful": 1,     "failed": 0,     "total": 1  },   "took": 55,   "aggregations": {    "count(ipo_year)": {      "value": 2898    }  },   "timed_out": false}

Profile

[  {    "query": [      {        "query_type": "MatchAllDocsQuery",        "lucene": "*:*",        "time": "0.3204170000ms",        "breakdown": {          "score": 0,          "create_weight": 10688,          "next_doc": 278660,          "match": 0,          "build_scorer": 31069,          "advance": 0        }      }    ],    "rewrite_time": 2279,    "collector": [      {        "name": "MultiCollector",        "reason": "search_multi",        "time": "2.957183000ms",        "children": [          {            "name": "TotalHitCountCollector",            "reason": "search_count",            "time": "0.2319240000ms"          },          {            "name": "ValueCountAggregator: [count(ipo_year)]",            "reason": "aggregation",            "time": "1.999916000ms"          }        ]      }    ]  }]

这是我们的第一个聚合例子。可以从profile结果看出来,其实现方式在采集文档的时候加上了ValueCountAggregator统计了字段非空的文档数量。

COUNT(DISTINCT ipo_year)

SQL

$ cat << EOF | ./es_query.py http://127.0.0.1:9200select count(distinct ipo_year) from symbolEOF
{"count(distinct ipo_year)": 39}

Elasticsearch

{  "aggs": {    "count(distinct ipo_year)": {      "cardinality": {        "field": "ipo_year"      }    }  },   "size": 0}
{  "hits": {    "hits": [],     "total": 6714,     "max_score": 0.0  },   "_shards": {    "successful": 1,     "failed": 0,     "total": 1  },   "took": 24,   "aggregations": {    "count(distinct ipo_year)": {      "value": 39    }  },   "timed_out": false}

Profile

[  {    "query": [      {        "query_type": "MatchAllDocsQuery",        "lucene": "*:*",        "time": "0.2033600000ms",        "breakdown": {          "score": 0,          "create_weight": 7501,          "next_doc": 162905,          "match": 0,          "build_scorer": 32954,          "advance": 0        }      }    ],    "rewrite_time": 2300,    "collector": [      {        "name": "MultiCollector",        "reason": "search_multi",        "time": "2.438386000ms",        "children": [          {            "name": "TotalHitCountCollector",            "reason": "search_count",            "time": "0.2240230000ms"          },          {            "name": "CardinalityAggregator: [count(distinct ipo_year)]",            "reason": "aggregation",            "time": "1.471620000ms"          }        ]      }    ]  }]

这个例子里 ValueCountAggregator 变成了 CardinalityAggregator

SUM(market_cap)

MIN/MAX/AVG/SUM 这几个简单的聚合也是支持的

$ cat << EOF | ./es_query.py http://127.0.0.1:9200select sum(market_cap) from symbol         EOF
{"sum(market_cap)": 11454155180142.0}

Elasticsearch

{  "aggs": {    "sum(market_cap)": {      "sum": {        "field": "market_cap"      }    }  },   "size": 0}
{  "hits": {    "hits": [],     "total": 6714,     "max_score": 0.0  },   "_shards": {    "successful": 1,     "failed": 0,     "total": 1  },   "took": 15,   "aggregations": {    "sum(market_cap)": {      "value": 11454155180142.0    }  },   "timed_out": false}

Profile

[  {    "query": [      {        "query_type": "MatchAllDocsQuery",        "lucene": "*:*",        "time": "0.2026870000ms",        "breakdown": {          "score": 0,          "create_weight": 8097,          "next_doc": 163069,          "match": 0,          "build_scorer": 31521,          "advance": 0        }      }    ],    "rewrite_time": 2151,    "collector": [      {        "name": "MultiCollector",        "reason": "search_multi",        "time": "2.461247000ms",        "children": [          {            "name": "TotalHitCountCollector",            "reason": "search_count",            "time": "0.3302140000ms"          },          {            "name": "SumAggregator: [sum(market_cap)]",            "reason": "aggregation",            "time": "1.102363000ms"          }        ]      }    ]  }]

过滤 + 聚合

SQL

$ cat << EOF | ./es_query.py http://127.0.0.1:9200select sum(market_cap) from symbol where ipo_year=1998EOF
{"sum(market_cap)": 107049150786.0}

Elasticsearch

{  "query": {    "term": {      "ipo_year": 1998    }  },   "aggs": {    "sum(market_cap)": {      "sum": {        "field": "market_cap"      }    }  },   "size": 0}
{  "hits": {    "hits": [],     "total": 56,     "max_score": 0.0  },   "_shards": {    "successful": 1,     "failed": 0,     "total": 1  },   "took": 11,   "aggregations": {    "sum(market_cap)": {      "value": 107049150786.0    }  },   "timed_out": false}

Profile

[  {    "query": [      {        "query_type": "TermQuery",        "lucene": "ipo_year:`N",        "time": "0.4526400000ms",        "breakdown": {          "score": 0,          "create_weight": 220579,          "next_doc": 159412,          "match": 0,          "build_scorer": 72649,          "advance": 0        }      }    ],    "rewrite_time": 3750,    "collector": [      {        "name": "MultiCollector",        "reason": "search_multi",        "time": "0.2203470000ms",        "children": [          {            "name": "TotalHitCountCollector",            "reason": "search_count",            "time": "0.009478000000ms"          },          {            "name": "SumAggregator: [sum(market_cap)]",            "reason": "aggregation",            "time": "0.1557820000ms"          }        ]      }    ]  }]

query 过滤完,然后再计算 aggs

转载地址:http://bsnca.baihongyu.com/

你可能感兴趣的文章
(喷血分享)利用.NET生成数据库表的创建脚本,类似SqlServer编写表的CREATE语句...
查看>>
[译] 为用户提供安全可靠的体验
查看>>
卷积神经网络—基本部件(2)
查看>>
Android 系统开发_四大组件篇 -- 探讨 Activity 的生命周期
查看>>
各个大厂裁员情况,已经慌的一B
查看>>
php.类与对象
查看>>
想入门数据科学领域?明确方向更重要
查看>>
【速记】借助ES6的模版字符串,在不用Babel插件的情况下实现一个轻量级类JSX功能...
查看>>
PHP算法之四大基础算法
查看>>
JavaScript常用数组操作方法,包含ES6方法
查看>>
从0开始用python写一个命令行小游戏(六)
查看>>
「Do.003」 adb无线连接Android设备
查看>>
Canvas 核心技术
查看>>
Nginx 部署静态页面
查看>>
SpringMVC之源码分析--ViewResolver(五)
查看>>
[LintCode] Permutation in String
查看>>
关于在一段文本上加修饰划线的效果
查看>>
如何基于MySQL及Redis搭建统一的KV存储服务
查看>>
zookeeper面试题
查看>>
2017年laravel行业调查结果
查看>>