0

我想分析 Verica 上的表使用情况以检查以下内容

  1. 最受打击的表是查询
  2. 获得更多写入查询的表
  3. 获得更多读取查询的表。

所以我正在寻求 SQL 查询的帮助,或者如果有人有任何文件,请指出我正确的方向。谢谢你。

4

1 回答 1

0

在这里,我创建了一个函数QTYPE(),它将“QUERY”类型的请求分配给 SELECT、INSERT 或 MODIFY(意思是 DELETE、UPDATE、MERGE)。区别在于,在 Vertica 中,UPDATE/MERGE 实际上是 DELETE,然后是 INSERT。

我使用了两个具有一定复杂性的正则表达式:首先,[schema.]tablename在 JOIN 或 FROM 关键字之后查找,然后[schema.]tablename在 UPDATE、INSERT INTO、MERGE INTO 和 DELETE FROM 关键字之后查找。然后,我重新加入tables系统表,以 a) 仅选择真正存在的表,b) 如果缺少架构名称,请添加它。

最终报告将是:

 qtype  |                        tbname                        | tx_count 
--------+------------------------------------------------------+----------
 INSERT | dbadmin.nrm_cpustats_rate                            |       74  
 INSERT | dbadmin.v_poll_item                                  |       39  
 INSERT | dbadmin.child                                        |       32  
 INSERT | dbadmin.tbid                                         |       32  
 INSERT | dbadmin.etl_group_membership                         |       12  
 INSERT | dbadmin.sensor_oco                                   |       11  
 INSERT | webanalytics.webtraffic_part                         |       10  
 INSERT | webanalytics.webtraffic_new_design_platform_datadate |        9   
 MODIFY | cp.foo                                               |        2   
 MODIFY | public.foo                                           |        2   
 MODIFY | taboola_tests.foo                                    |        2   
 SELECT | dbadmin.flext                                        |      112 
 SELECT | dbadmin.children                                     |      112 
 SELECT | dbadmin.ffoo                                         |      112 
 SELECT | dbadmin.demovals                                     |      112 
 SELECT | dbadmin.allbut4                                      |      112 
 SELECT | dbadmin.allcols                                      |      112 
 SELECT | dbadmin.allbut1                                      |      112 
 SELECT | dbadmin.flx                                          |      112 

这是函数定义和 CREATE TABLE 语句,用于收集您要查找的内容的统计信息,最后是获取最常接触表的“hit parade”的查询......

请注意,它可能会成为一个长跑者,在你的query_requests桌子上有很多历史......

CREATE OR REPLACE FUNCTION qtype(sql VARCHAR(64000))
RETURN VARCHAR(8) AS BEGIN
  RETURN
    CASE UPPER(REGEXP_SUBSTR(sql,'\w+')::VARCHAR(16))
      WHEN 'SELECT' THEN 'SELECT'
      WHEN 'WITH'   THEN 'SELECT'
      WHEN 'AT'     THEN 'SELECT'
      WHEN 'INSERT' THEN 'INSERT'
      WHEN 'DELETE' THEN 'MODIFY'
      WHEN 'UPDATE' THEN 'MODIFY'
      WHEN 'MERGE'  THEN 'MODIFY'
      ELSE UPPER(REGEXP_SUBSTR(sql,'\w+')::VARCHAR(16))
    END
  ;
END;

DROP TABLE IF EXISTS table_op_stats;
CREATE TABLE table_op_stats AS 
WITH
-- need 1000 integers - up to ~400 source tables found in 1 select
i(i) AS (
  SELECT MICROSECOND(tm)
  FROM (
              SELECT TIMESTAMPADD(MICROSECOND,   1,'2000-01-01'::TIMESTAMP)
    UNION ALL SELECT TIMESTAMPADD(MICROSECOND,1000,'2000-01-01'::TIMESTAMP)
  ) l(ts)
  TIMESERIES tm AS '1 MICROSECOND' OVER(ORDER BY ts)
)
,
tblist AS (
-- selects can affect several types, found by JOIN or FROM keyword before
-- hence look_behind regular expression
SELECT 
    QTYPE(request) AS qtype
  , transaction_id
  , statement_id
  , i
  , LTRIM(REGEXP_SUBSTR(request,'(?<=(from|join))\s+(\w+\.)?\w+\b',1,i,'i')) as tbname
  FROM query_requests CROSS JOIN i
  WHERE request_type='QUERY'
    AND success
    AND LTRIM(REGEXP_SUBSTR(request,'(?<=(from|join))\s+(\w+\.)?\w+\b',1,i,'i')) <> ''
  UNION ALL
  -- insert/delete/update/merge queries only affect one table each
  SELECT
    QTYPE(request) AS qtype
  , transaction_id
  , statement_id
  , 1 AS i
  , LTRIM(REGEXP_SUBSTR(request,'(insert\s+.*into\s+|update\s+.*|merge\s+.*into|delete\s+.*from)\s*((\w+\.)?\w+)\b',1,1,'i',2)) as tbname
  FROM query_requests
  WHERE request_type='QUERY'
    AND success
    AND QTYPE(request) <> 'SELECT'
)
,
-- join back to the "tables" system table - removes queries from correlation names, and adds schema name if needed
real_tables AS (
  SELECT
    qtype
  , transaction_id
  , statement_id
  , i
, CASE WHEN SPLIT_PART(tbname,'.',2)=''
    THEN table_schema||'.'||tbname
    ELSE tbname
  END AS tbname
  FROM tblist
  JOIN tables ON CASE WHEN SPLIT_PART(tbname,'.',2)=''
                   THEN tbname=table_name
                   ELSE SPLIT_PART(tbname,'.',1)=table_schema AND SPLIT_PART(tbname,'.',2)=table_name
                 END
)
SELECT
  qtype
, transaction_id
, statement_id
, i
, tbname
FROM real_tables;
-- Time: First fetch (0 rows): 42483.769 ms. All rows formatted: 42484.324 ms

-- the query at the end:
WITH grp AS (
  SELECT
    qtype
  , tbname
  , COUNT(*) AS tx_count
  FROM table_op_stats
  GROUP BY 1,2
)
SELECT
  *
FROM grp
LIMIT 8 OVER(
  PARTITION BY qtype
  ORDER BY tx_count DESC
);

于 2021-08-23T00:17:11.240 回答