sql - 选择查询中的依赖列

Question

我有一个表Employee：

+-------+---------+-------------+
| Empid |  Name   |   Mailid    |
+-------+---------+-------------+
|   001 | abc def | abc@xyz.com |
|   002 | hij klm | hij@xyz.com |
|   003 | nop qrs | nop@xyz.com |
+-------+---------+-------------+

我可以查询它：

select 
    Empid, 
    Left(Name,3),
    'hardcoded string',
    Mailid,
    Replace(Mailid,'@xyz.com','') 
from Employee;

如何证明上述查询直接或间接引用了员工中的特定列？例如：

col1 Empid
col2 Name
col3 N/A
col4 mailid
col5 mailid

请帮助任何指向实用程序的指针来实现这一点。

score 2 · Accepted Answer

要进行这种分析，您需要一个完整的 (T-)SQL 解析器，它可以访问数据库的当前结构（提示：列表达式可以变得任意复杂，包括子选择和一切）。

更新：我在下面的查询计划的想法不起作用。虽然该计划很好地跟踪列引用，但它没有以正确的顺序显示输出列。啊！我目前看不到如何治愈，属性中没有序数。所以，对不起，不起作用。我还是把它留给参考，也许其他人有基于此的想法。

想到的一件事是查看查询计划。您必须解析 XML 计划以查看某个列的值的来源。虽然这可能比直接查看 SQL 更简单，但它仍然不是微不足道的。查询计划也可能变得相当复杂。但是为了说明原理，让我们看一下 AdventureWorks DB 上的类似查询：

SELECT 
  rowguid, 
  Left(JobTitle,3),
  'hardcoded string',
  BirthDate,
  Replace(BirthDate,'@xyz.com','') 
FROM 
  HumanResources.Employee

此语句的 XML 查询计划如下所示：

<?xml version="1.0" encoding="utf-16"?>
<ShowPlanXML xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" Version="1.2" Build="11.0.3128.0" xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan">
  <BatchSequence>
    <Batch>
      <Statements>
        <StmtSimple StatementCompId="1" StatementEstRows="290" StatementId="1" StatementOptmLevel="TRIVIAL" StatementSubTreeCost="0.00807444" StatementText="Select rowguid, Left(JobTitle,3),'hardcoded string',BirthDate,Replace(BirthDate,'@xyz.com','') FROM HumanResources.Employee" StatementType="SELECT" QueryHash="0x5F035E11344539B" QueryPlanHash="0x717B3D06C26C61ED" RetrievedFromCache="false">
          <StatementSetOptions ANSI_NULLS="true" ANSI_PADDING="true" ANSI_WARNINGS="true" ARITHABORT="true" CONCAT_NULL_YIELDS_NULL="true" NUMERIC_ROUNDABORT="false" QUOTED_IDENTIFIER="true" />
          <QueryPlan CachedPlanSize="16" CompileTime="14" CompileCPU="12" CompileMemory="152">
            <MemoryGrantInfo SerialRequiredMemory="0" SerialDesiredMemory="0" />
            <OptimizerHardwareDependentProperties EstimatedAvailableMemoryGrant="418321" EstimatedPagesCached="104580" EstimatedAvailableDegreeOfParallelism="2" />
            <RelOp AvgRowSize="4045" EstimateCPU="2.9E-05" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row" EstimateRows="290" LogicalOp="Compute Scalar" NodeId="0" Parallel="false" PhysicalOp="Compute Scalar" EstimatedTotalSubtreeCost="0.00807444">
              <OutputList>
                <ColumnReference Database="[AdventureWorks2012]" Schema="[HumanResources]" Table="[Employee]" Column="BirthDate" />
                <ColumnReference Database="[AdventureWorks2012]" Schema="[HumanResources]" Table="[Employee]" Column="rowguid" />
                <ColumnReference Column="Expr1003" />
                <ColumnReference Column="Expr1004" />
                <ColumnReference Column="Expr1005" />
              </OutputList>
              <ComputeScalar>
                <DefinedValues>
                  <DefinedValue>
                    <ColumnReference Column="Expr1003" />
                    <ScalarOperator ScalarString="substring([AdventureWorks2012].[HumanResources].[Employee].[JobTitle],(1),(3))">
                      <Intrinsic FunctionName="substring">
                        <ScalarOperator>
                          <Identifier>
                            <ColumnReference Database="[AdventureWorks2012]" Schema="[HumanResources]" Table="[Employee]" Column="JobTitle" />
                          </Identifier>
                        </ScalarOperator>
                        <ScalarOperator>
                          <Const ConstValue="(1)" />
                        </ScalarOperator>
                        <ScalarOperator>
                          <Const ConstValue="(3)" />
                        </ScalarOperator>
                      </Intrinsic>
                    </ScalarOperator>
                  </DefinedValue>
                  <DefinedValue>
                    <ColumnReference Column="Expr1004" />
                    <ScalarOperator ScalarString="'hardcoded string'">
                      <Const ConstValue="'hardcoded string'" />
                    </ScalarOperator>
                  </DefinedValue>
                  <DefinedValue>
                    <ColumnReference Column="Expr1005" />
                    <ScalarOperator ScalarString="replace(CONVERT_IMPLICIT(varchar(40),[AdventureWorks2012].[HumanResources].[Employee].[BirthDate],121),'@xyz.com','')">
                      <Intrinsic FunctionName="replace">
                        <ScalarOperator>
                          <Convert DataType="varchar" Length="40" Style="121" Implicit="true">
                            <ScalarOperator>
                              <Identifier>
                                <ColumnReference Database="[AdventureWorks2012]" Schema="[HumanResources]" Table="[Employee]" Column="BirthDate" />
                              </Identifier>
                            </ScalarOperator>
                          </Convert>
                        </ScalarOperator>
                        <ScalarOperator>
                          <Const ConstValue="'@xyz.com'" />
                        </ScalarOperator>
                        <ScalarOperator>
                          <Const ConstValue="''" />
                        </ScalarOperator>
                      </Intrinsic>
                    </ScalarOperator>
                  </DefinedValue>
                </DefinedValues>
                <RelOp AvgRowSize="80" EstimateCPU="0.000476" EstimateIO="0.00756944" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row" EstimateRows="290" LogicalOp="Clustered Index Scan" NodeId="1" Parallel="false" PhysicalOp="Clustered Index Scan" EstimatedTotalSubtreeCost="0.00804544" TableCardinality="290">
                  <OutputList>
                    <ColumnReference Database="[AdventureWorks2012]" Schema="[HumanResources]" Table="[Employee]" Column="JobTitle" />
                    <ColumnReference Database="[AdventureWorks2012]" Schema="[HumanResources]" Table="[Employee]" Column="BirthDate" />
                    <ColumnReference Database="[AdventureWorks2012]" Schema="[HumanResources]" Table="[Employee]" Column="rowguid" />
                  </OutputList>
                  <IndexScan Ordered="false" ForcedIndex="false" ForceScan="false" NoExpandHint="false">
                    <DefinedValues>
                      <DefinedValue>
                        <ColumnReference Database="[AdventureWorks2012]" Schema="[HumanResources]" Table="[Employee]" Column="JobTitle" />
                      </DefinedValue>
                      <DefinedValue>
                        <ColumnReference Database="[AdventureWorks2012]" Schema="[HumanResources]" Table="[Employee]" Column="BirthDate" />
                      </DefinedValue>
                      <DefinedValue>
                        <ColumnReference Database="[AdventureWorks2012]" Schema="[HumanResources]" Table="[Employee]" Column="rowguid" />
                      </DefinedValue>
                    </DefinedValues>
                    <Object Database="[AdventureWorks2012]" Schema="[HumanResources]" Table="[Employee]" Index="[PK_Employee_BusinessEntityID]" IndexKind="Clustered" />
                  </IndexScan>
                </RelOp>
              </ComputeScalar>
            </RelOp>
          </QueryPlan>
        </StmtSimple>
      </Statements>
    </Batch>
  </BatchSequence>
</ShowPlanXML>

在这里，SQL Server 已经完成了解析 SQL 并确定列值最终来自何处的所有艰苦工作（然后优化计划，但这在这里没有意义）。

第一个<OutputList>对应于 SELECT 中的列列表。它有 5 列引用。从它们的属性可以看出，其中两个直接对应于表列，而其他三个只是引用了一些任意命名的表达式。现在您可以搜索这些表达式（如果需要，递归），它们又具有其他列引用（或者没有，如果“Expr1004”仅引用一个常量）。

因此，由于您只需要查看计划的几个元素，因此您很有可能找到源列（请注意，给定的输出列可能有多个源列）。

要获取 XML 查询计划，只需执行

SET SHOWPLAN_XML ON
<your statement>
SET SHOWPLAN_XML OFF

输出是计划（不执行语句 - 它只是估计的计划，但在这里并不重要）。

希望有帮助。

score 0 · Accepted Answer

像这样的东西？

Select Empid, Left(Name,3) as Name, 'hardcoded string' as [N/A], Mailid as mailid, Replace(Mailid,'@xyz.com','') as mailid2
from Employee

sql - 选择查询中的依赖列

2 回答 2

Related

Reference