我正在将一些 IIS 日志导入 Power Pivot 以使用以下方法进行一些分析:
LogParser.exe "
SELECT
EXTRACT_TOKEN(LogFileName, 5, '\\') As LogFile,
LogRow,
to_localtime(to_timestamp(date,time)) as LOG_DTTM,
cs-UserName as ClientUserName,
cs-Method,cs-Uri-Stem as UriStem,
cs-Uri-Query as UriQuery,
sc-Status as Status,
sc-SubStatus as SubStatus,
time-Taken as ElapsedTimeMS,
c-Ip As ClientIP,
s-ComputerName as ComputerName,
s-Ip as ServerIP,
s-Port as Port,
sc-Win32-Status as Win32Status,
cs(User-Agent) as UserAgent
INTO IIS_LOG_PROD_STAGING
FROM somefile.log" -o:SQL -oConnString:"Driver=SQL Server;Server=MY_SERVER_NAME; Database=MY_DATABASE_NAME;Trusted_Connection=yes" -createTable:ON -e:10 -transactionRowCount:-1
...我的问题是: 我应该将 DateTime 列的离散部分拆分为数据库存储级别的单独列,还是应该留给 PowerPivot 模型中的计算列?
Marco Russo 似乎建议至少将 DATE 分成一个单独的列:
http ://sqlblog.com/blogs/marco_russo/archive/2011/09/01/separate-date-and-time-in-powerpivot-and- bis-tabular.aspx
PowerPivot 仍将该列读取为 DateTime,但小时/分钟/秒消失了,并且唯一值的数量减少到数据中不同的天数。当然,更容易加入日历表!
这似乎是有道理的。但是,如果我知道我想要在 HourOfDay、DayOfWeek、DayOfMonth 等级别进行分析,我是否也应该将它们拆分为单独的数据库列?