当我们在共享点实例上执行一些搜索时,我们会在搜索结果中看到一些文件的“查看重复项”链接。
有没有办法报告所有这些重复项?
我已经看到这里有这个 SQL 来根据他们的 md5 哈希查找重复项:http: //social.technet.microsoft.com/forums/en-US/sharepointsearch/thread/8a8b25d9-a3ac-45df-86de-2a3a7838a534和我已在此处更正了 SharePoint 2010 兼容性的 SQL:
-- Step1 : get all files with short names, md5 signatures, and size
SELECT md5 ,
RIGHT(DisplayURL, CHARINDEX('/', REVERSE(DisplayURL)) - 1) AS ShortFileName ,
DisplayURL AS Url ,
llVal / 1024 AS FileSizeKb
INTO #listingFilesMd5Size
FROM SearchServiceApplication_CrawlStore.dbo.MSSCrawlURL y
INNER JOIN SearchServiceApplication_PropertyStore.dbo.MSSDocProps dp ON ( y.DocID = dp.DocID )
WHERE dp.pid = 58 -- File size
AND llVal > 1024 * 10 -- 10 Kb minimum in size
AND md5 <> 0
AND CHARINDEX('/', REVERSE(DisplayURL)) > 1
-- Step 2: Filter duplicated items
SELECT COUNT(*) AS NbDuplicates ,
md5 ,
ShortFileName ,
FileSizeKb
INTO #duplicates
FROM #listingFilesMd5Size
GROUP BY md5 ,
ShortFileName ,
FileSizeKb
HAVING COUNT(*) > 1
ORDER BY COUNT(*) DESC
DROP TABLE #listingFilesMd5Size
-- Step3 : show the report with search URLs
SELECT *,
NbDuplicates * FileSizeKb AS TotalSpaceKb ,
'http://srv-moss/SearchCenter/Pages/results.aspx?k=' + ShortFileName AS SearchUrl
FROM #duplicates
--ORDER BY NbDuplicates * FileSizeKb DESC
DROP TABLE #duplicates
但这仅匹配精确的重复项,而我对 SharePoint 根据搜索结果中的“查看重复项”链接认为是重复项感兴趣。
我已经看到有托管属性“DuplicateHash”,但这在任何地方都没有记录,我找不到通过对象模型访问它的方法。
谢谢