2

I wrote a Powershell v3 script to list all the files owned by a user on our file server. It ran for 3 hours before I stopped it. It has to go through 619,238 Files and 57,452 Folders (517 GB). What order of magnitude should this take? Is there a way to improve the speed?

I tried to do this using pipes, but failed to get anything to work.

It's running on a Vmware virtual machine running Windows 2009 R2 SP1 with 4GB of memory. It uses about 60% of the CPU when I run it.

Here the code I wrote. I am very new to Powershell, but I have a lot of experience with Perl. My co-workers said to write a bat file.

<#
.SYNOPSIS
   C:\ams\psscripts\list-files.ps1
.DESCRIPTION
   List all the files that a given user owns
.PARAMETER none
   username: user 
   logfile: path to log file. This is optional. If omitted the the log file is created "u:\scratch\<$username>-files.txt
.EXAMPLE
    C:\ams\psscripts\list-files.ps1 plo
    Example: C:\ams\psscripts\list-files.ps1 plo u:\scratch\log.txt

#>

param (
    [string]$username,
    [string]$logfile
    )


# Load modules
Set-ExecutionPolicy Unrestricted
Import-Module ActiveDirectory
Add-PSSnapin Quest.ActiveRoles.ADManagement

function printHelp {
    Write-Host "This script will find all the files owned by a user. It scans \\dfs\groups"
    Write-Host "C:\ams\psscripts\list-files.ps1 user logfile (optional)"
    Write-Host "Example: C:\ams\psscripts\list-files.ps1 plo"
    Write-Host "Example: C:\ams\psscripts\list-files.ps1 plo u:\scratch\log.txt"
}

if ($logfile -eq "") {
    $logfile = "u:\scratch\" + $username + "-files.txt"
    Write-Host "Setting log file to $logfile"
}

# you must use a UNC path
[String]$path = "\\dfs\u$\groups"
[String]$AD_username = "AMS\" + $username

# check that we have a valid AD user
if (!(Get-QADUser $AD_username)){
    Write-Host "ERROR: Not a valid AD User: $AD_username"
    Exit 0
}

Write-Output "Listing all files owned by $username from $path" | Out-File -FilePath $logfile 
Write-Host "Listing all files owned by $username from $path"
$d = Get-Date
Write-Output $d | Out-File -FilePath $logfile -Append

$files = Get-ChildItem $path -Recurse
Foreach ($file in $files)
{
    $f = Get-Acl $file.FullName

    $d = [string]::Compare($file.FullName, $username, $True)
    if (($f.Owner -eq $username) -or ($f.Owner -eq $AD_username))
    {
        Write-Host "$file.FullName"
        Write-Output $file.FullName | Out-File -FilePath $logfile -Append
    }
}

Write-Host "Completed"
exit 0

The next step I have to do is modify the script above to find the files owned by a given user and change them to their manager.

This is the script I found to change the owner. It will be in a loop which walks the filesystem. Is this a good way to do this.

$username=”dnp”
$domain=”ams”
$ID = new-object System.Security.Principal.NTAccount($domain, $username)

# file to change owner. must be UNC path
$path = "\\dfs\c$\ams\psscripts\test.txt"
write-host $path
$acl = get-acl $path
$acl.SetOwner($ID)
set-acl -path $path -aclObject $acl

Thanks, Dan

4

2 回答 2

0

我不知道应该采取什么数量级,但为了提高速度,您可以从删除foreach循环中的冗余比较开始:

$d = [string]::Compare($file.FullName, $username, $True)

字符串比较代价高昂,而且您没有使用$d. 您还与 $username 或 "AMS\" + $username 进行比较,这又是昂贵的。我不明白为什么你需要与两者进行比较。我已经修改了您的脚本以添加时间。我建议在文件的一个子集上尝试它,以获取一些经验数据来计算完整集需要多长时间。请记住,在这种情况下,文件的总大小无关紧要,因为您不处理它们,只是处理它们的属性。

<#
.SYNOPSIS
   C:\ams\psscripts\list-files.ps1
.DESCRIPTION
   List all the files that a given user owns
.PARAMETER none
   username: user 
   logfile: path to log file. This is optional. If omitted the the log file is created "u:\scratch\<$username>-files.txt
.EXAMPLE
    C:\ams\psscripts\list-files.ps1 plo
    Example: C:\ams\psscripts\list-files.ps1 plo u:\scratch\log.txt

#>

param (
    [string]$username,
    [string]$logfile
    )


# Load modules
Set-ExecutionPolicy Unrestricted
#Import-Module ActiveDirectory
#Add-PSSnapin Quest.ActiveRoles.ADManagement

function printHelp {
    Write-Host "This script will find all the files owned by a user. It scans \\dfs\groups"
    Write-Host "C:\ams\psscripts\list-files.ps1 user logfile (optional)"
    Write-Host "Example: C:\ams\psscripts\list-files.ps1 plo"
    Write-Host "Example: C:\ams\psscripts\list-files.ps1 plo u:\scratch\log.txt"
}

#StopWatch
$stopWatch = New-Object System.Diagnostics.Stopwatch
$stopWatch.Start()

if ($logfile -eq "") {
    $logfile = "e:\scratch\" + $username + "-files.txt"
    Write-Host "Setting log file to $logfile"
}

# you must use a UNC path
[String]$path = "\\test-server\testfolder\subfolder"
[String]$AD_username = "AMS\" + $username

# check that we have a valid AD user
if (!(Get-QADUser $AD_username)){
    Write-Host "ERROR: Not a valid AD User: $AD_username"
    Exit 0
}

Write-Output "Listing all files owned by $username from $path" | Out-File -FilePath $logfile 
Write-Host "Listing all files owned by $username from $path"
$d = Get-Date
Write-Output $d | Out-File -FilePath $logfile -Append

$stopWatch.Stop()
Write-Output ("Setup time: {0}." -f $stopWatch.Elapsed) | Out-File -FilePath $logfile -Append
$stopWatch.Reset()

$stopWatch.Start()
$files = Get-ChildItem $path -Recurse
$stopWatch.Stop()
Write-Output ("Got {0} files to process, took {1}" -f $files.Count, $stopWatch.Elapsed) | Out-File -FilePath $logfile -Append
$stopWatch.Reset()

$stopWatch.Start()
Foreach ($file in $files)
{
    $f = Get-Acl $file.FullName

    #$d = [string]::Compare($file.FullName, $username, $True)
    #if (($f.Owner -eq $username) -or ($f.Owner -eq $AD_username))
    if ($f.Owner -eq $AD_username)
    {
        Write-Host ("{0}" -f $file.FullName)
        Write-Output $file.FullName | Out-File -FilePath $logfile -Append
    }
}
$stopWatch.Stop()
Write-Output ("Processed {0} files, took {1}" -f $files.Count, $stopWatch.Elapsed) | Out-File -FilePath $logfile -Append

Write-Host "Completed"
exit 0

运行此程序在我们的基础架构中产生了以下结果:
得到 37803 个文件要处理,占用 00:00:57.5834897
处理 37803 个文件,占用 00:10:42.2988004

您的原始代码需要 15 分钟来处理相同数量的文件:
已处理 37803 个文件,耗时 00:15:04.1024350

在内存字符串构造中添加了@GeorgeR.Jenkins,但它并没有显着减少处理时间:已
处理 37803 个文件,耗时 00:10:26.7815446

有趣的是,尝试将 get-childitem 传递给 where 子句并没有提高性能。使用

$files = Get-ChildItem $path -Recurse | where {(Get-Acl $_.FullName).Owner -eq $AD_username} <br/>

它只会返回具有正确所有者的文件,因此以后不需要处理:
Got 46 files to process, take 00:13:51.4940596

综上所述,如果您在与我类似的基础架构上运行,我预计,以我见过的最快速度(每秒 59 个文件),您的 619,238 个文件将花费大约 175 分钟。我用你的原始代码得到的速率是每秒 42 个文件,这需要 246 分钟。我再次建议在您的系统上运行一小部分文件,以计算运行整个文件集需要多长时间。

于 2013-08-13T13:51:41.160 回答
0

尝试使用它来记录用户拥有的文件,因为在整个过程完成之前不会写入磁盘:

[string]$fileowned +=  $file.fullname|?{$f.owner -eq $username -or $f.owner -eq $AD_username}

一旦退出循环,然后将生成的 $logfile 写入磁盘:

$fileowned|out-file $logfile

顺便说一句,您使用 $d 两次;一次用于获取日期,下一次用于比较字符串。

于 2013-08-12T17:09:19.193 回答