Welcome to WuJiGu Developer Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
272 views
in Technique[技术] by (71.8m points)

azcopy - PowerShell - carve out a part of the select-string output

I'm copying a large amount of data using AzCopy and I need to have a way to retrieve the files that failed to copy.

AzCopy creates a nice log from each job and can do something like this:

Select-String -Path C:UsersXXX.azcopyProjects304c22cc-d37d-d743-7597-a160ac0ebad2.log -Pattern 'UPLOADFAILED'

But the output looks like that:

.azcopyProjects304c22cc-d37d-d743-7597-a160ac0ebad2.log:25528:2021/01/04 16:45:19 ERR: [P#0-T#2357] UPLOADFAILED: %5C%5CUNCfileserver.contoso.networkPROJ$AAAABBBBCCCCEigenerkl+?rungen.pdf_DOC001719.pdf : 000 : Could not check destination file existence. -> github.com/Azure/azure-storage-file-go/azfile.newStorageError, /home/vsts/go/pkg/mod/github.com/!azure/[email protected]/azfile/zc_storage_error.go:42

I need to carve out only the file path and name from this output. In the exampe below, I need to carve out:

fileserver.contoso.networkPROJ$AAAABBBBCCCCEigenerkl+?rungen.pdf_DOC001719.pdf

Does anyone have any idea how? I can't search based on the file name because I have over 2000 files that have failed and I need to carve out all of them.

Kind regards, Wojciech


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)
$file = 'C:UsersXXX.azcopyProjects304c22cc-d37d-d743-7597-a160ac0ebad2.log'

# Define the pattern as a regex that captures only the part of interest.
$pattern = '(?<=UPLOADFAILED:%5C%5C\UNC\)[^_]+'

(Select-String -Pattern $pattern -LiteralPath $file).Matches.Value

The assumptions are:

  • %5C%5CUNC is a fixed string preceding the path of interest (note how the are escaped as \ in order to treat them verbatim in the regex).

  • A _ character marks the end of the path.

Also note that Select-String matches case-insensitively by default; use -CaseSensitive as needed.

Finally, the presence of ? (REPLACEMENT CHARACTER, U+FFFD) in your sample data suggests that the file's character encoding is being misinterpreted, which you may be able to fix via the -Encoding parameter. Then again, these characters may point to a prior problem that caused these paths to be listed as failed to begin with.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to WuJiGu Developer Q&A Community for programmer and developer-Open, Learning and Share
...