CMD or Powershell to concatenate (concatenate) matching lines from two files
Is it possible to use CMD and Powershell to combine 2 files into 1 file like this:
file1-line1 tab file2-line1 file1-line2 tab file2-line2 file1-line3 tab file2-line3
So it takes file 1 line 1 and inserts a tab and then inserts file line 2 1. Then does it for all subsequent lines in each file?
source to share
Probably the simplest solution is to use the Windows port of a Linux utility paste
( paste.exe
from UnxUtils for example )
paste C:\path\to\file1.txt C:\path\to\file2.txt
On the page :
<b> DESCRIPTION
Write lines consisting of sequentially matching lines from each FILE, separated by TABs, to standard output.
For a PowerShell (ish) solution, I would use two read streams :
$sr1 = New-Object IO.StreamReader 'C:\path\to\file1.txt'
$sr2 = New-Object IO.StreamReader 'C:\path\to\file2.txt'
while ($sr1.Peek() -ge 0 -or $sr2.Peek() -ge 0) {
if ($sr1.Peek() -ge 0) { $txt1 = $sr1.ReadLine() } else { $txt1 = '' }
if ($sr2.Peek() -ge 0) { $txt2 = $sr2.ReadLine() } else { $txt2 = '' }
"{0}`t{1}" -f $txt1, $txt2
}
This avoids the need to read the two files completely into memory before merging them, which risks running out of memory for large files.
source to share
@echo off
setlocal EnableDelayedExpansion
rem Next line have a tab after the equal sign:
set "TAB= "
Rem First file is read with FOR /F command
Rem Second file is read via Stdin
< file2.txt (for /F "delims=" %%a in (file1.txt) do (
Rem Read next line from file2.txt
set /P "line2="
Rem Echo lines of both files separated by tab
echo %%a%TAB%!line2!
))
More about this post
source to share
A generalized solution that supports multiple files , based on a beautiful, efficient memory addressingSystem.IO.StreamReader
by Ansgar Wiechers :
PowerShell's ability to call items (properties, methods) directly on a collection and automatically call them for all items in the collection ( member enumeration , v3 +) makes it easy to generalize:
# Make sure .NET has the same current dir. as PS.
[System.IO.Directory]::SetCurrentDirectory($PWD)
# The input file paths.
$files = 'file1', 'file2', 'file3'
# Create stream-reader objects for all input files.
$readers = [IO.StreamReader[]] $files
# Keep reading while at least 1 file still has more lines.
while ($readers.EndOfStream -contains $false) {
# Read the next line from each stream (file).
# Streams that are already at EOF fortunately just return "".
$lines = $readers.ReadLine()
# Output the lines separated with tabs.
$lines -join "'t"
}
# Close the stream readers.
$readers.Close()
Get-MergedLines
(source code below; called from -?
For Reference) wraps the functionality into a function that:
-
accepts a variable number of filenames - both as an argument and via a pipeline
-
uses a custom delimiter to concatenate lines (tab by default)
-
allows you to trim end instances of the delimiter
function Get-MergedLines() {
<#
.SYNOPSIS
Merges lines from 2 or more files with a specifiable separator (default is tab).
.EXAMPLE
> Get-MergedLines file1, file2 '<->'
.EXAMPLE
> Get-ChildItem file? | Get-MergedLines
#>
param(
[Parameter(Mandatory, ValueFromPipeline, ValueFromPipelineByPropertyName)]
[Alias('PSPath')]
[string[]] $Path,
[string] $Separator = "'t",
[switch] $TrimTrailingSeparators
)
begin { $allPaths = @() }
process { $allPaths += $Path }
end {
# Resolve all paths to full paths, which may include wildcard resolution.
# Note: By using full paths, we needn't worry about .NET current dir.
# potentially being different.
$fullPaths = (Resolve-Path $allPaths).ProviderPath
# Create stream-reader objects for all input files.
$readers = [System.IO.StreamReader[]] $fullPaths
# Keep reading while at least 1 file still has more lines.
while ($readers.EndOfStream -contains $false) {
# Read the next line from each stream (file).
# Streams that are already at EOF fortunately just return "".
$lines = $readers.ReadLine()
# Join the lines.
$mergedLine = $lines -join $Separator
# Trim (remove) trailing separators, if requested.
if ($TrimTrailingSeparators) {
$mergedLine = $mergedLine -replace ('^(.*?)(?:' + [regex]::Escape($Separator) + ')+$'), '$1'
}
# Output the merged line.
$mergedLine
}
# Close the stream readers.
$readers.Close()
}
}
source to share
There are several recently blocked [duplicated] questions that are related to this question, for example:
- Combine two CSV files into one with columns [duplicates]
- Merge 2 CSV files in PowerShell [duplicate]
where I disagree because they differ from each other in that this question deals with text files and other files csv
. Generally, I would advise against manipulating files that represent objects (eg xml
, json
and csv
). Instead, I recommend importing these files (into objects), making the appropriate changes, and ConvertTo / Export the results back to the file.
One example where all the general solutions given in this question will lead to the wrong conclusion for these "duplicates" is, for example, where both files csv
have a common column (property) name.
Generic joins (see also: In Powershell, what's the best way to combine two tables into one? ) A list of two objects when the parameter is simply omitted. Therefore, other ( ) "duplicate" questions are better suited for this solution . Let's take Merge 2 CSV files in powershell [duplicate] from @Ender as an example: Join-Object
-on
-on
csv
$A = ConvertFrom-Csv @'
ID,Name
1,Peter
2,Dalas
'@
$B = ConvertFrom-Csv @'
Class
Math
Physic
'@
$A | Join $B
ID Name Class
-- ---- -----
1 Peter Math
2 Dalas Physic
Compared to the "text-based" merge solutions provided in this answer, the generic cmdlet Join-Object
can work on files of different lengths and lets you decide what to include ( LeftJoin
, RightJoin
or FullJoin
). Plus, you have control over which columns you want to include ( $A | Join $B -Property ID, Name
), order ( $A | Join $B -Property ID, Class, Name
), and much more that can't be done that just concatenate text.
Specific to this question:
Since this particular question is about text files and not files csv
, you will need to add the header (property) name (for example -Header File1
) when transferring the file and remove the header ( Select-Object -Skip 1
) when exporting the result:
$File1 = Import-Csv .\File1.txt -Header File1
$File2 = Import-Csv .\File2.txt -Header File2
$File3 = $File1 | Join $File2
$File3 | ConvertTo-Csv -Delimiter "'t" -NoTypeInformation |
Select-Object -Skip 1 | Set-Content .\File3.txt
source to share