CMD or Powershell to concatenate (concatenate) matching lines from two files

Is it possible to use CMD and Powershell to combine 2 files into 1 file like this:

file1-line1 tab file2-line1
file1-line2 tab file2-line2
file1-line3 tab file2-line3

So it takes file 1 line 1 and inserts a tab and then inserts file line 2 1. Then does it for all subsequent lines in each file?

+2


source to share


6 answers


In PowerShell and assuming both files have exactly the same number of lines:



$f1 = Get-Content file1
$f2 = Get-Content file2

for ($i = 0; $i -lt $f1.Length; ++$i) {
  $f1[$i] + "`t" + $f2[$i]
}

      

+6


source


Probably the simplest solution is to use the Windows port of a Linux utility paste

( paste.exe

from UnxUtils for example )

paste C:\path\to\file1.txt C:\path\to\file2.txt

      

On the page :

<b> DESCRIPTION

Write lines consisting of sequentially matching lines from each FILE, separated by TABs, to standard output.




For a PowerShell (ish) solution, I would use two read streams :

$sr1 = New-Object IO.StreamReader 'C:\path\to\file1.txt'
$sr2 = New-Object IO.StreamReader 'C:\path\to\file2.txt'

while ($sr1.Peek() -ge 0 -or $sr2.Peek() -ge 0) {
  if ($sr1.Peek() -ge 0) { $txt1 = $sr1.ReadLine() } else { $txt1 = '' }
  if ($sr2.Peek() -ge 0) { $txt2 = $sr2.ReadLine() } else { $txt2 = '' }

  "{0}`t{1}" -f $txt1, $txt2
}

      

This avoids the need to read the two files completely into memory before merging them, which risks running out of memory for large files.

+3


source


@echo off
setlocal EnableDelayedExpansion
rem Next line have a tab after the equal sign:
set "TAB=   "
Rem First file is read with FOR /F command
Rem Second file is read via Stdin
< file2.txt (for /F "delims=" %%a in (file1.txt) do (
   Rem Read next line from file2.txt
   set /P "line2="
   Rem Echo lines of both files separated by tab
   echo %%a%TAB%!line2!
))

      

More about this post

+2


source


A generalized solution that supports multiple files , based on a beautiful, efficient memory addressingSystem.IO.StreamReader

by Ansgar Wiechers :

PowerShell's ability to call items (properties, methods) directly on a collection and automatically call them for all items in the collection ( member enumeration , v3 +) makes it easy to generalize:

# Make sure .NET has the same current dir. as PS.
[System.IO.Directory]::SetCurrentDirectory($PWD)

# The input file paths.
$files = 'file1', 'file2', 'file3'

# Create stream-reader objects for all input files.
$readers = [IO.StreamReader[]] $files

# Keep reading while at least 1 file still has more lines.
while ($readers.EndOfStream -contains $false) {

  # Read the next line from each stream (file).
  # Streams that are already at EOF fortunately just return "".
  $lines = $readers.ReadLine()

  # Output the lines separated with tabs.
  $lines -join "'t"

}

# Close the stream readers.
$readers.Close()

      


Get-MergedLines

(source code below; called from -?

For Reference) wraps the functionality into a function that:

  • accepts a variable number of filenames - both as an argument and via a pipeline

  • uses a custom delimiter to concatenate lines (tab by default)

  • allows you to trim end instances of the delimiter

function Get-MergedLines() {
<#
.SYNOPSIS
Merges lines from 2 or more files with a specifiable separator (default is tab).

.EXAMPLE
> Get-MergedLines file1, file2 '<->'

.EXAMPLE
> Get-ChildItem file? | Get-MergedLines
#>
  param(
    [Parameter(Mandatory, ValueFromPipeline, ValueFromPipelineByPropertyName)]
    [Alias('PSPath')]
    [string[]] $Path,

    [string] $Separator = "'t",

    [switch] $TrimTrailingSeparators
  )

  begin { $allPaths = @() }

  process { $allPaths += $Path }

  end {

    # Resolve all paths to full paths, which may include wildcard resolution.
    # Note: By using full paths, we needn't worry about .NET current dir.
    #       potentially being different.
    $fullPaths = (Resolve-Path $allPaths).ProviderPath

    # Create stream-reader objects for all input files.
    $readers = [System.IO.StreamReader[]] $fullPaths

    # Keep reading while at least 1 file still has more lines.
    while ($readers.EndOfStream -contains $false) {

      # Read the next line from each stream (file).
      # Streams that are already at EOF fortunately just return "".
      $lines = $readers.ReadLine()

      # Join the lines.
      $mergedLine = $lines -join $Separator

      # Trim (remove) trailing separators, if requested.
      if ($TrimTrailingSeparators) {
        $mergedLine = $mergedLine -replace ('^(.*?)(?:' + [regex]::Escape($Separator) + ')+$'), '$1'
      }

      # Output the merged line.
      $mergedLine

    }

    # Close the stream readers.
    $readers.Close()

  }

}

      

+2


source


Powershell solution:

$file1 = Get-Content file1
$file2 = Get-Content file2
$outfile = "file3.txt"

for($i = 0; $i -lt $file1.length; $i++) {
  "$($file1[$i])`t$($file2[$i])" | out-file $outfile -Append 
}

      

+1


source


There are several recently blocked [duplicated] questions that are related to this question, for example:

where I disagree because they differ from each other in that this question deals with text files and other files csv

. Generally, I would advise against manipulating files that represent objects (eg xml

, json

and csv

). Instead, I recommend importing these files (into objects), making the appropriate changes, and ConvertTo / Export the results back to the file.

One example where all the general solutions given in this question will lead to the wrong conclusion for these "duplicates" is, for example, where both files csv

have a common column (property) name.
Generic joins (see also: In Powershell, what's the best way to combine two tables into one? ) A list of two objects when the parameter is simply omitted. Therefore, other ( ) "duplicate" questions are better suited for this solution . Let's take Merge 2 CSV files in powershell [duplicate] from @Ender as an example: Join-Object

-on

-on

csv

$A = ConvertFrom-Csv @'
ID,Name
1,Peter
2,Dalas
'@

$B = ConvertFrom-Csv @'
Class
Math
Physic
'@

$A | Join $B

ID Name  Class
-- ----  -----
1  Peter Math
2  Dalas Physic

      

Compared to the "text-based" merge solutions provided in this answer, the generic cmdlet Join-Object

can work on files of different lengths and lets you decide what to include ( LeftJoin

, RightJoin

or FullJoin

). Plus, you have control over which columns you want to include ( $A | Join $B -Property ID, Name

), order ( $A | Join $B -Property ID, Class, Name

), and much more that can't be done that just concatenate text.

Specific to this question:

Since this particular question is about text files and not files csv

, you will need to add the header (property) name (for example -Header File1

) when transferring the file and remove the header ( Select-Object -Skip 1

) when exporting the result:

$File1 = Import-Csv .\File1.txt -Header File1 
$File2 = Import-Csv .\File2.txt -Header File2
$File3 = $File1 | Join $File2
$File3 | ConvertTo-Csv -Delimiter "'t" -NoTypeInformation |
    Select-Object -Skip 1 | Set-Content .\File3.txt

      

0


source







All Articles