Speed ​​up my package parsing

I have a batch file that takes input from a txt file that looks like this.

Microsoft (R) Windows Script Host Version 5.8
Copyright (C) Microsoft Corporation. All rights reserved.


Server name lak-print01
Printer name Microsoft XPS Document Writer
Share name 
Driver name Microsoft XPS Document Writer
Port name XPSPort:
Comment 
Location 
Print processor WinPrint
Data type RAW
Parameters 
Attributes 64
Priority 1
Default priority 1
Average pages per minute 0
Printer status Idle 
Extended printer status Unknown 
Detected error state Unknown 
Extended detected error state Unknown 

Server name lak-print01
Printer name 4250_Q1
Share name 4250_Q1
Driver name Canon iR5055/iR5065 PCL5e
Port name IP_192.168.202.84
Comment Audit Department in Lakewood Operations
Location Operations Center
Print processor WinPrint
Data type RAW
Parameters 
Attributes 10826
Priority 1
Default priority 0
Average pages per minute 0
Printer status Idle 
Extended printer status Unknown 
Detected error state Unknown 
Extended detected error state Unknown 

Server name lak-print01
Printer name 3130_Q1
Share name 3130_Q1
Driver name Canon iR1020/1024/1025 PCL5e
Port name IP_192.168.202.11
Comment Canon iR1025 
Location Operations Center
Print processor WinPrint
Data type RAW
Parameters 
Attributes 10824
Priority 1
Default priority 0
Average pages per minute 0
Printer status Idle 
Extended printer status Unknown 
Detected error state Unknown 
Extended detected error state Unknown 

      

and parses it to get certain things in the list like server name, printer name, driver name, etc. and then puts each block in its own comma delimited string. Therefore I can have multiple lines, each containing a block of text, each column of which has specific information. Some of these txt files contain over 100 entries. When it comes to parsing, each file I try to execute takes 5-10 minutes.

The Parse code looks like this.

:Parselak-print01
SETLOCAL enabledelayedexpansion
:: remove variables starting $
FOR  /F "delims==" %%a In ('set $ 2^>Nul') DO SET "%%a="
(FOR /f "delims=" %%a IN (lak-print01.txt) DO CALL :analyse "%%a")>lak-print01.csv
attrib +h lak-print01.csv
GOTO :EOF

:analyse
SET "line=%~1"
SET /a fieldnum=0
FOR %%s IN ("Server name" "Printer name" "Driver name"
            "Port name" "Location" "Comment" "Printer status" 
        "Extended detected error state") DO CALL :setfield %%~s
GOTO :eof

:setfield
SET /a fieldnum+=1
SET "linem=!line:*%* =!"
SET "linet=%* %linem%"
IF "%linet%" neq "%line%" GOTO :EOF 
IF "%linem%"=="%line%" GOTO :EOF
SET "$%fieldnum%=%linem%"
IF NOT DEFINED $8 GOTO :EOF 
SET "line="
FOR /l %%q IN (1,1,7) DO SET "line=!line!,!$%%q!"
ECHO !line:~1!
:: remove variables starting $
FOR  /F "delims==" %%a In ('set $ 2^>Nul') DO SET "%%a="
GOTO :eof

      

and the output I get is

lak-print01,Microsoft XPS Document Writer,Microsoft XPS Document Writer,XPSPort:,,,Idle 
lak-print01,4250_Q1,Canon iR5055/iR5065 PCL5e,IP_192.168.202.84,Operations Center,Audit Department in Lakewood Operations,Idle 
lak-print01,3130_Q1,Canon iR1020/1024/1025 PCL5e,IP_192.168.202.11,Operations Center,Canon iR1025 ,Idle 
lak-print01,1106_TRN,HP LaserJet P2050 Series PCL6,IP_172.16.10.97,Monroe,HP P2055DN,Idle 
lak-print01,1101_TRN,HP LaserJet P2050 Series PCL6,IP_10.3.3.22,Burlington,Training Room printer,Idle 
lak-print01,1096_Q3,Canon iR1020/1024/1025 PCL5e,IP_192.168.96.248,Silverdale,Canon iR 1025,Idle 
lak-print01,1096_Q2,Kyocera Mita KM-5035 KX,IP_192.168.96.13,Silverdale,Kyocera CS-5035 all in one,Idle 
lak-print01,1096_Q1,HP LaserJet P4010_P4510 Series PCL 6,IP_192.168.96.12,Silverdale,HP 4015,Idle 
lak-print01,1095_Q3,HP LaserJet P4010_P4510 Series PCL 6,IP_192.168.95.247,Sequim,HP LaserJet 4015x,Idle 

      

Everything is perfect and the code works as intended, but its just super freaking slow!

How can I speed it up? the problem is there is no true delim and the tokens are changing. For example, comment requires token 2, but the printer name requires token 3.

Any help for increasing the parsing speed .. the program works fine but super slow during parsing.

+3


source to share


3 answers


Usage is Call

very slow - see if this gives you the result you want and it will be interesting to know how much faster it compares.



@echo off
:Parselak-print01
SETLOCAL enabledelayedexpansion
(FOR /f "delims=" %%a IN (lak-print01.txt) DO (
for /f "tokens=1,2,*" %%b in ("%%a") do (
   if "%%b"=="Server"   set "server=%%d"
   if "%%b"=="Printer"  if "%%c"=="name" (set "printer=%%d") else (set "printerstatus=%%d")
   if "%%b"=="Driver"   set "driver=%%d"
   if "%%b"=="Port"     set "port=%%d"
   if "%%b"=="Location" for /f "tokens=1,*"   %%e in ("%%a") do set "location=%%f"
   if "%%b"=="Comment"  for /f "tokens=1,*"   %%e in ("%%a") do set "comment=%%f"
   if "%%b"=="Extended" for /f "tokens=1-4,*" %%e in ("%%a") do if "%%f"=="detected" set "extendeddetected=%%i"
   )
if defined extendeddetected (
   echo !server!,!printer!,!driver!,!port!,!location!,!comment!,!printerstatus!,!extendeddetected!
   set "server="
   set "printer="
   set "driver="
   set "port="
   set "location="
   set "comment="
   set "printerstatus="
   set "extendeddetected="
)
))>lak-print01.csv
attrib +h lak-print01.csv
pause

      

+3


source


If speed is what you want, I would suggest Marpa , a general BNF parser in Perl , output .

It takes a while to get used to, but it gets the job done and gives you a very powerful tool that you can use easily - notice how natural the grammar is like input.



Hope it helps.

+6


source


The solution below assumes the input file is in a fixed format, that is, it has two header lines followed by blocks of 18 lines, always in the same order. If so, this solution generates results very quickly; otherwise, it must be modified accordingly ...

@echo off
setlocal EnableDelayedExpansion

rem Create the array of variable names for the *desired rows* of data in the file
set "row[1]=Server name"
set "row[2]=Printer name"
set "row[4]=Driver name"
set "row[5]=Port name"
set "row[6]=Comment"
set "row[7]=Location"
set "row[15]=Printer status"

set i=0
(for /F "skip=2 delims=" %%a in (lak-print01.txt) do (
   set /A i+=1
   if defined row[!i!] (
      set "line=%%a"
      for %%i in (!i!) do for /F "delims=" %%v in ("!row[%%i]!") do set "%%v=!line:*%%v =!"
   )
   if !i! equ 18 (
      echo !Server name!,!Printer name!,!Driver name!,!Port name!,!Location!,!Comment!,!Printer status!
      set i=0
   )
)) > lak-print01.csv

      

+3


source







All Articles