Count the most common occurrences of unknown lines in a file
I have a large file with lines like this ...
19:54:05 10.10.8.5 [SERVER] Response sent: www.example.com. type A by 192.168.4.5
19:55:10 10.10.8.5 [SERVER] Response sent: ns1.example.com. type A by 192.168.4.5
19:55:23 10.10.8.5 [SERVER] Response sent: ns1.example.com. type A by 192.168.4.5
I do not need any other data, only that after submitting the answer: I need a sorted list of the most common domain names. The problem is, I won't know all the domain names beforehand, so I can't just search for the string.
Using the example above, I would like the result to be along the lines
ns1.example.com (2)
www.example.com (1)
... where the number in () is the counter for this event.
How / what can I use for Windows? Input file -.txt - the output file can be anything. Ideally this is a command line process, but I am really lost, so I would be happy with anything.
source to share
The cat is out of the bag, so try to help a little. This is a PowerShell solution. If you are having trouble with how this works, I recommend that you research the individual parts.
If the text file was "D: \ temp \ test.txt" you can do something like this.
$results = Select-String -Path D:\temp\test.txt -Pattern "(?<=sent: ).+(?= type)" | Select -Expand Matches | Select -Expand Value
$results | Group-Object | Select-Object Name,Count | Sort-Object Count -Descending
Using your input you get this for output
Name Count
---- -----
ns1.example.com. 2
www.example.com. 1
Since a regex exists, I kept a link that explains how it works .
Please keep in mind that SO is, of course, a site that helps programmers and programming enthusiasts. We devote our free time when some people get paid to do it.
source to share
This is in batch:
@echo off
setlocal enabledelayedexpansion
if exist temp.txt del temp.txt
for /f "tokens=6" %%a in (input.txt) do (Echo %%a >> temp.txt)
for /f %%a in (temp.txt) do (
set /a count=0
set v=%%a
if "!%%a!" EQU "" (
for /f %%b in ('findstr /L "%%a" "temp.txt"') do set /a count+=1
set %%a=count
Echo !v:~0,-1! ^(!count!^)
)
)
del temp.txt
It is currently displaying it on the screen. If you want to redirect it to a text file, replace:
Echo !v:~0,-1! ^(!count!^)
from:
Echo !v:~0,-1! ^(!count!^) >> output.txt
This outputs:
www.example.com (1)
ns1.example.com (2)
With sample data
source to share
This batch file solution should work faster:
@echo off
setlocal
rem Accumulate each occurance in its corresponding array element
for /F "tokens=6" %%a in (input.txt) do set /A "count[%%a]+=1"
rem Show the result
for /F "tokens=2,3 delims=[]=" %%a in ('set count[') do echo %%a (%%b)
Output:
ns1.example.com. (2)
www.example.com. (1)
To save the result to a file, change the last line as follows:
(for /F "tokens=2,3 delims=[]=" %%a in ('set count[') do echo %%a (%%b^)) > output.txt
source to share