Get line numbers where a specific word / text occurs
What I have: I have a file that is read line by line. These lines are not counted in the file.
What I want to do: I want to read each line in the ONE stream and only return numbers where certain text occurs.
What I have so far:
public static Integer findLineNums(String word)
throws IOException {
final Map<String, Integer> map = new HashMap<>();
final List<String> lines = Files.lines(Paths.get(PATH)).collect(Collectors.toList());
IntStream.rangeClosed(0, lines.size()-1).forEach(f -> map.put(lines.get(f), f+1));
return map.get(word);
}
QUESTION: How to do this using only one thread?
EDIT QUESTION: I would like to do everything inside a Stream, this includes accumulating into a list as well.
The best scenario would look something like this:
Files.lines(Paths.get(PATH)).superAwesomeStreamFuncs().collect(Collectors.toList());
EDIT: In my case, I would only return one integer, but I would like to get something like Integer List.
source to share
It works:
int[] i = new int[]{0}; // trick to make it final
List<Integer> hits = <your stream>
.map(s -> s.contains(word) ? ++i[0] : - ++i[0])
.filter(n -> n > 0)
.collect(Collectors.toList());
The main "trick" here is the use of an array whose reference does not change (ie it is "effectively final"), but which allows us to mutate its (only) element as a counter that increments in a row. Sudden filter produces no matches.
Some test codes:
String word = "foo";
int[] i = new int[]{0};
List<Integer> hits = Stream.of("foo", "bar", "foobar")
.map(s -> s.contains(word) ? ++i[0] : - ++i[0])
.filter(n -> n > 0)
.collect(Collectors.toList());
System.out.println(hits);
Output:
[1, 3]
source to share
The following snippet will create List<Integer>
with lines containing the word
String word = "foo";
List<Integer> matchedLines = new ArrayList<>();
final List<String> lines = Files.readAllLines(Paths.get("word_list.txt"));
IntStream.rangeClosed(0, lines.size() - 1).forEach(f -> {
if (lines.get(f).contains(word)) {
matchedLines.add(++f);
}
});
System.out.println("matchedLines = " + matchedLines);
counting the file word_list.txt
as
foo
bar
baz
foobar
barfoo
output
matchedLines = [1, 4, 5]
edit To solve the problem with a single thread, create a customConsumer
public class MatchingLines {
static class MatchConsumer implements Consumer<String> {
private int count = 0;
private final List<Integer> matchedLines = new ArrayList<>();
private final String word;
MatchConsumer(String word) {
this.word = word;
}
@Override
public void accept(String line) {
count++;
if (line.contains(this.word)) {
matchedLines.add(count);
}
}
public List<Integer> getResult() {
return matchedLines;
}
}
public static void main(String[] args) throws IOException {
MatchConsumer matchConsumer = new MatchConsumer("foo");
Files.lines(Paths.get("word_list.txt")).forEach(matchConsumer);
System.out.println("matchedLines = " + matchConsumer.getResult());
}
}
source to share
This method returns the string represented by its number in the file.
public static Map<String, Integer> findLineNums(Path path, String word) throws IOException {
final Map<String, Integer> map = new HashMap<>();
int lineNumber = 0;
Pattern pattern = Pattern.compile("\\b" + word + "\\b");
try (BufferedReader reader = Files.newBufferedReader(path)) {
String line = null;
while ((line = reader.readLine()) != null) {
lineNumber++;
if (pattern.matcher(line).find()) {
map.put(line, lineNumber);
}
}
}
for (String line : map.keySet()) {
Integer lineIndex = map.get(line);
System.out.printf("%d %s\n", lineIndex, line);
}
return map;
}
BufferedReader
reads the file in turn, like a stream Files.lines
.
source to share