How to decode bson by java or clojure
I want to decode BSON file to Clojure card.
This is my code:
(ns decode
(:require [clojure.java.io :as cji])
(:import [org.bson BasicBSONObject BasicBSONEncoder BasicBSONDecoder]))
(defonce encoder (BasicBSONEncoder.))
(defonce decoder (BasicBSONDecoder.))
(defn read-file [file-path]
(with-open [reader (cji/input-stream file-path)]
(let [length (.length (clojure.java.io/file file-path))
buffer (byte-array length)]
(.read reader buffer 0 length)
buffer)))
(defn bson2map [^Byte b]
(->> (.readObject decoder b) (.toMap) (into {})))
(defn read-bson
[path]
(clojure.walk/keywordize-keys (bson2map (read-file path))))
But when I decode a BSON file like this (r/read-bson "test.bson")
, it just decodes the first entry and I want to decode them all. test.bson
too big. How to decode it in fragments?
Then I found a class called LazyBSONDecoder
, wrote some java code and it works, it can decode all entries.
import org.bson.LazyBSONDecoder;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
public class Main {
public static void main(String[] args) throws IOException {
InputStream in = new FileInputStream("./_Installation.bson");
LazyBSONDecoder decoder = new LazyBSONDecoder();
Object obj;
int count = 0;
try {
do {
obj = decoder.readObject(in);
System.out.println(obj);
count++;
} while (obj != null);
} catch (Exception e) {
// ignore
}
System.out.println(count);
}
}
So, I changed the Clojure code to replace BasicBSONDecoder
with LazyBSONDecoder
, but it always just decodes the first entry.
(defonce decoder (LazyBSONDecoder.))
(defn bson2map [^Byte b]
(do (print (.readObject decoder b))
(print (.readObject decoder b))
(print (.readObject decoder b))))
source to share
view this LazyBSONDecoder code
The function parameter bson2map
should be an inputStream, not a byte array, if it is a byte array, it will return a new one ByteArrayInputStream
, so I always get the first entry.
source to share