How to get nested nodes using xml-> in clojure.data.zip?

Question

How to get nested nodes using xml-> in clojure.data.zip?

I find using xml-> extremely confusing. I have read the docs and examples but cannot figure out how to get the nested nodes of the xml document.

Suppose the following xml is in the zip (as from xml-zip):

<html>
 <body>
  <div class='one'>
    <div class='two'></div>
  </div>
 </body>
</html>

I am trying to get a div back using class = 'two'.

I expected this to work:

(xml-> z :html :body :div :div)

Or that:

(xml-> z :html :body :div (attr= :class "two"))

Kind of like css selectors.

But it only returns the first level and does not search down the tree.

The only way to make it work is:

(xml-> z :html :body :div children leftmost?)

Is this what I have to do?

The whole reason I started using xml-> was for convenience and didn't allow the lightning to move up and down and left and right. If xml-> can't get nested nodes then I can't see the value above clojure.zip.

Thank.

+3

xml clojure zipper

Scott June 20. 17 at 18:45

source to share

2 answers

akond · Answer 1 · 2017-06-20T20:19:18+0000

Two consecutive ones :div

correspond to the same node. You should have gone down. And I believe that you forgot to get the node using zip/node

.

(ns reagenttest.sample
    (:require 
              [clojure.zip :as zip]
              [clojure.data.zip.xml :as data-zip]))
(let [s "..."
      doc (xml/parse (java.io.ByteArrayInputStream. (.getBytes s)))]
(prn (data-zip/xml-> (zip/xml-zip doc) :html :body :div zip/down (data-zip/attr= :class "two") zip/node)))

or you can use an arbitrary abstraction if you're not happy with xml->

:

(defn xml->find [loc & path]
    (let [new-path (conj (vec (butlast (interleave path (repeat zip/down)))) zip/node)]
        (apply (partial data-zip/xml-> loc) new-path)))

Now you can do this:

(xml->find z :html :body :div :div)
(xml->find z :html :body :div (data-zip/attr= :class "two"))

Alan thompson · Answer 2 · 2017-06-20T22:15:58+0000

You can solve this problem using tupelo.forest

from the Tupelo library . forest

contains functions for searching and processing data trees. It is similar to Enlive on steroids. Here is a solution for your data:

(dotest
  (with-forest (new-forest)
    (let [xml-str         "<html>
                             <body>
                               <div class='one'>
                                 <div class='two'></div>
                               </div>
                             </body>
                           </html>"

          enlive-tree     (->> xml-str
                            java.io.StringReader.
                            en-html/xml-resource
                            only)
          root-hid        (add-tree-enlive enlive-tree)

          ; Removing whitespace nodes is optional; just done to keep things neat
          blank-leaf-hid? (fn [hid] (ts/whitespace? (hid->value hid))) ; whitespace pred fn
          blank-leaf-hids (keep-if blank-leaf-hid? (all-leaf-hids)) ; find whitespace nodes
          >>              (apply remove-hid blank-leaf-hids) ; delete whitespace nodes found

          ; Can search for inner `div` 2 ways
          result-1        (find-paths root-hid [:html :body :div :div]) ; explicit path from root
          result-2        (find-paths root-hid [:** {:class "two"}]) ; wildcard path that ends in :class "two"
    ]
       (is= result-1 result-2) ; both searches return the same path
       (is= (hid->bush root-hid)
         [{:tag :html}
          [{:tag :body}
           [{:class "one", :tag :div}
            [{:class "two", :tag :div}]]]])
      (is=
        (format-paths result-1)
        (format-paths result-2)
        [[{:tag :html}
          [{:tag :body}
           [{:class "one", :tag :div}
            [{:class "two", :tag :div}]]]]])

       (is (val= (hid->elem (last (only result-1)))
             {:attrs {:class "two", :tag :div}, :kids []})))))

There are many examples in unit tests and a demo demo file .

How to get nested nodes using xml-> in clojure.data.zip?

More articles: