Retrieve Wikipedia math expressions using MediaWiki API

I want to get the content of tags <math>

in a MediaWiki API response.

I tried using this query: https://en.wikipedia.org/w/api.php?action=query&prop=extracts&format=xml&titles=delta-v

And I get a valid xml file, but instead of enter image description here

I get:

<dl><dd></dd></dl>

      

I want too:

<math>\Delta{v} = \int_{t_0}^{t_1} {\frac {|T|} {m}}\, dt</math>

      

What's available with the "Edit" button here: http://en.wikipedia.org/w/index.php?title=Delta-v&action=edit

is it somehow accessible via the API?

+3


source to share


2 answers


You need to query the entire page source using the built-in MediaWiki API:

  <code> https://en.wikipedia.org/w/api.php?action=query&prop=revisions&rvlimit=1&rvprop=content&format=xml&titles=delta-vcode>



This will give you exactly what you see when editing the page.

The request prop=extracts

is implemented by the TextExtracts extension and does not always work with code content generated by other extensions (such as math). This could probably be considered a bug in TextExtracts, but given the complexity of displaying the page in MediaWiki and the number of ways to add different extensions to content, it will probably never catch everything.

+4


source


In the original link you get the MathML rendering including the TeX code in the annotation element <annotation encoding="application/x-tex">

.



<dd>
  <span>
    <span>
      <math xmlns="http://www.w3.org/1998/Math/MathML" alttext="{\displaystyle \Delta {v}=\int _{t_{0}}^{t_{1}}{\frac {|T(t)|}{m(t)}}\,dt}">
        <semantics>
          <mrow class="MJX-TeXAtom-ORD">
            <mstyle displaystyle="true" scriptlevel="0">
              <mi mathvariant="normal">Δ</mi>
              <mrow class="MJX-TeXAtom-ORD">
                <mi>v</mi>
              </mrow>
              <mo>=</mo>
              <msubsup>
                <mo></mo>
                <mrow class="MJX-TeXAtom-ORD">
                  <msub>
                    <mi>t</mi>
                    <mrow class="MJX-TeXAtom-ORD">
                      <mn>0</mn>
                    </mrow>
                  </msub>
                </mrow>
                <mrow class="MJX-TeXAtom-ORD">
                  <msub>
                    <mi>t</mi>
                    <mrow class="MJX-TeXAtom-ORD">
                      <mn>1</mn>
                    </mrow>
                  </msub>
                </mrow>
              </msubsup>
              <mrow class="MJX-TeXAtom-ORD">
                <mfrac>
                  <mrow>
                    <mrow class="MJX-TeXAtom-ORD">
                      <mo stretchy="false">|</mo>
                    </mrow>
                    <mi>T</mi>
                    <mo stretchy="false">(</mo>
                    <mi>t</mi>
                    <mo stretchy="false">)</mo>
                    <mrow class="MJX-TeXAtom-ORD">
                      <mo stretchy="false">|</mo>
                    </mrow>
                  </mrow>
                  <mrow>
                    <mi>m</mi>
                    <mo stretchy="false">(</mo>
                    <mi>t</mi>
                    <mo stretchy="false">)</mo>
                  </mrow>
                </mfrac>
              </mrow>
              <mspace width="thinmathspace" />
              <mi>d</mi>
              <mi>t</mi>
            </mstyle>
          </mrow>
          <annotation encoding="application/x-tex">{\displaystyle \Delta {v}=\int _{t_{0}}^{t_{1}}{\frac {|T(t)|}{m(t)}}\,dt}</annotation>
        </semantics>
      </math>
    </span>
  </span>
</dd>
      

Run codeHide result


0


source







All Articles