R: collecting data with rvest error - due to "nested" forms?

For content extraction with "R", there is a new package called "rvest" from Hadley Wickam. It works great for simple sessions, eg. obtaining a timetable for a railway connection. But when I try to find advanced search it fails:

url     <- "http://mobile.bahn.de/bin/mobil/query.exe/dox?country=DEU&rt=1&use_realtime_filter=1&webview=&searchMode=NORMAL"
sitzung <- html_session(url)
p1.form <- html_form(sitzung)[[1]]
p2      <- submit_form(sitzung, p1.form, submit='advancedProductMode')
p2.form <- html_form(p2)[[1]]
form.mod<- set_values( p2.form
                      ,REQ0JourneyStopsS0G     = "HH"
                      ,REQ0JourneyStopsZ0G     = "F"
                      )
final   <- submit_form(sitzung, form.mod, submit='start')

Error in vapply(elements, encode, character(1)) : 
  values must be length 1,
 but FUN(X[[18]]) result is length 0

      

Same result:

 submit_form(p2, form.mod, submit='start'9

      

Any ideas? It was successful if I modify and submit the "p1.form" form. Content of the second form:

<form> '<unnamed>' (POST http://mobile.bahn.de/bin/mobil/query.exe/dox?ld=96240&n=8&i=c6.05923240.1417523354&rt=1&use_realtime_filter=1&webview=&OK#focus)
  <input hidden> 'queryPageDisplayed': yes
  <input hidden> 'REQ0JourneyStopsS0A': 1
  <input text> 'REQ0JourneyStopsS0G':
  <input hidden> 'REQ0JourneyStopsS0ID':
  <input hidden> 'REQ0JourneyStopsZ0A': 1
  <input text> 'REQ0JourneyStopsZ0G':
  <input hidden> 'REQ0JourneyStopsZ0ID':
  <input text> 'REQ0JourneyDate': 02.12.14
  <input text> 'REQ0JourneyTime': 13:40
  <input radio> 'REQ0HafasSearchForw': 1
  <input radio> 'REQ0HafasSearchForw': 0
  <input hidden> 'existProductNahverkehr': yes
  <input checkbox> 'REQ0JourneyProduct_prod_list': 4:0001111111000000
  <input hidden> 'REQ0Tariff_TravellerType.1': E
  <input hidden> 'REQ0Tariff_TravellerReductionClass.1': 0
  <input image> 'start':
  <input hidden> 'REQ0Tariff_Class': 2
  <input submit> 'chgBC=y&getstop': Reiseprofil ändern
  <input submit> 'HWAI=QUERY!options=hide!&getstop': Suchoptionen ausblenden
  <input hidden> 'REQ0JourneyStops1.0A': 1
  <input text> 'REQ0JourneyStops1.0G':
  <input hidden> 'REQ0JourneyStops2.0A': 1
  <input text> 'REQ0JourneyStops2.0G':
  <input submit> 'chgProd=y&getstop': Verkehrsmittelwahl ändern
  <select> 'REQ0HafasChangeTime' [0/9]
  <input hidden> 'existOptimizePrice': 1
  <input checkbox> 'REQ0HafasOptimize1': 0:1
  <input checkbox> 'REQ0JourneyProduct_opt0': 1
  <input checkbox> 'REQ0JourneyProduct_opt3': 1
  <input hidden> 'existOptionBits': yes
  <input hidden> 'immediateAvail': ON
  <input submit> 'start': Suchen

      

+3


source to share





All Articles