View on GitHub

Relenium

Selenium for R

Example

Relenium has been implemented using reference classes. No knowledge of them is required, but its use at the beginning may seem a little bit different from the standard R.

This demo can be executed directly in the R console by typing
require(relenium)
demo(webExample)

To open a new Firefox navigator use the function new.
firefox <- firefoxClass$new()

The methods available for the this new navigator are accesed via the operator $. Use the tab completion to get a list of such methods.
firefox$ # and press the TAB key

We go to some url using the function get.
firefox$get("http://lluisramon.github.io/relenium/toyPageExample.html")

The html from the webpage can be obtained with the getPageSource (and returning a "character") or in a more legible way with the printHtml function.
firefox$getPageSource()
firefox$printHtml()

The elements in the html code are called web elements (are objects of the WebElement class). To find elements in the html code use the findElement (findElementByClassName, findElementByTagName, ...) family functions to get the first tag or the findElements family to get them all.

An easy way to locate an element (tag) in the html tree is using xpaths. You can deduce the xpath yourself or even easier, in simple cases let the navigator tell you. In Chrome, for instance, open the inspector (Tools > Developer Tools ), search the element in the html code and click the right button of the mouse "Copy Xpath". On the other hand, learning xpaths is very useful for webscraping. For more information, look at the clear example in the selenium-python (not official) documentation, and the links therein for a more exhaustive documentation.

We are going to give an example of three different elements: an input box, a button and a select.

Input Box

We have searched "input" in the web inspector finding:
<h4> Input box </h4> 
<div class="row"> 
  <form class="navbar-form navbar-left" role="search"> 
     <input type="text" class="form-control">  
  </form> 
</div> 
The relevant element is the input tag (in bold), with xpath:
//*[@id='main_content']/div[1]/form/input
Now we get the input element with the following code.
inputElement <- firefox$findElementByXPath("//*[@id='main_content']/div[1]/form/input")
The inputElement is an object from the webElement class.

To write into the input box use the function sendKeys. Such function has two purposes: write text and emulate the action of pressing a key.
inputElement$sendKeys("R Project")
inputElement$sendKeys(key = "ENTER")
A complete list of the available keys can be obtained from any webElement.
inputElement$keys

Button

The Data button is an anchor element (a) with html code
 <a data-toggle="modal" href="#myModal" class="btn btn-primary btn-lg"> Data </a> 

To get the element and click the button execute:
buttonElement <- firefox$findElementByXPath("//*[@id='main_content']/a")
buttonElement$click()
This displays a new window with a data table.

You can get the table information using the readHTMLtable function from the XML package.
infoTable <- firefox$findElementByXPath("//*[@id='myModal']/div/div/div/table")
readHTMLTable(infoTable$getHtml(), header = TRUE)[[1]] 
This is shown as an example. In fact it is much more simpler to do:
readHTMLTable(firefox$getPageSource(), header = TRUE)

To close the window we call the close button (named x in the html code):
buttonElement <- firefox$findElementByXPath("//*[@id='myModal']/div/div/div/button")
buttonElement$click()

Select

The select tag has more methods than the other tags. The following code finds the select element and shows its inner code (another way to get its html code is via the function getHtml).
selectElement <- firefox$findElementByXPath("//*[@id='main_content']/select")
selectElement$printHtml()
Result:
<select multiple="multiple">
  <option value="Mango"> Mango </option>
  <option value="Nectarine"> Nectarine </option>
  <option value="Cherry"> Cherry </option>
</select>	

Now we can access to its options.
optsList <- selectElement$getOptions() 
The optsList is a list containing webElements. To get the names of the options we execute getText in every of its elements.
sapply(optsList, function(optEle){
  optEle$getText()
})

The items in the list can be selected in many ways (see the package documentation). A simple example is:
selectElement$selectByValue("Mango")
selectElement$selectByValue("Nectarine")
To get a list of all selected options and afterwards deselect them all:
optsSel <- selectElement$getAllSelectedOptions() 
sapply(optsSel, function(optEle){
  optEle$getText()
})
selectElement$deselectAll()

Navigation

Finally, we show an example of back/forward navigation and we close the webdriver.
firefox$get("http://lluisramon.github.io/relenium/")
firefox$back()
firefox$close()