Learn Object-Oriented Programming Through R | by Uğurcan Demir | Apr, 2022

An introduction to the OO system

Photo by Luke Ellis-Craven on Unsplash

R is a great programming language for many reasons. But an overwhelming majority of R users are unaware of all the benefits R provides for them.

Object-oriented programming is on top of the list of those benefits that R users constantly neglect. Part of the reason is that the R ecosystem is extremely suitable for a quick dive into programming and this suitableness has led the R community to underappreciate object-oriented software design.

Object-oriented programming is, along with procedural programming, one of the two widely used programming paradigms in the software world. We will not try to teach object-oriented programming from zero, and we will assume that the reader has some familiarity with the basic concepts of object-oriented programming such as classes, encapsulation, inheritance and polymorphism.

Compared to other object-oriented languages ​​like Python, Java, and C++, R has a different syntax but the basic concepts stay the same. There are two main class systems to apply object orientation in R, namely S3 and S4. We will examine them in detail. Let’s get started!

Here we create a simple linear model and discuss the structure of S3 classes on the basis of the following example:

v1 <- rnorm(n = 5 , mean = 100 , sd = 10)
v2 <- v1 + rnorm(n = 5 , mean = 0 , sd = 10)
reg <- lm(v2 ~ v1)
class(reg)
## [1] "lm"

As you can see, when we unclass an object, it is essentially a list.

To see why the print() function worked differently on classed and unclassed linear models, we inspect the print() function a little bit deeper.

print## function (x, ...) 
## UseMethod("print")
## <bytecode: 0x0000000013082240>
## <environment: namespace:base>

To see all the associated class methods with the print() function, we run the code below:

methods(print)##   [1] print.acf*                                          
## [2] print.AES*
## [3] print.anova*
...
## [133] print.lm*
...
## [228] print.xgettext*
## [229] print.xngettext*
## [230] print.xtabs*
## see '?methods' for accessing help and source code

As of the R 4.0.2 version, the version I use right now, there are 230 class methods with the print() function. 133rd of the list is print.lm*. However, the asterisks ( *) sign means that the content of print.lm() can not be directly seen just by typing it without parenthesis as we did above. We use getAnywhere() for that.

To see all the generic functions just go with the following:

methods(  class = 'default')##   [1] add1            aggregate       AIC             all.equal      
....
## [157] wilcox.test window with xtfrm
## see '?methods' for accessing help and source code

We have seen how S3 classes use methods and attributes. Now it is time to construct our own S3 class. The first step is as simple as creating a list with elements being the attributes. Then we run the class() function with the object-to-be list in it and manually name the class. By that point, we will have completed the whole process. This might seem strange since we first create the object and then create the class because in other prominent object-oriented languages ​​it is the opposite.

student1 <- list(name="Ugurcan", semester = 3 , statStudent = T)
student2 <- list(name="Mark", semester = 7 , statStudent = F)
class(student1) <- 'student'
class(student2) <- 'student'

Here, we have first created two instances in the form of a list and then created their class just by directly naming it:

What happens when we print them?

As we mentioned, print is a generic function and there is no corresponding class method to it. This is why when we call print over those two instances they are printed out as if they were lists. Let’s create our class method for the generic function print().

print.student <- function(student) {
cat(student$name , "n")
cat("semester" , student$semester , "n")
cat("Is he\she a Statistics student ?" , student$statStudent , "n")
}

Let’s check if it works below:

Excellent! We have got two instances with appropriate attributes, one class, and one class method.

Inheritance is an important topic, and it is implemented highly differently from other languages. To create an inheritance class, we first create the instance again, but instead of passing a single piece of string as the class name, we pass a vector of strings where each element of the vector gives the class names hierarchically.

student3 <- list(name="Lauren", semester = 6 , statStudent = F , dorm = T)
class(student3) <- c("dormStudents" , "student")
print(student3)
## Lauren
## semester 6
## Is heshe a Statistics student ? FALSE

We have created class methods for generic functions that already existed. When you work on big projects, you may need to create generic functions as well. This is how we do it in the S3 way.

Create the function that calls itself through UseMethod().

advance <- function(x , ...) {
UseMethod(generic = "advance" )
}

That’s it. We have a generic function. Now we write a class method for that generic function.

advance.student <- function(x) {
x$semester = (x$semester +1)
return(x)
}

We move student1 up one semester.

student1 <- advance(student1)
student1
## Ugurcan
## semester 4
## Is heshe a Statistics student ? TRUE

S4 classes are another type of an object orientation system in R. The difference between S3 and S4 should be explained in terms of the historical development of the language.

R is inspired by another language S, which was developed in the 1970s at Bell Laboratories. The S3 class system comes from the S language. But over time, as new object-oriented languages ​​rose to prominence and S3 classes started to seem outdated because they lacked a certain level of robustness and safety, the other languages ​​had at the time.

As a response to that, S4 classes were born out of the necessity of the prevention of spelling mistakes and misattributions to which S3 classes are very prone.

We define an S4 class with the function setClass().

setClass(Class = "student" , representation = list(
name = "character",
semester = "numeric" ,
statStudent = "logical"
)
)

Unlike S3, here we have first created the class and then the instance. Each instance is created with the new() function.

Note that the attributes of S4 classes are called slots. One way to access them is to use the @ sign, instead of $ as we did in S3 classes. The second way to access an S4 attribute is to use the slot() function. You can see both of the ways in the examples below:

student1@semester## [1] 3slot(object = student1 , name = "name")## [1] "Uğurcan"

If we made any spelling mistake or some other programming mistake, S4 classes would prevent us from doing it. On the other hand, S3 classes are just glorified lists.

To implement a class method on an S4 class, we use the setMethod() function. We will create a class method for the show() generic function which is the S4 equivalent for S3’s print(). As we might guess, when we just type the name of an object on the console show() works. In fact, we would get the same output just by using show().

And this is how we use setmethod() to create S4 class methods:

setMethod("show" , "student" , function(object) {
cat(object@name , "n")
cat("semester" , object@semester , "n")
cat("Is he\she a Statistics student ?" , object@statStudent , "n")
})

Let’s see if it works.

show(student1)## Uğurcan 
## semester 3
## Is heshe a Statistics student ? TRUE

To inherit another class in S4, we pass the name of the superclass to the contains parameter in the setClass() function.

setClass(Class = "dormStudents"  ,
representation = list(dorm = "logical" ) ,
contains = "student" )

Let’s create a new instance for our subclass:

Creating a generic function in the S4 system is similar to S3. We create a function that calls itself through standardGeneric().

advance <- function(object , ...){
standardGeneric("advance")
}

Now, we create a class method for that generic function.

advance.student <- setMethod(f = "advance" ,
signature = "student" ,
definition = function(object) {
object@semester <- (object@semester + 1)
return(object)
})

And it works.

student1 <- advance(student1)
student1
## Uğurcan
## semester 4
## Is heshe a Statistics student ? TRUE

We could use S3 syntax to create an S4 generic function. Just note that we only changed $ to @.

R is addictive. It wouldn’t be surprising if you began R programming just by writing simple scripts. As your projects grow, you will need the weapon of object-oriented design to fight messy and unstructured code. Or, who knows, maybe you will start your own R libraries. And when that time comes I hope this article will be a helpful reference for you.

Leave a Comment