Learn R in simple and easy steps starting from basic to advanced concepts with examples. If you are trying to understand the R programming language as a beginner, this tutorial will give you enough understanding on almost all the concepts of the language from where you can take yourself to higher levels of expertise.
Learn R in simple and easy steps starting from basic to advanced concepts with examples. If you are trying to understand the R programming language as a beginner, this repository will give you enough understanding on almost all the concepts of the language from where you can take yourself to higher levels of expertise.
R is one of the most popular analytics tool. But apart from being used for analytics, R is also a programming language.
R is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms, Windows and MacOS.
R is a programming language developed by Ross Ihaka and Robert Gentleman in 1993. R possesses an extensive catalog of statistical and graphical methods. It includes machine learning algorithm, linear regression, time series, statistical inference to name a few.
R is not only entrusted by academic, but many large companies also use R programming language, including Uber, Google, Airbnb, Facebook and so on.
Data analysis with R is done in a series of steps; programming, transforming, discovering, modeling and communicate the results
Data science is shaping the way companies run their businesses. Without a doubt, staying away from Artificial Intelligence and Machine will lead the company to fail. The big question is which tool/language should you use?
They are plenty of tools available in the market to perform data analysis. Learning a new language requires some time investment. The picture below depicts the learning curve compared to the business capability a language offers. The negative relationship implies that there is no free lunch. If you want to give the best insight from the data, then you need to spend some time learning the appropriate tool, which is R.
A variable is a name given to a memory location, which is used to store values in a computer program. Variables in R programming can be used to store numbers (real and complex), words, matrices, and even tables. R is a dynamically programmed language which means that unlike other programming languages, we do not have to declare the data type of a variable before we can use it in our program.
For a variable to be valid, it should follow these rules
Data types are used to store information. In R, we do not need to declare a variable as some data type. The variables are assigned with R-Objects and the data type of the R-object becomes the data type of the variable. There are mainly six data types present in R:
Vectors are the most basic R data objects and there are six types of atomic vectors. Below are the six atomic vectors:
Eg: 25, 7.1145 , 96547
Eg: 45.479, -856.479 , 0
Eg: 4+3i
Eg: “Edureka”, ‘R is Fun to learn’.
Vtr = c(2, 5, 11 , 24)
Or
Vtr <- c(2, 5, 11 , 24)
Let us move forward and understand other data types in R.
Lists are quite similar to vectors, but Lists are the R objects which can contain elements of different types like − numbers, strings, vectors and another list inside it.
Arrays in R are data objects which can be used to store data in more than two dimensions. It takes vectors as input and uses the values in the dim parameter to create an array.
The basic syntax for creating an array in R is −
*array(data, dim, dimnames)
Where:
Matrix is the R object in which the elements are arranged in a two-dimensional rectangular layout.
The basic syntax for creating a matrix in R is −
matrix(data, nrow, ncol, byrow, dimnames)
Where:
A Data Frame is a table or a two-dimensional array-like structure in which each column contains values of one variable and each row contains one set of values for each column. Below are some of the characteristics of a Data Frame that needs to be considered every time we work with them:
Factors are the data objects which are used to categorize the data and store it as levels. They can store both strings and integers. They are useful in data analysis for statistical modeling.
There are mainly 4 data operators in R, they are as seen below:
These operators help us perform the basic arithmetic operations like addition, subtraction, multiplication, etc.
These operators help us perform the relational operations like checking if a variable is greater than, lesser than or equal to another variable. The output of a relational operation is always a logical value.
These operators are used to assign values to variables in R. The assignment can be performed by using either the assignment operator (<-) or equals operator (=). The value of the variable can be assigned in two ways, left assignment and right assignment.
These operators compare the two entities and are typically used with boolean (logical) values such as ‘and’, ‘or’ and ‘not’.
The If statement helps you in evaluating a single expression as part of the flow. To perform this evaluation, you just need to write the If keyword followed by the expression to be evaluated.
The Else if statement helps you in extending branches to the flow created by the If statement and give you the opportunity to evaluate multiple conditions by creating new branches of flow.
The else statement is used when all the other expressions are checked and found invalid. This will be the last statement that gets executed as part of the If – Else if branch.
A loop statement allows us to execute a statement or group of statements multiple times. There are mainly 3 types of loops in R:
It repeats a statement or group of statements while a given condition is TRUE. Repeat loop is the best example of an exit controlled loop where the code is first executed and then the condition is checked to determine if the control should be inside the loop or exit from it.
It helps to repeats a statement or group of statements while a given condition is TRUE. While loop, when compared to the repeat loop is slightly different, it is an example of an entry controlled loop where the condition is first checked and only if the condition is found to be true does the control be delivered inside the loop to execute the code.
It is used to repeats a statement or group of for a fixed number of times. Unlike repeat and while loop, the for loop is used in situations where we are aware of the number of times the code needs to executed beforehand. It is similar to the while loop where the condition is first checked and then only does the code written inside get executed.
A function is a block of organized, reusable code that is used to perform a single, related action. There are mainly two types of functions in R:
A user-defined function is a function provided by the user of a program or environment in a context where the usual assumption is that functions are built into the program or environment.
There is a function called the repeat function which means to repeat a value a number of time. Repeat series is repeat 1:6 twice as you can see in the image. All this data can be created whenever required and elements of the series can also be repeated.