Recapitulation of Chapter-1
Bhaswati Ramanujam
Marketing leader with 22+ years driving growth, optimizing operations, and enhancing brand equity. Expertise in SaaS, CRM, and analytics across pharma, CPG, and IT. Skilled in data-driven strategies and team leadership.
In Chapter 1, we have introduced R as a powerful statistical programming language widely used in data analysis. The Chapter outlined R's key features, such as its extensive package ecosystem and strong data visualization capabilities, making it a popular choice for analytics professionals. The chapter also highlighted the benefits of R, including its flexibility, open-source nature, and active community support. Finally, it provided a brief guide on how to download and set up R, along with an overview of basic usage.
In ?Chapter 2 we are going to explore ?basics of R Programming in R Studio, Assigning Variables and the different kinds of Variables that can be assigned.
1.??? Basics of R Programming
1.1.? Writing and Executing Code:
Let’s write some simple codes in the R Console:
In R, arithmetic operations are performed using standard operators such as + for addition, - for subtraction, * for multiplication, and / for division. R also supports exponentiation with ^ and modulus with %%. These operations can be applied to numbers, vectors, and matrices, allowing for versatile data manipulation and calculation.
?
In R, exponentiation can be performed using the ^ operator. This operator raises a number (the base) to the power of an exponent. Here's how it works:
In R, the modulus operation, which gives the remainder of a division, is performed using the %% operator
In R, variables are assigned using the assignment operator <-, . Assigned variables can be of different type. Let’s discuss each type of variable one by one:
1.??? Numeric Variables:
Numeric variables can store integers, floating-point numbers, or any real number. Here's how you can work with numeric variables in R:
Integers
Floating-point numbers
Real numbers
In order to check whether the variable used is numeric, you can use the class(), typeof(), or is.numeric() functions:
In R, it is possible to convert a floating point number into an integer and vice versa. Given below is the way it is done:
To convert a floating-point number to an integer in R, you can use several methods depending on how you want to handle the conversion. Here are the most common approaches:
1. Using as.integer()
The as.integer() function converts a numeric value to an integer by truncating (removing) the decimal part.
> float_num<-3.99
> int_num<-as.integer(float_num)
> print(int_num)
[1] 3
2. Using floor()
The floor() function rounds down to the nearest integer.
> int_num<-as.integer(float_num)
> float_num<-3.99
> int_num<-floor(float_num)
> print(int_num)
[1] 3
Using ceiling()
The ceiling() function rounds up to the nearest integer.
> int_num<-as.integer(float_num)
> float_num<-3.99
> int_num<-ceiling(float_num)
> print(int_num)
[1] 4
Using round()
The round() function rounds to the nearest integer. You can specify the number of decimal places to round to.
> int_num<-as.integer(float_num)
> float_num-3.99
> int_num<-round(float_num)
> print(int_num)
[1] 4
Using trunc()
The trunc() function truncates the decimal part of the number, effectively similar to as.integer().
> int_num<-as.integer(float_num)
> float_num<-3.99
> int_num<-trunc(float_num)
> print(int_num)
[1] 3
You can also convert an integer to a floating-point number in R. In R, this conversion is straightforward because R performs automatic type coercion when needed. However, you can explicitly convert an integer to a floating-point number using the as.numeric() function.
> int_num<-53
> print(typeof(int_num))
[1] "double"
> float_num<-as.numeric(int_num)
> print(typeof(float_num))
[1] "double"
> print(float_num)
[1] 53
as.numeric() Function: This function converts an integer or other types to numeric (which in R is of type double). This allows for floating-point arithmetic and operations that require decimal precision.
?? Automatic Coercion: R automatically promotes integers to floating-point numbers when performing operations that require a floating-point result.
Example of Automatic Coercion
> int_num <- 42
> result <- int_num + 0.5
> print(typeof(result))
[1] "double"
> print(result)
[1] 42.5
1.??? Character Variables:
You can create character variables in R by assigning text to a variable using quotes (" or ').
> name <- "Swati Ramanujam"
> print(name)
[1] "Swati Ramanujam"
In order to check whether the variable used is numeric, you can use the class(), typeof(), or is.character() functions:
> class(name)
[1] "character"
> typeof(name)
[1] "character"
> is.character(name)
[1] TRUE
Character Operations
1. Concatenation
You can combine multiple character strings using the paste() or paste0() functions.
> #using paste()with space separator
> full_name<-paste("Swati","Ramanujam")
> print(full_name)
[1] "Swati Ramanujam"
2. String Length
The nchar() function returns the number of characters in a string.
> length_name <- nchar(full_name)
> print(length_name)
[1] 15
3. Substring Extraction
> # Extracting a substring
> sub_name <- substr(name, 1, 5)
> print(sub_name)
[1] "Swati"
4.Changing Case
You can change the case of characters using tolower() and toupper().
> lower_name <- tolower(name)
> print(lower_name)
[1] "swati ramanujam"
> upper_name <- toupper(name)
> print(upper_name)
[1] "SWATI RAMANUJAM"
Handling Character Vectors
Character vectors can be manipulated similarly to other vector types in R.
> fruits <- c("apple", "banana", "cherry")
> print(fruits)
[1] "apple"? "banana" "cherry"
> more_fruits <- c(fruits, "date", "elderberry")
> print(more_fruits)
[1] "apple"????? "banana"???? "cherry"???? "date"?????? "elderberry"
Converting Other Types to Character
You can convert numeric or other types of variables to character using the as.character() function.
> num <- 432
> char_num <- as.character(num)
> print(typeof(char_num))
[1] "character"
> print(char_num)
[1] "432"
2.??? Logical Variables:
Logical variables in R are used to represent Boolean values: TRUE and FALSE. They are fundamental in controlling the flow of programs through conditional statements, loops, and logical operations.
Creating Logical Variables
You can create logical variables by directly assigning the values TRUE or FALSE.
> is_honest<-TRUE
> is_dishonest<-FALSE
> print(is_honest)
[1] TRUE
> print(is_dishonest)
[1] FALSE
Checking the Type
You can check the type of a variable using the typeof() function.
> print(typeof(is_honest))
[1] "logical"
> print(typeof(is_dishonest))
[1] "logical"
Logical Operations
Logical operations can be performed on logical variables, resulting in TRUE or FALSE.
AND (&)/OR(||)
The AND operation returns TRUE if both operands are TRUE.
The OR operation returns TRUE if at least one operand is TRUE.
> result <- is_honest & is_dishonest
> print(result)
[1] FALSE
> result<-is_honest||is_dishonest
> print(result)
[1] TRUE
Logical Comparisons
Logical comparisons between numeric or character values return logical variables.
1. Equal to (==)
Checks if two values are equal.
> x<-5
> y<-7
> x==y
[1] FALSE
2. Not equal to (!=)
Checks if two values are not equal.
> x<-5
> y<-7
> x!=y
[1] TRUE
3. Greater than (>)
Checks if one value is greater than another.
> x<-5
> y<-7
> y>x
[1] TRUE
4. Less than (<)
Checks if one value is less than another.
> x<-5
> y<-7
> x<y
[1] TRUE
5. Greater than or equal to (>=)
Checks if one value is greater than or equal to another.
> x<-5
> y<-7
> y>=x
[1] TRUE
?
6. Lesser than or equal to (>=)
> x<-5
> y<-7
> x<=y
[1] TRUE
Logical Vectors
Logical variables can also be part of vectors, allowing you to perform element-wise logical operations.
> vec1 <- c(TRUE, FALSE, TRUE)
> vec2 <- c(FALSE, FALSE, TRUE)
> result <- vec1 & vec2
> print(result)
[1] FALSE FALSE? TRUE
Logical Functions
There are several functions in R that operate on logical vectors:
1. any()
Returns TRUE if at least one element in a logical vector is TRUE
> vec <- c(FALSE, FALSE, TRUE)
> result <- any(vec)
> print(result)
[1] TRUE
2. all()
Returns TRUE if all elements in a logical vector are TRUE.
> vec <- c(FALSE, TRUE, TRUE)
> result<-all(vec)
> print(result)
领英推荐
[1] FALSE
vec <- c(TRUE, TRUE, TRUE)
> result<-all(vec)
> print(result)
[1] TRUE
Coercion to Logical Type
You can convert other data types to logical using the as.logical() function. Non-zero numeric values convert to TRUE, zero converts to FALSE, and empty strings or NA can also be coerced.
> num <- 10
> log_val <- as.logical(num)
> print(log_val)
[1] TRUE
> num <- 0
> log_val <- as.logical(num)
> print(log_val)
[1] FALSE
> num <- NA
> log_val<-as.logical((num))
> print(log_val)
[1] NA
Use Cases of Logical Variables
Logical variables are widely used in:
Logical variables and operations are fundamental for decision-making in R programs, allowing you to control the flow of execution based on conditions.
3.??? Factor Variables:
Factor variables in R are used to represent categorical data, which can be either ordered or unordered. Factors are essential for handling categorical variables in statistical modeling and data analysis because they help R understand the data's categorical nature.
Creating Factor Variables
You can create a factor in R using the factor() function.
> fruits<-c("apple", "banana", "orange", "apple", "mango" )
> print(fruits)
[1] "apple"? "banana" "orange" "apple"? "mango"
> fruits <- factor(c("apple", "banana", "orange", "apple", "mango"))
> levels(fruits)
[1] "apple"? "banana" "mango"? "orange"
In this example:
Checking the Type
·??????? You can verify that a variable is a factor using the typeof() and class() functions.
> print(typeof(fruits))
[1] "integer"
> print(class(fruits))
[1] "factor"
Factors are stored as integers internally, with each unique category (level) mapped to a corresponding integer. The class() function returns "factor" because the variable is recognized as a factor.
Specifying Levels and Order
You can manually specify the levels of a factor, as well as their order, which is particularly important for ordinal data.
> sizes <- factor(c("small", "large", "medium", "large", "small"),
+ levels = c("small", "medium", "large"))
> print(sizes)
[1] small? large? medium large? small
Levels: small medium large
Converting Factors to Numeric or Character
Sometimes, you may need to convert factors back to their original numeric or character form.
1. Converting to Character
You can convert a factor to a character using as.character().
> fruits_char <- as.character(fruits)
> print(fruits_char)
[1] "apple"? "banana" "orange" "apple"? "mango"
2. Converting to Numeric
To convert a factor to its underlying integer representation, you can use as.numeric().
> num_factor <- factor(c(10, 20, 10, 30))
> num_values <- as.numeric(as.character(num_factor))
> print(num_values)
[1] 10 20 10 30
Manipulating Factor Levels
You can manipulate the levels of a factor in various ways:
1. Renaming Levels
You can rename the levels of a factor using the levels() function.
> levels(fruits)<-c("apple", "banana", "mango", "orange")
> print(fruits)
[1] apple? banana orange apple? mango
Levels: apple banana mango orange
2. Dropping Unused Levels
After subsetting a factor, you may end up with unused levels. You can drop these using the droplevels() function.
> fruits <- factor(c("apple", "banana", "orange", "apple", "mango"))
> subset_fruits <- fruits[1:3]
> print(subset_fruits)
[1] apple? banana orange
Levels: apple banana mango orange
Factors in Data Frames
Factors are often used in data frames to represent categorical variables, especially when working with datasets imported from CSV files.
> df <- data.frame(ID = 1:5, Fruits = fruits)
> print(df)
? ID Fruits
1? 1? apple
2? 2 banana
3? 3 orange
4? 4? apple
5? 5? mango
We will discuss this again when we talk about the data farmes in R.
Use Cases of Factors
Factors are commonly used in:
Important Considerations
Factor variables in R are powerful tools for handling categorical data, especially in statistical analysis and data visualization. Properly understanding and managing factors is crucial for accurate data analysis.
4.??? Complex Variables:
Complex variables in R are used to represent complex numbers, which consist of both a real and an imaginary part. Complex numbers are particularly useful in fields like engineering, physics, and certain areas of mathematics.
Creating Complex Variables
To create a complex number in R, you use the form a + bi, where a is the real part, and b is the imaginary part (indicated by i).
Example of Creating a Complex Variable:
z <- 3 + 4i # A complex number where the real part is 3 and the imaginary part is 4
> z <- 3 + 4i
> print(z)
[1] 3+4i
Checking the Type of a Complex Variable
You can check the type of a complex variable using the typeof() & class()functions:
> typeof(z)
[1] "complex"
> class(z)
[1] "complex"
Operations on Complex Variables
R supports various operations on complex numbers, including addition, subtraction, multiplication, division, and more.
Addition and Subtraction:
> z1<-3+4i
> z2<-1-2i
> sum <- z1 + z2
> print(sum)
[1] 4+2i
> diff<-z1-z2
> print(diff)
[1] 2+6i
Multiplication and Division:
> prod <- z1 * z2
z1= 3+4i
z2=1-2i
z1* z2= (3+4i)(1-2i)
=(3-6i)+(4i-8i^2)
=3-2i+8
=11-2i
z1/z2
=(3-4i)/(1-2i)
The conjugate of 1?2i1 ?is 1+2i1
So, multiply both the numerator and the denominator by 1+2i1 + 2i1+2i:
(3?4i/1-2i) *(1+2i/1+2i)
Expand the numerator:
(3?4i)×(1+2i)
Distribute:
=3×1+3×2i?4i×1?4i×2i
So:
=3+6i?4i+8 i^2= 11+2i
Expand the denominator:
(1?2i)×(1+2i)
Distribute:
=1×1+1×2i?2i×1?2i×2i
?
=1+2i-2i-4i^2?
=1+4
=5
Combine the results:
11+2i/5
=11/5+2i/5
=2.2+0.4i
Summary
So, the result of dividing (3?4i) by (1?2i)is:
2.2+0.4i
Complex Vectors
Just like other types, you can create vectors of complex numbers.
> complex_vector <- c(1+2i, 3+4i, 5+6i)
> print(complex_vector)
[1] 1+2i 3+4i 5+6i
Summary
Complex variables in R are powerful tools for handling mathematical operations that involve both real and imaginary numbers. R provides a variety of functions to manipulate and analyze complex numbers, making it a versatile environment for working with complex data.
1.??? Integer Variables:
Creating Integer Variables
To create an integer variable in R, you use the L suffix after the number. This explicitly tells R that the number should be treated as an integer.
> x <- 42L
> y <- -7L
> x+y
[1] 35
Checking the Type of a Variable
To check if a variable is an integer, use the is.integer() function:
> is.integer(x) ?# Returns TRUE if x is an integer
[1] TRUE
> is.integer(42) # Returns FALSE because 42 without L is a numeric type
[1] FALSE
Converting Between Numeric and Integer
You can convert a numeric variable to an integer using the as.integer() function:
> num <- 42.7
> int <- as.integer(num)
> print(int)
[1] 42
Arithmetic Operations with Integers
integers in R can be used in arithmetic operations just like numeric variables:
> a <- 10L
> b <- 3L
> sum <- a + b? # Addition
> print(sum)
[1] 13
> diff <- a – b # Subtraction
> print(diff)
[1] 7
> prod <- a * b ?# Multiplication
> print(prod)
[1] 30
> quot <- a / b? # Division
> a <- 10L
> b <- 3L
> sum <- a + b? # Addition
> print(sum)
[1] 13
?
Note that while the result of division between integers is numeric, it will give a floating-point result. To get an integer result, you would need to use integer division:
> quot_int <- a %/% b
> print(quot_int)?
[1] 3
> mod <- a %% b??
> print(mod)??
[1] 1
Summary
in R, integers are a subset of numeric variables. Here's a breakdown of why integer variables are treated as a separate type despite being part of the broader numeric category:
Numeric vs. Integer Types
Reasons for Separate Integer Type
Summary
While integers are indeed a subset of numeric types, they are treated as a distinct type in R to provide clarity and control over how data is stored and manipulated. This distinction helps with memory management, type-specific operations, and precise data handling.
In this chapter, we covered 6 of the 11 different variable types available in R. These variable types provide R with the flexibility to manage a wide range of data and operations, making it a powerful tool for statistical computing and data analysis. The remaining variable types will be explored in the next chapter.
Data Science | ML Engineer & Business Strategist | MS Business Analytics @ Northeastern | RWE Analyst | Pharma Analytics
7 个月Insightful and Thank you for posting!