Python

Created

September 9, 2020

Modified

December 8, 2025

Python basics

Data type

in Python, understanding data types and structures is essential for writting effective code. Data types determine the kind of data a variable can hold, while data structures allow you to organize and manage that data efficiently.

  • Numbers: Represent numerical values, including integers and floating-point numbers.
  • Strings: Represent sequences of characters, used for text manipulation.
  • Booleans: Represent truth values, either True or False.
  • Lists: Ordered collections of items, allowing for duplicate values and mutable operations.
  • Tuples: Ordered collections of items, similar to lists but immutable.
  • Dictionaries: Unordered, Key-value pairs that allow for efficient data retrieval based on unique keys.
  • Sets: Unordered collections of unique items, useful for membership testing and eliminating duplicates.
## Numbers and strings
integer_num = 42
float_num = 3.14
string_text = "Hello, Python!"

## List: mutable, ordered collection
fruits = ["apple", "banana", "cherry"]

## Tuple: immutable, ordered collection
dimensions = (1920, 1080)

## Dictionary: unordered, key-value pairs
person = {"name": "Alice", "age": 30, "city": "New York"}

## Set: unordered collection of unique items
unique_numbers = {1, 2, 3, 4, 5}

print("Integer:", integer_num)
print("Float:", float_num)
print("String:", string_text)
print("List of fruits:", fruits)
print("Tuple of dimensions:", dimensions)
print("Dictionary of person:", person)
print("Set of unique numbers:", unique_numbers)
Integer: 42
Float: 3.14
String: Hello, Python!
List of fruits: ['apple', 'banana', 'cherry']
Tuple of dimensions: (1920, 1080)
Dictionary of person: {'name': 'Alice', 'age': 30, 'city': 'New York'}
Set of unique numbers: {1, 2, 3, 4, 5}

Variable

  • Number
  • String
  • Tuple
  • List: Mutable, container
  • Dictionary: Mutable, container
  • Set: Mutable, container
  • None: empty value
tuple = (1, 2, 3)
list = [1, 2, 3]
dict = {"ele1":1, "ele2":2, "ele3":3}

Operator

Numerical Operators: - < : less than - > : greater than - <= : less than or equal to - >= : greater than or equal to - == : equal to - != : not equal to

String Operators: - == : equal to - != : not equal to

Logical Operators: - and - or - not

Control flow

Control flow in Python allows you to make decisions and execute different blocks of code based on conditions. Loops enable you to repeat a block of code multiple times.

Best practices for control flow and loops include: - Keep conditions simple and clear. Break down complex conditions into smaller parts. - Use meaningful variable names to enhance readability. - Avoid deeply nested loops and conditions to maintain code clarity. - Use comments to explain the purpose of complex conditions or loops. - Test edge cases to ensure your control flow behaves as expected.

# Conditional statements
x = 10
if x > 5:
    print("x is greater than 5")
elif x == 5:
    print("x is equal to 5")
else:
    print("x is less than 5")

Iteration

## For loop: iterating over a list
for i in range(5):
    print("Iteration:", i)

## While loop: continues until a condition is met
count = 0
while count < 5:
    print("Count is:", count)
    count += 1

Conditional execution in Python is achieved using the if/else construct (if and else are reserved words).

# Conidtional execution
x = 10
if x > 10:
    print("I am a big number")
else:
    print("I am a small number")

# Multi-way if/else
x = 10
if x > 10:
    print("I am a big number")
elif x > 5:
    print("I am kind of small")
else:
    print("I am really number")

Iteration/Lopps

Two looping constructs in Python

  • For : used when the number of possible iterations (repetitions) are known in advance

  • While: used when the number of possible iterations (repetitions) can not be defined in advance. Can lead to infinite loops, if conditions are not handled properly

for customer in ["John", "Mary", "Jane"]:
    print("Hello ", customer)
    print("Please pay")
    collectCash()
    giveGoods()

hour_of_day = 9
while hour_of_day < 17:
    moveToWarehouse()
    locateGoods()
    moveGoodsToShip()
    hour_of_day = getCurrentTime()

What happens if you need to stop early? We use the break keyword to do this.

It stops the iteration immediately and moves on to the statement that follows the looping

while hour_of_day < 17:
    if shipIsFull() == True:
        break
    moveToWarehouse()
    locateGoods()
    moveGoodsToShip()
    hour_of_day = getCurrentTime()
collectPay()

What happens when you want to just skip the rest of the steps? We can use the continue keyword for this.

It skips the rest of the steps but moves on to the next iteration.

for customer in ["John", "Mary", "Jane"]:
    print("Hello ", customer)
    print("Please pay")
    paid = collectCash()
    if paid == False:
        continue
    giveGoods()

Exceptions

  • Exceptions are errors that are found during execution of the Python program.
  • They typically cause the program to fail.
  • However we can handle them using the ‘try/except’ construct.
num = input("Please enter a number: ")
try:
    num = int(num)
    print("number squared is " + str(num**2))
except:
    print("You did not enter a valid number")

General functions

help()
type()
len() 
range()
list()      
tuple()
dict()

Python for R users

Install library

#| eval: false
## Install library using pip
python3 -m pip install pandas numpy matplotlib
## Install package using install.packages()
install.packages("dplyr")
## Install package using devtools
install.packages("devtools")
devtools::install_github("tidyverse/dplyr")

## Install package using bioconductor
if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install("dplyr", force = TRUE, ask = FALSE)
#| eval: false

## Install Python library using conda
conda install pandas numpy matplotlib

## Install R package using conda
conda install -n renv r-dplyr bioconductor-dplyr

Load library

#| eval: false
## Load library
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

## Use a function from library, first specify the library nickname and then 
## the function name, separated by a dot:
np.log(7)
#| eval: false

## Load library
library(dplyr)
suppressPackageStartupMessages(
    suppressWarnings(
        {
            library(ggplot2)
            library(tidyr)
        }
    )
)

Whitespace

  • Whitespace matters in Python.
  • in R, code blocks are defined by curly braces {}.
  • in Python, code blocks are defined by indentation (usually 4 spaces).
if(TRUE) {
    print("This is R")
    if(TRUE) {
        print("Nested block in R")
    }
}
## Python accepts tabs or spaces, but spaces are preferred
if True:
    print("This is Python")
    if True:
        print("Nested block in Python")

Container types

  • in R, the list is a versatile container type that can hold elements of different types and structures.
  • There is no single direct equivalent of R’s list in Python that support all the same features.
  • Instead, there are (at least) 4 different Python container types we need to aware:
    • list: ordered, mutable, allows duplicate elements, created using []
    • tuple: ordered, immutable, allows duplicate elements, created using ()
    • set: unordered, mutable, no duplicate elements, created using {}
    • dict: unordered, mutable, key-value pairs, created using {}

Lists

Python lists created using bare brackets [], closer to R’s as.list function.

  • The most important thing to know about Python lists is that they are mutable.
x = [1, 2, 3]
y = x    # `y` and `x` now refer to the same list!
x.append(4)
print("x is", x)
#> x is [1, 2, 3, 4]
print("y is", y)
#> y is [1, 2, 3, 4]
  • Some syntactic sugar around Python lists you might encounter is the usage of + and * with lists. These are concatenation and replication operators, akin to R’s c() and rep().
x = [1]
x
#> [1]
x + x
#> [1, 1]
x * 3
#> [1, 1, 1]
  • Index into lists with integers using trailing [], but note that indexing is 0-based
x = [1, 2, 3]

x[0]
#> 1
x[1]
#> 2
x[2]
#> 3

try:
  x[3]
except Exception as e:
  print(e)
#> list index out of range

## Negative numbers count from the end of the list
x[-1]
#> 3
x[-2]
#> 2
x[-3]
#> 1
  • Slice ranges of lists using : inside the trailing []. Note that the end index is exclusive. We can optionally specify a stride using a second :.
x = [1, 2, 3, 4, 5, 6] 
x[0:2] # get items at index positions 0, 1
#> [1, 2]
x[1:]  # get items from index position 1 to the end
#> [2, 3, 4, 5, 6]
x[:-2] # get items from beginning up to the 2nd to last.
#> [1, 2, 3, 4]
x[:]   # get all the items (idiom used to copy the list so as not to modify in place)
#> [1, 2, 3, 4, 5, 6]
x[::2] # get all the items, with a stride of 2
#> [1, 3, 5]
x[1::2] # get all the items from index 1 to the end, with a stride of 2
#> [2, 4, 6]

Tuples

  • Tuples behave like lists, but are immutable (cannot be changed after creation).
  • Created using bare parentheses (), but parentheses are not strictly required.
x = (1, 2) # tuple of length 2
type(x)
#> <class 'tuple'>
len(x)
#> 2
x
#> (1, 2)

x = (1,) # tuple of length 1
type(x)
#> <class 'tuple'>
len(x)
#> 1
x
#> (1,)

x = () # tuple of length 0
print(f"{type(x) = }; {len(x) = }; {x = }")
#> type(x) = <class 'tuple'>; len(x) = 0; x = ()
# example of an interpolated string literals

x = 1, 2 # also a tuple
type(x)
#> <class 'tuple'>
len(x)
#> 2

x = 1, # beware a single trailing comma! This is a tuple!
type(x)
#> <class 'tuple'>
len(x)
#> 1

type(x) = <class ‘tuple’>; len(x) = 0; x = ()

1
  • Tuples are the container that powers the packing and unpacking semantics in Python.
    • Packing and unpacking tuples is a common idiom in Python.
    • Python provides the convenience of unpacking tuples into multiple variables in a single statement.
x = (1, 2, 3)
a, b, c = x
a
#> 1
b
#> 2
c
#> 3
3

Data frame

#| eval: true
## R contains a native data frame
r_df <- data.frame(
    Name = c("Alice", "Bob", "Charlie"),
    Age = c(25, 30, 35),
    City = c("New York", "Los Angeles", "Chicago")
)
print(r_df)
#| eval: true

## Python's dataframe comes form the pandas library
import pandas as pd

## It's actually  a type of dictionary of lists
py_df = pd.DataFrame(
    {
        'Name': ['Alice', 'Bob', 'Charlie'],
        'Age': [25, 30, 35],
        'City': ['New York', 'Los Angeles', 'Chicago']
    }
)

print(py_df)

Reference

Back to top