2  Python Basics

2.1 Built-in Types: numeric types and str

This section is based on [1].

There are several built-in data structures in Python. Here is an (incomplete) list:

  • None
  • Boolean – True, False
  • Numeric Types — int, float, complex
  • Text Sequence Type — str
  • Sequence Types — list
  • Map type - dict

We will cover numeric types and strings in this section. The rests are either simple that are self-explained, or not simple that will be discussed later.

2.1.1 Numeric types and math expressions

Numeric types are represented by numbers. If there are no confusions, Python will automatically detect the type.

x = 1 # x is an int.
y = 2.0 # y is a float.

Python can do math just like other programming languages. The basic math operations are listed as follows.

  • +, -, *, /, >, <, >=, <= works as normal.
  • ** is the power operation.
  • % is the mod operation.
  • != is not equal

2.1.2 str

Scalars are represented by numbers and strings are represented by quotes. Example:

x = 1       # x is a scalar.
y = 's'     # y is a string with one letter.
z = '0'     # z loos like a number, but it is a string.
w = "Hello" # w is a string with double quotes.

Here are some facts.

  1. For strings, you can use either single quotes ' or double quotes ".
  2. \ is used to denote escaped words. You may find the list Here.
  3. There are several types of scalars, like int, float, etc.. Usually Python will automatically determine the type of the data, but sometimes you may still want to declare them manually.
  4. You can use int(), str(), etc. to change types.

Although str is a built-in type, there are tons of tricks with str, and there are tons of packages related to strings. Generally speaking, to play with strings, we are interested in two types of questions.

  • Put information together to form a string.
  • Extract information from a string. We briefly talk about these two tasks.
Note

There is a very subtle relations between the variable / constant and the name of the variable / constant. We will talk about these later.

Example 2.1 Here is an example of playing with strings. Please play with these codes and try to understand what they do.

import re

def clean_strings(strings):
    result = []
    for value in strings:
        value = value.strip()
        value = re.sub('[!#?]', '', value)
        value = value.title()
        result.append(value)
    return result

states = [' Alabama ', 'Georgia!', 'Georgia', 'georgia', 'FlOrIda',
          'south carolina##', 'West virginia?']
print(clean_strings(states))
['Alabama', 'Georgia', 'Georgia', 'Georgia', 'Florida', 'South Carolina', 'West Virginia']

2.2 Fundamentals

This section is mainly based on [2].

2.2.1 Indentation

One key feature about Python is that its structures (blocks) is determined by Indentation.

Let’s compare with other languages. Let’s take C as an example.

/*This is a C function.*/
int f(int x){return x;}

The block is defined by {} and lines are separated by ;. space and newline are not important when C runs the code. It is recommended to write codes in a “beautiful, stylish” format for readibility, as follows. However it is not mandatary.

/*This is a C function.*/
int f(int x) {
   return x;
}

In Python, blocks starts from : and then are determined by indents. Therefore you won’t see a lot of {} in Python, and the “beautiful, stylish” format is mandatary.

# This is a Python function.
def f(x):
    return x

The default value for indentation is 4 spaces, which can be changed by users. We will just use the default value in this course.

Note

It is usually recommended that one line of code should not be very long. If you do have one, and it cannot be shortened, you may break it into multiline codes directly in Python. However, since indentation is super important in Python, when break one line code into multilines, please make sure that everything is aligned perfectly. Please see the following example.

results = shotchartdetail.ShotChartDetail(
            team_id = 0,
            player_id = 201939,
            context_measure_simple = 'FGA',
            season_nullable = '2021-22',
            season_type_all_star = 'Regular Season')

2.2.2 Binary operators and comparisons

Most binary operators behaves as you expected. Here I just want to mention == and is.

  • == is testing whehter these two objects have the same value.
  • is is testing whether these two objects are exactly the same.
Note

You may use id(x) to check the id of the object x. Two objects are identical if they have the same id.

2.2.3 import

In Python a module is simply a file with the .py extension containing Python code. Assume that we have a Python file example.py stored in the folder assests/codes/. The file is as follows.

# from assests/codes/example.py

def f(x):
    print(x)

A = 'You get me!'

You may get access to this function and this string in the following way.

from assests.codes import example

example.f(example.A)
You get me!

2.2.4 Comments

Any text preceded by the hash mark (pound sign) # is ignored by the Python interpreter. In many IDEs you may use hotkeys to directly toggle multilines as comments. For example, in VS Code the default setting for toggling comments is ctrl+/.

2.2.5 Dynamic references, strong types

In some programming languages, you have to declare the variable’s name and what type of data it will hold. If a variable is declared to be a number, it can never hold a different type of value, like a string. This is called static typing because the type of the variable can never change.

Python is a dynamically typed language, which means you do not have to declare a variable or what kind of data the variable will hold. You can change the value and type of data at any time. This could be either great or terrible news.

On the other side, “dynamic typed” doesn’t mean that types are not important in Python. You still have to make sure that the types of all variables meet the requirements of the operations used.

a = 1
b = 2
b = '2'
c = a + b
TypeError: unsupported operand type(s) for +: 'int' and 'str'

In this example, b was first assigned by a number, and then it was reassigned by a str. This is totally fine since Python is dynamically types. However later when adding a and b, the type error occurs since you cannot add a number and a str.

Note

You may always use type(x) to detect the type of the object x.

2.2.6 Everything is an object

Every number, string, data structure, function, class, module, and so on exists in the Python interpreter in its own “box”, which is referred to as a Python object.

Each object has an associated type (e.g., string or function) and internal data. In practice this makes the language very flexible, as even functions can be treated like any other object.

Each object might have attributes and/or methods attached.

2.2.7 Mutable and immutable objects

An object whose internal state can be changed is mutable. On the other hand, immutable doesn’t allow any change in the object once it has been created.

Some objects of built-in type that are mutable are:

  • Lists
  • Dictionaries
  • Sets

Some objects of built-in type that are immutable are:

  • Numbers (Integer, Rational, Float, Decimal, Complex & Booleans)
  • Strings
  • Tuples

Example 2.2 (Tuples are not really “immutable”) You can treat a tuple as a container, which contains some objects. The relations between the container and its contents are immutable, but the objects it holds might be mutable. Please check the following example.

container = ([1], [2])
print('This is `container`: ', container)
print('This is the id of `container`: ', id(container))
print('This is the id of the first list of `container`: ', id(container[0]))

container[0].append(2)
print('This is the new `container`: ', container)
print('This is the id of the new `container`: ', id(container))
print('This is the id of the first list (which is updated) of the new `container`: ', id(container[0]))
This is `container`:  ([1], [2])
This is the id of `container`:  1946833634880
This is the id of the first list of `container`:  1946833486272
This is the new `container`:  ([1, 2], [2])
This is the id of the new `container`:  1946833634880
This is the id of the first list (which is updated) of the new `container`:  1946833486272

You can see that the tuple container and its first object stay the same, although we add one element to the first object.

2.3 Flows and Logic

2.3.1 for loop

  • range(10)
  • list

2.3.2 if conditional control

2.4 list

Note

In Python, a list is an ordered sequence of object types and a string is an ordered sequence of characters.

  • Access to the data

  • Slicing

  • Methods

    • append and +
    • extend
    • pop
    • remove
  • in

  • for

  • list()

  • sorted

  • str.split

  • str.join

2.4.1 List Comprehension

List Comprehension is a convenient way to create lists based on the values of an existing list. It cannot provide any real improvement to the performance of the codes, but it can make the codes shorter and easier to read.

The format of list Comprehension is

newlist = [expression for item in iterable if condition == True]

2.5 dict

  • Access to the data
  • Methods
    • directly add items
    • update
    • get
    • keys
    • values
    • items
  • dict()
  • dictionary comprehension

2.6 Exercises

Most problems are based on [3], [1] and [4].

Exercise 2.1 (Indentation) Please tell the differences between the following codes. If you don’t understand for don’t worry about it. Just focus on the indentation and try to understand how the codes work.

for i in range(5):
    print('Hello world!')
print('Hello world!')
for i in range(5):
    print('Hello world!')
    print('Hello world!')
for i in range(5):
print('Hello world!')
print('Hello world!')
for i in range(5):
    pass
print('Hello world!')
print('Hello world!')

Exercise 2.2 (Play with built-in data types) Please first guess the results of all expressions below, and then run them to check your answers.

print(True and True)
print(True or True)
print(False and True)
print((1+1>2) or (1-1<1))

Exercise 2.3 (== vs is) Please explain what happens below.

a = 1
b = 1.0
print(type(a))
print(type(b))

print(a == b)
print(a is b)
<class 'int'>
<class 'float'>
True
False

Exercise 2.4 (Play with strings) Please excute the code below line by line and explain what happens in text cells.

# 1
answer = 10
wronganswer = 11
text1 = "The answer to this question is {}. If you got {}, you are wrong.".format(answer, wronganswer)
print(text1)

# 2
var = True
text2 = "This is {}.".format(var)
print(text2)

# 3
word1 = 'Good '
word2 = 'buy. '
text3 = (word1 + word2) * 3
print(text3)

# 4
sentence = "This is\ngood enough\nfor a exercise to\nhave so many parts. " \
           "We would also want to try this symbol: '. " \
           "Do you know how to type \" in double quotes?"
print(sentence)
The answer to this question is 10. If you got 11, you are wrong.
This is True.
Good buy. Good buy. Good buy. 
This is
good enough
for a exercise to
have so many parts. We would also want to try this symbol: '. Do you know how to type " in double quotes?

Exercise 2.5 (split and join) Please excute the code below line by line and explain what happens in text cells.

sentence = 'This is an example of a sentence that I expect you to split.'

wordlist = sentence.split(' ')

newsentence = '\n'.join(wordlist)
print(newsentence)

Exercise 2.6 (List reference) Please finish the following tasks.

  1. Given the list a, make a new reference b to a. Update the first entry in b to be 0. What happened to the first entry in a? Explain your answer in a text block.

  2. Given the list a, make a new copy b of the list a using the function list. Update the first entry in b to be 0. What happened to the first entry in a? Explain your answer in a text block.

Exercise 2.7 (List comprehension) Given a list of numbers, use list comprehension to remove all odd numbers from the list:

numbers = [3,5,45,97,32,22,10,19,39,43]

Exercise 2.8 (More list comprehension) Use list comprehension to find all of the numbers from 1-1000 that are divisible by 7.

Exercise 2.9 (More list comprehension) Count the number of spaces in a string.

Exercise 2.10 (More list comprehension) Use list comprehension to get the index and the value as a tuple for items in the list ['hi', 4, 8.99, 'apple', ('t,b', 'n')]. Result would look like [(index, value), (index, value), ...].

Exercise 2.11 (More list comprehension) Use list comprehension to find the common numbers in two lists (without using a tuple or set) list_a = [1, 2, 3, 4], list_b = [2, 3, 4, 5].

Exercise 2.12 (Probability) Compute the probability that two people out of 23 share the same birthday. The math formula for this is \[1-\frac{365!/(365-23)!}{365^{23}}=1-\frac{365}{365}\cdot\frac{365-1}{365}\cdot\frac{365-2}{365}\cdot\ldots\cdot\frac{365-22}{365}.\]

  1. To directly use the formula we have to use a high performance math package, e.g. math. Please use math.factorial to compute the above formula.

  2. Please use the right hand side of the above formula to compute the probability using the following steps.

    1. Please use the list comprehension to create a list \(\left[\frac{365}{365},\frac{365-1}{365},\frac{365-2}{365},\ldots,\frac{365-22}{365}\right]\).
    2. Use numpy.prod to compute the product of elements of the above list.
    3. Compute the probability by finishing the formula.
  3. Please use time to test which method mentioned above is faster.

2.7 Projects

Most projects are based on [2], [5].

Exercise 2.13 (Determine the indefinite article) Please finish the following tasks.

  1. Please construct a list aeiou that contains all vowels.
  2. Given a word word, we would like to find the indefinite article article before word. (Hint: the article should be an if the first character of word is a vowel, and a if not.)
Click for Hint.

Solution. Consider in, .lower() and if structure.

Exercise 2.14 (Datetime and files names) We would like to write a program to quickly generate N files. Every time we run the code, N files will be generated. We hope to store all files generated and organize them in a neat way. To achieve this, one way is to create a subfolder for each run and store all files generated during that run in the particular subfolder. Since we would like to make it fast, the real point of this task is to find a way to automatically generate the file names for the files generated and the folder names for the subfolders generated. You don’t need to worry about the contents of the files and empty files are totally fine for this problem.

Click for Hint.

Solution. One way to automatically generate file names and folder names is to use the date and the time when the code is run. Please check datetime package for getting and formatting date/time, and os packages for playing with files and folders.

Exercise 2.15 (Color the Gnomic data) We can use ASCII color codes in the string to change the color of strings, as an example \033[91m for red and \033[94m for blue. See the following example.

print('\033[91m'+'red'+'\033[92m'+'green'+'\033[94m'+'blue'+'\033[93m'+'yellow')

Consider an (incomplete) Gnomic data given below which is represented by a long sequence of A, C, T and G. Please color it using ASCII color codes.

Gnomicdata = 'TCGATCTCTTGTAGATCTGTTCTCTAAACGAACTTTAAAATCTGTGTGGCTGTCACTCGG'\
             'CTGCATGCTTAGTGCACTCACGCAGTATAATTAATAACTAATTACTGTCGTTGACAGGAC'\
             'ACGAGTAACTCGTCTATCTTCTGCAGGCTGCTTACGGTTTCGTCCGTGTTGCAGCCGATC'\
             'ATCAGCACATCTAGGTTTTGTCCGGGTGTGACCGAAAGGTAAGATGGAGAGCCTTGTCCC'\
             'TGGTTTCAACGAGAAAACACACGTCCAACTCAGTTTGCCTGTTTTACAGGTTCGCGACGT'\
             'GCTCGTACGTGGCTTTGGAGACTCCGTGGAGGAGGTCTTATCAGAGGCACGTCAACATCT'\
             'TAAAGATGGCACTTGTGGCTTAGTAGAAGTTGAAAAAGGCGTTTTGCCTCAACTTGAACA'\
             'GCCCTATGTGTTCATCAAACGTTCGGATGCTCGAACTGCACCTCATGGTCATGTTATGGT'\
             'TGAGCTGGTAGCAGAACTCGAAGGCATTCAGTACGGTCGTAGTGGTGAGACACTTGGTGT'\
             'CCTTGTCCCTCATGTGGGCGAAATACCAGTGGCTTACCGCAAGGTTCTTCTTCGTAAGAA'\
             'CGGTAATAAAGGAGCTGGTGGCCATAGTTACGGCGCCGATCTAAAGTCATTTGACTTAGG'\
             'CGACGAGCTTGGCACTGATCCTTATGAAGATTTTCAAGAAAACTGGAACACTAAACATAG'
Click for Hint.

Solution (Hint). You may use if to do the conversion. Or you may use dict to do the conversion.

Exercise 2.16 (sorted) Please read through the Key funtions in this article, and sort the following two lists.

  1. Sort list1 = [[11,2,3], [2, 3, 1], [5,-1, 2], [2, 3,-8]] according to the sum of each list.

  2. Sort list2 = [{'a': 1, 'b': 2}, {'a': 3, 'b': 4},{'a': 5, 'b': 2}] according to the b value of each dictionary.

Exercise 2.17 (Fantasy Game Inventory) You are creating a fantasy video game. The data structure to model the player’s inventory will be a dictionary where the keys are string values describing the item in the inventory and the value is an integer value detailing how many of that item the player has. For example, the dictionary value {'rope': 1, 'torch': 6, 'gold coin': 42, 'dagger': 1, 'arrow': 12} means the player has 1 rope, 6 torches, 42 gold coins, and so on.

Write a program to take any possible inventory and display it like the following:

Inventory:
12 arrow
42 gold coin
1 rope
6 torch
1 dagger
Total number of items: 62