Unit 02 Lab 1: Python Programming

Part 1: Overview


In this lab you will become familiar with the Jupyter notebook interactive computing environment and learn the basics of programming in Python 3.

Learning Outcomes

Upon completing this lab you will be able to: - Know how to use the Jupyter notebook for Python programming - Understand and apply the basics of the Python programming language.


To complete this lab you will need:

NOTE: We can’t teach you everything about python, that’s why we recommend the following resources:

Before You Begin

Before you start this lab you should:

From your Hadoop Client:

  1. Open a linux command prompt.
  2. Start jupyter notebook:
    $ jupyter-notebook
  3. Open a browser and go to http://localhost:8888
    You should see the Jupyter Application in your browser.

Part 2: Walk-Though

Using Jupyter

The Jupyter notebook is an interactive computing environment, where you can create notebook documents which include, Python code, text, plots, other content like images and video. The code you write in the notebook is executable and the output can be seen inside the browser.

The Jupyter system consists of the web application, kernels and notebooks. Kernels are used to write an run code. In your setup, we only have the Python3 kernel installed, but Jupyter supports a variety of languages and environments. Through the web application we create notebooks which contain code (based on the kernel we selected), documents, and more. This section will walk you through the basics of the Jupyter environment.

  1. Let’s create a new notebook. Click on the new buton new button and select Python 3 for the notebook.
  2. This opens up a new notebook:
    new untitled notebook
  3. One of the first things we should do is name the notebook. Click on Untitled and rename it Hello.
  4. We type our code and documentation in the green bordered grey cell with the prompt In [ ] for example, type in:
    print ('Hello, Python!')
    So that the the box looks like this:
    hello python code
    NOTE: the cell is bordered in green to help us to identify the current cell we are editing.
  5. To run the code in this cell, you must click on the run cell run cell button. When you do this, you will see the prompt change to In [1] and you will see the program output Hello, Python! below.
    hello python output
    You should also notice a new cell has appeared beneath it and is active.
  6. Cells allow us to run our code in parts, but the code is shared across all cells in the notebook. For example in the new cell, type:
    yourname = input('What is your name?')
    Then click run cell run cell to execute. The output interacts with you and is waiting for you to enter something:
    python input example
    You’ll also notice the prompt says In [*] The asterisk * informs you that this cell is still running.
  7. Enter your name into the box and press enter (NOTE: I entered Tommy Tutone):
    python input example enter your name
    You should now see the cell has completed execution In [2] and has output.
  8. As we stated earlier, all the cells in a notebook share code and data. Let’s demonstrate this. In the 3rd cell, type:
    print('Hello', yourname)
    and click run cell run cell to execute. You should see this output:
    hello tommy tutone

Exploring the Jupyter Toolbar

It is impractical to demonstrate all of the features of the Jupyter UI, so I encourage you to play around with the commands in the menu and toolbar. They are rather intuitive and some including checkpoints and kernel restarts can be very useful.
jupyter menu and toolbar

What’s running

Every notebook you open runs on a separate process. Just because you close the notebook does not imply the notebook stops running. To view running notebooks, click on the Jupyter Logo then click the Running tab. You will see a list of running notebooks like this:
running notebooks

Python Basics

Create a new notebook and rename it Python Basics

Variables and types

Python infers the data type from the data itself. Type the following code in to a new cell and execute it:

# variables and types
x = 10  # int
pi = 3.14   #float
name = 'bob' # string

You should see the following output:
variable types output


Here’s some common arithmetic operators in Python. Type the following code in to a new cell and execute it:

# arithmetic operators
x,y = 5,2
print (x+y) # 5+2
print (x-y) # 5-2
print (x*y) # 5x2
print (x/y) # 5/2
print (x//y)# 5/2 (as integers)
print (x%y) # Remainder of 5/2

You should see the following output:
arithmetic operators output

Here’s an example of Python’s Logical Operators these evaluate to the boolean value of True or False. Type the following code in to a new cell and execute it:

# logical operators. evalutate to True or False
x,y = 5,2
print (x == y) # 5 equals 2 ?
print (x > y)  # 5 greater than 2 ?
print (x < y)  # 5 less than 2 ?
print (x != y) # 5 not equal to 2 ?

You should see the following output:
logical operators output

Lists and Comprehensions

Python lists store collections of similar items. List items are indexed with brackets [] and the starting index is 0. Type the following code in to a new cell and execute it:

# lists / arrays: Collections of similar items
words = ['this', 'is', 'only','a','test']
temperatures = [90, 85, 88, 92, 80, 78, 72]

# indexing first, 3rd and last words
print ( words[0], words[2], words[-1])

#slicing, gets [2], [3] and [4]

# how many words?
print (len(words))

# sort the words

You should see the following output:
list output

Python’s list comprehensions allow us to create new lists from lists. This example is fairly straightforward. Type it into a new cell and execute:

# comprehensions make new lists from lists.
numbers = list(range(1,10))
squares =[n*n for n in numbers]
odd_squares = [n*n for n in numbers if n%2==1]
print (numbers)
print (squares)
print (odd_squares)

You should see the following output:
list comprehensions output

Tuples and Dictionaries

Tuples, unlike lists, are for collections of dissimilar items. However they are indexed like lists, except they are immutable: you can’t change them. Type this code into a new cell and execute:

# tuples: immutable collections of dissimiar items, work just like lists.
weather = ('Syracuse','NY', 75)
print (weather)

# get NY

You should see the following output:
tuples output

Python Dictionaries are key-value pairs. They’re like tuples but rather than indexed by number they are indexed by their key. Unlike tuples, they can be changed, too. Type this code into a new cell and execute:

# Dictionaries: Key vaylue pairs: indexed by key
student = { 'Name' : 'Bob', 'GPA' : 3.45, 'Age': 22}

# print the name and age
print(student['Name'], student['Age'])

student['Age'] = 23 # Happy birthday
print(student['Name'], student['Age'])

You should see the following output:
dictionaries output

Functions and Lambdas

Python functions are code subroutines which can be re-used. They allow us to extend the language with our own commands which add modularity and programming clarity to our code. Type and execute:

# functions
def area(radius):
    return 3.14159 * radius * radius

def circumference(radius):
    return 3.14159 * 2 * radius

r = 1
print('area', area(r))
print('circumference', circumference(r))

You should see the following output:
function output

Lambda functions are un-named inline functions. They’re used when you’re just too lazy to create a function with def. Type and execute:

#lambda functions

r = 1
area = lambda radius: 3.14159 * radius * radius
circumference = lambda radius: 3.14159 * 2 * radius

print('area', area(r))
print('circumference', circumference(r))

You should see the following output:
lambda functions output

Lambdas and Lists: Lambdas are often used to apply an anonymous function through a filter() or map() operation. This is similar to a List Comprehension. Type this in and execute it:

# lambdas and lists
numbers = [1,6,2,4,9,8,3,6]

even_numbers = list(filter(lambda n: n%2==0, numbers))
squares = list(map(lambda n: n*n, numbers))

print('Even Numbers', even_numbers)
print('Squares', squares)

You should see the following output:
lambdas and lists output


When we need to make decisions in our code and branch execution based on boolean values we use If…Elif…Else Type this example and execute it 3 times. Giving a different age each time as to trigger each of the three different outputs:

# decisions
age = int(input('How old are you? '))
if age <=21:
    print ("You're just a kid!")
elif age <65:
    print ("Welcome to the daily grind!")
    print ("Time to retire!")

You should see the following output, when you enter 99 as input:
decisions output


In Python the For loop can be used to repeat a series of commands for each item in a list. This example will split a string sentence into a list of words and then for each word, counts it if the word is appears. Type this in and execute:

# loops
sentence = "this is a test this is only a test that is all it is"
words = sentence.split(' ') # make a list of words
count = 0
for word in words:
    if word == 'is':
        count = count + 1

print ('The word "is" appears', count, 'times')

You should see the following output:
loops output


Reading from data files with Python is simple. To do this we use the with open statement. Here’s an example to read the preamble.txt file. Type and execute:

# read a file
with open('datasets/text/preamble.txt') as file:
    for line in file:
        print (line)

You should see the following output:
read file output

In our final example we put a couple of ideas together to write a program that counts the number of words in the preamble. Type and execute:

# word counts:
count = 0
with open('datasets/text/preamble.txt') as file:
    for line in file:
        for words in line.split(' '):
            count = count + 1

print("There are",count,"words in the preamble")

You should see the following output:
word count output

Test Yourself

  1. What are the three basic data types we explored in this exercise?
  2. What are the three structure data types we explored in this exercise?
  3. Explain when you should use a list versus a tuple?
  4. How is a Dictionary different from a tuple?
  5. What is the advantage of a lambda over a regular function?

Part 3 On Your Own


Open a new Notebook Called Unit02Lab1Part3 Write code in each cell to answer each question.


Write each program in its own cell.

  1. Write a program to ask for an integer value and then print out that number squared. For example when you input 3 the output would be 9
  2. Write a program to print odd when the integer you input is an odd number or print even when you input is an even number.
  3. Create your own list of words. Write a program to loop through the words and print out the first letter of each word. for example if your word list is '['what','the','fudge'] the output would be w t f
  4. Modify your program in the previous example to print out the last letter in each word.
  5. Write a program to ask you to enter a word at run time, then counts the number of time that word appears in datasets/text/zork1-walkthru.txt