Table Exercises

We work with the schools table from before.

from datascience import *
import numpy as np

schools = Table.read_table('data/cal_unis.csv')
schools
Name Institution City County Enrollment Founded
University of California, Berkeley UC Berkeley Alameda 42519 1869
University of California, Davis UC Davis Yolo 39152 1905
University of California, Irvine UC Irvine Orange 35220 1965
University of California, Los Angeles UC Los Angeles Los Angeles 45428 1882
University of California, Merced UC Merced Merced 8544 2005
University of California, Riverside UC Riverside Riverside 23278 1954
University of California, San Diego UC San Diego San Diego 38798 1960
University of California, Santa Barbara UC Santa Barbara Santa Barbara 24346 1891
University of California, Santa Cruz UC Santa Cruz Santa Cruz 19700 1965
California State University Maritime Academy CSU Vallejo Solano 1017 1929

... (22 rows omitted)

Pull up the Python Reference and try these exercises yourself! They are roughly in order of increasing difficulty.

Exercise 1: Variable names

How do we get all the column labels of schools?

schools.labels
('Name', 'Institution', 'City', 'County', 'Enrollment', 'Founded')

Exercise 2: Reorder columns

How do we reorder the columns to get the following table?

Hint: use one of select, drop, or with_columns.

schools.select(
  "Name", "Founded",
  "Institution", "City",
  "County", "Enrollment"
)
Name Founded Institution City County Enrollment
University of California, Berkeley 1869 UC Berkeley Alameda 42519
University of California, Davis 1905 UC Davis Yolo 39152
University of California, Irvine 1965 UC Irvine Orange 35220
University of California, Los Angeles 1882 UC Los Angeles Los Angeles 45428
University of California, Merced 2005 UC Merced Merced 8544
University of California, Riverside 1954 UC Riverside Riverside 23278
University of California, San Diego 1960 UC San Diego San Diego 38798
University of California, Santa Barbara 1891 UC Santa Barbara Santa Barbara 24346
University of California, Santa Cruz 1965 UC Santa Cruz Santa Cruz 19700
California State University Maritime Academy 1929 CSU Vallejo Solano 1017

... (22 rows omitted)

Exercise 3: Filtering

Let’s reset the schools table to our original dataset before continuing.

schools = Table.read_table('data/cal_unis.csv')
schools
Name Institution City County Enrollment Founded
University of California, Berkeley UC Berkeley Alameda 42519 1869
University of California, Davis UC Davis Yolo 39152 1905
University of California, Irvine UC Irvine Orange 35220 1965
University of California, Los Angeles UC Los Angeles Los Angeles 45428 1882
University of California, Merced UC Merced Merced 8544 2005
University of California, Riverside UC Riverside Riverside 23278 1954
University of California, San Diego UC San Diego San Diego 38798 1960
University of California, Santa Barbara UC Santa Barbara Santa Barbara 24346 1891
University of California, Santa Cruz UC Santa Cruz Santa Cruz 19700 1965
California State University Maritime Academy CSU Vallejo Solano 1017 1929

... (22 rows omitted)

  1. How do we get a table with all of the UC schools?
schools.where("Institution", "UC")
Name Institution City County Enrollment Founded
University of California, Berkeley UC Berkeley Alameda 42519 1869
University of California, Davis UC Davis Yolo 39152 1905
University of California, Irvine UC Irvine Orange 35220 1965
University of California, Los Angeles UC Los Angeles Los Angeles 45428 1882
University of California, Merced UC Merced Merced 8544 2005
University of California, Riverside UC Riverside Riverside 23278 1954
University of California, San Diego UC San Diego San Diego 38798 1960
University of California, Santa Barbara UC Santa Barbara Santa Barbara 24346 1891
University of California, Santa Cruz UC Santa Cruz Santa Cruz 19700 1965
  1. How do we get a table with all the schools in Los Angeles?
schools.where("City", "Los Angeles")
Name Institution City County Enrollment Founded
University of California, Los Angeles UC Los Angeles Los Angeles 45428 1882
California State University, Los Angeles CSU Los Angeles Los Angeles 27685 1947

Exercise 4: Rename Columns

How do we update schools such that the column Name is renamed University? Hint: Check out the method relabeled.

schools = schools.relabeled("Name", "University")
schools
University Institution City County Enrollment Founded
University of California, Berkeley UC Berkeley Alameda 42519 1869
University of California, Davis UC Davis Yolo 39152 1905
University of California, Irvine UC Irvine Orange 35220 1965
University of California, Los Angeles UC Los Angeles Los Angeles 45428 1882
University of California, Merced UC Merced Merced 8544 2005
University of California, Riverside UC Riverside Riverside 23278 1954
University of California, San Diego UC San Diego San Diego 38798 1960
University of California, Santa Barbara UC Santa Barbara Santa Barbara 24346 1891
University of California, Santa Cruz UC Santa Cruz Santa Cruz 19700 1965
California State University Maritime Academy CSU Vallejo Solano 1017 1929

... (22 rows omitted)

There are many ways to approach a problem. Suppose you didn’t know the method relabeled existed:

schools = schools.with_column("University", schools.column("Name")).drop("Name")
schools
Institution City County Enrollment Founded University
UC Berkeley Alameda 42519 1869 University of California, Berkeley
UC Davis Yolo 39152 1905 University of California, Davis
UC Irvine Orange 35220 1965 University of California, Irvine
UC Los Angeles Los Angeles 45428 1882 University of California, Los Angeles
UC Merced Merced 8544 2005 University of California, Merced
UC Riverside Riverside 23278 1954 University of California, Riverside
UC San Diego San Diego 38798 1960 University of California, San Diego
UC Santa Barbara Santa Barbara 24346 1891 University of California, Santa Barbara
UC Santa Cruz Santa Cruz 19700 1965 University of California, Santa Cruz
CSU Vallejo Solano 1017 1929 California State University Maritime Academy

... (22 rows omitted)

Questions for you as you read the above code:

  1. What methods are being chained together here? Which is executed first, with_column or drop?
  2. When is the column method being called?
  3. What would happen if you switched the two methods? (try it out)
  4. Why is University now the last column?

Exercise 5: Debugging show

The following cells exhibit a tricky, but potentially common bug. First, check out the Python Reference to understand what show does.

schools = Table.read_table('data/cal_unis.csv')
schools = schools.show(3)
Name Institution City County Enrollment Founded
University of California, Berkeley UC Berkeley Alameda 42519 1869
University of California, Davis UC Davis Yolo 39152 1905
University of California, Irvine UC Irvine Orange 35220 1965

... (29 rows omitted)

schools = schools.show(4)
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[13], line 1
----> 1 schools = schools.show(4)

AttributeError: 'NoneType' object has no attribute 'show'

Like print, show is for display purposes. It does not return anything—i.e., it returns None. The second cell above therefore assigns schools to None!

Despite this, show—just like print—is useful when you want to display intermediate output for debugging purposes.

schools = Table.read_table('data/cal_unis.csv')
schools.show(3) # just the first three rows
schools = schools.relabeled("Name", "University")
schools.show(3) # and again
schools = schools.with_columns("Name", "City")
schools         # the last evaluated value in a cell
                # is displayed by default
Name Institution City County Enrollment Founded
University of California, Berkeley UC Berkeley Alameda 42519 1869
University of California, Davis UC Davis Yolo 39152 1905
University of California, Irvine UC Irvine Orange 35220 1965

... (29 rows omitted)

University Institution City County Enrollment Founded
University of California, Berkeley UC Berkeley Alameda 42519 1869
University of California, Davis UC Davis Yolo 39152 1905
University of California, Irvine UC Irvine Orange 35220 1965

... (29 rows omitted)

University Institution City County Enrollment Founded Name
University of California, Berkeley UC Berkeley Alameda 42519 1869 City
University of California, Davis UC Davis Yolo 39152 1905 City
University of California, Irvine UC Irvine Orange 35220 1965 City
University of California, Los Angeles UC Los Angeles Los Angeles 45428 1882 City
University of California, Merced UC Merced Merced 8544 2005 City
University of California, Riverside UC Riverside Riverside 23278 1954 City
University of California, San Diego UC San Diego San Diego 38798 1960 City
University of California, Santa Barbara UC Santa Barbara Santa Barbara 24346 1891 City
University of California, Santa Cruz UC Santa Cruz Santa Cruz 19700 1965 City
California State University Maritime Academy CSU Vallejo Solano 1017 1929 City

... (22 rows omitted)