This is a Jupyter notebook. We'll write all of our code in this class in a Jupyter notebook.
Today, don't worry about how any of this works. Throughout the semester, we'll learn how each of these pieces work.
Note: The maps in this notebook will not load correctly in Safari if you're on a Mac; use Chrome.
from datascience import *
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
import plotly.graph_objects as go
Here, we'll load in data about all public universities in California. The data comes from this Wikipedia article.
uni = Table.read_table('data/california_universities.csv')
uni = uni.with_columns(
'Enrollment', uni.apply(lambda s: int(s.replace(',', '')), 'Enrollment'),
'Founded', uni.apply(lambda s: int(s.replace('*', '')), 'Founded')
)
Data is often stored in tables. In about a few weeks, we'll become very, very familiar with how tables work. But for now, let's just observe.
uni.show(15)
Name | City | County | Enrollment | Founded |
---|---|---|---|---|
University of California, Berkeley | Berkeley | Alameda | 42519 | 1869 |
University of California, Davis | Davis | Yolo | 39152 | 1905 |
University of California, Irvine | Irvine | Orange | 35220 | 1965 |
University of California, Los Angeles | Los Angeles | Los Angeles | 45428 | 1882 |
University of California, Merced | Merced | Merced | 8544 | 2005 |
University of California, Riverside | Riverside | Riverside | 23278 | 1954 |
University of California, San Diego | San Diego | San Diego | 38798 | 1960 |
University of California, Santa Barbara | Santa Barbara | Santa Barbara | 24346 | 1891 |
University of California, Santa Cruz | Santa Cruz | Santa Cruz | 19700 | 1965 |
California State University Maritime Academy | Vallejo | Solano | 1017 | 1929 |
California Polytechnic State University | San Luis Obispo | San Luis Obispo | 21812 | 1901 |
California State Polytechnic University, Pomona | Pomona | Los Angeles | 26443 | 1938 |
California State University, Bakersfield | Bakersfield | Kern | 10493 | 1965 |
California State University Channel Islands | Camarillo | Ventura | 7095 | 2002 |
California State University, Chico | Chico | Butte | 17488 | 1887 |
... (17 rows omitted)
Let's start asking questions.
uni.sort('Enrollment', descending = True)
Name | City | County | Enrollment | Founded |
---|---|---|---|---|
University of California, Los Angeles | Los Angeles | Los Angeles | 45428 | 1882 |
University of California, Berkeley | Berkeley | Alameda | 42519 | 1869 |
California State University, Fullerton | Fullerton | Orange | 39774 | 1957 |
University of California, Davis | Davis | Yolo | 39152 | 1905 |
University of California, San Diego | San Diego | San Diego | 38798 | 1960 |
California State University, Northridge | Northridge | Los Angeles | 38716 | 1958 |
California State University, Long Beach | Long Beach | Los Angeles | 36846 | 1949 |
University of California, Irvine | Irvine | Orange | 35220 | 1965 |
San Diego State University | San Diego | San Diego | 34881 | 1897 |
San Jose State University | San Jose | Santa Clara | 32828 | 1857 |
... (22 rows omitted)
uni.sort('Enrollment', descending = True).barh('Name', 'Enrollment')
uni.sort('Founded')
Name | City | County | Enrollment | Founded |
---|---|---|---|---|
San Jose State University | San Jose | Santa Clara | 32828 | 1857 |
University of California, Berkeley | Berkeley | Alameda | 42519 | 1869 |
University of California, Los Angeles | Los Angeles | Los Angeles | 45428 | 1882 |
California State University, Chico | Chico | Butte | 17488 | 1887 |
University of California, Santa Barbara | Santa Barbara | Santa Barbara | 24346 | 1891 |
San Diego State University | San Diego | San Diego | 34881 | 1897 |
San Francisco State University | San Francisco | San Francisco | 29586 | 1899 |
California Polytechnic State University | San Luis Obispo | San Luis Obispo | 21812 | 1901 |
University of California, Davis | Davis | Yolo | 39152 | 1905 |
California State University, Fresno | Fresno | Fresno | 24995 | 1911 |
... (22 rows omitted)
uni_copy = uni.sort('Founded').with_columns('Total Universities', np.arange(1, uni.num_rows + 1))
uni_copy.plot('Founded', 'Total Universities')
Let's add some spice.
fig = go.Figure()
fig.add_trace(
go.Scatter(x = uni_copy.column('Founded'),
y = uni_copy.column('Total Universities'),
hovertext = uni_copy.column('Name'),
mode = 'markers',
)
)
fig.add_trace(
go.Scatter(x = uni_copy.column('Founded'),
y = uni_copy.column('Total Universities'),
line = dict(color = 'blue'),
)
)
fig.update_layout(title = 'Total Number of Public Universities in California by Year',
xaxis_title = 'Year',
yaxis_title = 'Total Universities')
fig.show()