We looked at a simple API in the last mission. That API didn't require authentication, but most APIs do. Imagine that you're using the Reddit API to pull a list of your private messages. It would be a huge privacy breach for Reddit to give that information to anyone, so requiring authentication makes sense.
Authentication is also used by APIs so that they can perform rate limiting. APIs are usually created to enable users to build interesting applications or services. In order to ensure that the API is available and responsive for all users, APIs prevent you from making too many requests in too short of a time. This is known as rate limiting, and ensures that one user cannot overload the API server by making too many requests too fast.
In this mission, we'll be exploring the Github API and using it to pull some interesting data on repositories and users. Github is a site for hosting code (if you haven't looked at it, you should -- it's a great place to host a portfolio). Github has user accounts (example), repositories that contain code (example), and organizations that can be created by companies (example).
You can find documentation on the API here. In particular, make sure to pay attention to the authenticationsection.
2: API Authentication
In order to authenticate with the Github API, we'll need to use an access token. An access token is a credential that you can generate here. An access token is a string that the API can read and associate with your account.
Having a token is preferable to using a username and password for a few reasons:
- Typically, you will be accessing an API from a script. If you put your username and password in the script, and someone manages to get their hands on it, they can take over your account. With an access token, you can revoke it if there is a security breach and someone gets access to it. This will cancel their access.
- Access tokens can have scopes and specific permissions. For instance, you can make a token that has permission to write to your Github repositories and make new ones, or you can make a token that can only read from your repositories. This enables you to enforce token security better, and use tokens that only have read access in potentially insecure or shared scripts.
After generating a token, you need to pass an Authorization header to the Github API. Just like the server sends headers in response to our request, we can send the server headers when we make a request. Headers contain metadata about the request. With the requests
library, we just make a dictionary of headers, and then pass it into the request when we make it.
The Authorization header needs to have the word token
, followed by our access token. Here's an example Authorization header:
{"Authorization": "token 1f36137fbbe1602f779300dad26e4c1b7fbab631"}
In this case, our access token is 1f36137fbbe1602f779300dad26e4c1b7fbab631
. This token is generated by Github, and is associated with the account of Vik Paruchuri
.
You should never share your token with anyone you don't want to give access to your account -- the token you'll be using throughout this mission has been revoked, so it isn't valid anymore. Consider a token somewhat equivalent to a password, and store it securely
Instructions
- Make an authenticated request to
https://api.github.com/users/VikParuchuri/orgs
-- this will tell you which organizations a Github user is in.- Assign the json content of the response to
orgs
(you can get this withresponse.json()
)
- Assign the json content of the response to
# Create a dictionary of headers, with our Authorization header.
headers = {"Authorization": "token 1f36137fbbe1602f779300dad26e4c1b7fbab631"}
# Make a GET request to the Github API with our headers.
# This API endpoint will give us details about Vik Paruchuri.
response = requests.get("https://api.github.com/users/VikParuchuri", headers=headers)
# Print the content of the response. As you can see, this token is associated with the account of Vik Paruchuri.
print(response.json())
response1=requests.get("https://api.github.com/users/VikParuchuri/orgs" ,headers=headers)
orgs=response1.json()
print(response1)
print(orgs)
Output
{'avatar_url': 'https://avatars.githubusercontent.com/u/913340?v=3', 'events_url': 'https://api.github.com/users/VikParuchuri/events{/privacy}', 'gists_url': 'https://api.github.com/users/VikParuchuri/gists{/gist_id}', 'disk_usage': 711120, 'subscriptions_url': 'https://api.github.com/users/VikParuchuri/subscriptions', 'followers': 100, 'type': 'User', 'html_url': 'https://github.com/VikParuchuri', 'bio': None, 'starred_url': 'https://api.github.com/users/VikParuchuri/starred{/owner}{/repo}', 'followers_url': 'https://api.github.com/users/VikParuchuri/followers', 'public_gists': 9, 'company': 'dataquest.io', 'blog': 'http://www.vikparuchuri.com', 'gravatar_id': '', 'private_gists': 1, 'collaborators': 4, 'following_url': 'https://api.github.com/users/VikParuchuri/following{/other_user}', 'public_repos': 60, 'updated_at': '2015-08-19T18:44:48Z', 'organizations_url': 'https://api.github.com/users/VikParuchuri/orgs', 'name': 'Vik Paruchuri', 'site_admin': False, 'location': 'Boston, MA', 'email': 'vik.paruchuri@gmail.com', 'login': 'VikParuchuri', 'url': 'https://api.github.com/users/VikParuchuri', 'id': 913340, 'following': 10, 'received_events_url': 'https://api.github.com/users/VikParuchuri/received_events', 'plan': {'space': 976562499, 'private_repos': 20, 'collaborators': 0, 'name': 'medium'}, 'created_at': '2011-07-13T18:18:07Z', 'total_private_repos': 17, 'repos_url': 'https://api.github.com/users/VikParuchuri/repos', 'owned_private_repos': 17, 'hireable': None}
<mock_requests.Response object at 0x7f9b29372588>
[{'id': 11148054, 'events_url': 'https://api.github.com/orgs/dataquestio/events', 'members_url': 'https://api.github.com/orgs/dataquestio/members{/member}', 'description': None, 'avatar_url': 'https://avatars.githubusercontent.com/u/11148054?v=3', 'login': 'dataquestio', 'repos_url': 'https://api.github.com/orgs/dataquestio/repos', 'url': 'https://api.github.com/orgs/dataquestio', 'public_members_url': 'https://api.github.com/orgs/dataquestio/public_members{/member}'}]
###########################################################################
3: Endpoints And Objects
APIs are usually set up to let you retrieve information about specific objects in the database. In the last screen, we retrieved information about a specific user object, VikParuchuri
. We could also retrieve information about other Github users using the same endpoint. For example, https://api.github.com/users/torvalds
would get us information about Linus Torvalds.
# headers is loaded in.
response=requests.get("https://api.github.com/users/torvalds",headers=headers)
torvalds=response.json()
print(torvalds)
response
Response (<class 'mock_requests.Response'>)
<mock_requests.Response at 0x7f69e8b19160>
headers
dict (<class 'dict'>)
{'Authorization': 'token 1f36137fbbe1602f779300dad26e4c1b7fbab631'}
torvalds
dict (<class 'dict'>)
{'avatar_url': 'https://avatars.githubusercontent.com/u/1024025?v=3',
'bio': None,
'blog': None,
'company': 'Linux Foundation',
'created_at': '2011-09-03T15:26:22Z',
'email': None,
'events_url': 'https://api.github.com/users/torvalds/events{/privacy}',
'followers': 29687,
'followers_url': 'https://api.github.com/users/torvalds/followers',
'following': 0,
'following_url': 'https://api.github.com/users/torvalds/following{/other_user}',
'gists_url': 'https://api.github.com/users/torvalds/gists{/gist_id}',
'gravatar_id': '',
'hireable': None,
'html_url': 'https://github.com/torvalds',
'id': 1024025,
'location': 'Portland, OR',
'login': 'torvalds',
'name': 'Linus Torvalds',
'organizations_url': 'https://api.github.com/users/torvalds/orgs',
'public_gists': 0,
'public_repos': 2,
'received_events_url': 'https://api.github.com/users/torvalds/received_events',
'repos_url': 'https://api.github.com/users/torvalds/repos',
'site_admin': False,
'starred_url': 'https://api.github.com/users/torvalds/starred{/owner}{/repo}',
'subscriptions_url': 'https://api.github.com/users/torvalds/subscriptions',
'type': 'User',
'updated_at': '2015-06-11T00:46:13Z',
'url': 'https://api.github.com/users/torvalds'}
##############################################################################
4: Other Objects
The Github API has a few other types of objects beyond users. For example, https://api.github.com/orgs/dataquestio
will get you information about the Dataquest Github organization. https://api.github.com/repos/octocat/Hello-World
will give you information about the Hello-World
repository owned by the user octocat
. Here's a link to that repository.
Find the full documentation for all the endpoints here.
Instructions
- Make a GET request to the
https://api.github.com/repos/octocat/Hello-World
endpoint.- Assign the decoded json result to
hello_world
- Assign the decoded json result to
# Enter your answer here.
response=requests.get("https://api.github.com/repos/octocat/Hello-World",headers=headers)
hello_world=response.json()
#################################################################
5: Pagination
Sometimes, a certain request can return a lot of objects. This can happen when you're doing something like listing out all of a user's repositories, for example. Returning too much data will take a long time, and will make the server slow down. For example, if a user has 1000+ repositories, requesting all of them might take 10+ seconds. This isn't a great user experience, so it's typical for API providers to implement pagination. This means that the API provider will only return a certain number of records per page. You can specify the page number that you want to access. To access all of the pages, you'll need to write a loop.
To get the repositories that a user has starred, or marked as interesting, we can use the following api route --https://api.github.com/users/VikParuchuri/starred
. We can add two pagination query parameters to it, page
, and per_page
. page
is the page that we want to access, and per_page
is the number of records we want to see on each page. Typically, API providers enforce a cap on how high per_page
can be, because setting it to an extremely high value defeats the purpose of pagination.
You can see the documentation on Github API pagination here.
Instructions
- Get the second page of starred repositories from the
https://api.github.com/users/VikParuchuri/starred
endpoint.- Assign the loaded json of the response to
page2_repos
.
- Assign the loaded json of the response to
params = {"per_page": 50, "page": 1}
response = requests.get("https://api.github.com/users/VikParuchuri/starred", headers=headers, params=params)
page1_repos = response.json()
params={"per_page":50,"page":2}
response=requests.get("https://api.github.com/users/VikParuchuri/starred",headers=headers,params=params)
page2_repos=response.json()
########################################################
6: User-Level Endpoints
So far, we've looked at endpoints where you need to explicitly provide the username of the user whose information you're looking up. An example is https://api.github.com/users/VikParuchuri/starred
-- this pulls up the starred repositories for VikParuchuri
.
Since we're authenticated with our token, the system knows who we are, and can show us some relevant information without us having to specify our username. These are usually included because they enable you to get private information or perform actions that require authentication (like changing your user account information).
Making a GET request to https://api.github.com/user
will give you information about the user that the authentication token is for.
There are other endpoints like this, that automatically give you information or allow you to take actions as the authenticated user.
Instructions
- Make a GET request to the
"https://api.github.com/user"
endpoint.- Assign the decoded json of the result to the
user
variable.
- Assign the decoded json of the result to the
# Enter your code here.
response=requests.get("https://api.github.com/user",headers=headers)
user=response.json()
#######################################################################
7: POST Requests
In the last mission, we mentioned different types of requests. So far, we've been making GET requests. GET requests are used to retrieve information from the server (hence the name GET). There are a few other types of requests.
One of them is called a POST request. POST requests are used to send information to the server, and create objects on the server. In our case, we can use POST requests to create new repositories.
Different API endpoints choose what types of requests they will accept. Not all endpoints will accept a POST request, and not all will accept a GET request. You'll have to consult API documentation to figure out which endpoints accept which types of requests.
We can make POST requests using requests.post
. POST requests almost always include data, because we need the data to create the object on the server.
To pass in the data, we do something very similar to what we do with query parameters and GET requests:
payload = {"name": "test"}
requests.post("https://api.github.com/user/repos", json=payload)
The above code will make a new repository owned by the currently authenticated user named test
. It will convert the payload dictionary to json, and pass it along with the POST request.
Look at the repos api documentation for a full listing of what data can be passed with this POST request. A short listing:
name
-- required, the name of the repositorydescription
-- optional, the description of the repository
A successful POST request will usually return a 201
status code, indicating that the object was created successfully on the server. Sometimes, the json representation of the object that was created will be returned as the content of the response.
Instructions
- Create a repository named
learning-about-apis
.- Assign the status code of the response to the
status
variable.
- Assign the status code of the response to the
# Create the data we'll pass into the API endpoint. This endpoint only requires the "name" key, but others are optional.
payload = {"name": "test"}
# We need to pass in our authentication headers!
response = requests.post("https://api.github.com/user/repos", json=payload, headers=headers)
print(response.status_code)
payload={"name":"learning-about-apis"}
response=requests.post("https://api.github.com/user/repos",json=payload,headers=headers)
status=response.status_code
############################################
8: PUT/PATCH Requests
Sometimes, we don't want to make a new object, we just want to update an existing one. This is where PATCH and PUT requests come into play. We use PATCH requests when we want to change a few attributes of an object, and we don't want to send the whole object to the server (maybe we just want to change the name of our repository, for example). We use PUT requests when we want to send the whole object to the server, and replace the version on the server with the version we went.
In practice, API developers don't always respect this convention, and sometimes API endpoints that accept PUT requests will treat them like PATCH requests, and not require that the whole object be sent back.
We send a payload with PATCH requests, the same way we do with POST requests:
payload = {"description": "The best repository ever!", "name": "test"}
response = requests.patch("https://api.github.com/repos/VikParuchuri/test", json=payload)
The above code will change the description of the test
repository to The best repository ever!
(we didn't specify a description when we created it).
A PATCH request will usually return a 200
status code if everything goes fine.
Instructions
- Make a PATCH request to the
https://api.github.com/repos/VikParuchuri/learning-about-apis
endpoint that changes the description toLearning about requests!
.- Assign the status code of the response to
status
- Assign the status code of the response to
payload = {"description": "The best repository ever!", "name": "test"}
response = requests.patch("https://api.github.com/repos/VikParuchuri/test", json=payload, headers=headers)
print(response.status_code)
payload={"description":"Learning about requests!","name":"learning-about-apis"}
response=requests.patch("https://api.github.com/repos/VikParuchuri/learning-about-apis",json=payload,headers=headers)
status=response.status_code
status
int (<class 'int'>)
200
payload
dict (<class 'dict'>)
{'description': 'Learning about requests!', 'name': 'learning-about-apis'}
headers
dict (<class 'dict'>)
{'Authorization': 'token 1f36137fbbe1602f779300dad26e4c1b7fbab631'}
######################################################################
9: DELETE Requests
The final major request type is the DELETE request. The DELETE request removes objects from the server. We can use the DELETErequest to remove repositories.
response = requests.delete("https://api.github.com/repos/VikParuchuri/test")
The above code will delete the test
repository from Github.
A successful DELETE request will usually return a 204
request, indicating that the object has been deleted.
DELETE requests should be used carefully -- it's very easy to accidentally remove something important.
Instructions
- Make a DELETE request to the
https://api.github.com/repos/VikParuchuri/learning-about-apis
endpoint.- Assign the
status_code
of the response to the variablestatus
- Assign the
response = requests.delete("https://api.github.com/repos/VikParuchuri/test", headers=headers)
print(response.status_code)
response=requests.delete("https://api.github.com/repos/VikParuchuri/learning-about-apis",headers=headers)
status=response.status_code
print(status)
response
Response (<class 'mock_requests.Response'>)
<mock_requests.Response at 0x7fda16e777b8>
headers
dict (<class 'dict'>)
{'Authorization': 'token 1f36137fbbe1602f779300dad26e4c1b7fbab631'}
status
int (<class 'int'>)
204
10: Further Exploration
That's it for the major points of working with APIs, but feel free to explore more with your own API token. If you want to generate a Github access token, head here. Then, you can consult the API documentation to find good routes to query.