{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "5O5v478MwBjn"
},
"source": [
"# Guide to Extracting Data w/ APIs from the Fitbit Sense"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "z3QMsFOnzcu2"
},
"source": [
"\n",
"\n",
"A picture of the Fitbit Sense"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Lc7qnO7xxgFM"
},
"source": [
"At a price of around 250$, the [Fitbit Sense](https://www.fitbit.com/global/us/products/smartwatches/sense) is a sleep and physical activity tracker that tracks stress, exercise, ECG, SpO2 and more! \n",
"\n",
"This is a comprehensive, clear guide to extract data from the Fitbit Sense using the Fitbit Web API. Links to external resources and official Fitbit documentation are provided throughout the guide for further reference.\n",
"\n",
"If you want to know more about the Fitbit Sense, see the [README](https://github.com/alrojo/wearipedia/tree/main/wearables/fitbit-Sense) for a detailed analysis of performances, sensors, data privacy, and extraction pipelines.\n",
"\n",
"\n",
"A list of the most important accessible data categories is provided below, For the full list, access the api data in section 3.\n",
"\n",
"Category Name (API version)| Parameter Name (subcategory)| Frequency of Sampling \n",
":-------------------:|:----------------------:|:----------------------:\n",
"sleep | date |during the night\n",
"sleep | duration |during the night\n",
"sleep | efficiency |during the night\n",
"sleep | end time |during the night\n",
"sleep | sleep levels |during the night\n",
"steps | date and time |daily\n",
"steps | value (number of steps) |daily\n",
"minutesVeryActive | date and time |daily\n",
"minutesVeryActive | value | daily\n",
"minutesFairlyActive | date and time |daily\n",
"minutesFairlyActive | value |daily\n",
"minutesLightlyActive | date and time |daily\n",
"minutesLightlyActive | value |daily\n",
"distance moved | date and time |daily\n",
"distance moved | value |daily\n",
"minutesSedentary | date and time |daily\n",
"minutesSedentary | value |daily\n",
"heart rate | resting heart rate |daily (per minute)\n",
"heart rate | heart rate zones |daily (per minute)\n",
"heart rate | heart rate variability | during sleep (per minute)\n",
"temperature | skin temperature | daily\n",
"temperature | core temperature | daily\n",
"Spo2 | date and time | during sleep\n",
"Spo2 | value | during sleep\n",
"\n",
"\n",
"\n",
"\n",
"In this guide, we sequentially cover the following **five** topics to extract data from the Fitbit API:\n",
"\n",
"1. **Setup**\n",
" - 1.1: Study participant setup and usage\n",
" - 1.2: Library imports\n",
"2. **Authentication/Authorization**\n",
" - 2.1: Regestering an application\n",
" - 2.2: Authorizing the app\n",
" - 2.3: Retrieving The Authorization Code\n",
" - 2.4: Calling the API \n",
"3. **Data extraction**\n",
" - Select the dates\n",
"4. **Data visualization**\n",
" - 4.1: Visualizing Non-Wear days and filtering the data from them \n",
" - 4.2: Visualizing Heart Rate\n",
" - 4.2: Visualizing distance moved\n",
"5. **Data analysis**\n",
" - 5.1: Finding Outliers (Anomaly Detection). We provide two ways to find outliers in any set of output data.\n",
" - 5.2: Checking for correlation between distance moved walked and heart rate\n",
"\n",
"\n",
"*Note: Full documentation of APIs by Fitbit can be found [here](https://dev.fitbit.com/build/reference)."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "zvLGwn8RwCAA"
},
"source": [
"# Setup"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "NRlKPoM_1U0y"
},
"source": [
"## 1.1 Study participant setup and usage\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "8dCBzeFd1NDx"
},
"source": [
"After creating your Fitbit account, charging the watch and connecting it to your account, download the Fitbit app from the appstore/playstore and start using your Fitbit. You will see that the data is being collected and visualized in the app. Once you have some data, it's easy to access them through the Fitbit Web API if you follow the notebook."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "-FU5QEqw1GlJ"
},
"source": [
"## 1.2 Library imports"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "o4Wj4rtt1GLD"
},
"outputs": [],
"source": [
"import base64\n",
"import hashlib\n",
"import html\n",
"import json\n",
"import os\n",
"import re\n",
"import urllib.parse\n",
"import requests\n",
"import matplotlib.pyplot as plt\n",
"import pandas as pd\n",
"import seaborn as sns\n",
"import numpy as np\n",
"from sklearn.covariance import EllipticEnvelope\n",
"from scipy import stats"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Ocm9m7rLwCT_"
},
"source": [
"# Authentication/Authorization"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "zXGozB-G6nJ5"
},
"source": [
"To obtain access to the data using the Web API, authentication and authorization is required. Fitbit supports the OAuth 2.0 protocol, with three different models (read more about it [here](https://dev.fitbit.com/build/reference/web-api/authorization/)).\n",
"\n",
"Briefly, Fitbit offers three workflows for their Web APIs in order from lowest level of security to the highest:\n",
"* Implicit Grant Flow\n",
"* Authorization Code Grant Flow\n",
"* Authorization Code Grant Flow with PKCE (Proof Key for Code Exchange)\n",
"
\n",
"\n",
"We will discuss the **Authorization Code Grant Flow**.\n",
"\n",
"The full documentation for all workflows are provided [here](https://dev.fitbit.com/build/reference/web-api/developer-guide/authorization/).\n",
"\n",
"*Note: Tokens have a specified TTL (time to live) determined earlier by the `expires` parameter. Once that time is over, a new token must be issued."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "IK-J_4Ff7uv9"
},
"source": [
"## 2.1 Registering An Application"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "63Ry0mG8708f"
},
"source": [
"First, register an application on [here](https://dev.fitbit.com/apps/new) while logged in. OAuth 2.0 Application Type should be set to **Client** or **Personal** and the Callback URL is the address through which you can receive your token (https://127.0.0.1/, also known as the [localhost](https://en.wikipedia.org/wiki/Localhost), is provided as an example, but any link accessible locally should suffice; [8080](https://www.quora.com/What-is-port-8080-used-for#:~:text=Port%208080%20is%20typically%20used%20for%20a%20personally%20hosted%20web%20server) is the port). Other sections can be filled without particular specifications (e.g. https://google.com for all website links). An image with the important sections highlight are provided below for clarity."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "fq005A8o8I5b"
},
"source": [
""
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "_UmHCCDB-t4l"
},
"source": [
"The `client_id` and `client_secret` can be accessed under **Manage My Apps** as shown below."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "WgKbH_EC-1Oo"
},
"source": [
""
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "tdEVz0g8_SAZ"
},
"source": [
"Afterwards, we initialize a [dictionary](https://docs.python.org/3/tutorial/datastructures.html#dictionaries) to hold all variables relevant to authenticating, authorizing, and calling the API later."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "E5cPpq3l_8yi"
},
"outputs": [],
"source": [
"code_verifier = base64.urlsafe_b64encode(\n",
" os.urandom(43)\n",
").decode(\"utf-8\") if \"code_verifier\" not in locals() else code_verifier\n",
"code_challenge = base64.urlsafe_b64encode(\n",
" hashlib.sha256(code_verifier.encode(\"utf-8\")).digest()\n",
").decode(\"utf-8\").replace('=', '')\n",
"\n",
"variables = dict()\n",
"\n",
"# user specified\n",
"variables[\"client_id\"] = \"238RJ5\"\n",
"variables[\"client_secret\"] = \"f56c7bc6d3cce8cc78edf723ce37f444\"\n",
"variables[\"expires_in\"] = \"31536000\" # expiry of token in seconds\n",
"\n",
"# constants or one-time generated\n",
"variables[\"code_verifier\"] = code_verifier\n",
"variables[\"code_challenge\"] = code_challenge\n",
"variables[\"code_challenge_method\"] = \"S256\"\n",
"variables[\"response_type\"] = \"token\" # code\n",
"variables[\"scope\"] = (\n",
" \"weight%20location%20settings%20profile%20nutrition%20\" +\n",
" \"activity%20sleep%20heartrate%20social\"\n",
")\n",
"variables[\"prompt\"] = \"none\"\n",
"variables[\"redirect_uri\"] = \"https%3A%2F%2F127.0.0.1%3A8080%2F\"\n",
"variables[\"grant_type\"] = \"authorization_code\"\n",
"variables[\"authorization\"] = base64.urlsafe_b64encode(\n",
" bytes(variables[\"client_id\"] + \":\" + variables[\"client_secret\"], \"utf-8\")\n",
").decode(\"utf-8\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "n_Puc2IGBGfJ"
},
"source": [
"## 2.2 Authorize the app"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "5lV7uKCQBKiE"
},
"source": [
"\n",
"Next, we display Fitbit's authorization page by typing a specific URL on a web browser. A code challenge and code verifier is required to progress further. The concept is comprehensively outlined [here](https://tools.ietf.org/html/rfc7636).\n",
"\n",
"The URL should consist of the following required parameters (split into \"variable\" and \"non-variable\" parameters).\n",
"\n",
"- Variable parameters\n",
"* `client_id`: Fitbit API application ID (under manage my apps specified in 2.1, link [here](https://dev.fitbit.com/apps), requires login)\n",
"* `code_challenge`: base64url-encoded SHA256 hash of the code verifier, can be obtained [here](https://example-app.com/pkce)\n",
"* `code_challenge_method`: S256\n",
" \n",
"- Non-variable parameters\n",
"* `scope`: space-delimited list of data collections requested by the application\n",
"* `response_type`: code"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "o2iPirhyBpSB"
},
"source": [
"The resulting URL is demonstrated below."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "t2Mf-QlmBtns",
"outputId": "c1815d26-fb9e-4c3b-efdc-13bcdd875681"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"https://www.fitbit.com/oauth2/authorize?client_id=238RJ5&redirect_uri=https%3A%2F%2F127.0.0.1%3A8080%2F&code_challenge=GlrjTiiXE-cZe3MGpBSjMQeYMiJcGEs6dGk26hujl-E&code_challenge_method=S256&scope=weight%20location%20settings%20profile%20nutrition%20activity%20sleep%20heartrate%20social&response_type=token&expires_in=31536000\n"
]
}
],
"source": [
"# combine all parameters into the url string\n",
"url = \"https://www.fitbit.com/oauth2/authorize\" # authorization endpoint\n",
"for key in [\"client_id\", \"redirect_uri\", \"code_challenge\", \"code_challenge_method\", \"scope\", \"response_type\", \"expires_in\"]:\n",
" if url == \"https://www.fitbit.com/oauth2/authorize\":\n",
" url += \"?\" + key + \"=\" + variables[key]\n",
" else:\n",
" url += \"&\" + key + \"=\" + variables[key]\n",
"\n",
"print(url)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "4cDsfD0IChZE"
},
"source": [
"Click the URL above to access the Authorization page. Check **Allow All** and click the **Allow** button in red as shown below"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "CzbV7bvfC3yq"
},
"source": [
""
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "YKSQRaSQCD15"
},
"source": [
"## 2.3 Retrieving The Authorization Code "
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "jpm_VeJ-CK5w"
},
"source": [
"If the `response_type` is `token`, an `access_token` is provided to you as part of the url.\n",
"\n",
"https://127.0.0.1:8080/#access_token=eyJhbGciOiJIUzI1NiJ9.eyJhdWQiOiIyMzhSSjUiLCJzdWIiOiI5RlJHUFQiLCJpc3MiOiJGaXRiaXQiLCJ0eXAiOiJhY2Nlc3NfdG9rZW4iLCJzY29wZXMiOiJyc29jIHJzZXQgcmFjdCBybG9jIHJ3ZWkgcmhyIHJwcm8gcm51dCByc2xlIiwiZXhwIjoxNjkyODMzNDcwLCJpYXQiOjE2NjEyOTc0NzB9.9pcX9pYrk-si6OsvAFz68MvTVm21S1OZUna921g46NA&user_id=9FRGPT&scope=sleep+profile+weight+heartrate+location+settings+nutrition+activity+social&token_type=Bearer&expires_in=31536000\n",
"\n",
"In the example above, the `access_token` is **eyJhbGciOiJIUzI1NiJ9.eyJhdWQiOiIyMzhSSjUiLCJzdWIiOiI5RlJHUFQiLCJpc3MiOiJGaXRiaXQiLCJ0eXAiOiJhY2Nlc3NfdG9rZW4iLCJzY29wZXMiOiJyc29jIHJzZXQgcmFjdCBybG9jIHJ3ZWkgcmhyIHJwcm8gcm51dCByc2xlIiwiZXhwIjoxNjkyODMzNDcwLCJpYXQiOjE2NjEyOTc0NzB9.9pcX9pYrk-si6OsvAFz68MvTVm21S1OZUna921g46NA**.\n",
"\n",
"Store the `access_token` inside the `variables` dictionary "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "6I6N9VOzDfyU"
},
"outputs": [],
"source": [
"variables[\"access_token\"] = 'eyJhbGciOiJIUzI1NiJ9.eyJhdWQiOiIyMzhSSjUiLCJzdWIiOiI5RlJHUFQiLCJpc3MiOiJGaXRiaXQiLCJ0eXAiOiJhY2Nlc3NfdG9rZW4iLCJzY29wZXMiOiJyc29jIHJzZXQgcmFjdCBybG9jIHJ3ZWkgcmhyIHJwcm8gcm51dCByc2xlIiwiZXhwIjoxNjkyODMzNDcwLCJpYXQiOjE2NjEyOTc0NzB9.9pcX9pYrk-si6OsvAFz68MvTVm21S1OZUna921g46NA'"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "uFNEPwuzF3Q9"
},
"source": [
"## 2.4 Calling the API"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "ZIJRSBhRGBUF"
},
"outputs": [],
"source": [
"# executes a GET request on the API\n",
"def call_API(\n",
" access_token: str,\n",
" url: str,\n",
" call: str = \"GET\"\n",
"):\n",
" headers = {\n",
" \"Authorization\": \"Bearer \" + access_token\n",
" }\n",
" return requests.request(\n",
" call, url=url, headers=headers).json()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "65nPJGWGGEF-",
"outputId": "78dc583f-3d19-453a-fee9-f84e840d7965"
},
"outputs": [
{
"data": {
"text/plain": [
"{'user': {'age': 31,\n",
" 'ambassador': False,\n",
" 'autoStrideEnabled': True,\n",
" 'avatar': 'https://static0.fitbit.com/images/profile/defaultProfile_100.png',\n",
" 'avatar150': 'https://static0.fitbit.com/images/profile/defaultProfile_150.png',\n",
" 'avatar640': 'https://static0.fitbit.com/images/profile/defaultProfile_640.png',\n",
" 'averageDailySteps': 14055,\n",
" 'challengesBeta': True,\n",
" 'clockTimeDisplayFormat': '12hour',\n",
" 'corporate': False,\n",
" 'corporateAdmin': False,\n",
" 'dateOfBirth': '1991-06-29',\n",
" 'displayName': 'Alexander J.',\n",
" 'displayNameSetting': 'name',\n",
" 'distanceUnit': 'en_US',\n",
" 'encodedId': '9FRGPT',\n",
" 'features': {'exerciseGoal': True},\n",
" 'firstName': 'Alexander',\n",
" 'foodsLocale': 'en_US',\n",
" 'fullName': 'Alexander Johansen',\n",
" 'gender': 'MALE',\n",
" 'glucoseUnit': 'en_US',\n",
" 'height': 182.8,\n",
" 'heightUnit': 'en_US',\n",
" 'isBugReportEnabled': False,\n",
" 'isChild': False,\n",
" 'isCoach': False,\n",
" 'languageLocale': 'en_US',\n",
" 'lastName': 'Johansen',\n",
" 'legalTermsAcceptRequired': False,\n",
" 'locale': 'en_US',\n",
" 'memberSince': '2021-06-07',\n",
" 'mfaEnabled': False,\n",
" 'offsetFromUTCMillis': -21600000,\n",
" 'sdkDeveloper': False,\n",
" 'sleepTracking': 'Normal',\n",
" 'startDayOfWeek': 'SUNDAY',\n",
" 'strideLengthRunning': 114.80000000000001,\n",
" 'strideLengthRunningType': 'auto',\n",
" 'strideLengthWalking': 75.9,\n",
" 'strideLengthWalkingType': 'auto',\n",
" 'swimUnit': 'en_US',\n",
" 'temperatureUnit': 'en_US',\n",
" 'timezone': 'America/Chicago',\n",
" 'topBadges': [{'badgeGradientEndColor': 'B0DF2A',\n",
" 'badgeGradientStartColor': '00A550',\n",
" 'badgeType': 'DAILY_STEPS',\n",
" 'category': 'Minions Badges',\n",
" 'cheers': [],\n",
" 'dateTime': '2022-12-12',\n",
" 'description': '22,222 steps in a day',\n",
" 'earnedMessage': 'Congrats on earning your first Minions: Kevin badge!',\n",
" 'encodedId': '22B656',\n",
" 'image100px': 'https://www.gstatic.com/fitbit/badge/images/badges_new/100px/badge_daily_steps22222.png',\n",
" 'image125px': 'https://www.gstatic.com/fitbit/badge/images/badges_new/125px/badge_daily_steps22222.png',\n",
" 'image300px': 'https://www.gstatic.com/fitbit/badge/images/badges_new/300px/badge_daily_steps22222.png',\n",
" 'image50px': 'https://www.gstatic.com/fitbit/badge/images/badges_new/badge_daily_steps22222.png',\n",
" 'image75px': 'https://www.gstatic.com/fitbit/badge/images/badges_new/75px/badge_daily_steps22222.png',\n",
" 'marketingDescription': 'Cheers! Way to make it a mission to move. Next up: Take 32,100 steps in a day to get the Bob Badge.',\n",
" 'mobileDescription': 'Cheers! Way to make it a mission to move.',\n",
" 'name': 'Minions: Kevin (22,222 steps in a day)',\n",
" 'shareImage640px': 'https://www.gstatic.com/fitbit/badge/images/badges_new/386px/shareLocalized/en_US/badge_daily_steps22222.png',\n",
" 'shareText': 'I took 22,222 steps in a day',\n",
" 'shortDescription': '22,222 steps in a day',\n",
" 'shortName': 'Minions: Kevin',\n",
" 'timesAchieved': 2,\n",
" 'value': 22222},\n",
" {'badgeGradientEndColor': '38D7FF',\n",
" 'badgeGradientStartColor': '2DB4D7',\n",
" 'badgeType': 'LIFETIME_DISTANCE',\n",
" 'category': 'Lifetime Distance',\n",
" 'cheers': [],\n",
" 'dateTime': '2022-07-23',\n",
" 'description': '70 lifetime miles',\n",
" 'earnedMessage': \"Whoa! You've earned the Penguin March badge!\",\n",
" 'encodedId': '22B8M7',\n",
" 'image100px': 'https://www.gstatic.com/fitbit/badge/images/badges_new/100px/badge_lifetime_miles70.png',\n",
" 'image125px': 'https://www.gstatic.com/fitbit/badge/images/badges_new/125px/badge_lifetime_miles70.png',\n",
" 'image300px': 'https://www.gstatic.com/fitbit/badge/images/badges_new/300px/badge_lifetime_miles70.png',\n",
" 'image50px': 'https://www.gstatic.com/fitbit/badge/images/badges_new/badge_lifetime_miles70.png',\n",
" 'image75px': 'https://www.gstatic.com/fitbit/badge/images/badges_new/75px/badge_lifetime_miles70.png',\n",
" 'marketingDescription': \"By reaching 70 lifetime miles, you've earned the Penguin March badge!\",\n",
" 'mobileDescription': 'You matched the distance of the March of the Penguins—the annual trip emperor penguins make to their breeding grounds.',\n",
" 'name': 'Penguin March (70 lifetime miles)',\n",
" 'shareImage640px': 'https://www.gstatic.com/fitbit/badge/images/badges_new/386px/shareLocalized/en_US/badge_lifetime_miles70.png',\n",
" 'shareText': 'I covered 70 miles with my #Fitbit and earned the Penguin March badge.',\n",
" 'shortDescription': '70 miles',\n",
" 'shortName': 'Penguin March',\n",
" 'timesAchieved': 1,\n",
" 'unit': 'MILES',\n",
" 'value': 70},\n",
" {'badgeGradientEndColor': '38D7FF',\n",
" 'badgeGradientStartColor': '2DB4D7',\n",
" 'badgeType': 'DAILY_FLOORS',\n",
" 'category': 'Daily Climb',\n",
" 'cheers': [],\n",
" 'dateTime': '2022-08-25',\n",
" 'description': '50 floors in a day',\n",
" 'earnedMessage': 'Congrats on earning your first Lighthouse badge!',\n",
" 'encodedId': '228TT7',\n",
" 'image100px': 'https://www.gstatic.com/fitbit/badge/images/badges_new/100px/badge_daily_floors50.png',\n",
" 'image125px': 'https://www.gstatic.com/fitbit/badge/images/badges_new/125px/badge_daily_floors50.png',\n",
" 'image300px': 'https://www.gstatic.com/fitbit/badge/images/badges_new/300px/badge_daily_floors50.png',\n",
" 'image50px': 'https://www.gstatic.com/fitbit/badge/images/badges_new/badge_daily_floors50.png',\n",
" 'image75px': 'https://www.gstatic.com/fitbit/badge/images/badges_new/75px/badge_daily_floors50.png',\n",
" 'marketingDescription': \"You've climbed 50 floors to earn the Lighthouse badge!\",\n",
" 'mobileDescription': \"With a floor count this high, you're a beacon of inspiration to us all!\",\n",
" 'name': 'Lighthouse (50 floors in a day)',\n",
" 'shareImage640px': 'https://www.gstatic.com/fitbit/badge/images/badges_new/386px/shareLocalized/en_US/badge_daily_floors50.png',\n",
" 'shareText': 'I climbed 50 flights of stairs and earned the Lighthouse badge! #Fitbit',\n",
" 'shortDescription': '50 floors',\n",
" 'shortName': 'Lighthouse',\n",
" 'timesAchieved': 4,\n",
" 'value': 50},\n",
" {'badgeGradientEndColor': 'FFDB01',\n",
" 'badgeGradientStartColor': 'D99123',\n",
" 'badgeType': 'LIFETIME_FLOORS',\n",
" 'category': 'Lifetime Climb',\n",
" 'cheers': [],\n",
" 'dateTime': '2022-09-06',\n",
" 'description': '500 lifetime floors',\n",
" 'earnedMessage': \"Yipee! You've earned the Helicopter badge!\",\n",
" 'encodedId': '228TB8',\n",
" 'image100px': 'https://www.gstatic.com/fitbit/badge/images/badges_new/100px/badge_lifetime_floors500.png',\n",
" 'image125px': 'https://www.gstatic.com/fitbit/badge/images/badges_new/125px/badge_lifetime_floors500.png',\n",
" 'image300px': 'https://www.gstatic.com/fitbit/badge/images/badges_new/300px/badge_lifetime_floors500.png',\n",
" 'image50px': 'https://www.gstatic.com/fitbit/badge/images/badges_new/badge_lifetime_floors500.png',\n",
" 'image75px': 'https://www.gstatic.com/fitbit/badge/images/badges_new/75px/badge_lifetime_floors500.png',\n",
" 'marketingDescription': \"By climbing 500 lifetime floors, you've earned the Helicopter badge!\",\n",
" 'mobileDescription': \"You've reached the altitude of a helicopter. Upward and onward as you soar to new heights!\",\n",
" 'name': 'Helicopter (500 lifetime floors)',\n",
" 'shareImage640px': 'https://www.gstatic.com/fitbit/badge/images/badges_new/386px/shareLocalized/en_US/badge_lifetime_floors500.png',\n",
" 'shareText': 'I climbed 500 floors with my #Fitbit and earned the Helicopter badge.',\n",
" 'shortDescription': '500 floors',\n",
" 'shortName': 'Helicopter',\n",
" 'timesAchieved': 1,\n",
" 'value': 500}],\n",
" 'waterUnit': 'en_US',\n",
" 'waterUnitName': 'fl oz',\n",
" 'weight': 82.5,\n",
" 'weightUnit': 'en_US'}}"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# calls user profile\n",
"call_API(\n",
" access_token=variables[\"access_token\"],\n",
" url=\"https://api.fitbit.com/1/user/-/profile.json\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "GCoEb3ZcwCov"
},
"source": [
"# Data Extraction"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "q6K2MJSGGkCH"
},
"source": [
"Now, we can extract data by calling the API using the `call_API()` function. A full list of data types and endpoints are available [here](https://dev.fitbit.com/build/reference/web-api/explore/).\n",
"\n",
"In brief, the categories are:\n",
"* [Activity](https://dev.fitbit.com/build/reference/web-api/activity/)\n",
"* [Activity Intraday Time Series](https://dev.fitbit.com/build/reference/web-api/activity/#get-activity-intraday-time-series)\n",
"* [Activity Time Series](https://dev.fitbit.com/build/reference/web-api/activity/#get-activity-intraday-time-series)\n",
"* [Body and Weight](https://dev.fitbit.com/build/reference/web-api/body/)\n",
"* [Body and Weight Time Series](https://dev.fitbit.com/build/reference/web-api/body/#body-time-series)\n",
"* [Devices](https://dev.fitbit.com/build/reference/web-api/devices/)\n",
"* [Food and Water](https://dev.fitbit.com/build/reference/web-api/nutrition/)\n",
"* [Food and Water Time Series](https://dev.fitbit.com/build/reference/web-api/nutrition/)\n",
"* [Friends](https://dev.fitbit.com/build/reference/web-api/friends/)\n",
"* [Heart Rate Intraday Time Series](https://dev.fitbit.com/build/reference/web-api/heartrate-timeseries/)\n",
"* [Heart Rate Time Series](https://dev.fitbit.com/build/reference/web-api/heartrate-timeseries/)\n",
"* [Sleep](https://dev.fitbit.com/build/reference/web-api/sleep/)\n",
"* [Subscriptions](https://dev.fitbit.com/build/reference/web-api/subscription/)\n",
"* [User](https://dev.fitbit.com/build/reference/web-api/user/)\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 371
},
"id": "sNMnkFlKGr9b",
"outputId": "753f1f93-c84c-42eb-e6c2-98fa708a1b0b"
},
"outputs": [
{
"ename": "JSONDecodeError",
"evalue": "ignored",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mJSONDecodeError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[1;32m 36\u001b[0m \u001b[0;31m# loop api calls for all categories\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 37\u001b[0m \u001b[0;32mfor\u001b[0m \u001b[0mcategory\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mvalues\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mcategories\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mitems\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 38\u001b[0;31m response = call_API(\n\u001b[0m\u001b[1;32m 39\u001b[0m \u001b[0murl\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mvalues\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m\"url\"\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 40\u001b[0m \u001b[0maccess_token\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mvariables\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m\"access_token\"\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m\u001b[0m in \u001b[0;36mcall_API\u001b[0;34m(access_token, url, call)\u001b[0m\n\u001b[1;32m 8\u001b[0m \u001b[0;34m\"Authorization\"\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m\"Bearer \"\u001b[0m \u001b[0;34m+\u001b[0m \u001b[0maccess_token\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 9\u001b[0m }\n\u001b[0;32m---> 10\u001b[0;31m return requests.request(\n\u001b[0m\u001b[1;32m 11\u001b[0m call, url=url, headers=headers).json()\n",
"\u001b[0;32m/usr/local/lib/python3.8/dist-packages/requests/models.py\u001b[0m in \u001b[0;36mjson\u001b[0;34m(self, **kwargs)\u001b[0m\n\u001b[1;32m 896\u001b[0m \u001b[0;31m# used.\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 897\u001b[0m \u001b[0;32mpass\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 898\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0mcomplexjson\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mloads\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mtext\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 899\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 900\u001b[0m \u001b[0;34m@\u001b[0m\u001b[0mproperty\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m/usr/lib/python3.8/json/__init__.py\u001b[0m in \u001b[0;36mloads\u001b[0;34m(s, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)\u001b[0m\n\u001b[1;32m 355\u001b[0m \u001b[0mparse_int\u001b[0m \u001b[0;32mis\u001b[0m \u001b[0;32mNone\u001b[0m \u001b[0;32mand\u001b[0m \u001b[0mparse_float\u001b[0m \u001b[0;32mis\u001b[0m \u001b[0;32mNone\u001b[0m \u001b[0;32mand\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 356\u001b[0m parse_constant is None and object_pairs_hook is None and not kw):\n\u001b[0;32m--> 357\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0m_default_decoder\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdecode\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0ms\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 358\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mcls\u001b[0m \u001b[0;32mis\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 359\u001b[0m \u001b[0mcls\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mJSONDecoder\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m/usr/lib/python3.8/json/decoder.py\u001b[0m in \u001b[0;36mdecode\u001b[0;34m(self, s, _w)\u001b[0m\n\u001b[1;32m 335\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 336\u001b[0m \"\"\"\n\u001b[0;32m--> 337\u001b[0;31m \u001b[0mobj\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mend\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mraw_decode\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0ms\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0midx\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0m_w\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0ms\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m0\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mend\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 338\u001b[0m \u001b[0mend\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0m_w\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0ms\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mend\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mend\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 339\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mend\u001b[0m \u001b[0;34m!=\u001b[0m \u001b[0mlen\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0ms\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m/usr/lib/python3.8/json/decoder.py\u001b[0m in \u001b[0;36mraw_decode\u001b[0;34m(self, s, idx)\u001b[0m\n\u001b[1;32m 353\u001b[0m \u001b[0mobj\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mend\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mscan_once\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0ms\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0midx\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 354\u001b[0m \u001b[0;32mexcept\u001b[0m \u001b[0mStopIteration\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0merr\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 355\u001b[0;31m \u001b[0;32mraise\u001b[0m \u001b[0mJSONDecodeError\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"Expecting value\"\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0ms\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0merr\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mvalue\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;32mfrom\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 356\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0mobj\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mend\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;31mJSONDecodeError\u001b[0m: Expecting value: line 1 column 1 (char 0)"
]
}
],
"source": [
"#@title Set up the start and end dates (YYYY-MM-DD) or a single date for day's data extraction\n",
"start_date = \"2022-07-01\" #@param {type:\"string\"}\n",
"end_date = \"2022-08-30\" #@param {type:\"string\"}\n",
"single_date = \"2022-12-11\" #@param {type:\"string\"}\n",
"\n",
"# store arguments for some of the categories\n",
"categories = {\n",
" \"sleep\": {\n",
" \"url\": \"https://api.fitbit.com/1.2/user/-/sleep/date/\" + start_date + \"/\" + end_date + \".json\"},\n",
" \"steps\": {\n",
" \"url\": \"https://api.fitbit.com/1/user/-/activities/steps/date/\" + start_date + \"/\" + end_date + \".json\"},\n",
" \"minutesVeryActive\": {\n",
" \"url\": \"https://api.fitbit.com/1/user/-/activities/minutesVeryActive/date/\" + start_date + \"/\" + end_date + \".json\"},\n",
" \"minutesFairlyActive\": {\n",
" \"url\": \"https://api.fitbit.com/1/user/-/activities/minutesFairlyActive/date/\" + start_date + \"/\" + end_date + \".json\"},\n",
" \"minutesLightlyActive\": {\n",
" \"url\": \"https://api.fitbit.com/1/user/-/activities/minutesLightlyActive/date/\" + start_date + \"/\" + end_date + \".json\"},\n",
" \"distance\": {\n",
" \"url\": \"https://api.fitbit.com/1/user/-/activities/distance/date/\" + start_date + \"/\" + end_date + \".json\"},\n",
" \"minutesSedentary\": {\n",
" \"url\": \"https://api.fitbit.com/1/user/-/activities/minutesSedentary\t/date/\" + start_date + \"/\" + end_date + \".json\"},\n",
" 'heart_rate_day':{\n",
" 'url': 'https://api.fitbit.com/1/user/-/activities/heart/date/'+ single_date +'/1d.json'},\n",
" 'hrv':{\n",
" 'url': 'https://api.fitbit.com/1/user/-/hrv/date/'+ single_date +\".json\"},\n",
" 'distance_day':{\n",
" 'url': \"https://api.fitbit.com/1/user/-/activities/distance/date/\" + single_date + \"/1d.json\"},\n",
" 'ECG':{\n",
" 'url': \"https://api.fitbit.com/1/user/-/ecg/list.json?afterDate=\"+ single_date +\"&sort=asc&limit=1&offset=0\"}, \n",
" }\n",
"\n",
"\n",
"# initialize empty dictionary to aggregate values\n",
"api_data = dict()\n",
"\n",
"# loop api calls for all categories\n",
"for category, values in categories.items():\n",
" response = call_API(\n",
" url=values[\"url\"],\n",
" access_token=variables[\"access_token\"]\n",
" )\n",
"\n",
" api_data[category] = [response]\n",
"\n",
"# initalize metadata with information for all api_data value keys\n",
"meta_api_data = {i:{j for j in api_data[i][0].keys()} for i in api_data.keys()}"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "pZtYSKuAGwz8"
},
"outputs": [],
"source": [
"meta_api_data"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "gVAxzEYmG3GL"
},
"outputs": [],
"source": [
"api_data"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "_N29AffmwC9a"
},
"source": [
"# Data Visualization\n",
"\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "0P0eekSbUsyD"
},
"source": [
"## 4.1 Visualizing Non-Wear Days and Filtering The Data"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "bd0nGFRfV11M"
},
"source": [
"Here we are going to visualize what days data wasn't collected. Since the days in which data wasn't collected show \"zero\" values for the data, then filter this out to increase the accuracy of the analysis. This is because zero values will be consided data points in the analysis which will alter the results.\n",
"\n",
"So we will create a function that can come later in aid to delete the data which have no value, but first lets take a look at our sample of data."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "2up6mdKWW7Gb"
},
"outputs": [],
"source": [
"# First we are going to aggregate the data in arrays. We are taking steps as an example here\n",
"dates = []\n",
"steps = []\n",
"\n",
"for datapoint in api_data['steps'][0]['activities-steps']:\n",
" dates.append(datapoint['dateTime'])\n",
" steps.append(float(datapoint['value']))\n",
"\n",
"#Using a pandas dataframe to aggregate the data\n",
"d = {'Date': dates, 'steps': steps}\n",
"df = pd.DataFrame(data=d)\n",
"\n",
"with plt.style.context('fivethirtyeight'):\n",
" #Creating the plot \n",
" #Resizing the plot\n",
" fig,ax = plt.subplots()\n",
" fig.set_size_inches(12,10)\n",
"\n",
" sns.set_theme(style=\"dark\")\n",
" ax = sns.barplot(x=\"Date\", y=\"steps\", data=df, palette = 'Dark2_r' )\n",
"\n",
" # adjust tick sizes\n",
" plt.tick_params(axis='x', labelsize=8)\n",
" plt.tick_params(axis='y', labelsize=8)\n",
"\n",
" # rotates and right-aligns the x labels so they don't crowd each other.\n",
" for label in ax.get_xticklabels(which='major'):\n",
" label.set(rotation=90, horizontalalignment='right')\n",
" plt.show()\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "63X9CWHdcp-A"
},
"source": [
"As shown above, alot of the data is not recorded, and that's what we don't want to include in the analysis so lets filter this out."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "nsTNtYQhc982"
},
"outputs": [],
"source": [
"#since the data will consist of multiple arrays carrying multiple data points, \n",
"# we will create the function such that it gets the list of arrays as a parameter \n",
"# and the refrence index for the array to examine the data from\n",
"\n",
"def remove_non_wear(lst, refrence_index):\n",
" newlst = []\n",
" for i in range(len(lst)):\n",
" newlst.append([])\n",
"\n",
" for index in range(len(lst[refrence_index])):\n",
" if lst[refrence_index][index] != 0:\n",
" for array in lst:\n",
" newlst[lst.index(array)].append(array[index])\n",
" return newlst"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "ijb3XQ2EdIQM"
},
"outputs": [],
"source": [
"# Lets put that to test\n",
"\n",
"#Now let's test this\n",
"new_arrays = remove_non_wear([dates, steps], 1)\n",
"\n",
"#creating a new plot with the new data\n",
"dates = new_arrays[0]\n",
"steps = new_arrays[1]\n",
"\n",
"#creating a new plot with the new data\n",
"d = {'Date': dates, 'steps': steps}\n",
"df = pd.DataFrame(data=d)\n",
"\n",
"with plt.style.context('fivethirtyeight'):\n",
" #Resizing the plot\n",
" fig,ax = plt.subplots()\n",
" fig.set_size_inches(5,6)\n",
"\n",
" sns.set_theme(style=\"dark\")\n",
" ax = sns.barplot(x=\"Date\", y=\"steps\", data=df, palette = 'Dark2_r' )\n",
"\n",
" # adjust tick sizes\n",
" plt.tick_params(axis='x', labelsize=8)\n",
" plt.tick_params(axis='y', labelsize=8)\n",
"\n",
" # rotates and right-aligns the x labels so they don't crowd each other.\n",
" for label in ax.get_xticklabels(which='major'):\n",
" label.set(rotation=90, horizontalalignment='right')\n",
" plt.show()\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "nHBebu3udtI6"
},
"source": [
"Initially, you might think that the data on 22-07-25 wasn't filtered out but lets check if it has a non-zero value"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "z32qNDdXeBp7"
},
"outputs": [],
"source": [
"print(steps)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "I-cVS7QEeEUS"
},
"source": [
"The number of steps is only 11 which indicates that the fitbit was taken off most of the day. It is considered an outlier value for this reason and we will filter this out in section 5.1.\n",
"\n",
"Now that we know this filteration works, lets recreate some visuals from the app"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Qwveat0YUtfH"
},
"source": [
"## 4.2 Visualizing Heart Rate"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "DryU4NNlgoP3"
},
"source": [
"Here, we are going to replicate the following visual from the fitbit app that presents the heart rate in a single day.\n",
"\n",
""
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "f4OlemJGhnNc"
},
"source": [
"First we are going to collect the data in arrays then create the plot and make it look like the original plot. At the end, we place the labels and adjust the font sizes. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "fuxlKyVLhmku"
},
"outputs": [],
"source": [
"# First we are going to aggregate the data in arrays. We are taking steps as an example here\n",
"time = []\n",
"heart_rate = []\n",
"\n",
"for datapoint in api_data['heart_rate_day'][0]['activities-heart-intraday']['dataset']:\n",
" time.append(int(datapoint['time'][0:2]) * 60 + int(datapoint['time'][3:5]))\n",
" heart_rate.append(float(datapoint['value']))\n",
"\n",
"# Lets smooth out the data the way I think fitbit does something similar to\n",
"new_heart_rate = []\n",
"new_time = []\n",
"temp_time = []\n",
"counter = 0\n",
"for i in range(len(time)):\n",
" temp_time.append(time[i])\n",
" if counter % 9 == 0 or i == len(time)-1:\n",
" new_heart_rate.append(heart_rate[i])\n",
" new_time.append(sum(temp_time) / len(temp_time))\n",
" temp_time = []\n",
" counter+=1\n",
"\n",
"\n",
"\n",
"with plt.style.context('dark_background'):\n",
"\n",
" # creating the plot and setting the background color\n",
" fig = plt.figure(figsize=(9, 5))\n",
" fig.patch.set_facecolor('#0a7192')\n",
" # adjust the facecolor\n",
" plt.gca().set_facecolor('#0a7192')\n",
" \n",
" plt.plot(new_time, new_heart_rate, linewidth=3, color='#4fbecc')\n",
"\n",
"\n",
" # setting the width of the X axis\n",
" times = [0, 24*60]\n",
" plt.xticks(ticks=times, labels=['', ''])\n",
"\n",
" #keeping on only some of the y values by adjusting the labels\n",
" bpms = [54, 74, 94]\n",
" plt.yticks(ticks=bpms, labels=['54', '74', '94'])\n",
"\n",
" # removing the borders from four sides\n",
" plt.gca().spines['left'].set_visible(False)\n",
" plt.gca().spines['right'].set_visible(False)\n",
" plt.gca().spines['top'].set_visible(False)\n",
" plt.gca().spines['bottom'].set_visible(False)\n",
"\n",
" # adjusting the label size\n",
" plt.tick_params(axis='y', labelsize=16)\n",
" \n",
" # set the colors of the grid\n",
" plt.gca().grid(axis='y', color='#1f8ba0')\n",
"\n",
"\n",
" #adding labels\n",
" plt.figtext(0.5,1.0, 'Monday', fontsize=22, ha='center', color ='w', fontweight = 'bold')\n",
" plt.figtext(0.5,0.90, 'Beats Per Minute', fontsize=17, ha='center', color ='w')\n",
" plt.figtext(0.5,0.03, 'NOON', fontsize=18, ha='center', color ='w', fontweight = 'bold')\n",
" plt.figtext(0.15,0.02, '12:00 AM', fontsize=18, ha='center', color ='w', fontweight = 'bold')\n",
" plt.figtext(0.85,0.02, '11:59 PM', fontsize=16, ha='center', color ='w', fontweight = 'bold')\n",
"\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "kxfoYDaO2osm"
},
"source": [
"Now it looks very similar to the original plot!"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "q3_uzdBZUt9w"
},
"source": [
"## 4.3 Visualizing Distance\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "xoAAp6B24Lhq"
},
"source": [
"Now lets replicate this next visual which plots the distance walked in a certain day.\n",
"\n",
""
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "6l0YWCH55peT"
},
"source": [
"To build this plot, we are going to collect the data in arrays, plot them with matplotlib, and edit the aesthetics to look similar to the original. One interesting thing that fitbit does is that it divides the day into sections and adds the data in those sections then visualizes them. That's what we are going to do despite that it may produce different aggregates numbers."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "vNXAAsnn6DIu"
},
"outputs": [],
"source": [
"# First we are going to aggregate the data in arrays with only 96 entries\n",
"# Since we have 1440 datapoint, each of the 96 entries will the sum of 15 of those\n",
"\n",
"time = []\n",
"distances = [0]*96\n",
"\n",
"counter = 0\n",
"entry = 0\n",
"for datapoint in api_data['distance_day'][0]['activities-distance-intraday']['dataset']:\n",
" if entry == 96:\n",
" break\n",
" distances[entry] += float(datapoint['value'])\n",
" if counter % 15 == 0:\n",
" time.append(int(datapoint['time'][0:2]) * 60 + int(datapoint['time'][3:5]))\n",
" entry += 1\n",
" counter += 1\n",
"\n",
"\n",
" \n",
"with plt.style.context('dark_background'):\n",
"\n",
" # creating the plot and setting the background color\n",
" fig = plt.figure(figsize=(9, 5))\n",
" fig.patch.set_facecolor('#02575c')\n",
" # adjust the facecolor\n",
" plt.gca().set_facecolor('#02575c')\n",
"\n",
" plt.bar(time, distances, color = 'w', edgecolor = '#80a9ab', width = 15)\n",
"\n",
" # removing the borders from four sides\n",
" plt.gca().spines['left'].set_visible(False)\n",
" plt.gca().spines['right'].set_visible(False)\n",
" plt.gca().spines['top'].set_visible(False)\n",
" plt.gca().spines['bottom'].set_visible(False)\n",
"\n",
" #keeping on only some of the y values by adjusting the labels\n",
" bpms = [0, 0.9]\n",
" plt.yticks(ticks=bpms, labels=['0', '0.9'])\n",
"\n",
" # Creating a horizontal lines at 0 and 0.9\n",
" plt.axhline(y=0.005, linewidth = 0.5) # can't make one at zero so using a small number\n",
" plt.axhline(y=0.9, linewidth = 0.5)\n",
"\n",
" # removing x labels\n",
" times = [0, 24*60]\n",
" plt.xticks(ticks=times, labels=['', ''])\n",
"\n",
" # adjusting the label size\n",
" plt.tick_params(axis='y', labelsize=16)\n",
" \n",
" #adding labels\n",
" plt.figtext(0.5,0.90, 'Distance', fontsize=19, ha='center', color ='w', fontweight = 'bold')\n",
" plt.figtext(0.5,0.02, 'NOON', fontsize=18, ha='center', color ='w', fontweight = 'bold')\n",
" plt.figtext(0.15,0.02, '12:00 AM', fontsize=18, ha='center', color ='w', fontweight = 'bold')\n",
" plt.figtext(0.85,0.02, '11:59 PM', fontsize=16, ha='center', color ='w', fontweight = 'bold')\n",
" \n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "voZtY46DDfEy"
},
"source": [
"Although the numbers are the same because we don't know how many parts are in the orginal graph, the pattern looks close enough."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "krHp2C8_wNC2"
},
"source": [
"# Data Analysis"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "fEzBJBzTUvG4"
},
"source": [
"## 5.1 Detecting Anomalies (Outliers)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "fTwDu3oifh4t"
},
"source": [
"Detecting outliers is a uselful thing to do before an analysis which is evident from the datapoint shown in section 4.1 where the device was worn for a very little time. Data points like this may affect the results of a data analysis done on the data that contain them. There are multiple ways of annotating outliers and here we are going to present two of them. "
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "vR8yG6akLC3G"
},
"source": [
"### First Way to Detect Outliers: Iterquartile range\n",
"\n",
"According to the [National Institute of Standards and Technology's handbook](https://https://www.itl.nist.gov/div898/handbook/prc/section1/prc16.htm#:~:text=An%20outlier%20is%20an%20observation,what%20will%20be%20considered%20abnormal.), we assume that a \"mild outlier\" is a datapoint that is higher than or lower than the first quartile or higher than the third quartile by a distance of the interquartile range multiplied by 1.5 points, while \"extreme outliers\" are ones that distance multipled by 3 points instead. Lets create a function that applies this on a given array."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "gctkjYGsLhmK"
},
"outputs": [],
"source": [
"def find_Outliers(array):\n",
" #find interquartile range \n",
" quartiles = np.quantile(array, [0.25,0.75])\n",
" q1, q3 = quartiles[0],quartiles[1]\n",
" interquartile_Range = q3 - q1\n",
" \n",
" #append outliers\n",
" outliers = []\n",
" for item in array:\n",
" if item >= q3+(1.5* interquartile_Range) or item <= q1 - (1.5*interquartile_Range):\n",
" outliers.append(item) \n",
" return outliers"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "0Ku3BXl2L5dc"
},
"source": [
"Lets test this function"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "CjoVYLsFMO7s"
},
"outputs": [],
"source": [
"#we aleady have the new_heart_rate from the past plot\n",
"#lets inject an outlier value\n",
"new_heart_rate.append(12)\n",
"\n",
"#run the function\n",
"outliers = find_Outliers(new_heart_rate)\n",
"print(outliers)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "CgsEjPG0NgGL"
},
"source": [
"Looks like that works!"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Z5UaWZX1NqbG"
},
"source": [
"### Second Way To Detect Outliers: Elleptic Envelope\n",
"\n",
"The Elliptic Envelope algorithm is a machine learning algorith that creates a hypothetical ellipse around the set of data and points outside of this envelope are considered outliers. Check [this](https://towardsdatascience.com/machine-learning-for-anomaly-detection-elliptic-envelope-2c90528df0a6) to learn more about the algorithm."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"background_save": true
},
"id": "MO4JRUuVOUpr"
},
"outputs": [],
"source": [
"def find_outliers2(arr):\n",
" list_of_outliers = []\n",
" # Create a dataframe\n",
" d = {'arr': arr}\n",
" df = pd.DataFrame(data=d)\n",
"\n",
" # here we return the a list where the indexies with -1 values are where the\n",
" # outliers are at. learn more about the implementation here: \n",
" # https://www.datatechnotes.com/2020/04/anomaly-detection-with-elliptical-envelope-in-python.html\n",
" pred = EllipticEnvelope(assume_centered=False, contamination=0.02, random_state=None,\n",
" store_precision=True, support_fraction=None).fit_predict(df['arr'].array.reshape(-1, 1))\n",
" for i in range(len(pred)):\n",
" if pred[i] == -1:\n",
" list_of_outliers.append(arr[i])\n",
" return list_of_outliers"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "YSweE9utOlOz"
},
"source": [
"Lets test it"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"background_save": true
},
"id": "u2occ6f4OnPf"
},
"outputs": [],
"source": [
"#run the function\n",
"outliers = find_outliers2(new_heart_rate)\n",
"print(outliers)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "iEDduesoOxc9"
},
"source": [
"That works as well! There are numerous other ways to detect outliers. You can check [this](https://towardsdatascience.com/5-ways-to-detect-outliers-that-every-data-scientist-should-know-python-code-70a54335a623) out for more!\n",
"Now lets create a plot to highlight the outliers."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"background_save": true
},
"id": "vUL9zvWga1tH"
},
"outputs": [],
"source": [
"outlier_time = [123]\n",
"new_time.append(123)\n",
"\n",
"with plt.style.context('fivethirtyeight'):\n",
"\n",
" #creating the plot without highlighting outliers\n",
" plt.xlabel('heart rate')\n",
" plt.ylabel('dates')\n",
" plt.scatter(x = new_heart_rate, y = new_time, color = 'B')\n",
" plt.rcParams[\"figure.figsize\"] = (5,5)\n",
" plt.show(block=True)\n",
"\n",
" #recreating the plot with highlighting outliers\n",
" plt.xlabel('heart_rate')\n",
" plt.ylabel('dates')\n",
" plt.scatter(x = new_heart_rate, y = new_time, color = 'B')\n",
" plt.rcParams[\"figure.figsize\"] = (5,5)\n",
" plt.scatter(x = outliers, y = outlier_time, color='r')\n",
" plt.show(block=True)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "6lAGrHSQUv3g"
},
"source": [
"## 5.2 Checking for correlation between distance walked and heart rate"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "QU2IigUXShZZ"
},
"source": [
"Here we are trying to see if there is a correlation between the distance walked and heart_rate. The very intuitive hypothesis is that walking for a longer distance will correlate with the heart rate at this time, and we are checking for the validity of this hyposthesis."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "-v8dH4u7TH8c"
},
"source": [
"To test for this hypothesis, we are going to collect the data, make the times match and see if there is correlation by calculating its p-value"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "TPfet0KNTHXX"
},
"outputs": [],
"source": [
"# first, let's collect the data \n",
"\n",
"time_distances = []\n",
"distances = []\n",
"\n",
"for datapoint in api_data['distance_day'][0]['activities-distance-intraday']['dataset']:\n",
" distances.append(float(datapoint['value']))\n",
" time_distances.append(int(datapoint['time'][0:2]) * 60 + int(datapoint['time'][3:5]))\n",
"\n",
"time_heart = []\n",
"heart_rate = []\n",
"\n",
"for datapoint in api_data['heart_rate_day'][0]['activities-heart-intraday']['dataset']:\n",
" time_heart.append(int(datapoint['time'][0:2]) * 60 + int(datapoint['time'][3:5]))\n",
" heart_rate.append(float(datapoint['value']))\n",
"\n",
"\n",
"#since all times are collected in distances but not heart rate we have fix this\n",
"#to get matching size arrays\n",
"new_dist = []\n",
"new_bpm = []\n",
"\n",
"for i in range(len(distances)):\n",
" if distances[i] != 0:\n",
" if time_distances[i] in time_heart:\n",
" new_dist.append(distances[i])\n",
" new_bpm.append(heart_rate[time_heart.index(time_distances[i])])\n",
"\n",
"\n",
"#We remove the outliers after this\n",
"outliers1 = find_outliers2(new_dist)\n",
"outliers2 = find_outliers2(new_bpm)\n",
"for item in outliers1:\n",
" if item in new_dist:\n",
" index = new_dist.index(item)\n",
" new_dist.remove(item)\n",
" new_bpm.pop(index)\n",
"for item in outliers2:\n",
" if item in new_bpm:\n",
" index = new_bpm.index(item)\n",
" new_bpm.remove(item)\n",
" new_dist.pop(index)\n",
" \n",
"# create a dataframex \n",
"d = {'distance (miles)': new_dist, 'bpm': new_bpm}\n",
"df = pd.DataFrame(data=d)\n",
"\n",
"#plot the data\n",
"with plt.style.context('fivethirtyeight'):\n",
" graph = sns.lmplot(data=df, y=\"distance (miles)\", x=\"bpm\")\n",
" plt.show(block=True)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "nwgcapa_WzFL"
},
"outputs": [],
"source": [
"slope, intercept, r_value, p_value, std_err = stats.linregress(new_dist,new_bpm)\n",
"print(p_value)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "lpkWe-29XURl"
},
"source": [
"The p value here is about 0.00000057 which is definetly smaller than 0.05 which is the cutoff value for whether the correlation is significant or not. This means that there is a 0.000057% chance that this data is generated at random which means the correlation is significant! "
]
}
],
"metadata": {
"colab": {
"collapsed_sections": [
"n_Puc2IGBGfJ",
"YKSQRaSQCD15",
"uFNEPwuzF3Q9",
"0P0eekSbUsyD",
"Qwveat0YUtfH",
"q3_uzdBZUt9w",
"vR8yG6akLC3G",
"6lAGrHSQUv3g"
],
"provenance": []
},
"kernelspec": {
"display_name": "Python 3",
"name": "python3"
},
"language_info": {
"name": "python"
}
},
"nbformat": 4,
"nbformat_minor": 0
}