{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "\n", "

Python for Business Analytics

\n", "

Working with data in Python (exercises)

\n", "
\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "

Task 1

\n", "\n", "The marketing team at the financial services company that offers the credit card would like to better understand their clients in order to identify high value existing and potential customers to send offers to. Using descriptive statistics, investigate \n", "\n", "**(a)** Do gender or ethinicity seem related to the number of cards that a customer owns? \n", "\n", "**(b)** Which variables have the highest correlation with monthly credit card balance? Are there any other interesting correlations in the data? " ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ObsIncomeLimitRatingCardsAgeEducationGenderStudentMarriedEthnicityBalance
0114.891360628323411MaleNoYesCaucasian333
12106.025664548338215FemaleYesYesAsian903
23104.593707551447111MaleNoNoAsian580
34148.924950468133611FemaleNoNoAsian964
4555.882489735726816MaleNoYesCaucasian331
\n", "
" ], "text/plain": [ " Obs Income Limit Rating Cards Age Education Gender Student Married \\\n", "0 1 14.891 3606 283 2 34 11 Male No Yes \n", "1 2 106.025 6645 483 3 82 15 Female Yes Yes \n", "2 3 104.593 7075 514 4 71 11 Male No No \n", "3 4 148.924 9504 681 3 36 11 Female No No \n", "4 5 55.882 4897 357 2 68 16 Male No Yes \n", "\n", " Ethnicity Balance \n", "0 Caucasian 333 \n", "1 Asian 903 \n", "2 Asian 580 \n", "3 Asian 964 \n", "4 Caucasian 331 " ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import pandas as pd\n", "data=pd.read_csv('credit.csv')\n", "data.head()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "

Task 2

\n", "\n", "The population.xlsx file contains World Bank data on the total population of countries and regions in 1960 and 2015.\n", "\n", "**(a)** Use the appropriate pandas function to import the data and specify the country name as the index label (the column name is 'Country' in the original file).\n", "\n", "**(b)** Display the first five rows.\n", "\n", "**(c)** Display the data for Australia, China, and New Zealand only.\n", "\n", "**(d)** Display the population size for all countries with population higher than 100 million in 2015.\n", "\n", "**(e)** Create a new dataframe, *large_population*, which contains a copy of the data selected in (d). Save it as an Excel file. Open it in Excel, do some basic formatting, and transfer the final table to Word as you may do when writing a report. \n", "\n", "The cell below starts the exercise. " ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import pandas as pd\n", "data=pd.read_excel('population.xlsx', index_col='Country')\n", "data.head(5)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.0" } }, "nbformat": 4, "nbformat_minor": 2 }