Loading Data Churm_-_01_-_Import.ipynb

The Import step for this data set is simple:

Step 0: Create project tree

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
if COLAB:
  from google.colab import drive
  if not os.path.isdir("/content/gdrive"):
    drive.mount("/content/gdrive")
    d = "/content/gdrive/MyDrive/datasets"
    if not os.path.isdir(d): os.makedirs(d)
  if not os.path.isdir(ROOT): os.makedirs(ROOT)

def makedirs(d):
  if COLAB:
    if not os.path.isdir(ROOT+d): os.makedirs(ROOT+d)
  else:
    if not os.path.isdir(ROOT+d): os.makedirs(ROOT+d, mode=0o777, exist_ok=True)

for d in ['orig','data','output']: makedirs(d)

Step 1 Get data

Put the original data into folder orig. Either download the files from the links provided (for this project we have a data file (CSV) and a data sheet file (YAML) specifying the label encoding) ) or run something like

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
BASE_URL = "https://SETU-DataMining2.github.io/live/resources/churn"

for filename in ['data.csv','datasheet.yaml']:
  source = f"{BASE_URL}/{filename}"
    target = f"{ROOT}/orig/{filename}"

  if not os.path.isfile(target):
    print (f"Downloading remote file {filename}", sep="")
    import urllib.request
    urllib.request.urlretrieve(source, target)
  else:
    print(f"Using local copy of {filename}")

You need to: