I'm writing my first Python file which basically grabs some webdata and saves to a .csv file (and working, see below).
The data structure is consistent but has a header consisting of some 17 rows. I want to import the csv into SQL, but its having trouble with the header data, even if I tell it to start reading from row 18 etc it can't see the data unless I manually delete rows 1-17.
I'm thinking the easiest option would be to simply delete rows 1-17 as part of my Python code below. But I have no idea where to start so any tips appreciated.
import urllib.request, urllib.parse, urllib.error
ASXCode = 'CSL'
url = 'http://chartapi.finance.yahoo.com/instrument/1.0/' + ASXCode + '.ax/chartdata;type=quote;range=1d/csv'
urllib.request.urlretrieve(url, "Intra_" + ASXCode + ".csv")
You could do it like this which first downloads and saves the webdata into a temporary file, and then copies it to the final destination file but skips the first 17 rows at the beginning.
import csv
import os
import urllib.request, urllib.parse, urllib.error
ASXCode = 'CSL'
local_filename = "Intra_" + ASXCode + ".csv"
url = ('http://chartapi.finance.yahoo.com/instrument/1.0/' + ASXCode +
'.ax/chartdata;type=quote;range=1d/csv')
temp_filename, headers = urllib.request.urlretrieve(url)
with open(temp_filename, 'r', newline='') as inf, \
open(local_filename, 'w', newline='') as outf:
reader = csv.reader(inf)
writer = csv.writer(outf)
for _ in range(17): # skip first 17 rows
next(reader)
writer.writerows(reader) # copy the rest
os.remove(temp_filename) # clean up
print('{} downloaded'.format(local_filename))