Contents | Previous (3.5 Main module) | Next (4 Classes)
In this section we reconsider a design decision made earlier.
Compare these two programs that return the same output.
# Provide a filename
def read_data(filename):
records = []
with open(filename) as f:
for line in f:
...
records.append(r)
return records
d = read_data('file.csv')
# Provide lines
def read_data(lines):
records = []
for line in lines:
...
records.append(r)
return records
with open('file.csv') as f:
d = read_data(f)
- Which of these functions do you prefer? Why?
- Which of these functions is more flexible?
Duck Typing is a computer programming concept to determine whether an object can be used for a particular purpose. It is an application of the duck test.
If it looks like a duck, swims like a duck, and quacks like a duck, then it probably is a duck.
In the second version of read_data()
above, the function expects any
iterable object. Not just the lines of a file.
def read_data(lines):
records = []
for line in lines:
...
records.append(r)
return records
This means that we can use it with other lines.
# A CSV file
lines = open('data.csv')
data = read_data(lines)
# A zipped file
lines = gzip.open('data.csv.gz','rt')
data = read_data(lines)
# The Standard Input
lines = sys.stdin
data = read_data(lines)
# A list of strings
lines = ['ACME,50,91.1','IBM,75,123.45', ... ]
data = read_data(lines)
There is considerable flexibility with this design.
Question: Should we embrace or fight this flexibility?
Code libraries are often better served by embracing flexibility. Don't restrict your options. With great flexibility comes great power.
You've now created a file fileparse.py
that contained a
function parse_csv()
. The function worked like this:
>>> import fileparse
>>> portfolio = fileparse.parse_csv('Data/portfolio.csv', types=[str,int,float])
>>>
Right now, the function expects to be passed a filename. However, you can make the code more flexible. Modify the function so that it works with any file-like/iterable object. For example:
>>> import fileparse
>>> import gzip
>>> with gzip.open('Data/portfolio.csv.gz', 'rt') as file:
... port = fileparse.parse_csv(file, types=[str,int,float])
...
>>> lines = ['name,shares,price', 'AA,100,34.23', 'IBM,50,91.1', 'HPE,75,45.1']
>>> port = fileparse.parse_csv(lines, types=[str,int,float])
>>>
In this new code, what happens if you pass a filename as before?
>>> port = fileparse.parse_csv('Data/portfolio.csv', types=[str,int,float])
>>> port
... look at output (it should be crazy) ...
>>>
Yes, you'll need to be careful. Could you add a safety check to avoid this?
Fix the read_portfolio()
and read_prices()
functions in the
report.py
file so that they work with the modified version of
parse_csv()
. This should only involve a minor modification.
Afterwards, your report.py
and pcost.py
programs should work
the same way they always did.