Wayne's Github Page

A place to learn about statistics

Calling Built-in Functions in Python

Similar to calculators with basic transformations, Python has a suite of functions to help you work with data.

But unlike calculator functions that take in a single number (e.g. abs(-1)), there can be multiple inputs and inputs can be lists or other more complicated objects.

Functions with multiple arguments

Sometimes it’s useful to sort the data according to a feature before examining the data so you can focus on the outliers. sorted() is such a function that can help:

demo = [10, 9, 9, 1, 5, 8, 6, 7, 7, 10]
asc_demo = sorted(demo)
print(asc_demo)

dsc_demo = sorted(demo, reverse=True)
print(dsc_demo)

print(demo)

This example shows:

Passing arguments by position vs keywords

Continusing the example above, the first input demo was passed in as the first input which is considered a positional argument. The second input True was passed as a keyword argument because we referenced the keyword reverse.

Notice that keyword arguments cannot be used before position arguments since the intent is unclear: do you expect the keyword argument to count as one of the positions or not?

sorted(reverse=True, demo)
  File "<ipython-input-18-9383389be95f>", line 1
    sorted(reverse=True, demo)
                         ^
SyntaxError: positional argument follows keyword argument

How did we know what the keywords were called? Look up the documentation

help(sorted)

# or in iPython
?sorted

Writing your own function

Imagine that we would like to standardize text data by lowercasing the words then split the text up by spaces.

def standardize_text(text):
    return text.lower().split()

You should always test the function with various types of input to see what could go wrong. It is good to have cases that will for sure fail!

standardize_text('hello world!')
standardize_text('#2022HereWeComeAgain')
standardize_text('George Washington')
standardize_text('STOP')
standardize_text('')
standardize_text(['i said x', 'you said y'])

Setting defaults in functions

You can give your function default values for its parameters by assigning the parameter during its definition. For example, if we wanted a log function that defaulted to base=10 instead of base=e:

from math import log


def my_log(x, base=10):
    return log(x, base)    


my_log(100)
my_log(100, 100)

In-place functions

Python has a few in-place functions/methods that modify the object directly and usually returns nothing (this is a coding pattern in Python, not a result of in-place algorithms). Here’s an example of a common mistake

demo = [1, -1, 0]
sorted_demo = demo.sort()

print(demo)         # prints [-1, 0, 1]
print(sorted_demo)  # prints None

list.sort() sorts the list demo directly so demo is modified. The method, however, does not return anything so sorted_demo will be assigned None. The common mistake is to assume all functions will return an object and see code crash because sorted_demo is None instead of a list.

In-place operations are attractive because they do not require doubling the memory when working with data. If demo and sorted_demo both existed, then you now have two copies of the data (different order). If these lists were much larger, this will quickly become a problem.

There unfortunately is no standard around naming for in-place functions and you need to read the documentation. The popular package pandas often has options to do in-place operations although its default is to return a new copy of data.