diveintopython.org
Python for experienced programmers

 

2.6. Filtering lists

As you know, Python has powerful capabilities for mapping lists into other lists. This can be combined with a filtering mechanism, where some elements in the list are mapped while others are skipped entirely.

Example 2.14. List filtering syntax

[mapping-expression for element in source-list if filter-expression]

The first two thirds of the expression should look familiar, because it's the same structure as list mapping. The last part, starting with the if, is the filter expression. A filter expression can be any expression that evaluates true or false (which in Python can be almost anything). Any element for which the filter expression evaluates true will be included in the mapping. All other elements are ignored, so they are never put through the mapping expression and are not included in the output list.

Example 2.15. Introducing list filtering

>>> li = ["a", "mpilgrim", "foo", "b", "c", "b", "d", "d"]
>>> [elem for elem in li if len(elem) > 1]       1
['mpilgrim', 'foo']
>>> [elem for elem in li if elem != "b"]         2
['a', 'mpilgrim', 'foo', 'c', 'd', 'd']
>>> [elem for elem in li if li.count(elem) == 1] 3
['a', 'mpilgrim', 'foo', 'c']
1 The mapping expression here is simple (it just returns the value of each element), so concentrate on the filter expression. As Python loops through the list, it runs each element through the filter expression; if the filter expression is true, the element is mapped and the result of the mapping expression is included in the returned list. Here you are filtering out all the one-character strings, so you're left with a list of all the longer strings.
2 Here you are filtering out a specific value, b. Note that this filters all occurrences of b, since each time it comes up, the filter expression will be false.
3 count is a list method that returns the number of times a value occurs in a list. You might think that this filter would eliminate duplicates from a list, returning a list containing only one copy of each value in the original list. But it doesn't, because values that appear twice in the original list (in this case, b and d) are excluded completely. There are ways of eliminating duplicates from a list, but filtering is not the solution.

Example 2.16. Filtering a list in apihelper.py

    typeList = (BuiltinFunctionType, BuiltinMethodType, FunctionType, MethodType, ClassType)
    methodList = [method for method in dir(object) if type(getattr(object, method)) in typeList]

This looks complicated, and it is complicated, but the basic structure is the same. The whole filter expression returns a list, which is assigned to the methodList variable. The first half of the expression is the list mapping part. The mapping expression is an identity expression; it returns the value of each element. dir(object) returns a list of object's attributes and methods; that's the list you're mapping. So the only new part is the filter expression after the if.

The filter expression looks scary, but it's not. You already know about type, getattr, and in. As you saw in the previous section, the expression getattr(object, method) returns a function object if object is a module and e is the name of a function in that module.

So this expression takes an object, named object, getting a list of the names of its attributes, methods, functions, and a few other things, and then filtering that list to weed out all the stuff that we don't care about. We do the weeding out by taking the name of each attribute/method/function and getting a reference to the real thing, via the getattr function. Then we check the type of that object with the type function, and see if the type is one of the things we care about. Specifically, we care about methods and functions, both built-in (like the pop method of a list) and user-defined (like the buildConnectionString function of the odbchelper module). We don't care about other attributes, like the __name__ attribute that's built in to every module.

You may have noticed that the filter expression also includes objects whose type is ClassType. Don't worry about this for now; we'll discuss Python classes ad nauseum when we get into object-oriented concepts in chapter 3.