Product SiteDocumentation Site

Chapter 14. Collections module

14.1. Counter
14.2. defaultdict
14.3. namedtuple

In this chapter we will learn about a module called Collections. In this module we some nice data structures which will help you to solve various real life problems.
>>> import collections

This is how you can import the module, now we will see the available classes which you can use.

14.1. Counter

Counter is a dict subclass which helps to count hashable objects. Inside it elements are stored as dictionary keys and counts are stored as values which can be zero or negative.

Below we will see one example where we will find occurrences of words in the Python LICENSE file.
Example 14.1. Counter example
>>> from collections import Counter
>>> import re
>>> path = '/usr/share/doc/python-2.7.3/LICENSE'
>>> words = re.findall('\w+', open(path).read().lower())
>>> Counter(words).most_common(10)
[('2', 97), ('the', 80), ('or', 78), ('1', 76), ('of', 61), ('to', 50), ('and', 47), ('python', 46), ('psf', 44), ('in', 38)]

Counter objects has an method called elements which returns an iterator over elements repeating each as many times as its count. Elements are returned in arbitrary order.
>>> c = Counter(a=4, b=2, c=0, d=-2)
>>> list(c.elements())
['a', 'a', 'a', 'a', 'b', 'b']

most_common is a method which returns most common elements abd their counts from the most common to the least.
>>> Counter('abracadabra').most_common(3)
[('a', 5), ('r', 2), ('b', 2)]