/* jquery */ /* jquery accordion style*/ /* jquery init */

5 April 2013

Learn Python - File Walk App

In this program we'll use the powerful os module introduced in the first article. This time our focus will be on its walk function, which scans the file system then returns a list of files and their locations. We'll use this function to search for specific types of file based on their name.

Let's start by creating another new file with Geany and saving it into the Desktop's Python folder with the name 'walk.py'. Now enter the source code, saving frequently and taking care with quotes, colons and line indentations.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
# A file search program
# Created by David Briddock

import os

root = input("Start directory? ")
ending = input("File ending? ")

# starting from the specified root directory
# walk through all the files and sub-directories
for path, dirs, files in os.walk(root):

  # step through each file in the collection
  for fileName in files:

    # does the file name ending match?
    if fileName.endswith(ending):
      print(path + "/" + fileName)

Now, let's step through the code. The first two lines contain our program description comments. These are followed by the 'import os' statement on line 4.

Next we obtain some user supplied search data. On line 6 we use the input function to capture the starting directory for our search, and store it in root. Line 7 does a similar thing to capture a file ending string and store it in ending'.

The For Loop

Now we come to the meat of the program, contained in lines 11 to 18.

As you can see, we actually have two for loops, with the second one inside the first. With Python, as with most languages, we can have as many loops within loops as we like. Creating a hierarchy of loops is a frequently used programming technique.

Both for loops use the same format we saw earlier. However, the one on line 11 looks a little different as it has three variables before the in keyword. I'll not dwell on this for now, but just to note functions that return multiple variables is just one of Python's powerful features.

What does os.walk do? Well, as the comment suggests, it finds all the files and sub-directories in the supplied start directory. Then, for each subdirectory, it again finds all the files and sub-directories. This process is repeated until all the sub-directories have been searched.

Now for the three returned variables. The first, 'path', is just a simple string. However, the next two, 'dirs' and 'files', are list-type variables. A list-type variable holds a collection of items. We're only interested in 'files' which contains a list of file names.

As 'files' contains a list of files we can use the 'for' loop on line 14 to extract each file name from the list in turn, and assign it to a variable called 'fileName'.

Now we have a file name we can see if it ends with the string we entered earlier. This is done with the if statement on line 17 and a string method called endswith. (one of many Python string methods)

Let's imagine we entered '.py' as the file ending string. The test will see if 'fileName' ends with '.py'. If it does the test will be 'True', and the indented print statement on line 18 will be called. If no match is found, the test is 'False' and nothing is printed. So, only files which end in '.py' will be printed out.

Execute And Experiment

And that's the end of our 'walk' program. Use the Geany 'Build->Execute menu option, or press the F5 key, to run the program. Try entering '/home' as the starting directory and '.py' as the file ending. If you see an error message instead, go through the debugging process we described above until it runs successfully.

Feel free to experiment with different 'ending' strings. For example, using '.conf' should display some configuration files. And you can also try different start directories. A single period string '.' means the current directory. A value of '/etc' will find more Python files. While entering '/' will scan the complete file system - something that might take a little time to run, and could generate hundreds of file matches.

You can, of course, change the program too. One idea is to replace the 'endswith' function with 'startswith' to find all files starting with a specific string.

A post from my Learn Python on the Raspberry Pi tutorial.