In Python we teach "you should never add to or remove from a list while you are going through that list with a for-each loop." But you may have tried this and it seemed to work correctly. Plus, going through all the hassle of marking a bunch of things for removal and then removing them in their own separate loop later seems pretty tedious and indirect. So why is it important?
Changing a list during a for-each loop that uses ("iterates over") the same list can create some subtle errors that can be difficult to track down. Let's look at a simple example that can help shed light on why it is dangerous to change the length of a list as you're iterating over it.
my_list = ["apple", "banana", "cherry", "dragonfruit"]
for fruit in my_list:
if fruit == "banana":
When you run the program above, you might expect the output to read "apple", "cherry", "dragonfruit". But it doesn't. Instead, the output looks like this:
The "cherry" entry got skipped. So why did this happen? Let's take a closer look at how the code "thinks" about a for-each loop in order to investigate.
For-loops are counting loops. Their purpose is to start at one number and count until they reach a second number. In the case of for-each loops, the loop is starting at 0 and counting list indexes one by one until it gets through all the items in the list. So when the loop begins it will get list item 0, then list item 1, then list item 2, and so on until it is done with the list. Each time we start a new loop iteration, the item from the list at that index is assigned to the loop variable (in this case, fruit).
So let's trace the events in the loop above. We start at the top of the loop. Since we are starting a new iteration, we assign the loop variable. It is iteration 0, so the fruit variable is the item at index 0 of my_list. fruit becomes "apple".
The loop runs, "apple" is printed, and we repeat. Now we are on iteration 1. The fruit variable is the item at index 1 of my_list. This is "banana". The fruit variable becomes "banana". We start to go down through the loop code: oh look! This code says that we should remove "banana" from the list! So we do. Now my_list has been changed, so it has a new value:
["apple", "cherry", "dragonfruit"]
However, since we haven't started a new iteration of the loop yet, fruit hasn't been re-assigned. It still contains the value "banana". So "banana" gets printed out.
Now we're ready to start loop iteration 2, and here's where the problem occurs. When we re-start the loop, we are on iteration 2. Therefore we should be getting the item at index 2 in my_list. In the original list, that item would have been "cherry". HOWEVER, our list has changed. Look at the new value of my_list. Because we removed an item, the indexes of everything after that item have changed. Instead of index 2 being "cherry", index 2 is now "dragonfruit".
The code has no way of knowing we've made this change, so it continues happily from the number 2, effectively skipping the "cherry" entry entirely and using "dragonfruit" instead.
This is the problem with changing a for-loop's count AS it's counting: it loses its place. Removing items can cause the loop to skip elements of the list without meaning to. Adding items may cause the loop to view the same list item more than once. Of course this depends on the index of the item that you're removing, but in general the more you change the list, the more you throw off the loop's count. The problems this creates can be subtle, and therefore difficult to track down.
In languages other than Python, this can even cause crashes! This happens because the loop decides how high it is counting at the very beginning, based on the length of the original list. If you reach the end of the list and still have numbers left to count, Python is forgiving enough to let it go. But other languages may panic and crash the program, because they tried to get a value that wasn't there.
So again, in certain situations, depending on where in the list you're removing the item from, it might not cause issues, or the issues might be hard to see. However, even though it might work sometimes, it's not a safe thing to do. So in general we just try to remember: "don't change the length of a list while you're using a for-loop to go through that same list!"
Please sign in to leave a comment.