IEnumerable, why use it?

Okay, first off: I really love the IEnumerable<T> interface, and I use it excessively throughout my applications.

Almost any list or collection throughout the .NET framework implements this interface. It exposes only one method (GetEnumerator), but this method is your friend in almost any circumstance.

This method returns an Enumerator, which again exposes methods used to traverse (or loop through) the collection. This means you can make a method definition like this:

public void MyMethod(IEnumerable<string> someEnumerable)
{

}

I could call this method with any sort of collection, it could be string[], List<string>, IList<string>, IQueryable<string> or even a basic IEnumerable<string>. I don’t give a damn. I don’t care what the implementation is, I just want a collection to work with, nothin’ more.

This means I don’t need to force other developers who use my code, to pass an array or a List or some custom collection, they just pass whatever collection they have, and I will be able to work with it. That’s awesome!

But wait! How do you get an IEnumerable<T> in the first place? How can we make our own?

Well, in “ye olde times” one would make an abomination like this:

public List<string> MyMethod()
{
List<string> myResult = newList<string>();
for (int i = 0; i < dataSource.Length; i++)
{
// Do some work.
myResult.Add(dataSource[i].SomeProperty);
}
return myResult;
}

Can you see what’s wrong here?

There’s actually more than one thing that’s wrong with this snippet.

First off: When we return a specific implementation, in this case List<T>, we force our code to use this implementation. We kinda dictates the use of our result which is a major drawback on agile Development. What if the consumer don’t want or need a list?

Second: In line 3, we create an instance of an object we actually don’t need, so that we can return a list.

Third: We loop through a collection of an unknown size. This Collection could potentially contain a billion items.

Fourth: We have to wait until the loop has ended. Which could take forever. Think about what would happen to your app, when you loop though a billion items and only need the first five.

This is where .NET provides a concept of Enumerators. These are methods that only returns data when said data is ready to be returned on not when all data is ready to be returned. And it is actually the same method called from the GetEnumerator()-method.

To convert our method from above into an Enumerator, we must make some changes. First we need to return an IEnumerable<string> instead of a List. Second we need to remove the list-object. And finally rewrite the loop to use the keyword “yield”.

This is how our method would then look like:

public IEnumerable<string> MyMethod()
{
for (int i = 0; i < dataSource.Length; i++)
{
// Do some work.
yield return dataSource[i].SomeProperty;
}
yield break;
}

As you can see, the code looks way simpler. We don’t force our consumer to take a list. And the yield-keyword, changes the method into an Enumerator, which gives us a LOT of benefits:

The yield return, returns our data when the property Current is called on the Enumerator-instance. This property is usually called each time the MoveNext()-method is called. This Means you do not have to wait until all items are looped. You can stop at any moment. the yield-return is only called when you request the next item.

So if your consumer needs to loop through each of your items, you will only have to loop once, whereas the List-thingy above, would loop through the same items twice. That’s two billion items! Secondly the first example would always loop though each item, but the second one only loops through as many items as you want it to. Nothin more nothin less.

 

The essence of it all: Always use IEnumerable, only use lists where it cannot be done without it.

For performance reasons, always use Enumerators. They will save the day.

 

This is my first attempt on a “professional”-programming guide. Please provide feedback!