June 2016 – What William Said

Meet Janie. Janie has an ever-changing menagerie of pugs, and would like to use a data structure to keep track of them. What data structure would be most suitable?

Janie could keep her pugs in an array, but this isn’t a good choice for several reasons. Arrays are best for collections of constant size, while the size of her menagerie varies. Further, any ordering of the pugs will require re-sorting the array as pugs enter and leave, and without this ordering we lose the benefit of being able to find a given pug in O(1) time by indexing into the array.

Queues and stacks don’t have the constant size problem – we expect them to grow and shrink. However, they’re unsuitable because they define the order in which pugs have to be removed after they’re added; the oldest pug might usually tend to come off the queue first, but not always, and we certainly hope the newest pug won’t be popped off the stack right away.

Consider a linked list, however. A linked list is a data structure in which we have some number of nodes, each of which points to the next node on the list; by maintaining a pointer to the head of the list, we can obtain the rest of the objects by following the next pointers. In a doubly-linked list, each node contains a pointer to the next node and a pointer to the previous node.

We’ll modify this slightly to get a pugly-linked list. The head of each pug (its head pointer) will point towards the previous pug in the list, and the tail of each pug (its tail pointer) will point to the next pug in the list. (Normally, we’d keep one head and one tail pointer, pointing to the first and last elements of the doubly-linked list, while each node would have previous and next pointers…but we’ll abuse the terminology a bit to make it pug-specific). If the tail pointers are curly this will, naturally, give us a circular list.

If we need to find a specific pug, we can iterate through the list, which is the same thing we’ll need to do with an unsorted array – but we won’t have any empty cells from missing pugs. We just look at the first pug and follow the tail pointers until we find the pug we want. We can add pugs in constant time – just set its head to point in the same direction as the first pug’s head, set its tail to point at the first pug, and set the first pug’s head to point the new pug, which has now become the first pug in this pugly-linked list. We can remove pugs (provided we have a pointer to them) in constant time as well, by updating the next and previous pugs to point at each other rather than the pug being removed.

And this is how Janie can keep track of all her pets with a pugly-linked list.

Consider test-driving a new car, of a make and model you’re not familiar with. It’s unlikely you’ll need any instruction in how to do it: rightmost pedal makes the car go, pedal next to it makes the car stop. Even though there are many different cars, they have a reasonably consistent interface that allows most people to drive most cars.

Subclassing Car

Suppose we want to implement this in code. Let’s say I want to use a Camry object. A Camry “is a” car, so it makes sense that the Camry class inherits (possibly indirectly) from the Car class. We would never actually instantiate a generic “car” object, so Car can be an abstract class.

What methods would Car have? Two obvious ones could be ApplyGas and ApplyBrake. Maybe they’re implemented something like this:

In this case, the comments are standing in for some code that physically allows more gas into the engine or applies the brakes. This works fine….as long as every car works exactly the same way. But what happens if we have, for example, a Prius? Now stomping on the gas pedal may or may not actually be sending more gas to the engine, which means that we need to override the behavior in the derived class. More generally, for each type of car that derives from the Car class and uses different hardware, we’ll need to rewrite the methods for that car’s hardware – and if we forget, and the method tries to use hardware that isn’t there, then things can go horribly, horribly wrong.

Ok, so we’ll make ApplyGas an abstract method; now each derived class will be forced to implement it appropriately. The same will go for StartCar (do we need a key?), ChangeGear (which gears are available?), and many of the other methods, but the Car class is accomplishing our goal of ensuring that each derived class has a consistent interface.

However, we should step back and ask – what are we gaining here? We’re not sharing implementations; we simply want a consistent interface. So…maybe we should consider using an interface?

Defining an Interface

While inheritance allows us to share behavior between related objects, an interface is simply a contract that defines certain methods each class implementing that interface promises to provide. As a driver, I don’t care how each method is implemented; I simply want to know that, given an instantiation of any class which implements ICar, stomping on the gas will make me go faster and using the brake will make me slow down. I want to learn the interface, and from that, be able to handle anvy new car that might be released.

Let’s define a simple interface for car:

Car interface

Here we’ve defined various methods that we’d expect to apply to any car. There’s a way to turn it on and off, which could be by means of a key (most vehicles), by the location of the key (Tesla), or some other method entirely – we don’t know what it is, just that there is one. There’s a way to accelerate and a way to slow down. We can get the current speed and we can change direction. As is standard practice, we’ve named the interface starting with a capital I.

Notice two things that none of the methods have: an access modifier and an implementation. Since an interface is just a contract, all methods are public by default – an implementing class can’t agree to implement the methods if it doesn’t know what they are! Similarly, the interface doesn’t define how any method is to be implemented, just that they must be.

Implementing the Interface

Suppose we go ahead and create a class that implements ICar:

Car class without implementation

Here we’ve made the claim that this class (unhelpfully named Program – we should really choose a better name) implements ICar, but we haven’t actually provided any method definitions, which gives us the red squigglies. Hovering over ICar shows all the methods that we’re required to implement before the program will compile. This is one of the advantages of using interfaces – we can’t possibly forget to provide a custom implementation for one of the methods and end up using the default implementation without meaning to.

If we hit control-., Visual Studio will offer to implement the interface for us, either implicitly or explicitly.

Car class implement interface

If we choose to implement the interface, then each of the required methods will be created; since all interface methods are public, that access modifier is automatically assigned. If we implement the interface explicitly, then instead of the access modifier the method is explicitly called out as being the implementation of an interface. This is important because it’s possible that our class implements several interfaces which have methods with the same signature that need different implementations; declaring which implementation the method is associated with allows us to have several methods with the name name and call the appropriate method for the interface we’re currently using.

Car class notimplementedexception

Multiple Interfaces

Let’s look at an example. Here I’ve added an interface IBoat which also has a GetCurrentSpeed method, and updated that method in ICar so that both methods return strings. Since I’ve implemented ICar implicitly, IBoat must be implemented explicitly (and Visual Studio will do this no matter what option I choose when I tell it to implement the new interface). I’ve renamed Project to something that makes more sense (Vehicle) and written a main() routine that creates two vehicle objects, one for each of the two classes. When I run the program, each call to GetCurrentSpeed will automatically find the correct method for its interface, and in fact the boat object cannot even see the methods which are specific to the car interface.

Vehicle class multiple interfaces

Programming to the Interface

You may have heard that you should program to the interface, not to the implementation. This means that if we know what interface an object implements, we don’t need to know or care what the object actually is. In our example above, we know that car implements ICar, but we don’t actually need to care that it instantiates Vehicle (as opposed to ToyotaCamry, FordTaurus, etc). If in the future we decide to change the implementation of car to be TeslaModelS instead of Vehicle, we only have to change one line of code. Because car is defined as an ICar, we can only call methods defined in ICar (or Object, which ICar inherits from) on it, and those methods are guaranteed to exist in every class that implements ICar.

Thus, although interfaces can be used as a type of multiple inheritance, their real value is in allowing components to be loosely coupled: when one part of the program changes, it becomes less likely that this will result in other parts of the program needing to change, which means less work and (hopefully) fewer errors. By restricting what we’re allowed to do (in this case, restricting objects to using only methods required by the interface), we actually make those objects more usable.

Month: June 2016

Building a Pugly-Linked List

Interfaces in C#