How To Write Code Well¶
Introduction¶
This video is not for you if you don’t already know how to write at least a little bit of code.
If you do know a little Python and you want to write code, but you just don’t know what to do or how to get started, then this is the video for you.
Hi, I’m Jonathan Gardner, and this is part of my Theory of Python series of videos. I’m going to try and share with you my nearly 20 years of professional experience writing code.
The Process¶
Let’s talk about the overall process I follow. When I’m writing code, I follow a simple formula. It’s the same formula I use to solve a particularly vexing math or physics problem. It’s the same formula I use in life.
Own the Problem¶
The first step is, I believe, the most important and yet oft-ignored. How many times have you seen a problem, then walked away because you just couldn’t care to solve it?
If you want to solve a problem, you have to “own” the problem. Make it “your” problem, not someone elses. By internalizing the problem at a spiritual or psychic level, you will invest emotionally, physically, and mentally to find the solution to the problem.
Understand the Problem¶
After you have taken responsibility, the next step is to understand the problem. I have seen too many people solve the wrong problem simply because they didn’t understand it very well. I count myself among those people.
I look for the following things to gain an understanding of something.
What are the terms and their specific definitions? Without a specific language, it’s really hard to reason about a problem. I often will keep a little dictionary or terminology guide. It will help me explain the problem to others, and the potential solution.
What will the solution look like? That is, what do you need to get success? In a math problem, there is usually a question at the end. That should be the central focus. In a business setting, the problem is always “How do we make more profit (IE, increased value, decreased cost?)” The solution is one that will be cheap and easy yet deliver maximum value to our customers.
Connect the problem with knowns and unknowns. Sometimes I will draw a map connecting ideas of the problem together. I’ll even have to fill in gaps. If I need a magical machine, then I’ll spell that out. Maybe the real problem is finding that machine. Maybe I don’t need the machine after all.
Find similar problems and understand their solutions, as well as the solutions that didn’t work. Learn from people who have gone before you. Learn from their mistakes, and build on their successes. There is very rarely a new challenge in software engineering. Most of what we do is applying old solutions to new situations. If we stick with what we know works, then we’re almost guaranteed to find another solution that works as well. When we do ignore previous solutions, it’s only because we actually have a better solution, not because we didn’t understand what our predecessors were doing and why.
Break the big problem down into little problems. Keep doing this until you start seeing problems you can solve. A particularly vexing physics problem was really easy to see once I started taking apart each component of the problem. I didn’t know at the beginning that it would come down to solving an integral, but by breaking the problem down into smaller and smaller pieces, eventually I got to the step where I knew exactly what to do.
In software development, point 5 cannot be underestimated. We can write one big complex program, or we can write many small, simple programs. I’m a huge fan of small, simple programs, even if you have to make thousand of them.
Designing the Solution¶
Once you understand the problem, you can start putting together a solution. In my near-two decades of experience, I have a huge arsenal of potential solutions I can bring to bear on any problem. If you’re new, you don’t have that luxury.
But if you’ve broken the big problem down into small, solvable problems, the solutions should practically write themselves.
Designing big solutions requires systematic organization of the smaller solutions. The ways we organize the solutions is as follows:
A library. A library is a set of functions that are all related. If you have lots of little functions that all act on the same kind of data or are united together to solve a bigger problem, you may want a library. This library is going to be imported into running processes and services.
A database. A database stores data and makes it available for other processes to access or modify.
A process. A process is running code. You might have one or many, but you have to think of each one as independent of all the others. Processes unite input, output, and libraries into one cohesive whole.
A service. A service is a passive set of processes waiting for instructions from clients. Nowadays we put our services behind the HTTP protocol, and we often use fancy terms like RESTful or RPC to describe them. The core element of a service is that there are some operation that need to reach across processes. Services help to coordinate these operations.
The solution generally has two faces. One face describes functions stored in libraries, imported by processes and used by services. The other face is the user’s story. It explains how everything fits together to solve the problem.
Use Case Document¶
User stories aren’t hard to write. Just think of something the user might want to do, and then write down the steps it takes to get it done. Think first of what buttons the user has to push, or what commands he has to type. Think about how those translate to function calls that call functions and store data in databases.
This document should have the following repeated again and again:
A description of what the user is trying to accomplish. IE, “I want to call my friend.”
A list of actions the user can take to accomplish the purpose. IE, “I open the friends list by clicking on the friends icon. I click on my friends face. I see the phone dialing and hear a dial tone.” Etc…
A description of how the software reacts to those actions at every level. * IE, “When the user clicks on the friends list, the database is queried to
retrieve the list of friends. They are sorted alphabetically, and they are shown as a list with their image, name, and phone number.”
Even for simple projects, it’s good practice to do this.
Architecture Document¶
I think the architecture design is pretty self-explanatory. Just list out the libraries, databases, processes and services and what they do and how they do it. Every function should be listed, along with what parameters they take and how they behave.
It’s ok to leave out the details of things you understand well. Just be sure to fill it in when you’re dealing with someone who doesn’t have your experience or when they ask questions for more details.
Every function should list the following:
Where the function is found.
The function name.
The function parameters and their meaning.
What the function does.
For the usual or general case.
For the unusual or specific cases where bad input is provided or when the system is in a weird state.
If you can describe this with simple English, then writing, testing, and documenting the code is a simple task.
I should mention how “big” functions should be. In my opinion, they should be very small. If it takes more than a few sentences to describe it, it’s probably too complicated.
The issue with “big” functions is they are often poorly defined, difficult to implement, and hard to test.
Compare the following two functions:
find_friends looks up friends in the database and returns them. It uses the query() function to retrieve records from the friends table.
find_friends looks up friends in the database and returns them. It creates a connection using the connection string. If the database is down, it raises an exception. It then constructs a query for the friends table, which may be different depending on which database you are connected to or what time of day it is. It then sends the query to he database and waits for a response. If it gets back an unusual response, or it takes too long, it retries, but only up to 3 times or 1 minute. When it receives the response, it decodes the response based on the type of fields in the friends table…
The first function is small. It relies on other functions to handle the work needed to generate query and return it.
The second function is too big. You should be able to see how we can remove parts of it and place it into separate functions that can be reused by other parts of the code.
Implementation¶
Implementation is often messy. It starts with a prototyping phase where we quickly assemble the components to see if we missed anything in our design. (We always do. We always do!)
After we get our initial prototype working and demonstrating that our design is probably sound, we go into the development stage where we begin filling in all the details.
While we write our code, we’re writing documentation for ourselves and our users. We’re also writing unit tests and integration tests. We’re thinking about how we will deploy the solution.
Writing Code¶
In the prototype phase, I am very careful to add doc strings to each function and such. I try to organize the code the way I would in the final project. I’ll also add in “scaffolding” – stubs that either do nothing or do some kind of dummy behavior. For instance, I might have a function “find_friends” that looks up friends in the database. In the prototype phase, this will just return a fixed list of friends without even talking to the database. Later on I can change it to actually talk to the database.
As I gain confidence through preliminary testing that the prototype is sound, then I start paying a lot more attention to tests and documentation. I also make sure my code is solid. Exceptions should be raised and handled appropriately.
Testing¶
Once you have some code, and you think it’s pretty good, you need to check that it actually does what you think it does.
Every line of code in my videos I have tested by entering them into a python interactive session. I do this because I know that I am very prone to making mistakes. I know myself better than anyone else, and that’s been my experience: One mistake after the other! But I know that if I am careful and check my work, I can find and eliminate those mistakes. The finished product is perfect, but it takes a lot of testing to get there.
If you’re a beginning programmer and you want to write some code, I urge you to STOP. Think about what you will do with the code you write. How will you know it works the way you think it should? Already your head should be thinking of test cases.
Test cases are the unit of testing. You have multiple test cases that test specific aspects of the software. Some might be very specific – testing that a function when called with a particular set of parameters does something very specific. Another might be more general – perhaps when the user presses the ‘a’ key an ‘a’ is printed on their screen.
Note that you can write an infinite number of test cases, and miss the most important test cases! For instance, suppose I wrote the recursive fibonacci function, and I called it for fib(1), fib(2), and fib(10). Thinking it is working fine, I send it to my friend who complains that it takes too long to calculate fib(1000), and it won’t even try to calculate fib(10000).
There are entire books of how to find good test cases. Let me guide you on your path.
Unit tests. That is, testing the unit of programming, the function. Focus on each function, seeing if it calls the functions it is supposed to with the right parameters, and responds appropriately to their result.
Coverage. Try to get a set of test cases that will run every single line of code, and every condition. That is, if there is an ‘and’ or an ‘or’, it will test for True and False of both sides of those operators.
Limits and extremes. If you want to test a function that adds two numbers, you have an infinite number of choics for the numbers to use. When you have to choose numbers, choose some of these interesting numbers:
0, especially if your operation involves division.
-1.
2. It is said that there are only 3 important numbers in computer science: 0, 1, and 2.
1000 or 1000000, and negative the same. Try to pick extreme numbers that exceed the bounds of what you anticipate.
You might get imaginative when you think about what the function is doing, and come up with your own. For instance, pi might be interesting if you are using sines and cosines. e would make a fantastic choice for exponents and logarithms.
Also, for strings, choose strings that have actual unicode characters in them, like Chineses and special symbols and Arabic or Hebrew. By exercising the full set of unicode characters, or at least a sampling of the languages you intend to support (the French and Germans do not necessarily speak English all the time!) you will learn the limits of your strings.
There are tools to help you see if you’ve covered all the code with your unit tests. (I recommend the unittest module and nose!)
The only tool I know of to help you analyze which values to use is the grey matter between your ears. I have heard of programs that can analyze code for you, but I don’t think it will work well for Python nor do I think it can ever do a better job than you. The issue is that these programs look at your code. They can’t read your mind. You need to find out if the code matches what you wish the code would be.
Now, keep in mind that I don’t do a lot of testing during the prototype phase. That is an experimental phase, and I typically have one or two bigger tests I am trying to get to pass. During prototyping, the design is subject to change, and that means the tests need to change as well.
However, as the projet progresses, and the design begins to settle down, I’ll take the time to write a test document that describes all the tests we want to run and all the important test cases.
Documentation¶
Setting up Sphinx to auto-build your documentation from your doc strings is not hard. Do it from the beginning!
As your code grows in complexity, you’ll want to add pages and additional documentation to explain why the code is organized and functions the way it does.
A set of user docs should be kept separate from the code documentation. These user docs are for the user to read and understand how the software functions.
The best set of user docs is no user docs at all! If the interface is completely intuitive, then you have done something amazing.
However, users will appreciate being able to read more about the software and understand more about how it works and how they can use it. I might not read the docs very much when I first experiment with the code, but once I’ve bought into it emotionally, I’m going to study it deeply and learn all of its secrets.
Some people try to replace mailing lists, chat logs, and even Stack Overflow-style websites for their user docs. I don’t recommend this. Nothing can compare to a well-written set of accurate user docs. The others are often far worse than even poorly maintained documentation in my experience.
Conclusion¶
I want to work with you to develop a simple software project. There are a few bits and pieces of Python you need to know about, pieces that I want to elaborate on, but I think you are ready at this point to do so.
I will start a video series on such a project. You can follow along with me as I work through the various stages on camera, explaining what I am doing and why I am doing it.