This Plan-Code-Improve-Polish (PCIP) System is an approach that we can apply to any data analysis or programing project. The goal behind this system to help structure our thinking so that we can spend our time productively rather than getting stuck and spinning our wheels. Check out each part of the system and what questions we might ask ourselves in general, in programming contexts, and when making data visualizations (both tables and plots).
Plan
Planning is the most important step we can take. This is where we will articulate what we are ultimately trying to do. Keep in mind that good plans include the following three core elements and one optional element.
- Statement of Goal(s): Identify what you are ultimately/primarily trying to do. As you do so, be open to breaking this goal down into smaller sub-goals and tasks.
- Statement of Needs: Identify what you'll need to achieve each of your goals and sub-goals. This would include noun-type objects (e.g., data files, image files, packages, resources) and verb-type objects (e.g., functions, actions,).
- List of Steps: Using your list of needs, you can bring the nouns and verbs together to create a list of steps/actions that will help you achieve each sub-goal and then your ultiamte/primary goal. Numbering the steps can help you organize your actions in terms of which goals rely on other goals.
- Make a sketch: (Optional) If appropriate to your goal(s), try making a sketch of what you currently imagine the end product looking like. The sketch does not need to be fancy as its purpose to help provide you with a visual reminder of what you're aiming for. This is useful with creating data visualizations, dashboards, or apps.
A great example of a plan is that of any recipe. A recipe will have a statement for what you're making (i.e., the goal), a list of ingredients (i.e., noun-type needs), and a series of instructions telling you what to do (i.e., the steps). Many recipes will also include pictures of either the final product and/or key instances of the process.
Key Questions to Ask Ourselves During the Planning Stage
Here are a few questions that we can ask ourselves to help us generate a plan.
- What is your goal? What are you trying to do?
- What data do you need? What data do you have?
- Is the data tidy? Do you need to tidy or otherwise wrangle the data?
- In a Programming Context
- What do you need in terms of nouns and verbs?
- What do you have?
- What steps do you need to work through so that the computer can help you reach your goal?
- In a Data Visualizations Context
- What steps do you need to carry out to create the visualization?
- What kind of visualization do you want? What geometry (-ies) do you need?
- Sketch what you're looking to create.
- What goes into your Framework? Content? Labels?
- How many layers do you need?
- Do you need to use an interactive system to get started?
Keep in mind that planning is not always linear. Sometimes you'll be writing out your steps and realize that you're going to need something that you didn't previously think of. Similarly, you might decide that you should create a new sub-goal so you can have sharper focus on a particular task.
Code
After we have made a plan, we can shift to enacting or following it. A good habit for us to adopt for programming and Statistics/Data Science contexts comes again from cooking: read the recipe (plan) several times, top to bottom. This helps you better internalize the plan, and make sure that you have everything you might need. In cooking/baking, this is known as mise en place or getting everthing together and put into place.
In programming and/or Statistics/Data Science contexts, we can swap out mise en place of ingredients for creating a transition document in our programming environment. These transition documents will help guide our work according to our plan.
Key Checks when Coding
Here are few checks that we can do to make sure that we are in the strongest position possible as we transform our plans into code.
- Load necessary packages (install them as needed).
- Download/obtain necessary data files; load them to the enivronment.
- Tidy, wrangle/clean your data as needed.
- Create a Trasition Document to scaffold your coding environment according to your current plan.
- In a Programming Context
- Use comments to help you organize your code.
- Use your plan to help you write your code in a logical way.
- If you are creating custom/new functions that you are going to re-use, define these first.
- Store outputs as necessary in appropriate [new] objects.
- In a Data Visualization Context
- Using an interactive tool? (e.g.,
esquisse::esquisser)
- Launch that tool.
- Explore your data.
- Copy and save the code.
- Plots/Graphics via
{ggplot2}
- Create the framework and specify the mappings.
- Create content (geometries, glyphs, aesthetics).
- Create the labels (axes, gridlines, scales, legends, titles).
- Tables
- Use the right package for the job.
{janitor} for frequency tables.
{knitr} and {kableExtra} for summary tables.
- If knitting to Word, use
{flextable}.
- Identify what the rows and columns need to represent.
As you code, feel free to work in smaller chunks. This is where having sub-goals can come in handy. You don't have to code every single element before you test out your code. Keep in mind that in this Coding step, perfection is the enemy of progress. We do not need to have the best possible code from the start; rather, we need some initial code that we can improve upon in the next stage.
Improve
As we write code, we will start testing out our code. This helps us debug our code, check the reasonableness of the results, and assess the efficiency of our code.
No matter how much experience you have, you should always carry out the Improve stage at least once. This is as true when you are first learning a new tool or when you've used the tool for decades. At it's core, the Improvement stage is about answering one core question: Did you achieve your stated goal(s)?
- No?
- Identify what is going wrong; what the barriers might be.
- Plan your modifications, then Code updates.
- Yes?
- Were there any warning messages you ignored?
- Are there any places where your code could be more efficient?
- Plan your modifications, then Code updates.
Key Checks when Improving
As we work to improve our product (code, data visualizations, etc.), here are some key aspects to focus upon.
- Do this step at least once.
- Run your code. Look at the console for any warnings or error messages.
- Did you achieve your stated goal?
- Plan your modifications, then code.
- Focus on fundamental aspects, not aesthetics.
- In a Programming Context
- Critically look at your code.
- Are you using meaningful names?
- Are there any places where you can be more efficient?
- Keep track of your bugs and how you fixed them.
- Incorporate comments to help you make sense of your code in the future.
- Do you have DRY code? That is, have you applied the principle of Do Not Repeat Yourself to your code?
- In a Data Visualization Context
- Critically examine your visualization in terms of EPTs, Tufte's Principles of Analytical Design, and Kosslyn's Eight-fold Way.
- What improvements can you make to your visualization?
- Do you need to add or remove layers to highlight the message you want to convey?
- Do you need to adjust glyph sizes and or add additional aesthetics to make the graph more readable?
- Do your readers make sense of the visualization (and the data) in a way that is consistent (but not necessarily identical to) what you intended?
- Did you make a “duck”?
Cycling Through Plan-Code-Improve
You'll notice that the first three stages of the PCIP system are connected in the digram with loop. This reflects that we should anticipate moving through Planning, Coding, and Improving several times. Iterating through this cycle is something that programmers, statisticians, data scientists, data analysts, and researchers do all of the time. Eventually, we reach a point where our work is good enough to move onto the last stage.
Polish
The final stage of the PCIP system is the Polishing stage. Here the idea is to take a few moments and make sure that our work is in the best shape that it can be in. While it might seem like Improving and Polishing are the same thing, they are different. The Improvement stage can lead to large changes in your final product; the Polishing stage will not. Improving is about making sure your code works (efficiently); polishing is about making sure your code is readable. Improving your data visualization focuses on ensuring that your message is conveyed while polishing focuses on artistic choices for your final visualization.
In the Polishing stage we will want to focus on critically appraising our work from two standpoints: how readable is our code and how appealing are our end products? You will want to think about how someone else might interpret your code and/or understand the data from your visualization.
Key Checks when Polishing
When polishing your work, here are some key things to focus on.
- Only polish after you've gone through at least one round of Plan-Code-Improve.
- Critically appraise your work.
- How appealing is your work?
- How easy of a time would someone (including your future self) have in understanding the work?
- Be open to the possibility to have to go back to a new cycle of Plan-Code-Improve.
- In a Programming Context
- Look through your code again.
- Did you use meaningful names?
- How readable is your code?
- Is your code consistently formatted?
- Did you follow a coding Style Guide?
- Where might you add additional comments to improve the readability and re-usability of your code in the future?
- In a Data Visualization Context
- Check spelling and grammar of all parts of the visualization.
- Double check that you have all key elements labeled.
- Do you have a graph title or caption? Add one if not.
- Did you set alt text for a plot/graph?
- Check legibility. (E.g., Do you need to change font sizes?)
- Do you have any vibrations, grids, or other chartjunk that need to be addressed/removed?
- Are there any final improvements to be made to enhance the appeal of the plot in addition to the utility of the visualization?