Do you love clicking around interactive maps? Frustrated you can’t get your spreadsheet charts to match your vision? If you’re drawn to data and visual design, learning the fundamentals of data visualization could be a fruitful way to bring value to an employer—and grow your career.
By becoming handy with data viz, as it’s often called, you’ll have a valuable new tool to apply to fields such as design, engineering, journalism, or data analysis. Someday, you might even be building and designing charts and other visual representations as a full-time job as a data visualization designer or engineer.
In this guide, we’ll walk you through the important concepts in data visualization, the tools of the trade, and how to learn them. Spoiler alert: CodeSignal Learn offers data visualization lessons for free. You can also find courses on the fundamentals of Python, which you’ll need to use to navigate the most common data viz libraries.
Hold onto your X-axis as we chart your way to data viz success!
Jump to a section:
Introduction to data visualization
Famed data visualization thinker Edward Tufte said, “The commonality between science and art is in trying to see profoundly—to develop strategies of seeing and showing.”
Data visualization does more than convert numbers into lines and shapes. Thoughtfully crafted charts can change minds, influence budgets, and inspire movements. Less dramatically, they’re a more efficient, easier-to-digest way to look at data than tables.
While statisticians and others have been visually representing data in print for over two centuries, the practice has shifted to computers and the internet. Today, there are three main ways to visualize data:
Spreadsheet apps like Excel and Google Sheets make serviceable charts, and with some practice and a few tricks, you can customize them to some extent.
Business intelligence and data analysis platforms, such as Tableau or Microsoft’s Power BI, offer many ways to manipulate and visualize large amounts of data from different sources without learning code.
Code frameworks, including Matplotlib and Plotly for Python or d3.js for JavaScript, are the most flexible and powerful.
What is data visualization?
Data visualization transforms information into images. It’s usually based on numbers, whether tracking single data points (say, the daily high temperature in a certain place) or counts (for instance, the number of babies in Canada given the name Michael every year).
Scottish engineer William Playfair introduced the world to line, bar, and pie charts nearly 250 years ago. Other types of charts came about in the 19th and 20th centuries, and in the 21st, interactive computing and an explosion in data have combined to make all sorts of visualization possible.
In this guide, we will focus on modern tools and techniques that use code or specialized platforms to create highly customizable and often interactive visualizations.
Why is data visualization important?
A picture may be worth a thousand words, but a chart can tell the story of millions of data points. Charts, maps, and even well-formatted tables can turn numbers into stories. Persuasive presenters use them to back up bold assertions. Curious investigators uncover patterns or anomalies in the lines, shapes, and shades.
The science and art of data visualization have coevolved with the internet and Big Data. Enormous datasets and interactive charts open up so much of the world to exploration and understanding.
In dynamic situations, from monitoring company-wide sales to a region’s incidence of disease, decision-makers often rely on dashboards with many charts to stay on top of dozens or even hundreds of variables at once. In more static situations, such as looking at combinations of historical data to inform strategic choices, charts can be even more customized, with annotations and other design choices specific to the story to be drawn out of the data.
All indications are that data viz will continue to be an important part of any role that requires understanding or communicating around data.
Is data visualization easy to learn?
Whether you’ll find data viz easy to learn depends on your current skills, your aptitude for code, and how far you want to go.
You should be able to grasp the basics of no-code business intelligence (BI) software, such as Tableau or Power BI, within 10 to 15 hours of concerted study. Getting certified with BI software could take weeks to a few months.
To use Python data visualization libraries, you’ll need to learn the fundamentals of the Python programming language if you haven’t already. That can take a few months, but don’t stress—it’s become such a common language in part because it’s easier to learn than most others, and CodeSignal Learn can help get you there. Once you know Python, learning how to use the individual libraries is a matter of practice: you can follow our learning paths quickly, but mastery takes time.
If you’re not already up to speed with core statistics concepts, expect the process to go a bit slower. You’ll need to either familiarize yourself at the outset or look up terms as you go along.
What is the difference between data analysis and data visualization?
Data analysis and data visualization are very close friends. In fact, most data analysts end up doing some data visualization regardless of their particular training.
Data analysis covers all the tools and techniques for making sense of data. Much of data analysis involves manipulating data: sums, averages, comparisons, regressions, and the like. The results end up as one or multiple data points, which the analyst may display as a chart—and that’s where visualization comes in. The overlap between the two is in the choices about how to display data, such as the type of chart, what data to include and exclude, and how to scale the axes.
We enter the realm of pure data viz with aesthetics and interaction design. This includes choices made to improve understanding, such as colors and labels, and ways users can engage with charts, such as hovering, clicking to expand, and searching across particularly large data sets.
Essential skills for breaking into data visualization
As we’ve mentioned, data viz is a balance of art and science. We’ll start by discussing design considerations such as which type of chart to choose. Then we’ll look at the technical options that are commonly used in data visualization today.
Understanding the different chart types and how to choose between them
What type of data are you displaying and what story do you want it to tell? Are you looking at how a single factor changes over time? Are you comparing different data series? Is the data geographical? Does it follow a multi-step process? This analysis is the first part of your chart selection strategy.
Let’s examine some of the most common types of charts and when to use them.
Line charts
Line charts show points in a data series connected by a line. The Y-axis—the vertical one—represents the value of the data, while the X-axis—the horizontal one—specifies each point in the series. Most commonly, the X-axis is time, but it can be some other unit, such as price or size. A line chart can contain multiple lines, allowing the comparison of one or more data series.
Bar charts
Bar charts are used to compare the values of categorical variables, representing clear demarcations such as blood type or political party affiliation. (They can also be used for ranges like income bands or years.) A bar chart can compare one or multiple data series; usually, color is used to represent different types of data.
Pie charts
Pie charts show the components of a whole, whether on a percentage or absolute basis. A donut chart is simply a pie chart with a hole in the middle; which one to use is purely an aesthetic decision.
Histograms
Histograms look similar to bar charts, but they show the distribution of values within a given data series. A large data set is broken into chunks—say, household income in a country, broken down by $5,000 increments. Histograms show more nuance than basic statistical features like average and median.
Scatter plots
Scatter plots let you show every single piece of data in a series at once. You’ll often see these when comparing a sizable but manageable number of data points, such as statistics across countries of the world or the sports teams in a league. Often, they’ll have a best-fit line, which is a mathematical function that represents the overall trend of the data.
Bubble charts
Most bubble charts are scatter plots with variously sized circles adding another dimension of comparison. For instance, when comparing countries’ life expectancy, the circle size could represent the amount spent on healthcare per person. Color can denote a category, such as continent. There are also linear bubble charts, which are an alternative to line charts, and bubble clouds, which have no axes but simply represent labeled data points in proportionally sized circles.
Heatmaps
Heatmaps represent data values using color. These are particularly useful for showing data within a spatial context, such as on a map. They can also be used on a grid to highlight patterns that a simple line graph may not highlight as clearly, such as phenomena more likely to happen on a certain day of the week. In other cases, they simply make it easier to see the range of values in a large number of datapoints at a glance.