Tips for Data Processing in Google Sheets
Sometimes, data can be inherently messy, incomplete, or too detailed, etc. and so, diving right into the data visualization without processing might result in a turmoil. In this course, you will learn a small set of Sheets functions for the basic data processing. It also sheds light on a histogram – the simplest graphical representation of the distribution of a dataset.
5 Best Formulas for Data Processing
- make “percent frequency”
- create a bar chart to make it visual
TRIM clears away all redundant spaces between words, meaning all spaces before the first character (and after the last character), plus all duplicate spaces between characters. It, thus returns a cell that has only single spaces between words, and is helpful when data has lengthy leading or trailing spaces in the string.
The function is originally meant to replace existing text with something new. (So, it’s similar to the shortcut key “find and replace” that searches a worksheet for a text and replace it with what you want.) Taking advantage of this nature, however, it’s also widely used to replace all spacing – i.e. “_” with none – i.e. “”. ( _ that are enclosed in double quotation marks means there’s a single spacing.)
= SUBSTITUTE(A2, “_”, “”)
CONCATENATE refers to uniting values of two or more cells in a spreadsheet into a third different cell. It is effective when you come across a large dataset that requires to combine the contents next to each other. While the syntax is simple, there’s the option to make it joined with a space in the middle. You can do it by adding “_” between two values to the function.
= CONCATENATE(A2, “_”, B2)
SUMIF is the function designed to conditionally sum numbers. As the name sounds, it sums numbers only if a value in a range meets to a certain condition you set. In this example, the prices of “Pear” are only summed up.
= SUMIF(A2:A10, “Pear”, B2:B10)
COUNTIF is set to count the number of times that a value shows up across an indicated data range that meets a specified condition. As it lets us count the number of each item within a list of data, it’s often used for analyzing relative frequency distribution.
= COUNTIF(A2:A21, A2)
Using COUNTIF for Frequency Distribution
Though it seems overwhelming, it’s pretty easy. To get the percent frequency, just multiply the relative frequency by 100, or click “format as percent” in the toolbar. You can use the $ to enclose the cell D6 by pressing “F4” so a location to the reference will be fixed. This “absolute reference” makes you easily copy and paste the formula.
To select more than one separate column, drag over the ranges while holding the Control (Windows) or Command (Mac) key. Then, you will go to click “Insert Chart” button and it returns the relative and percent distribution of each item. You can also add the titles by editing “X-AXIS” tab.