Quote of the Day
In times to come, when we are all gone, people not yet born will walk in the sunshine of their own days because of what women and men did at this hour to feed the sick, to heal and to comfort.
While COVID rages around the US, I have been staying isolated at my lake cabin in northern Minnesota. Because of my love for data, I track various lake metrics such as height relative to sea level, clarity, and temperature. I gather this information into a report that I send to the other folks who live on the lake. In this post, I show how I put together a chart of lake water temperature vs time using Excel and Power Query. I also demonstrate how to use the FILTERXML function to parse some rather messy data. I find myself using FILTERXML for data parsing more and more.
For those who like to follow along, you can download my workbook here.
I measure the lake temperature at the end of my dock using a K-type thermocouple. The thermocouple is mounted 15 centimeters below the water surface. While I take the readings manually today, I am setting up an Arduino system to automate the temperature reading next year. I begin taking readings as soon as the ice is off the lake and stop taking readings when the lake is ice-covered. The lake is not ice-covered yet, but I decided to start putting my report together early this year.
Air Temperature Readings
Because my wife likes to see how air temperature and lake temperature are related, I also plot the air temperature as recorded by a nearby National Oceanic and Atmospheric Agency (NOAA) site in Hibbing, Minnesota. They have an excellent web site that allows you to download all sorts of US weather data (link).
Parsing with FILTERXML
I have never cared for Excel’s Text-to-Column function because it does not update automatically. I now tend to parse text using functions or FILTERXML. For this example, I will be using FILTERXML. Because you may not be familiar with it, I include an example of how it can parse an XML string in Figure 2. For general parsing, you often can convert a string to an XML string using the substitute command and then apply FILTERXML to parse the modified string. For more details, see the Chandoo blog for a great example.
I recorded my lake temperature data on Google Keep over time. When it was time to report, I simply pasted the data into the LakeData tab of the Excel Workbook. I then used FILTERXML to parse the data. Similarly, I downloaded the air temperature data from the NOAA website, converted from PDF to CSV using Tabula, and pasted the data into the NOAA tab of the workbook. Both raw tables were loaded into Power Query for general cleaning (e.g. converting strings to dates, etc). The clean data is then charted in Figure 3.
I should mention that the downloadable data from NOAA is always two days behind.
Figure 3 shows the plot that I will use in my lake metrics report for lake and air temperature. The light yellow area chart shows the range of daily temperatures. The purple diamonds show the lake temperature data. The chart does show that the lake temperature and air temperature tend to track most of the year, but diverge as winter arrives because the lake does not cool off as quickly as the air.
This workbook is a good example of how to gather data, parse it, clean it, merge it, and plot sensor data. It shows how to parse the data using FILTERXML that updates automatically as more data is collected.