## Medal of Honor Statistics Using Power Query

Quote of the Day

Their drills are bloodless battles, and their battles bloody drills.

Publius Flavius Vegetius Renatus, Roman writer on the Roman military. An older version of "train like you fight, fight like you train."

Figure 1: US Air Force Medal of Honor. (Source)

In this post, I will use Power Query (PQ) to gather (aka web scrape) the US Medal of Honor (MOH) recipient names, rank, service, and conflict from a website called the Congressional Medal of Honor Society (CMOHS), which states that there have been  3505 MOH recipients. They have a beautiful website with the records stored on 140 ages with 25 names per page and one page with 5 names. Normally, I would use Python to scrape a large number of web pages but I am trying to use PQ more because my customers all have Excel. My approach as a contractor is always to work within the existing infrastructure of my customers. A copy of my workbook is available here.

I should mention that the Wikipedia states there have been 3522 MOH recipients. I do not know the reason for the discrepancy, but I do know that I gathered every MOH recipient from the CMOHS pages.

The web scraping process was straightforward:

• Create a list of all web pages with MOH recipients.
This was easier than you might think. There were 141 pages, each with the base URL and an appended number (1 through 141). You can see the format in the attached workbook.
• Build a parsing template by parsing one webpage
This was a little harder than you might think. PQ could not determine how to identify the information on the web page because it was not table-oriented. However, I took a standard web query and modified it by converting a WebPage command into a WebContents command –  it loaded in the raw HTML. I then parsed the HTML to extract the information. I illustrate the process in Figure 2, a figure which is large and you must click on it to view. The HTML parsing was routine and I just refer you to the query to see how I did it.

Figure 2: Grabbing HTML..

• Turn the single-page parsing routine into a function
This is a routine process. You add a function header (aka signature) and a closing phrase. I illustrate the process in Figure 3.

Figure 3: Turn a Query Into a Function.

Make the list of web page addresses into a table and load it into PQ.
• Run the function over every web address, producing a column of tables.
I used the "Invoke Custom Function" command from the menu to create a new column with the parsed data. I illustrate the process in Figure 4.

Figure 4: Calling the Parsing Function.

• Expand the column of tables into its separate fields.
You now have all the MOH recipients. At this point, I can begin filtering and pivoting.

I know it sounds complicated but it is simple when you see it in the spreadsheet.

Now that we have the data, let's see what we can learn. I decided to focus on WW2 and later conflicts. I wanted to know the number of MOH recipients by:

• conflict
• service
• rank

I now could generate the pivot tables that I wanted (Figures 5 through 7).

Figure 5: Number of MOH Recipients By Conflict.

Figure 6: Number of MOH Recipients By Service.

Figure 7: Number of MOH Recipients By Rank (Top 10).

The results make sense:

• WW2 was such a massive conflict that the number of recipients should be larger.
• The US Army is the largest of the US military services, so it would make sense for it to have the most recipients.
• Enlisted ranks would be more likely to be in the life and death situations that would call for such acts of bravery.
Posted in Excel | 3 Comments

## Computing the Number of Friday the 13ths in a Year Using Excel

Quote of the Day

What is our intention for this meeting? What’s important? What matters?

Oprah Winfrey. She starts every meeting with these three questions.

Figure 1: Number of Friday the 13ths By Year.

I have been tutoring math at the local library and using Excel as a vehicle for encouraging people to explore everyday math. While at the library, I heard a young man ask "How many Friday the 13ths are in a year?" Since I am always looking for computational examples, I showed him how to use Excel to find the answer for himself. This post shows how I taught him to solve the problem. The solution turned out to be a good example of using Excel's date and array capabilities.

Figure 1 shows a simple table with the number of Friday the 13ths by year. I deliberately made the table 28 years per column to show that calendars repeat with a period of 28 years.  I use this characteristic of calendars to buy old calendars that I can reuse; I love old Christmas and tool company calendars.

Figure 2 shows the Excel array formula I used to compute the number of Friday the 13ths. My workbook is here.

Figure 2: Formula Breakdown for Calculating the Number of Friday the 13ths in a Year.

Posted in Excel | 3 Comments

## Excel VBA Code to Center a Shape in a Cell

Quote of the Day

The worst pain a man can suffer is to have insight into much and power over nothing.

Herodotus. This quote reminds me a bit of the curse of Cassandra.

Figure 1: Animated GIF Showing Operation of VBA Shape Centering Macro.

I recently finished a job where the customer wanted an Excel dashboard that displayed metrics for test case completion and various success metrics. This dashboard contained many control shapes that I wanted to be centered in cells. I do not like to manually adjust objects so I googled for a VBA routine that would center a shape. I soon found a nice piece of code by HipGecko on the Mr. Excel forum that centered pictures in the active cell. A simple modification of this code allows it to center shapes, an object type that includes pictures and controls.

Figure 1 shows an animated GIF of the macro centering a flower picture a button shape in two different cells. The code will center the object in the cell if the upper-left-hand corner of the object is in the cell. If multiple shapes are in the cell, the code will center all the shapes on top of one another.

You can download a workbook with the code here. The source is shown below.

## Tensions and Angles in a Simple Rope Rigging

Quote of the Day

Once you get to earth orbit, you’re halfway to anywhere in the solar system.

Figure 1: Great Rope Tension Infographic. (Source)

I have been tutoring math and physics at the local library for the last few months. As part of this tutoring, I have been looking for good graphics that illustrate basic science concepts. One common high-school physics problem involves computing the tension in ropes tied to an anchor by a pulley. Figure 1 is a graphic that nicely illustrates the tension between two ropes connected to an anchor point by a carabiner.

In this post, I will show how to derive a couple of formulas for the various angles and forces shown in Figure 1. The derivation assumes that there is no friction associated with the carabiner, which is not true. This idealization would be better if the carabiner was replaced with a pulley.

I find infographics like this useful because I frequently use ropes, pulleys, and winches at my cabin to perform tasks like removing tree stumps, pulling docks out of the water, and helping folks whose cars are stuck in the mud.

Figure 2 shows a free body diagram of the rigging and my derivation of the angles and forces involved.

Figure 2: Free Body Diagram and Derivation.

I created an Excel Workbook that computes the values shown in Figure 3. It also computes the angle of the gray rope in Figure 1, which is not shown in that diagram.

Figure 3: Excel Table of Rope Tensions and Angles.

In Figure 2, I derived $F=T\cdot \frac{{\sin \left( \theta \right)}}{{\sin \left( \varphi \right)}}$. This equation is not defined when φ = 0. You can circumvent this discontinuity using one of three techniques:

• Assume that φ is very small rather than 0. The classic approach for an engineer and the one I typically use.
• Derive an alternate expression that removes the discontinuity, which is shown in Figure 4.
• Just eliminate φ from the equation for F as shown in Figure 5.

Figure 4: Derivation of Alternate Solution without a Discontinuity at φ =0.

Figure 5: Formula Eliminating Φ.

Posted in Cabin, Construction, Daily Math | 10 Comments

## Measuring Countersink Angle Using Gage Balls

Quote of the Day

A wealth of information creates a poverty of attention.

Herbert Simon, economist

Figure 1: Example of a Countersink Specification. (Source)

This post will cover how to measure a countersink angle using gage balls. Figure 1 shows how a countersink is normally specified on an engineering drawing. I frequently use countersinks in my wood and metal working hobbies. In addition, using gage balls to measure the countersink angle provides a good example of how to apply basic geometry concepts to a practical problem.  I use this example in my role as a volunteer adult math tutor at our local library.

Two gage balls of different diameters are used to measure the countersink angle (θ). Figure 2 shows the key variables involved: the diameter of ball 1 (B1), the diameter of ball 2 (B2), and the height difference when placed into the countersink (M). Figure 2 also presents the formula for the countersink angle θ.  The derivation requires only to apply the definition of the sine of a right triangle.

Figure 2: Illustration Defining the Variables Used in This Derivation.

Figure 3 shows a scale drawing of a countersink angle measurement example.

Figure 3: Countersink Angle Determination Example.

Figure 4 shows how I performed the calculations in Mathcad.

Figure 4: Example Calculations.

## US and Canada Aircraft Production During WW2

Quote of the Day

Followers will take on the personality and the character of the leader. As a leader, make sure you show the team your very best.

Mark Hertling. No truer words on management have ever been spoken. Every leader is the model for their team.

Figure 1: The Consolidated B-24 Liberator was the Most Produced WW2 Heavy Bomber. (Photo)

While looking for some good summer history reading, I found the book America's Hundred Thousand: U.S. Production Fighters of World War II. This book covers the production miracle associated with scaling up up the US aircraft industry to supply planes for every front during WW2. Its title refers to the fact that the US produced ~100K fighter aircraft during WW2, which lasted for 44 months for the US (Figure 2). I decided that I would look at the numbers for all forms of aircraft produced by the US during WW2. Fortunately, the Hyperwar website has put the Army Air Forces Statistical Digest online, which gives me easy access to the data. The Digest contains aircraft production data for both the US and Canada. Figure 1 shows the production numbers for the 11 categories of aircraft production listed in the Statistical Digest. In addition to 100K fighter aircraft, there were nearly 200K of other aircraft manufactured as well.

I should mention that different WW2 sources have somewhat different totals. Some databases show more than 100K fighters having been produced by the US and Canada during WW2.  This is because of a number of experimental and special purpose versions of aircraft being manufactured. The tables I chose in the Hyperwar database include a class of aircraft called "other" that includes these unusual units. Other databases eliminate the "other" category and include these aircraft in a related category.

Figure 2: Hyperwar Data on US Aircraft Production During WW2. (Source)

The increase in the level of aircraft production is easy to see using Figure 3.

Figure 3: WW2 US and Canada Aircraft Production Numbers By Year.

You can see the scope of the effort in the US by looking at the number of states where the production occurred – 24 states out of 48 were producing aircraft (Figure 4). I would have expected that California would dominate all the other states in terms of production because it was the prewar heart of US aircraft production. The table shows that other states, like New York and Kansas, also had significant production.

Figure 4: Aircraft Production By State Plus Canada.

There were 53 companies making aircraft during WW2, plus thousands of subcontractors. Figure 5 shows the top 15. Notice how Ford is in the top 15 of aircraft manufacturers. During WW2, they set up an assembly line for B-24 bombers at Willow Run. At one point, 70% of the B-24 production was from Willow Run.

Figure 5: Top 15 Companies For Aircraft Production.

In Figure 6, I broke out the production for North American Aviation because it was so massive. Its portfolio included iconic aircraft like the P-51, B-25, and B-24. I should mention that the A-36 was a ground attack version of the P-51, so you could argue that production of the AT-6 and P-51 were about equal.

Figure 6: Aircraft Produced By North American Aviation During WW2.

The US and Canada produced nearly 300K airframes during WW2. Figure 7 shows the percentage breakdown by major category. While fighter planes may have dominated in quantity, the B-29 very heavy bomber dominated in terms of development cost. At $3 billion dollars, it was the single most expensive development program during WW2 – its cost beat the Manhattan project by$1 billion.

Figure 7: Aircraft Manufactured Percentages By Category.

For those who are interested, my workbook is here.

## Using Excel to Compute Effective Tax Rate

Quote of the Day

A man who is empty on the inside must decorate himself on the outside.

— Bruce Lee

Figure 1: Income Tax Rate Versus Income When Married Filing Jointly.

I have been working as a contractor for the last few months and it is now time for me to make a quarterly tax payment to the folks at the Internal Revenue Service (IRS). This involves using some tables listed on Form 1040ES (see Appendix A). The US tax code uses has a graduated tax rate; the tax rates on higher amounts of income are higher than on lower amounts. As I calculated my tax payment, I became curious as to my effective tax rate. I am going to use Excel to duplicate a graph that I saw on the Wikipedia, updated with 2018 tax rates. My version of the graph is shown in Figure 1.

I am not a tax accountant, so please do not take anything I say as tax advice – we are just looking at how to compute your effective tax rate using Excel. One complicating factor is that you must choose between four different income graduation schedules: married jointly, married separate, single, and head of household. Figure 2 shows these different schedules in one table. The IRS normally shows them as four different tables (Appendix A). I will be filing as a married person with my wife.

My Excel workbook is shown in Figure 3. You can download the workbook from here.

Figure 3: Worksheet Excerpt.

The calculation is fairly straightforward. Here are some calculation details (Figure 4):

• You select your filing category and income.
• The spreadsheet selects the correct column of income graduations (aka hurdles).
• The tax function uses Excel's SUMPRODUCT for the calculation.
• The function is designed to apply the lowest tax rate to every dollar earned, plus the appropriate tax increment for each income hurdle.
• To make the calculation of the tax increments simpler, I need to include a 0 as the heading value in the tax table. I show the correct heading label by using a custom format.

Figure 4: Effective Tax Rate Calculation Details.

## Appendix A: IRS Rate Information From 1040ES

The rate information shown in Figure 5 is from Form 1040ES, which is available here.

Figure 5: IRS Rate Information from Form 1040ES.

## Using Excel Custom Formatting To Hide Redundant Table Information

Quote of the Day

To see things in the seed, that is genius.

Figure 1: Requirements Dissemination Process.

I currently am doing some contract work as a system engineer. In this role, I am working on generating and managing lists of requirements to drive the engineering design work. The requirements are stored in a database that I query to generate requirements lists for specific subsystems, like a receiver or transmitter assembly. These requirements are disseminated to the engineers in various subsystem documents (Figure 1).

The query outputs need some cleanup and must be augmented with information from other databases. I do this work in Excel. Figure 2 shows a simplified example of a query output as it looks in Excel initially; there actually are about a dozen columns in the database output.

Figure 2: Raw Requirement Output from the Requirements Database.

I prepare these lists for publication using Excel, tables, and structured addressing. The company style guide wants to see the requirements listed as shown in Figure 3. This style requires that the column values are not repeated if they are the same as the previous column value. I wanted to duplicate this style in my Excel workbook.

Figure 3: Requirements Formatted Per Company Style Guide For Requirement Lists.

The quickest way to format a table in this manner is to use custom formatting. The column contents are hidden by making the font color the same as the cell background; all the data is still in the table, it is just not visible. The presence of banded rows complicates the situation a bit. Figure 4 shows how I used a custom format to hide the redundant text. If the column contents are different than the previous contents, the black default font color is used.

Figure 4: Custom Formatting Dialog.

The details of the custom format formulas are shown in Figure 5.

Figure 5: Custom Format Formula Breakdown.

This table can be pasted into a Word document and it will retain the formatting from Excel.

For those who are interested, my example workbook is here.

## Web Scraping WW2 Landing Ship Data

Quote of the Day

The destinies of two great empires seemed to be tied up in some God
damned things called LSTs.

— Winston Churchill

## Introduction

Figure 1: LSTs on Omaha Beach Shortly After D-Day. (Source)

I have been working on improving my web scraping abilities by analyzing WW2 data. I have focused on topics related to how the US took the 14th largest military in the world and in roughly 18 months turned it into a military that could fight anywhere in the world. In this post, I want to look in detail at how war materials were delivered to beaches around the world using a vessel called a Landing Ship Tank (LST). I have wanted to write about the LST for a while, but the web data was distributed on about 1200 separate pages – one for each ship. While a Python script would have worked nicely, I wanted to try gathering the data without doing any programming. I found some software that did a good job automating this task, which I will discuss later in this post.

To understand the role of the LST, you need to understand the US war material supply chain during WW2. To supply its forces, the US needed to:

• manufacture the goods (e.g., tanks)
• transfer the goods to ports (e.g., trains and barges)
• ship the goods on ocean-going transports (e.g., Liberty and Victory ships) to staging areas
• move the goods to a beach using landing ships – the LST is the largest landing ship and it handles the heaviest material (Figure 1).
• move the goods from the beach to the troops using ground transport (e.g. Red Ball Express)

The Landing Ship Tank (LST) was the landing ship class tasked with placing heavy equipment, like tanks and railcars, onto beaches around the world during WW2 (Figure 2). The US and UK jointly developed the LST design during 1941 with production starting in 1942. The Wikipedia provides a comprehensive list of the 1,198 LSTs that were built by the US, but some of these vessels were built after WW2; 1,046 LSTs were built by the US during the war years (Pearl Harbor to VJ day). The US shipyards average 24 LSTs per month throughout the war. Shipyards in Canada and the UK built ~80 LSTs during the war years. The LSTs were so numerous that most were referred to by their hull designations and were not assigned formal names. Because the LSTs were plentiful, some were repurposed as utility vessels. For example, LST-490 was redesignated as USS Agenor, a landing craft repair ship.

While the  LSTs were a critical part of the war supply chain, the ships were not popular with their crews. They were not particularly seaworthy and the crews were often seasick. Because their bow design was focused on opening to deploy tanks on a beach and not for low drag, the ships were slow (maximum speed of ~12 knots)  and were vulnerable to submarine and E-boat attack. The crews often grimly said that LSTs stood for "Large Slow Target" – a sinking LST was not a pretty place. In fact, 40 US LSTs were sunk for all causes during WW2 (see Appendix A).

## Background

Two websites have pages assigned to individual LSTs: navsource.org and uboat.net. Because each LST's data was on an individual web page, I needed to scrape 1198 pages. The magnitude of this task meant that I wanted to automate this work.

There are two software tools that I used to gather the data:

• Data Miner
Data Miner is a chrome app that I used this tool to gather the individual LST URLs from navsource.org and uboat.net.
• Octoparse
I fed this tool URLs from Data Miner and used it to parse the individual LST URLs. It has a very useful wizard that guides you through its use.

Octoparse produced an Excel spreadsheet with the LST data I was looking for: LST name, date laid, date launched, data commissioned, and manufacturer. There was some cleanup needed, but Power Query handled that nicely. The rest of the work was standard Excel processing – graphs, computing averages, etc.

I should mention that the navsource.org website did not appreciate being scraped and kept asking me if I was a robot. I had to break my parsing up into small bits over a period of days.

My Excel worksheet is here. There is not much there because most of the work was done by Data Miner and Octoparse.

## Analysis

### Build Time

I am defining build times as the interval between the date the keel was laid to when the LST was commissioned. Table 1 shows how the mean construction time for an LST varied by year. 1942 production began in traditional shipyards, which resulted in relatively short build times.  By 1943, manufacturing was being moved to non-traditional manufacturers and the build times extended. The build times dramatically reduced in 1944 as the manufacturers gained experience. Times grew again in 1945, which I have seen for other WW2 production. The war was clearing drawing to a close by the summer of 1945 and the government was beginning to reduce the priority of war production.

 Table 1: Mean LST Build Time By Year.

### Monthly Production

Figure 3 shows the number of LSTs commissioned by US manufacturer per month during WW2. Notice how the monthly commissioning rate peaked just before D-Day. This is not a coincidence as the Anzio landing and D-Day landing created a need for LSTs in Europe. Later production during 1944 and 1945 was needed for action in the Pacific theater.

Figure 3: Monthly US LST Production During WW2.

### Average LST Production Per Month

Table 2 shows how the mean US monthly LST production varied by year.  Production averaged 24 units per month throughout the war.

 Table 2: Monthly Average LST Production.

### Shipyards Involved

Table 3 shows the 17 US companies that produced LSTs during WW2. Because the coastal shipyards were busy building warships, the Pentagon awarded the bulk of the LST contracts to companies with no experience building military ships. During peacetime, these companies produced bridges and river craft. They had the welding technology, heavy equipment infrastructure, and management experience needed for switching their manufacturing lines to LSTs and they produced 68% of all the vessels manufactured.

 Table 3: US Companies Producing LSTs During WW2.

## Conclusion

The LST may not be pretty and its lack of seaworthiness may have sickened its crews, but it was a critical component in bringing supplies to Allied forces during WW2. The first LSTs entered service early in 1942 and the ship class stayed in service with the US Navy until the 1980s.  As with the M4 Sherman tank and the B-24 Liberator bomber, the LST was not loved by its crews. While many historians dwell on its shortcomings, LSTs were produced quickly and in sufficient quantity to make a major impact on both the Atlantic and Pacific battlefields (Figure 4).  The Germans took an alternative approach to war material – they produced some excellent systems but not in time or in numbers sufficient to make a difference (e.g. Type XXI submarine, Me 262).

I am in awe of the fact that in a time before computers, CAD, and robots, over 1000 large ships designed as part of an international partnership could be produced in 44 months that did the job required.

## Appendix A: US LSTs Lost During WW2.

Table 4 shows the US LSTs lost for all causes during WW2.

 Table 4: List of US LSTs Lost During WW2 For All Causes.

## 1968 Vietnam War Statistics

Quote of the Day

Captain's log. Using the lightspeed breakaway factor, the Enterprise has moved back through time to the twentieth century. We are now in extended orbit around Earth, using our ship's deflector shields to remain unobserved. Our mission, historical research. We are monitoring Earth communications to find out how our planet survived desperate problems in the year 1968.

Captain Kirk, Opening for the episode "Assignment: Earth" from Star Trek TOS. People were very aware at the time that 1968 was an unusual year.

Figure 1: US Military Deaths in Vietnam By Year. All data from the Combat Area Casualties Current File (CACCF) hosted by Duke University.

It has been 50 years since 1968, and I have been seeing quite a few retrospectives on television about that tumultuous year. I was in 6th-grade in 1968 and the chaos of that year is still very clear in my memory – I remember spending quite a bit of class time on the Paris peace talks. One lesson was about how the Paris Peace negotiators argued about the shape of the table at which they would sit. Arguing about the shape of a table while people were dying seemed ridiculous to a 12-year boy. After hearing all these recent discussions about 1968, I decided to look at the US Vietnam casualty data (Figure 1) to see what insights I could gain on that year. All my work is done in Excel and my workbook is here.

The war was a regular topic at my family's dinner table. My father was a Republican and my mother was a Democrat, which meant that they did not agree on the war at all. To show you how strange the situation was, my father supported Nixon because Nixon was going to turn up the war's intensity. My mother thought Nixon might work out because he was Quaker and they are opposed to war. Dad's view of Nixon turned out to be closer to fact.

My family used to watch Walter Cronkite on the news every weeknight, and the rise in the weekly casualty reports was very worrisome. You could feel a change in people's attitudes when Walter gave a devastating war critique that created real doubt about the future of the war (Figure 2).

 Figure 2: Walter Cronkite Vietnam War commentary on 27-Feb-1968. I remember this news broadcast. It left people stunned.

In Figure 3, you can see how the war's intensity ramped up by looking at how the monthly casualty rates varied. I have highlighted in red the two months (February and May) with the highest casualty rates of the war; these months correspond to two major Vietnamese pushes during the Tet Offensive.

Figure 3: US Vietnam War Dead By Month. All data from the Combat Area Casualties Current File (CACCF) hosted by Duke University.