news

Wingspan Data Analysis | R-bloggers

Vaseline August 18, 2024

Wingspan is a great game, even though I’ve only played it a few times. The mechanics are great, there are a lot of bird variations, and a ton of different strategies to try. There are 170 birds and I’ve probably only seen 30 of them. So, true to form, I did some data analysis to get a sense of all the different types of cards in the game.

Open source wins again with the {wingspan} R package. It contains the details of every bird in the core, European, Oceania and swift start sets. I will only use the core set for this analysis, because that is the only one I am somewhat familiar with.

What type of food is eaten most often?

There are five types of food: invertebrates (let’s face it, it’s grubs, and I’d rather choose 1 syllable than 4), seeds, fruits, fish, and rats (it’s rats). Grubs are certainly the most common food, but how many more?

Seeds and grubs are 2.5-3x more common food costs than the other three foods when added up across all 170 cards. If you’re not looking for specific foods, picking grubs and seeds from the feeder gives you more options. Once you’ve played a few times, this becomes pretty clear.

What is the average egg capacity?

The average egg capacity is 2.85, although the distribution of egg capacity and its relationship to victory points is more interesting.

A bird with 4 victory points and an egg capacity of 2 is the most common.
There are only 4 birds with an egg capacity of 6.
Each egg is worth one victory point at the end. So let’s calculate the number of victory points + the capacity of the egg as follows:
- The bird with the most specimens is the Wild Turkey with 13 specimens.
- 17 birds (10%) have a victory point capacity of 10 or more.

This is useful to know in terms of the chance of drawing valuable cards from the tray or deck. Sure, some of the lower value cards will have great activations, but at the end of the game you’ll be looking for the big ones.

What is the habitat distribution?

The number of birds in the habitats is almost equal: 83 birds in the forest and grassland and 85 in the wetlands.

The distribution is somewhat interesting:

There are 45 birds that live exclusively in marshes, which is the largest group
There are 27 birds that can be played in each habitat
There are only 2 birds that can play in the forest or the swamp, but not in the grassland

What is the most common force?

Drawing cards (or tuck cards) is the most common power, aside from ‘Other’. This usually involves drawing more bonus cards or moving the bird to a different habitat.

There are only 6 birds without powers, all of which are high VP birds. I was surprised that cards with egg laying, card drawing or food from the supply powers only make up 11% of the cards.

Predicting victory points

I expect victory points to correlate with egg capacity, food cost, activation power, and habitat. By using a model to predict victory points, we can see which cards are good value for money.

Or, what I would actually expect is that cards with lower than expected victory points would have strong activation powers to compensate. However, I assume that there has been a lot of testing and the cards have been adjusted to be balanced.

Data setting

A few things to note regarding the data setting:

I filtered the birds on those birds with a single price, so not on those birds that you can pay for, for example a larva or seed.
The birds without a power category were coded as ‘No Power’ rather than left as NA and removed from the model.

Adjusting the model

I fitted a GLM with victory points as the response and food cost, egg capacity, habitat and strength category as predictors. I removed the intercept from the model formula because it makes the interpretation of the coefficients easier.

library(wingspan)
library(tidyverse)

df 
  rename(vp = victory_points) |>
  filter(
    set == "core",
    !food_cost_div
    ) |>
  mutate(power_category = replace_na(power_category, "No power")) |>
  mutate_at(c("forest", "grassland", "wetland"), as.numeric)

mod |t|)    
egg_capacity                       -0.42837    0.06754  -6.342 4.15e-09 ***
invertebrate                        1.57877    0.14491  10.895  


This is pretty neat; almost all are significant predictors of victory points. The takeaways:

The higher the egg capacity, the less victory points. Logical, because every egg on the map counts for a VP.
The higher the cost, the more VPs. Logical.
Fruit, fish and rats yield more VPs than larvae and seeds. This makes sense, because fewer birds need fruit, fish or rats.
Forest and grassland birds are responsible for fewer VPs, but for marsh birds there is no difference. Nice!
Birds without powers have much more VPs (5.9) than the other birds. This makes sense if they have no other VP-generating potential.
Birds with the power to lay eggs yield the fewest VPs (1.6), because the power yields a high VP potential due to an egg being laid with each activation.
Birds with the ability to flock yield the second fewest VPs (1.8), given their potential to generate VPs.
Birds with card caching yield 2.1 victory points.
Birds with card drawing, food from the supply, hunting or food from the feeder give ~3 VPs. Their powers do not generate VPs directly, but they do allow you to play birds earlier.

I also fit the color of the force into the model, but it was not a significant predictor. That surprised me, because I expected brown and pink forces to have fewer VPs than white ones. You can see where they are in the residual plot below.

Remaining plot

By plotting the victory points against the remainder, we can see if the number of victory points is higher or lower than expected. Those above the line have fewer VPs than expected, and those below the line have fewer VPs than expected. I chose the grub to score birds with an ‘or’ condition in their food cost.

Upon inspection, the birds with weaker or activation powers for all players are usually above the line. They have more VPs to be worth playing. Those below the line usually have decent powers considering the cost.

The “Power 4”, as they are popularly known, are:

Common Raven
Chihuahuan Raven
Franklin’s Seagull and
lapwing

All four are well below the line, which is some evidence to support my theory that birds with fewer VPs than expected have strong activation powers. The common raven is slightly higher, suggesting it is the best of the four. This is cool because it may allow you to identify other strong cards that you may have overlooked.

This isn’t always the case though. For example, the bird with 5 VPs at the bottom of the column, below the Common Raven, is the Indigo Bunting. It costs a larva, a seed, and a fruit. Its power is to get a larva or a fruit from the feeder. Not as good as throwing away 1 egg to get 2 whatevers, or even getting a single larva from the stock. In this case I’d say it either needs to be cheaper or have a different VP, or both. Probably not worth paying the cost, in my opinion.

The bird at the bottom of the 4 VP column is the Brown Pelican. It costs 2 fish; when you play it, you get 3 fish from the supply. That’s it. In my opinion, it needs more VPs or better activation.

The Northern Bobwhite is a great card: 5 victory points, estimated VPs of 3.3 (good bang for your buck), an egg capacity of 6, and an activation power to lay an egg on the card. It’s a great card at any stage of the game.

This analysis doesn’t directly indicate which cards are the best or worst, but I found it useful in determining whether a card is a good value for the money or too expensive.

Each bird in the graphs above is also listed in the lookup table below. ‘Est. VPs’ is the VPs estimated by the model, and ‘res’ is the remainder (VPs – Est. VPs). It’s a great lookup table for comparing birds.

Follow the link to view the table in a new window.

Final thoughts

There are some interesting things that came out of the analysis, especially with the model. The predicted VPs and the residual plot are useful for critically assessing each card and whether it is worth the cost. I have referred to the table far more often than I expected.

It would be great to have data on game statistics, like the last boards, what birds were played, what turn each bird played, the last VPs, who won, etc. That would reveal some pretty cool stuff, I think. If you know of such a dataset, let me know!

Anyway, have fun birdwatching!

The post Wingspan Data Analysis appeared first on Dan Oehm | Gradient Descending.