Wordle Scores

A wealth of word whimsy.

Published

March 31, 2023

Modified

April 17, 2024

Introduction

Unless you spent 2022 on the moon, you’ve heard of Wordle, but just in case you haven’t here’s the story. You get up to six guesses to identify a secret five-letter word. The game tells you whether each of the letters of your guesses appears in the word and, if they do, whether they’re in the correct place.

Data

My Scores

read_csv("wordle_scores.csv", col_types = "nn-") |>
  mutate(source = "me", puzzle = row_number()) ->
  wordle_scores

wordle_scores |>
  select(-source, -puzzle) |>
  st(title = "My Wordle Scores: Summary Statistics")

My Wordle Scores: Summary Statistics
Variable	N	Mean	Std. Dev.	Min	Pctl. 25	Pctl. 75	Max
score	466	4	1	2	3	5	7

Scores from Twitter

I found two data sets of Wordle-related tweets on Kaggle: this one and this other one.

We can combine them, taking care to remove duplicates. There’s surely some bias here, since people are more likely to share their scores on social media when they do well.

read_csv("wordle_tweets_1.zip", col_types = "nc--c") |>
  rename(puzzle = wordle_id) ->
  twitter_scores_1

read_csv("wordle_tweets_2.zip", col_types = "ncc") |>
  rename(puzzle = WordleID, tweet_id = ID, tweet_text = Text) ->
  twitter_scores_2

twitter_scores_1 |>
  rbind(twitter_scores_2) |>
  mutate(score = str_extract(tweet_text, "Wordle [0-9]{3} ([1-6X])/6", 1),
         score = as.numeric(case_match(score, "X" ~ "7", .default = score)),
         source = "twitter",
         .keep = "unused") |>
  drop_na() |>
  distinct(tweet_id, .keep_all = TRUE) |>
  select(-tweet_id) ->
  twitter_scores

rm(twitter_scores_1, twitter_scores_2)

twitter_scores |> 
  select(-source, -puzzle) |>
  st(title = "Wordle Scores from Twitter: Summary Statistics")

Wordle Scores from Twitter: Summary Statistics
Variable	N	Mean	Std. Dev.	Min	Pctl. 25	Pctl. 75	Max
score	3090345	4.1	1.2	1	3	5	7

We should be careful when thinking about mean scores, since we’ve coded the “X” representing a failed puzzle as a 7. If we filter those out, we get the average number of guesses per solved puzzle.

wordle_scores |>
  full_join(twitter_scores) |>
  filter(score < 7) |>
  group_by(source) |>
  summarize(filtered_mean = mean(score)) |>
  gt()

source	filtered_mean
me	3.945770
twitter	4.085486

Looks like I’m slightly better on average than the people who tweeted their scores, not accounting for any failed puzzles.

Visualization

First let’s make one those those trend-and-distribution charts that I love so much. See the post about my Jeopardy! Coryat scores for another one.

For a more detailed comparison against the scores from Twitter, we’ll look at the cumulative distributions.

Line Chart and Histogram
Cumulative Distributions

Code

# main plot
(wordle_scores |>
  ggplot() +
  # aesthetic mapping
  aes(x = puzzle, y = score) +
  # visual elements representing the data
  geom_line(colour = "#b4b4b4") +
  geom_smooth(se = FALSE, colour = "black") +
  geom_point(colour = "#dc2828") +
  # scales
  scale_y_continuous(limits = c(7.25, 0.75), 
                     breaks = 1:7, 
                     labels = c(1:6, "X"), 
                     trans = "reverse") +
  scale_x_continuous(expand = c(0, 0), breaks = NULL) +
  # labels
  labs(title = "My Wordle Scores",
       subtitle = "Trend and distribution",
       x = "", 
       y = "Score") +
  # theming
  theme_bw() +
  theme(panel.grid.minor = element_blank())) |>
  # add the marginal histogram
  ggMarginal(type = "histogram", 
             margins = "y", 
             fill = "#b4b4b4",
             yparams = list(bins = 7, center = 0, binwidth = 1))

Code

# combine the scores from twitter with my own
wordle_scores |>
  full_join(twitter_scores) |>
  # count the occurrences of each possible combination of source and score
  group_by(source, score) |>
  count() |>
  ungroup() |>
  complete(source, score, fill = list(n = 0)) |>
  # get the cumulative percentiles for each source
  arrange(source, score) |>
  group_by(source) |>
  mutate(percentile = cumsum(n)/sum(n)) |>
  # add some helper columns for evil secondary axis trickery later
  mutate(axis = case_when(percentile == 1 ~ "n",
                          source == "me" ~ "l",
                          TRUE ~ "r")) |>
  mutate(
    r_label_colour  = case_match(axis, "r" ~ "grey30"),
    l_label_colour  = case_match(axis, "l"  ~ "grey30"),
    r_tick_linetype = case_match(axis, "r" ~ "solid", .default = "blank"),
    l_tick_linetype = case_match(axis, "l"  ~ "solid", .default = "blank")) -> 
  # need to save this dataframe so we can refer to it within the ggplot call
  temp

temp |>
  ggplot() +
  aes(x = score, y = percentile, fill = source) +
  # visual elements representing the data
  geom_line(linetype = "dotted") +
  geom_point(size = 3, shape = 21, colour = "black") +
  # scales
  scale_x_continuous(breaks = 1:7, 
                     labels = c(1:6, "X"),
                     expand = c(0, 0)) +
  ## evil secondary axis trickery part one
  scale_y_continuous(breaks = temp$percentile,
                     labels = scales::label_percent(accuracy = 0.1),
                     limits = c(0, 1),
                     expand = c(0, 0),
                     sec.axis = dup_axis()) +
  scale_fill_manual(values = c("#dc2828", "white")) +
  # labels
  labs(y = "", 
       x = "Score", 
       title = "My Wordle Scores vs Twitter", 
       subtitle = "Cumulative distributions",
       fill = "Source") +
  # theming
  theme_bw() +
  ## evil secondary axis trickery part two
  theme(panel.grid.minor   = element_blank(),
        axis.text.y.right  = element_text(colour = temp$r_label_colour),
        axis.text.y.left   = element_text(colour = temp$l_label_colour),
        axis.ticks.y.right = element_line(linetype = temp$r_tick_linetype),
        axis.ticks.y.left  = element_line(linetype = temp$l_tick_linetype))

Observations

I get fewer puzzles in two or fewer guesses than those who posted their scores to Twitter, but I do better at the harder words. I suspect this is the result of the bias speculated about above; probably there are many people who only tweeted because they got the puzzle in two guesses.

References & Further Reading

Wordle-solving state of the art: all optimality results so far