Munro Tidy Tuesday
I was immediately obsessed when I saw the Tidy Tuesday theme was Scottish Munros - so far we have climbed 58/282 so far. As I assume is the case for the majority of baggers, we use walkhighlands for all our munro info and routes. walkhighlands is one of the unequivocally wonderful bits of the internet and I can’t believe it’s free. We donate to keep it going, and if you’re a bagger who uses it often, I’d encourage you to do the same. When I saw the Tidy Tuesday dataset, I knew I wanted to try and combine the provided data from the Database of British and Irish Hills with what’s available on walkhighlands.
Because I’ve made certain career choices, my day-to-day activity now involves a lot less coding and a lot more admin and I realised I was starting to lose some of my R so this has been a nice excuse to refresh. I’m gearing up for another semester of teaching R to students in the age of AI and it has been interesting to reflect on how I’m using AI myself. Pleasingly, many of the solutions to the many problems I had to solve came from my knowledge of Munros and Gaelic. AI sometimes provided the code but it’s a nice reminder it can only provide the answers to questions you know to ask.
Walkhighlands munro info
The reason I wanted to use walkhighlands data is that it has a bunch of route information that I could use for exploration that wasn’t contained in the Database of British and Irish Hills that is the base of the Tidy Tuesday data:
- Region
- Estimated length of walk in hours
- Distance of walk
- Total ascent of route (not just of each individual Munro)
- Route descriptions
I’ve decided not to include the code I used for web scraping from walkhighlands. There’s nothing in their terms of service I can find that says I shouldn’t have done it, but I don’t want to annoy them because again, they’re the best thing on the internet so I’ll just describe roughly what I did.
The approach was to first scrape walkhighlands for the list of Munros, the region in which they are located, and their height from the Munro A-Z page. Then it took one route for each munro (the first listed), the min and max estimated walk time, distance in km, total ascent in metres, and used regex to look for certain words that describe walk features that might be of interest (scramble, exposed, arete, river, spate, bog). walkhighlands also provides a Grade rating for each walk as well as a bog factor, however, these are represented as images, and try as I (well, AI) might, I could not get it to parse this information.
An important note for those of you who are familiar with walkhighlands, as noted, I included one route per munro - the first one listed. This can make a big difference to the walk, for example, which route you take up Ben Nevis significantly changes the fear factor and technicality. Text mining is also a blunt tool and only looks at whether a word is contained in the walk report rather than its context - a route that reads “there is no scrambling required” would still have been included in the “scramble” category.
It took a long time to get the AI to provide code that worked and there were a number of issues - at one point it was matching the route to the wrong Munro, then it didn’t return all Munros, then it was missing a bunch of routes. I had to manually create a file of some routes to load in because I could not find a solution as to why these handful were failing. Because I could not have done any of this type of scraping without AI, I really have no idea why it works and why it didn’t. This is intellectually unsatisfying but also, the idea you’d be willing to trust this black box of “knowledge” to something more serious than an obsessive deep dive into your favorite mountains is madness.
Here’s what the walkhighlands data looks like.
munro | region | height | first_route_title | first_route_url | scramble | exposed | spate | bog | river | arete | time_hours_min | time_hours_max | distance_km | ascent |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
A' Bhuidheanach Bheag | Cairngorms | 936 | Càrn na Caim and A' Bhuidheanach Bheag from Drumochter | https://www.walkhighlands.co.uk/cairngorms/carn-na-caim.shtml | FALSE | FALSE | FALSE | TRUE | FALSE | FALSE | 5 | 6 | 19.0 | 824 |
A' Chailleach (Fannichs) | Ullapool | 997 | Sgùrr Breac and a' Chailleach from near Braemore | https://www.walkhighlands.co.uk/ullapool/sgurrbreac.shtml | FALSE | FALSE | FALSE | TRUE | FALSE | FALSE | 6 | 8 | 16.0 | 1,127 |
A' Chailleach (Monadhliath) | Cairngorms | 930 | The Monadhliath Munros: a' Chailleach, Càrn Sgulain and Càrn Dearg from Glen Banchor | https://www.walkhighlands.co.uk/cairngorms/monadhliath.shtml | FALSE | FALSE | TRUE | TRUE | FALSE | FALSE | 8 | 10 | 24.5 | 946 |
A' Chralaig | Kintail | 1,120 | a' Chralaig and Mullach Fraoch Choire from Cluanie | https://www.walkhighlands.co.uk/kintail/Achralaig.shtml | TRUE | FALSE | FALSE | TRUE | FALSE | FALSE | 6 | 8 | 14.5 | 1,150 |
A' Ghlas-bheinn | Kintail | 918 | a' Ghlas Bheinn and the Falls of Glomach from Morvich | https://www.walkhighlands.co.uk/kintail/Aghlasbheinn.shtml | FALSE | FALSE | TRUE | FALSE | TRUE | FALSE | 7 | 9 | 21.0 | 1,196 |
A' Mhaighdean | Ullapool | 967 | The Fisherfield 6, from Shenavall | https://www.walkhighlands.co.uk/ullapool/fisherfield-6.shtml | TRUE | FALSE | TRUE | TRUE | TRUE | FALSE | 12 | 18 | 29.0 | 2,254 |
Database of British and Irish Hills
Next it was time to load in the Tidy Tuesday dataset which is from the Database of British and Irish Hills. In order to be able to join this with my walkhighlands database, I had to do quite a lot of wrangling although thankfully I was not reliant on AI and mainly able to achieve it because of my existing knowledge of Munros and Gaelic.
I wasn’t that bothered about the Munro status changes over the years as the walkhighlands database allowed me to do other more interesting analyses so I dropped these bits.
library(tidyverse)
library(fuzzyjoin)
library(ggthemes)
library(ggridges)
library(flextable)
library(stringi)
library(tidytext)
library(sf)
library(plotly)
library(rnaturalearth)
scottish_munros <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2025/2025-08-19/scottish_munros.csv')
raw_data <- read_csv("https://www.hills-database.co.uk/munrotab_v8.0.1.csv")
scottish_munros <- raw_data |>
filter(`2021` == "MUN") |>
select(
`DoBIH Number`, Name,
`Height (m)`, xcoord, ycoord, "Grid Ref",
) |>
drop_na(`DoBIH Number`) |>
rename(
munro = "Name",
height = `Height (m)`,
number = `DoBIH Number`,
grid_ref = "Grid Ref"
)
rm(raw_data)
What’s in a name (issues)
The first problem to solve before trying to join the two datasets was that some of the names of the Munros differ between the two - sometimes this is because there are variants in the Gaelic (Carn Eighe/Càrn Eige), sometimes the Anglicised version is used, and sometimes it’s because multiple Munros have the same name so they have additional information added in parenthesis in one of the files (Stuc an Lochain [Stuchd an Lochain]/Stùcd an Lochain).
scottish_munros <- scottish_munros |>
# Standardise names in the 'dobih' dataset to Walkhighlands spellings
mutate(
munro = case_when(
munro == "A' Chraileag [A' Chralaig]" ~ "A' Chralaig",
munro == "Beinn Challuim [Ben Challum]" ~ "Ben Challum",
munro == "Beinn Sheasgarnaich [Beinn Heasgarnich]" ~ "Beinn Heasgarnich",
munro == "Beinn a' Bhuird North Top" ~ "Beinn a' Bhùird",
munro == "Ben Klibreck - Meall nan Con" ~ "Ben Klibreck",
munro == "Blabheinn [Bla Bheinn]" ~ "Blà Bheinn",
munro == "Cac Carn Beag (Lochnagar)" ~ "Lochnagar",
munro == "Carn Eighe" ~ "Càrn Eige",
munro == "Carn a' Choire Bhoidheach" ~ "Càrn a' Choire Bhòidheach",
munro == "Creag a' Mhaim" ~ "Creag a'Mhàim",
munro == "Glas Leathad Mor (Ben Wyvis)" ~ "Ben Wyvis",
munro == "Leabaidh an Daimh Bhuidhe (Ben Avon)" ~ "Ben Avon",
munro == "Meall Garbh" ~ "Meall Garbh (Ben Lawers)",
munro == "Meall na Aighean" ~ "Creag Mhòr (Meall na Aighean)",
munro == "Sgurr Dearg - Inaccessible Pinnacle" ~ "Inaccessible Pinnacle",
munro == "Sgurr Mhor (Beinn Alligin)" ~ "Sgùrr Mòr (Beinn Alligin)",
munro == "Sgurr na h-Ulaidh [Sgor na h-Ulaidh]" ~ "Sgòr na h-Ulaidh",
munro == "Sgurr nan Ceathramhnan [Sgurr nan Ceathreamhnan]" ~ "Sgùrr nan Ceathreamhnan",
munro == "Stob Coir' an Albannaich" ~ "Stob Coir an Albannaich",
munro == "Stuc an Lochain [Stuchd an Lochain]" ~ "Stùcd an Lochain",
munro == "Càrn nan Gobhar (Strathfarrar)" ~ "Càrn nan Gobhar (Loch Mullardoch)",
TRUE ~ munro
)
)
After standardising the names, to facilitate the join I also had to convert to lower case, remove accents (whether they’re used differs between the datasets), and any parenthesis information from the Munro names.
After this cleaning, because multiple Munros have the same name, I needed to join on height to distinguish them as thankfully, there aren’t two Munros with the same name and height. However, a problem I wasn’t anticipating is that height measurements differed between the datasets. A lot of these can be put down to rounding - walkhighlands uses whole numbers whilst the DoBIH uses two decimal places. However, this doesn’t explain them all (the largest difference is 8.1 metres, that’s a lot!). I don’t know which one is “correct” or why they differ but given DoBIH is numerically more precise, I decided to use that as my measure of height in any analysis.
My AI-fuelled discovery was fuzzyjoin
which allows you to set a tolerance level for the join and pick a best match. I’ve never needed this before but it provided itself to be extremely useful - with a bit of trial and error I set a tolerance of 10m and manually checked the output to ensure everything had lined up correctly.
# 0) Choose a tolerance in metres (use Inf if you want “nearest regardless”)
tol_m <- 10
# 1) Normalise names in BOTH tables: remove (...) and [...], drop accents, lower-case, squish
x <- walkhighlands %>%
mutate(
munro_key = munro %>%
str_replace_all("\\s*\\([^)]*\\)", "") %>% # remove text in ( )
str_replace_all("\\s*\\[[^\\]]*\\]", "") %>% # remove text in [ ]
stri_trans_general("Latin-ASCII") %>%
str_to_lower() %>%
str_squish(),
height = parse_number(as.character(height)),
row_id_x = row_number()
)
y <- scottish_munros %>%
mutate(
munro_key = munro %>%
str_replace_all("\\s*\\([^)]*\\)", "") %>%
str_replace_all("\\s*\\[[^\\]]*\\]", "") %>%
stri_trans_general("Latin-ASCII") %>%
str_to_lower() %>%
str_squish(),
height = parse_number(as.character(height)),
row_id_y = row_number()
)
# 2) Fuzzy FULL join on exact munro_key + height within tolerance
candidates <- fuzzy_full_join(
x, y,
by = c("munro_key" = "munro_key", "height" = "height"),
match_fun = list(`==`, function(a, b) abs(a - b) <= tol_m)
) %>%
# standardise suffixes for older fuzzyjoin that uses .x/.y
rename_with(~ str_replace(.x, "\\.x$", "_wh")) %>%
rename_with(~ str_replace(.x, "\\.y$", "_dobih")) %>%
mutate(
height_diff = abs(height_wh - height_dobih),
height_diff = if_else(is.na(height_diff), Inf, height_diff)
)
# 3) Reduce to one nearest match per row on each side, preserving FULL-join behaviour
best_for_left <- candidates %>% group_by(row_id_x) %>% slice_min(height_diff, with_ties = FALSE) %>% ungroup()
best_for_right <- candidates %>% group_by(row_id_y) %>% slice_min(height_diff, with_ties = FALSE) %>% ungroup()
joined_dat <- bind_rows(best_for_left, best_for_right) %>%
distinct(row_id_x, row_id_y, .keep_all = TRUE) %>%
select(munro_wh, munro_dobih,
munro_key_wh, munro_key_dobih,
height_wh, height_dobih, height_diff, region,
xcoord:grid_ref, scramble:distance_km, ascent, spate, bog, first_route_title)%>%
mutate(
site_key = coalesce(grid_ref, paste0(xcoord, "_", ycoord)),
munro_wh_clean = str_replace(munro_wh, "\\s*\\([^)]*\\)$", "") # drop "(Loch Mullardoch)" etc.
) %>%
group_by(site_key) %>%
slice_min(height_diff, with_ties = FALSE) %>% # keep the single closest pair for that site
ungroup() %>%
mutate(munro_wh = munro_wh_clean) %>%
select(-munro_wh_clean) |>
mutate(time = (time_hours_min + time_hours_max) / 2) |>
select(-munro_dobih:-munro_key_dobih) |>
rename(munro = munro_wh)|>
mutate(scramble_exposed = case_when(
scramble & exposed ~ "Both",
scramble & !exposed ~ "Scramble",
!scramble & exposed ~ "Exposed",
TRUE ~ "Neither"
)) |>
mutate(scramble_exposed = factor(scramble_exposed,
levels = c("Neither", "Scramble", "Exposed", "Both"))) |>
mutate(wet = case_when(
spate & bog ~ "Both",
spate & !bog ~ "Large river",
!spate & bog ~ "Boggy",
TRUE ~ "Neither"
)) |>
mutate(scramble_exposed = factor(scramble_exposed,
levels = c("Neither", "Scramble", "Exposed", "Both")))|>
mutate(wet = factor(wet,
levels = c("Neither", "Large river", "Boggy", "Both")))
rm(x,y, best_for_left, best_for_right, candidates, tol_m)
I also created some manual colour scales on a nature theme:
nature_5 <- c(
"#355E3B",
"#4B4F58",
"#4682B4",
"#E07B39",
"#BDB76B"
)
nature_4 <- c(
"#355E3B",
"#4B4F58",
"#4682B4",
"#E07B39"
)
nature_13 <- c(
"#355E3B", # Pine green
"#6B8E23", # Moss
"#BDB76B", # Dry grass
"#8B5A2B", # Earth brown
"#D2B48C", # Sand
"#87CEEB", # Sky blue
"#4682B4", # Loch blue
"#191970", # Mountain shadow (midnight blue)
"#7D7D7D", # Granite grey
"#A9A9A9", # Slate grey
"#8E6C88", # Heather purple
"#E07B39", # Sunset orange
"#FFD700" # Sun yellow
)
Height differences
Now we are cooking. First I wanted to look into the height differences between walkhighlands and Database of British and Irish Hills a little more.
# 1. Extract the Munros with largest discrepancies
outliers <- joined_dat %>%
filter(height_diff >3) %>%
select(munro, height_diff) %>%
mutate(
x_pos = c(2, 2, 4.5, 5.9),
y_pos = c(45, 75, 95, 125),
y_arrow = 5
)
# 2. Plot with arrows + horizontal text
ggplot(joined_dat, aes(height_diff)) +
geom_histogram(boundary = 0, colour = "black", fill = "steelblue") +
scale_x_continuous(breaks = seq(0, 8, 1)) +
labs(title = "walkhighlands vs DoBIH heights",
subtitle = "wh measurements are always larger",
y = "Count",
x = "Difference in metres") +
theme_minimal(base_size = 14) +
geom_segment(data = outliers,
aes(x = height_diff, xend = height_diff,
y = y_arrow, yend = y_pos),
arrow = arrow(length = unit(0.2, "cm")),
colour = "black") +
# text labels off to the right
geom_text(data = outliers,
aes(x = x_pos, y = y_pos + 10,
label = munro),
hjust = 0, size = 4)
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Scrambling and exposure
I have a reasonably bad fear of heights so anything that mentions scrambling or exposure worries me. First question, are higher munros scarier?
Big thank you toJessica Moore for the inspiration for this one.
ggplot(joined_dat,
aes(height_dobih, scramble_exposed, fill = scramble_exposed)) +
geom_density_ridges(quantile_lines = TRUE, quantile_fun = mean,
vline_linetype = "dashed",
aes(color = "Mean height (m)")) +
scale_y_discrete(expand = c(0.01, 0)) +
scale_x_continuous(expand = c(0.01, 0)) +
scale_color_manual(values = c("Mean height (m)" = "black")) +
theme_economist() +
scale_fill_manual(values = nature_4) +
labs(x = NULL, y = NULL,
title = "Are higher Munros scarier?",
colour = NULL,
subtitle = "Routes descriptions that mention exposure tend to be on higher Munros")+
guides(fill = "none") +
theme(legend.position = "bottom",
legend.position.inside = c(0.8,0.10))
## Picking joint bandwidth of 38.4
For locating the scary munros, I decided that an interactive plotly map was called for so that you could easily isolate the different types.
Have I mentioned that I dislike heights and exposure?
munros_map <- joined_dat %>%
select(munro, xcoord, ycoord, height_dobih, scramble_exposed, wet) %>%
na.omit()
# Convert OSGB36 coordinates to sf object
munros_sf <- munros_map %>%
st_as_sf(coords = c("xcoord", "ycoord"),
crs = 27700) # EPSG:27700 is OSGB36 / British National Grid
# Transform to WGS84 (lat/long) for easier plotting
munros_lat_long <- munros_sf %>%
st_transform(crs = 4326)
# Extract coordinates for ggplot
munros_coords <- munros_lat_long %>%
mutate(
longitude = st_coordinates(.)[,1],
latitude = st_coordinates(.)[,2]
) %>%
st_drop_geometry() %>%
arrange(-height_dobih)
uk_map <- rnaturalearth::ne_countries(scale = "large",
country = "United Kingdom",
returnclass = "sf")
# Step 2: Specify shape codes (16 = circle, 17 = triangle, etc.)
shape_values <- c(
"Neither" = 16, # filled circle
"Scramble" = 17, # filled triangle
"Exposed" = 15, # filled square
"Both" = 18 # filled diamond
)
p <- ggplot() +
geom_sf(data = uk_map, fill = "lightgray", color = "darkgrey", size = 0.3) +
coord_sf(xlim = c(-8, -1.5), ylim = c(57, 58.6)) +
geom_jitter(data = munros_coords,
aes(x = longitude,
y = latitude,
shape = scramble_exposed,
colour = scramble_exposed,
text = munro),
size = 1,
height = .1,
width = .1) +
scale_x_continuous(breaks = NULL) +
scale_shape_manual(values = shape_values) +
scale_y_continuous(breaks = NULL) +
scale_colour_brewer(palette = "Dark2") +
labs(title = "Where are the scary Munros?",
subtitle = "Walk descriptions that reference:",
colour = NULL, shape = NULL) +
theme_economist() +
theme(
axis.text = element_blank(),
axis.ticks = element_blank(),
axis.title = element_blank(),
panel.grid = element_blank(),
panel.border = element_blank(),
legend.text = element_text(size = 10)
)
## Warning in geom_jitter(data = munros_coords, aes(x = longitude, y = latitude, :
## Ignoring unknown aesthetics: text
ggplotly(p, tooltip = "text")
By region
I also thought it would be fun to look into regional differences. The munros vary massively in character depending on where you are in the country (you can imagine hobbits living in Cairngorms whilst Skye would be home to dragons), but how is this reflected in the walk features?
Which region has the tallest Munros?
region_height <- joined_dat |>
group_by(region) |>
summarise(avg_height = mean(height_dobih, na.rm = TRUE), .groups = "drop") |>
slice_max(avg_height, n = 5)
ggplot(
semi_join(joined_dat, region_height, by = "region"),
aes(
x = height_dobih,
y = fct_reorder(region, height_dobih, .fun = mean, .desc = FALSE),
fill = region
)
) +
geom_density_ridges(
quantile_lines = TRUE, quantile_fun = mean,
vline_linetype = "dashed",
aes(colour = "Mean height (m)")
) +
scale_y_discrete(expand = c(0.01, 0)) +
scale_x_continuous(expand = c(0.01, 0)) +
scale_colour_manual(values = c("Mean height (m)" = "black")) +
theme_economist() +
scale_fill_manual(values = nature_5) +
labs(
x = NULL, y = NULL,
title = "Which region has the highest Munros?",
colour = NULL,
subtitle = "Top 5 regions displayed. On average, Loch Ness has the highest Munros"
) +
guides(fill = "none") +
theme(
legend.position = "inside",
legend.position.inside = c(0.8, 0.1),
legend.text = element_text(size = 10)
)
## Picking joint bandwidth of 36.4
Which region has the longest walks on average by distance?
# Top 5 regions by mean distance
region_distance <- joined_dat |>
group_by(region) |>
summarise(avg_distance = mean(distance_km, na.rm = TRUE), .groups = "drop") |>
slice_max(avg_distance, n = 5)
# Keep only those regions in the raw data
top_dat <- semi_join(joined_dat, region_distance, by = "region")
ggplot(
top_dat,
aes(
x = distance_km, # use the per-route variable here
y = fct_reorder(region, distance_km, .fun = mean, .desc = FALSE),
fill = region
)
) +
geom_density_ridges(
quantile_lines = TRUE, quantile_fun = mean,
vline_linetype = "dashed",
aes(colour = "Mean distance (km)")
) +
scale_y_discrete(expand = c(0.01, 0)) +
scale_x_continuous(expand = c(0.01, 0)) +
scale_colour_manual(values = c("Mean distance (km)" = "black")) +
theme_economist() +
scale_fill_manual(values = nature_5) +
labs(
x = NULL, y = NULL,
title = "Which region has the longest walks?",
colour = NULL,
subtitle = "Top 5 regions displayed. On average, Loch Ness has the longest walks"
) +
guides(fill = "none") +
theme(
legend.position = "inside",
legend.position.inside = c(0.82, .1),
legend.text = element_text(size = 10)
)
## Picking joint bandwidth of 2.91
Which region has the most ascent?
# Top 5 regions by mean distance
region_ascent <- joined_dat |>
group_by(region) |>
summarise(avg_ascent = mean(ascent, na.rm = TRUE), .groups = "drop") |>
slice_max(avg_ascent, n = 5)
# Keep only those regions in the raw data
top_ascent <- semi_join(joined_dat, region_ascent, by = "region")
ggplot(
top_ascent,
aes(
x = ascent, # use the per-route variable here
y = fct_reorder(region, ascent, .fun = mean, .desc = FALSE),
fill = region
)
) +
geom_density_ridges(
quantile_lines = TRUE, quantile_fun = mean,
vline_linetype = "dashed",
aes(colour = "Mean ascent (m)")
) +
scale_y_discrete(expand = c(0.01, 0)) +
scale_x_continuous(expand = c(0.01, 0)) +
scale_colour_manual(values = c("Mean ascent (m)" = "black")) +
theme_economist() +
scale_fill_manual(values = nature_5) +
labs(
x = NULL, y = NULL,
title = "Which region has the most ascent?",
colour = NULL,
subtitle = "Top 5 regions displayed. On average, Loch Ness has the most ascent"
) +
guides(fill = "none") +
theme(
legend.position = "inside",
legend.position.inside = c(0.82, .1),
legend.text = element_text(size = 10)
)
## Picking joint bandwidth of 115
And finally, where do you get the most bang for your buck?
Ullapool
joined_dat |>
count(region, first_route_title, name = "n") |>
group_by(region) |>
summarise(avg_count = mean(n), .groups = "drop") |>
ggplot(aes(
x = fct_reorder(region, avg_count),
y = avg_count,
fill = region
)) +
geom_col() +
scale_fill_manual(values = nature_13) +
coord_flip() +
guides(fill = "none") +
labs(x = NULL, y = "Number of Munros",
title = "Average number of Munros bagged per route by region")
Ascent
As you would expect, total ascent correlates strongly with total time, but there are some points of interest. Bidein a’ Choire Sheasgaich and Lurg Mhòr have the longest estimated walk time but their remoteness means that they’re an outlier in terms of the amount of ascent you’d expect for that time. The Fisherfield 6 claim the prize for most ascent by some distance but are bang on in terms of the ascent/time relationship. Meall Buidhe wins the award for being the quickest munro to bag, with the least ascent.
ggplot(joined_dat, aes(x = ascent, y = time)) +
geom_jitter() +
scale_x_continuous(breaks = seq(500, 2500, 250)) +
scale_y_continuous(breaks = seq(0, 20, 2)) +
annotate(geom = "curve",
x = 1200, y = 15,
xend = 1450, yend = 16,
curvature = -0.3,
arrow = arrow(length = unit(0.5, "lines"))) +
annotate("text",
x = 1150, y = 14.5,
label = "Bidein a' Choire Sheasgaich\nand Lurg Mhòr") +
annotate(geom = "curve",
x = 2050, y = 13.5,
xend = 2220, yend = 15,
curvature = -0.3,
arrow = arrow(length = unit(0.5, "lines"))) +
annotate("text",
x = 2100, y = 13,
label = "Fisherfield 6")+
annotate(geom = "curve",
x = 575, y = 2.9,
xend = 1250, yend = 4,
curvature = 0.3,
arrow = arrow(length = unit(0.5, "lines"))) +
annotate("text",
x = 1450, y = 4.5,
label = "Meall Buidhe")+
theme_economist() +
labs(x = "Ascent (m)",
y = "Time (hours)",
title = "Total ascent by time") +
theme(legend.position = "inside",
axis.title.x = element_text(margin = margin(t = 8)),
axis.title.y = element_text(margin = margin(r = 8)),
legend.position.inside = c(0.9, .2),
legend.text = element_text(size = 10)
)
And here’s an interactive version of that plot that adds in scrambling and exposure because had I mentioned, I am scared of heights.
p1 <- ggplot(joined_dat, aes(x = ascent, y = time)) +
geom_jitter(aes(shape = scramble_exposed,
colour = scramble_exposed,
text = munro),
size = 1, width = .3, height = .3) +
scale_x_continuous(breaks = seq(500,2500, 250)) +
scale_y_continuous(breaks = seq(0,20,2)) +
scale_shape_manual(values = shape_values) +
scale_colour_manual(values= nature_4) +
labs(x = "Ascent (m)",
y = "Time (hours)",
title = "Ascent by time",
shape = NULL,
colour = NULL) +
theme_economist() +
theme(
axis.title.x = element_text(margin = margin(t = 8)),
axis.title.y = element_text(margin = margin(r = 8)),
legend.text = element_text(size = 10)
)
## Warning in geom_jitter(aes(shape = scramble_exposed, colour = scramble_exposed,
## : Ignoring unknown aesthetics: text
ggplotly(p1, tooltip = "text")
Where’s wet?
If the text-mining for scrambling and exposure is a blunt tool then my approach here is even blunter. There are multiple words I could have searched for regarding the presence of water - river, stream, burn - but many of those represent features that don’t make a difference to the walk if they present no difficulty (“cross the bridge over the river”). I decided to use the word “spate” because when the river is large, walkhighlands often highlights that it would be difficult or impossible to cross “in spate”.
So these aren’t all the rivers, just ones where the description indicates crossing them might present an issue.
shape_values <- c(
"Neither" = 16, # filled circle
"River" = 17, # filled triangle
"Boggy" = 15, # filled square
"Both" = 18 # filled diamond
)
p2 <- ggplot() +
geom_sf(data = uk_map, fill = "lightgray", color = "darkgrey", size = 0.3) +
coord_sf(xlim = c(-8, -1.5), ylim = c(57, 58.6)) +
geom_jitter(data = munros_coords,
aes(x = longitude,
y = latitude,
shape = wet,
colour = wet,
text = munro),
size = 1,
height = .05,
width = .05) +
scale_x_continuous(breaks = NULL) +
scale_shape_manual(values = shape_values) +
scale_y_continuous(breaks = NULL) +
scale_colour_brewer(palette = "Dark2") +
guides(shape = "none") +
labs(title = "Where is wet?\n(Everywhere, it's Scotland)",
subtitle = "Walk descriptions that reference:",
colour = NULL, shape = NULL) +
theme_economist() +
theme(
axis.text = element_blank(),
axis.ticks = element_blank(),
axis.title = element_blank(),
panel.grid = element_blank(),
panel.border = element_blank(),
legend.text = element_text(size = 10)
)
## Warning in geom_jitter(data = munros_coords, aes(x = longitude, y = latitude, :
## Ignoring unknown aesthetics: text
ggplotly(p2, tooltip = "text")
I may have become a bit obsessed.
I must stop.
But if you have any other suggestions for analysis….just ask.