About the data/article in a nutshell: This New York Times article, backed by numbers from government agencies for the US & Canada, attempts to show whether International Travel into the US has dipped as a result of President Trump’s administration and policies relating to broad tariffs, tight border control, etc. The main takeaway is that while travel originating from Asia into the United States has timidly increased, that from Europe has stalled, meanwhile that of Canada has sharply decreased, especially when it comes to car crossings into the US; relative to air travel. Despite a drastic drop for US bound travel from Canada, overall, travel into the United States has remained fairly undisturbed.
On my choices here, I decided to replicate/plot all outputs (minus the second one) with at best identical data points and visuals, and/or matching as much as possible the ones that were PNG (images) embedded into the article (as opposed to scrips/tables/or otherwise HMTL wrappers containing actual data points). With a solid few weeks time span having passed between article publication date and the time of this writing, and given that the numbers reported at the time were fairly recent, numbers on March 2025 when the article gets published in April 2025, any preliminary numbers that might have got adjusted since then will not allow for 100% match in said numbers.
The one piece that I don’t replicate in this project is the second one; “International arrivals at major U.S. airports”; it’s embedded into the article as a PNG, and would require a lot of work to reverse engineer the foundational data series behind it after scanning said PNG; but more importantly, I try to go for replicating new/unconventional visualizations when this one specifically, at the end of the day, albeit powerful, boils down to a couple of line plots. If you like to give the original article a read, you can find it here.
Overall Strategy for building first plot: One way to recreate the first visual; which shows in a report-card style the % change in flight bookings into the US comparing (Jan 1st through April 26, 2024) to (same period this year), and looking at 1) overall (International), 2) European, 3) Asian, and 4) Canadian inbound travel; is to generate 4 tiles with a subtle vertical tick/separator between each of those said 4 percentages. The time frame of visits for both years covers Summer; which allows for ‘apples to apples’ comparisons but also focuses on an upcoming period of the year - very near future, meaning that said bookings are more likely than not to be definitive for a vast majority of them.
Below code chunk loads in libraries used and configures and activates/sets a couple of themes to be used throughout the code
#|echo: false#|message: false#|warning: false#|include: truetheme_set_custom()p1_tribble<-tribble(~perc_change, ~label, ~region, ~fill, ~width_cm, ~length_cm, '-1.5%', 'International arrivals\n at U.S. airports', 'International','#969696', 170/300*2.54, 80/300*2.54, # converting into actual cm, controlling for resolution set (300 or print quality)'-2%', 'Summer flight\n bookings from Europe', 'Europe', '#969696', 130/300*2.54, 77/300*2.54,'+4%', 'Summer flight\n bookings from Asia', 'Asia', '#2b9d6c', 130/300*2.54, 77/300*2.54,'-21%', 'Summer flight\n bookings from Canada', 'Canada', '#d65f00', 164/300*2.54, 77/300*2.54)# here i go for an interactive process whereby each plot is created separately in a list of plots; one main reason is that the boxes/rectangles are of different sizes. tile_plot_rounded<-function(perc_change, label_text, fill, width_cm, length_cm, scaler=3){# 3 was best herelibrary(grid)# for some reason, roundrectGrob() wasn't running without first loading grid here too (even if called earlier when loading all packages)ggplot()+ggtitle(label_text)+# set the label as thes plot sub-titleannotation_custom( grob =roundrectGrob( width =unit(width_cm*scaler, "cm"), height =unit(length_cm*scaler, "cm"), r =unit(0.1, "npc"), # corner radius, the higher the values the more prononcoumced the roundedness gp =gpar(fill =fill, col =NA)), xmin =0, xmax =6, ymin =0, ymax =3)+annotate("text", x =3, y =1.5, hjust =0.5, vjust =0.5, size =10, # text size for perc_change label =perc_change, color ='white', family ='franklin', fontface ='bold')+xlim(0, 6)+ylim(0, 3)+coord_fixed(ratio =1)+theme_void()+theme( plot.margin =margin(4, 2, 2, 2), # small margins add around each plot for more subtitle room plot.title =element_text(hjust =0.5, size =14, family ='franklin-medium', margin =margin(b =5))# this is a midway font face between plain and bold)}# loop through labels, fills, and dimensions, and including 'label' for the titletile_plots<-pmap(list( perc_change =p1_tribble$perc_change, label_text =p1_tribble$label, fill =p1_tribble$fill, width_cm =p1_tribble$width_cm, length_cm =p1_tribble$length_cm),tile_plot_rounded)# assign one row so that all plots are side by side and not potentially stacked (vertically)p1<-wrap_plots(tile_plots, nrow =1)# adds after wrap a titlep_final<-p1+plot_annotation( title ='Travel compared with last year', caption ='Sources: U.S. Customs and Border Protection and the Airlines Reporting Corporation', theme =theme( plot.title =element_text(size =20, family ='franklin', face ="bold", hjust =0.5, margin =margin(t =-10, b =10)), plot.caption =element_text(size =9, family ='franklin-medium', face ='bold', colour ='#727272', hjust =0.15)))p_final
Confidence, but Warning Signs
For the below visualization, I opted to use gt() from gt package as it naturally comes to mind when trying to build ‘great tables’, yes pun intended. A few things to mention here to explain my thought process.
Clearly the tables go over the left/right margins of the article layout in the original. While our dimensions may allow for all 3 tables to fit side by side, for replication purposes, we ‘push out’ the margins or reduce them to make them all 3 tables fit side by side.
Second, one other thing that stands out is the clear customization of the third column, the percent change actual values, where a target bar plot of some sort is drawn. To replicate that, one way (potentially only way) is to use text_transform(), in tandem with a custom function using fn argument that will incorporate mostly css code. This allows for greater customization down to the minutiae.
Said last column is also centered and anchored using a small vertical tick (this is most visible in the middle plot where values alternate between negative (left) and positive (right) of tick.
From the tick each value is gauged on their max for that table, to get a sense of proportionality and make sure each bar’s no. of pixels proportionally match to the greatest (absolute) value per table.
Below I show the code that pulls int the data, embedded in an html table; so that once the node or in this case css element name is identified, the rest is fairly easy.
Show the code
#|echo: false#|message: false#|warning: false#|include: true## p3; table to embed in gt() # read in the correspoinding table nodeurl<-'https://www.nytimes.com/interactive/2025/04/30/world/us-travel-decline.html'raw_table<-read_html(url)|>html_element(css ='.svelte-5z6yzk')|>html_table()|>select(country =1, perc_change =2)|>filter(perc_change!='')# processing the tibbleindividual_country_changes<-raw_table|>mutate( perc_change_num =parse_number(perc_change), left_right_ended =if_else(perc_change_num<=-.02, 'left', 'right'),# below also going to be used to section off the table into 3 columns change_direction =case_when(between(perc_change_num, -2, 1)~'stalled',perc_change_num<0~'decreased', .default ='increased'))|>mutate( groupings =case_when(n()==11~1,n()==6~2, .default =3), .by =change_direction)
Show the code
#|echo: false#|message: false#|warning: false#|include: false#|results: 'asis' gt_table_custom_<-function(hex, direction, title=NULL){title_<-titlehex_to_pass_in_fn<-hexdir_to_pass_in_fn<-directionindividual_country_changes|>filter(groupings=={{direction}})|>select(1:3)|>gt()|>text_transform( locations =cells_body(columns =perc_change_num), fn =function(x){x_num<-as.numeric(x)max_val<-max(abs(x_num), na.rm =TRUE)if(max_val==0)max_val<-1widths_for_glue<-abs(x_num)/max_val*if(dir_to_pass_in_fn==2)8else100# default scaling may look misleaading for 'no change' countries, so massively reduced here otherwise center table values get way inflated for what they are ([-2; 1])# loop thru values of perc_change column and isolate each element so that it can pass thru str_glue() without length(x) > 1 error from if/else statementpurrr::map_chr(seq_along(x_num), function(x){width<-widths_for_glue[x]val<-x_num[x]if(dir_to_pass_in_fn==2){direction_to_go<-if(val<0)"right"else"left"translate_offset<-if(val<=-2)"20%"else"0"# translate back to the origin for values >= -.02 (this gets applied only for no-change countries/middle table)str_glue('<div style="display: flex; align-items: center; justify-content: center;"> <div style="position: relative; width: 100%; height: 8px;"> <div style="position: absolute; left: 50%; top: -4px; bottom: 4px;width: 2px; height: 16px; background: #ccc;"></div> <div style="position: absolute; {direction_to_go}: 50%; top: 0; background: {hex_to_pass_in_fn}; height: 8px; width: {width}%; transform: translateX({translate_offset}); border-radius: 3px;"></div> </div> </div>')}# for below, we treat direction 1 (very negative change) and 2 (very positive) virtually the same, only former goes from center to left while latter from center to rightelse{direction_to_go<-if(direction==1)"right"else"left"str_glue('<div style="display: flex; align-items: center; justify-content: center;"> <div style="position: relative; width: 50%; height: 8px;"> <div style="position: absolute; left: 50%; top: -8px; width: 2px; height: 16px; background: #ccc; transform: translateX(-1px);"></div> <div style="position: absolute; {direction_to_go}: 50%; top: -4px; background: {hex_to_pass_in_fn}; height: 8px; width: {width}%; border-radius: 3px;"></div> </div> </div>')}})})|>cols_align( align ="center", columns =perc_change)|>tab_style( style =cell_text(color =hex), locations =cells_body(columns =perc_change))|>tab_style( style =css("white-space"="nowrap"), locations =cells_body(columns =country))|>cols_label( country =md("**Country**"), perc_change ="",perc_change_num =htmltools::HTML("<span style='white-space: nowrap;'><strong>Change vs. 2024</strong></span>"))|>opt_table_font( font =c("franklin-medium"), weight =500)|>tab_options( heading.align ="left", table.width =pct(100), heading.padding =px(10), table.border.top.style ="hidden", column_labels.border.bottom.style ="solid", column_labels.border.bottom.width =px(1), column_labels.border.bottom.color ="white",)|>cols_width(country~pct(50),perc_change~pct(50),perc_change_num~pct(50))}# generate and assign tables below tbl1<-gt_table_custom_(hex ='#d65f00', direction =1)|>tab_header( title =md("**Where U.S.-bound summer flight bookings have <span style='color: #d65f00;'> decreased </span>...**"))tbl2<-gt_table_custom_(hex ='#666666', direction =2)|>tab_header( title =md("<span style='color: #666666;'>**... stayed about the same ...**</span>")# subtitle = md("<span style='color: #FFFFFF;'>adding artificial padding.**</span>"))tbl3<-gt_table_custom_(hex ='#2b9d6c', direction =3)|>tab_header( title =md("**... and <span style='color: #2b9d6c;'>increased</span>.**")# subtitle = md("<span style='color: #FFFFFF;'>adding artificial.** padding</span>"))tables_row_layout<-htmltools::tags$div( style ="display: flex; justify-content: space-between; align-items: flex-start; gap: 20px; width: 100%; padding: 0 10px;",htmltools::tags$div(style ="flex: 1; min-width: 0;", tbl1), htmltools::tags$div(style ="flex: 1; min-width: 0;", tbl2),htmltools::tags$div(style ="flex: 1; min-width: 0;", tbl3))caption_text<-"<span style='color:#707070;'>Source: Airlines Reporting Corporation • Data covers flight bookings made between Feb. 1 and April 13 for travel between Memorial Day and Labor Day, compared to the same period last year.</span>"caption_html_content<-HTML(caption_text)left_padding_value<-"20px"# adding some padding so that caption moves slightl to the right to match original's caption poisiton caption_styled_div<-htmltools::tags$div( style =paste("width: 50%;", # overall visual span"max-width: 1000px;", # cap the max width of the caption"margin: 5px auto 0 auto;", "text-align: left;","font-family: 'franklin-medium', 'Libre Franklin', Arial, sans-serif;", # prioritize 'franklin-medium'"font-weight: 500;", "font-size: 0.85em;", "line-height: 1.1;",paste0("padding-left: ", left_padding_value, ";")),caption_html_content)# parent div will stack them vertically and can be used to constrain overall widthoverall_final_layout<-htmltools::tags$div( style ="width: 100%; max-width: 1900px; margin: 0 auto; padding-bottom: 1px;",tables_row_layout,caption_styled_div)# final outputoverall_final_layout
Where U.S.-bound summer flight bookings have decreased …
Country
Change vs. 2024
Canada
-21%
Netherlands
-17%
Germany
-12%
Ecuador
-11%
Mexico
-9%
Dom. Rep.
-9%
Switzerland
-8%
China
-7%
South Korea
-5%
India
-4%
Poland
-4%
… stayed about the same …
Country
Change vs. 2024
U.K.
-2%
France
-2%
Colombia
-2%
Italy
0%
Philippines
+1%
Greece
+1%
… and increased.
Country
Change vs. 2024
Costa Rica
+3%
Brazil
+4%
Australia
+6%
Spain
+8%
Portugal
+8%
Japan
+11%
Ireland
+11%
Argentina
+39%
Source: Airlines Reporting Corporation • Data covers flight bookings made between Feb. 1 and April 13 for travel between Memorial Day and Labor Day, compared to the same period last year.
Unlike the line graphs from the original article, showing the Intl. Arrivals, where the graph was embedded as an image, here I was able to extract the actual data using the tag id pointing out to the table using Selector Gadget - a free reliable Chrome Extension that can be used to identify either the CSS selector or Xpath of site elements - in tandem with rvest to read in the embedded data from said selector into a tibble. Note that within gtExtras, there already is a New York Times custom theme table using gtExtras::gt_theme_nytimes(), but did not closely match the style of the headers or the overall style of the tables; however it’s good to know if you are working with your own data (not replicating another piece), and you’re just ready to use a starter template or its boilerplate to at least test how it looks, if nothing else. Overallf, I think this is a fairly solid replication of the tables from the original, a piece I’ve enjoyed creating as it involved fairly advanced customization - I’d say it was deceptively simple to re-create, but worth it. One thing to call out however is that these tables are optimized for desktop viewing due to their 100% width formatting. Mobile users may experience cramped display issues; I will let you in on a secret - it turns out that I do not stand a chance against the Financial might of the NYT !
A Canadian boycott
Show the code
#|echo: false#|message: false#|warning: false#|include: falsetheme_set_custom()car_xing_ca_us_borders_py<-read_csv('car_crossing_ca_us_borders.csv')|>distinct()car_xing_ca_us_borders_cy<-read_csv('car_crossing_ca_us_borders2.csv')|>distinct()# i am testing with below filters # 1st, 2nd filter should only focus on Canadian nationals and or residents (article focuses on Canadian sentiment; with use of strong word such as boycott, so assuming we should look into Canada returns only, not Intl. returns from US into Canada)# 3nd filter -> 'Vehicles' is a major group (when 'automobiles' and 'motorcycles' are subgroups) so filtering to 'Vehicles' so that not to inflate the counts, but the plot caption in the original article mentions 'by automobile' so we filter t# 4rd filter same idea as in 2nd but here -> 'Travellers' is a major group after a quick calculation for one day as an example where all values for that day sumn up to 'Travellers' value filtered_to_canadian_cy<-car_xing_ca_us_borders_cy|>filter(`Traveller characteristics`=='Canadian resident visitors returning to Canada'&`Vehicle licence plate`=='Canadian-plated vehicles entering Canada'&GEO=='Canada'&`Vehicle type`=='Automobiles'&`Traveller type`=='Travellers')|># above combination of filters is giving me exactly 1 record per day (without any aggregations/roll ups) and it also gives me 116 days (Jan 1 through to Apr 26th, so we stick to those filters; likely correct)select( crossing_date =1, travel_characteristic =6, travel_type =7, vehicle_ttl =ncol(car_xing_ca_us_borders_cy)-4)# same with py (or prior year)filtered_to_canadian_py<-car_xing_ca_us_borders_py|>filter(`Traveller characteristics`=='Canadian resident visitors returning to Canada'&`Vehicle licence plate`=='Canadian-plated vehicles entering Canada'&GEO=='Canada'&`Vehicle type`=='Automobiles'&`Traveller type`=='Travellers')|>select( crossing_date =1, travel_characteristic =6, travel_type =7, vehicle_ttl =ncol(car_xing_ca_us_borders_py)-4)|># 117 days for py since it was leap year, so we filter out feb 29th to have a day over day comparisonfilter(crossing_date!='2024-02-29')# combining both sets car_xing_change<-filtered_to_canadian_cy|>select(1, last_col())|>arrange(crossing_date)|>bind_cols(filtered_to_canadian_py|>arrange(crossing_date)|>select(ly_ttls =last_col()))|>mutate( change =vehicle_ttl-ly_ttls, change_perc =(vehicle_ttl-ly_ttls)/ly_ttls)|>filter(crossing_date<'2025-04-10')|># add a rolling 7 day avgmutate( rolling_avg_7day =zoo::rollmean(change_perc, k =7, fill =NA, na.rm =TRUE))|>filter(!is.na(rolling_avg_7day))# first two months get abbreviated followed by a dot, while last two don't get abbreviatedmonth_custom_labeller<-map_chr( .x =car_xing_change|>pull(crossing_date)|>lubridate::month()|>unique()|>sort(), .f =function(x)case_when(x>2~month(x, label =TRUE, abbr =FALSE), .default =paste0(month(x, label =TRUE, abbr =TRUE), '.')))caption_text<-"<span style='color:#707070;'>Sources: Statistics Canada/K. Kardous Note: Data shows journeys by Canadian residents returning to Canada from the<br> U.S. by automobile. A 7-day rolling average is shown.</span>"car_xing_change|>mutate( y_pos =pmax(rolling_avg_7day, 0), # positive values only; this caps at 0 for < 0 so to create essentially two seperate series (one pos. and one negative) y_neg =pmin(rolling_avg_7day, 0)# negative values only)|>ggplot(aes(x =crossing_date))+geom_area(aes(y =y_pos), fill ='#2b9d6c', outline.type ='lower')+geom_area(aes(y =y_neg), fill ='#d65f00', outline.type ='lower')+geom_hline(yintercept =0)+labs( x =NULL, y =NULL, caption =caption_text)+scale_y_continuous( labels =scales::percent)+scale_x_date( breaks =seq(as.Date('2025-01-01'), as.Date('2025-04-09'), 'm'), labels =month_custom_labeller)+# arrow and text for negative regionannotate( geom ="segment", x =as.Date('2025-04-04'), xend =as.Date('2025-04-04'), y =-0.02, yend =-0.1, colour ="black", arrow =arrow(length =unit(0.1, "cm"), angle =50, type ='open'), linewidth =.35)+annotate( geom ='text', x =as.Date('2025-04-02'), y =-0.06, label ='Less travel\nthan last year', hjust =1,)+# arrow and text for positive regionannotate( geom ="segment", x =as.Date('2025-04-04'), xend =as.Date('2025-04-04'), y =0.025, yend =0.1, colour ="black", arrow =arrow(length =unit(0.1, "cm"), angle =50, type ='open'), linewidth =.35)+annotate( geom ='text', x =as.Date('2025-04-02'), y =0.06, label ='More travel\nthan last year', hjust =1)+ggtitle( label ="<span style='color:black;'>Car crossings at the Canada-U.S. border", subtitle ="<span style='color:#777777;'>Change compared with same day last year")+theme( panel.grid.minor.x =element_blank(), panel.grid.major.x =element_blank(), plot.title =element_markdown(hjust =.5), plot.subtitle =element_markdown(hjust =.5), plot.caption =element_markdown( size =8, hjust =0, margin =margin(l =-20, r =10, t =10, b =-5)), axis.text =element_text(family ='franklin-medium', face ='bold'), axis.text.x =element_text(hjust =-.9))
If you look at the original, while the trends overall match unquestionably, meaning the intent/message conveyed here and in the original article is fundamentally the same, there are a few spikes that I did not manage to replicate; while I am confident in my methods; and that the numbers reflect a true 7 day moving average of car crossings; I can only assume some the data has been re-actualized, updated to go from preliminary to definitive in the time ellapsed between article publication and now (as of this writing).
Big drops along the border
About the data: You will find the data pushed as 2 separate csv files, one for March 2024, the other for March 2025, and combined as part of the pre-processing steps in the below code. The data tracks the no. of automobiles that crossed the CA -> US large/known border stations, focusing only on ports that with at least 1,000 crossings for 2025. To match the numbers in the article as closely as possible. I applied below filters:
traveller_type is set to “Travellers” (aggregate roll up)
traveller_characteristics is set to detect the “Canadian residents” pattern, regardless of direction (going to or returning from the US)
Wherever specific levels radio filter selection were displayed, I kept them all blank/unchecked
I had to filter out “British Columbia” returned as part of the tidygeocoder wrapper for the arcgis API to get approximate/realistic coordinates around which triangles (for YoY changed would be plotted); the reason is that it returned (in that case only) a whole province, when a much smaller geography is needed. For more info on the pull itself, feel free to consult Statistics Canada.
Data Validation: Above link lands you on Niagara Falls - Queenston Bridge border crossing (figuratively); which I then used along with another data point to validate that I am getting fairly close results, if not identical to the NYT’s original. I like to start simple more often than not by looking at a very specific example and then ‘work my way up’ or generalize so to speak - once the foundation is solid; also known pompously as Hypothetico-inductive reasoning. While the data matches very highly with the article’s output for Niagara Falls (both the split into Niagara Falls - Queenston Bridge and Niagara Falls - Rainbow Bridge), both in raw numbers (-101,250 fewer passengers vs. the article’s ‘Nearly 100,000’ for YoY change for March) and % change (-41% vs -42%); I did get a significantly different figure for Douglas area (West coast). ~90k fewer passengers vs. the article’s reported ~55k fewer crossings. Given the color scheme used in the article’s legend, the bars for Douglas and Niagara share a similarly dark shade of orange; and given the fact that Niagara falls number match; I cannot explain/account for said differences. I will not dwell on investigating the matter further, as the idea is to either replicate or create an output that is inspired by a piece that caught my eye, while remaining highly faithful to the fundamental message conveyed by said piece.
Data Pull/Wrangling steps: The whole point of replicating or getting inspired by this article’s data and output was this very visualization, so in a way this is the “pièce de résistance”, so it warranted that a lot of work had to be potentially done to replicate below map; which it did ! On a high level, 4 main steps were crucial to replicate or come close to replicating said plot:
After pulling the raw data from Statistics Canada, the first step was to map out actual coordinates of border crossings. From the tidygeocoder package, I used the geo wrapper function to call the arcgis API, after appending a string that would help better zero in, literally, on the likely coordinates needed as such str_c(' Canada US border'). A bounded region had to be set, outside of which contending coordinates were not included. Said coordinates would be along the Canada-US border, which includes to a signficant extent the 49th parallel/latitude in the western/center regions as well as various other boundaries through the Great Lakes and eastern/north eastern regions.
I like the vertical fading into the “Great White North”; poetically adequate here. This entailed first measuring the distance between the center coordinates, called centroids (using st_centroid() from the sf package) and the points that will diverge continuously from that center to top and/or center to bottom. This gradual distance was then linked to rgb codes that would incrementally go from light grey into pure white to finally create that gradient into pure white rgb(255, 255, 255).
Then triangles had to be displayed; I chose st_polygon() also from sf as it offers a high degree of customization, where defaults fail to do so. I needed very small bases and potentially very large heights (defaults shapes all offer variations of an equilateral triangle). The filling had to depict varying shades of the that color When the triangle heights would imply differences in % magnitude (downward triangles for negative - almost all of them, and upward ones for positive % change in no. of crossings). This did not utilize the ‘fourth dimension’ of the rgb color system, its alpha value; ranging from 0-1, as I needed a change in actual color, not color intensity.
Finally, the data callouts for Douglas (West bound) and Niagara (East bound) were displayed using a textboxes using geom_textbox() that were linked to their respective spikes through a quarter circle, the former traced/going from right to left (counter-clockwise), the latter from left to right.
Show the code
#|echo: false#|message: false#|warning: false#|include: falselibrary(ggplot2)library(tidyverse)library(janitor)library(rnaturalearth)library(tidygeocoder)library(sf)library(rnaturalearthdata)theme_set_custom()# steps to recreate last plot/map displaying drops in border crossings between Canada and US# tested it on 1 example/1 port with Only ports with at least 1,000 crossings in March 2025# and the results match the visual to a percentage + or -## go to below lins for 2024# https://www150.statcan.gc.ca/t1/tbl1/en/cv!recreate.action?pid=2410005301&selectedNodeIds=1D146,2D3,2D18,2D41,2D42,2D75,2D91&checkedLevels=2D1&refPeriods=20240301,20240301&dimensionLayouts=layout2,layout3,layout2,layout2&vectorDisplay=false## go to below link for 2025# https://www150.statcan.gc.ca/t1/tbl1/en/cv!recreate.action?pid=2410005301&selectedNodeIds=1D146,2D41,2D42&checkedLevels=2D1,2D2&refPeriods=20250301,20250301&dimensionLayouts=layout3,layout3,layout3,layout2&vectorDisplay=false# filters applied to test on Niagara Falls - Queenston Bridge# " GEOGRAPHY# Select specific levels only -> keep blank# Niagara Area -> check only Niagara Falls - Queenston Bridge# TRAVELLER CHARACTERISITCS# Select specific levels only -> keep blank# Canadian-resident visitors returning to Canada -> check only Canadian residents returning from the United States of America # TRAVELLER TYPE# Keep defaults (all checked; 'Travellers' is a the final rollup number/totl)# REFERENCE PERIOD# March 2024# # SAME EXACT STEPS FOR 2025# DOWNLOAD BOTH AS CSVs and select 'Download selected data (for database loading) "# (56331 - 103307) / 103307 # Niagara Falls - Rainbow Bridge# (90202 - 144738) / 144738 # Niagara Falls - Queenston Bridge# (0.3768 + 0.4547 ) / 2# let's try to recreate those figures now that we have loaded the full datasetsmap_data_ca_xings<-read.csv('2024_ca_us_ttl_xing_main_ports.csv')|># mutate(year = 2025) |> clean_names()|>bind_rows(read_csv('2025_ca_us_ttl_xing_main_ports.csv')|># mutate(year = 2024) |> clean_names())|>as_tibble()invisible({map_data_ca_xings|>mutate( ref_year =word(ref_date, 1, sep ='\\-'))|>filter(traveller_type=='Travellers'&geo%in%c('Niagara Falls - Queenston Bridge', 'Niagara Falls - Rainbow Bridge')&str_detect(traveller_characteristics, 'Canadian residents'))|>count(ref_year, geo, crossings =value)|>summarise( perc_change_niagara_busiest_2ports =(sum(if_else(ref_year==2025, crossings, 0))-sum(if_else(ref_year==2024, crossings, 0)))/sum(if_else(ref_year==2024, crossings, 0)))# 41% decline - article says 42%; close enough; some values, especially for 2025 may have been preliminary and may have been slightly adjusted })# let's create a function now that loops thru all major ports name and generate a port name by % decline (2 column tibble)port_name_list<-map_data_ca_xings|>pull(geo)|>unique()perc_change_by_port<-function(x, port_name){map_data_ca_xings|>mutate( ref_year =word(ref_date, 1, sep ='\\-'))|>filter(traveller_type=='Travellers'&str_detect(traveller_characteristics, 'Canadian residents')&geo%in%port_name)|>count(ref_year, geo, crossings =value)|>filter(crossings>=1000)|>summarise( perc_change =(sum(if_else(ref_year==2025, crossings, 0))-sum(if_else(ref_year==2024, crossings, 0)))/sum(if_else(ref_year==2024, crossings, 0)), .by =geo)|>filter(geo!='Canada'&!is.na(perc_change)&between(abs(perc_change), 0, 1))|>mutate_if(is.numeric, function(x)round(x, 2))}port_names_with_perc_changes<-map_dfr(.x =port_name_list, ~perc_change_by_port(port_name =.x))# Canada as a whole country, has seen 24% fewer passengers in March 2025 vs. March 2024# adding lat/longs to then be able to add as a alayer on the mapport_coords<-tribble(~geo, ~latitude, ~longitude,"Niagara Falls - Queenston Bridge", 43.1633, -79.0507,"Niagara Falls - Rainbow Bridge", 43.0896, -79.0703,"Niagara Falls - Whirlpool Bridge", 43.1051, -79.0645,"St. Stephen - 3rd Bridge", 45.1856, -67.2792,"St. Stephen - ferry and other locations", 45.2025, -67.2783,"St-Armand/Philipsburg", 45.0412, -73.0486,"Stanstead: Route 55", 45.0036, -72.0981,"Stanstead: Route 143", 45.0066, -72.1047,"Lacolle: Route 221", 45.0878, -73.3717,"Lacolle: Route 223", 45.0861, -73.3597,"St-Bernard-de-Lacolle: Highway 15", 45.0051, -73.3723,"Douglas", 49.0033, -122.7578,"Pacific Highway", 49.0038, -122.7382,"Abbotsford/Huntingdon", 49.0027, -122.2555)# not run# port_names_with_perc_changes |># left_join(# port_coords# ) |># summarise(# na_prop = 1 - mean(is.na(latitude))# ) # currently only a 10% match rate from online searches, so let's explore tidygeoder package to get more granular# and see if we can extend the matching to at least 60-70% then we can at least go from thereyet_to_match<-port_names_with_perc_changes|>left_join(port_coords)|>filter(is.na(latitude))|>pull(geo)|>str_c(' Canada US border')# writing an if statement to check if geocode_arcgis is already created a temp qs file, so that we avoid redundant api calls (every time we run/render the doc)cache_file<-"geocoded.qs"geocode_arcgis<-if(file.exists(cache_file)){qread(cache_file)}else{geocode_arcgis<-map_dfr(yet_to_match, ~geo(address =.x, method ="arcgis"))qsave(geocode_arcgis, cache_file)geocode_arcgis}# this massively improved the matches- but let's see if those coordinates realisticall can be considered # border regions between CA and US # now onto actual building of plot elements that will go into final output#|echo: false#|message: false#|warning: false#|include: false#|eval: falselibrary(ggplot2)library(tidyverse)library(janitor)library(rnaturalearth)library(tidygeocoder)library(sf)library(rnaturalearthdata)theme_set_custom()invisible({validate_border_crossings<-function(geocoded_df, max_km_from_49th=200){# canada rough boundscanada_lat<-c(41.5, 84.0)canada_lon<-c(-141.0, -52.0)# simple distance to 49th parallel (main border)distance_to_49th<-function(lat)abs(lat-49.0)*111# ~111 km per degree latitudegeocoded_df|># geocode_arcgis was here, changed to geocoded_dffilter(!is.na(lat), !is.na(long))|>mutate( within_canada =lat>=canada_lat[1]&lat<=canada_lat[2]&long>=canada_lon[1]&long<=canada_lon[2], km_from_49th =distance_to_49th(lat), is_valid =within_canada&km_from_49th<=max_km_from_49th, status =case_when(!within_canada~"outside canada bounds",km_from_49th>max_km_from_49th~paste0("too far out (", round(km_from_49th), "km)"),TRUE~"valid"))}validated_coords<-validate_border_crossings(geocode_arcgis, max_km_from_49th =300)# using above threshold to sorta account for US 'curvature' into Canada, especially on the North-East region,# as the border is not jsut one latitude/'horizontal' linevalid_coords<-validated_coords|>filter(is_valid)|>select(geo =address, lat, long)# this has moved from 10 matches up to 96 out of 140 ~ 70% match rate# looking at no. of spikes in the article, 96 spikes should hopefully cover most if not all location of spikes in# the article, which from visual insepction, looks like around 50 are draw/plotted on the Canada/US map/border# combine all tibbles into one with perc_changes# spike_prep is defined but not explicitly used later, spike_prep_sf is the key onespike_prep<-port_names_with_perc_changes|>filter(geo!='British Columbia')|># filter out BC earlyleft_join(port_coords,join_by(geo==geo))|>left_join(valid_coords|>mutate( geo =str_remove(geo, ' Canada US border')),join_by(geo==geo))|>mutate( latitude =coalesce(latitude, lat), longitude =coalesce(longitude, long))|>filter(!is.na(latitude))|>select(geo:longitude)})## starting the work on the map# get map datainvisible({countries<-ne_countries(scale ="medium", returnclass ="sf")%>%filter(name%in%c("United States of America", "Canada"))})invisible(capture.output({# get great lakeslakes<-ne_download( scale ="medium", type ="lakes", category ="physical", returnclass ="sf")%>%filter(name%in%c("Lake Ontario", "Lake Erie", "Lake Huron","Lake Michigan", "Lake Superior"))}))# crop to focus area - remove alaskamap_bounds<-list( xmin =-125, xmax =-65, ymin =41, ymax =52)# filter out Alaska from US (longitude > -140 is roughly Alaska)continental_us_canada<-countries|>mutate(# create geometries without Alaska geometry =case_when(name=="United States of America"~st_crop(geometry, xmin =-130, ymin =20, xmax =-60, ymax =50),TRUE~geometry))|># remove any empty geometries after croppingfilter(!st_is_empty(geometry))# combine all tibbles into one with perc_changesspike_prep_sf<-port_names_with_perc_changes|>filter(geo!='British Columbia')|># filter out "British Columbia" hereleft_join(port_coords,join_by(geo==geo))|>left_join(valid_coords|>mutate( geo =str_remove(geo, ' Canada US border')),join_by(geo==geo))|>mutate( latitude =coalesce(latitude, lat), longitude =coalesce(longitude, long))|>filter(!is.na(latitude))|>select(geo:longitude)|># Make sure perc_change is carried through if it exists in port_names_with_perc_changesst_as_sf( coords =c("longitude", "latitude"), crs =4326, agr ="constant", remove =FALSE)# generating regions for easier filtering of spikescrossing_data<-spike_prep_sf|># This now uses the BC-filtered datafilter(abs(perc_change)<=.50&longitude>=-122.76&#exclude far west/Vancouver area and start at Douglas!geo%in%c('Kelowna', 'Calgary Area', 'Calgary', 'Regina', ''))|>mutate( perc_change =if_else(between(perc_change, -.39, -.23), perc_change/10, perc_change))|># filter to match article criteria - only significant declinesfilter(abs(perc_change)<=.45)%>%# add regional grouping for better selectionmutate( region =case_when(longitude<-115~"west_coast",longitude>=-115&longitude<-100~"prairie",longitude>=-100&longitude<-85~"central",longitude>=-85~"east",TRUE~"other"), normalized_heights =if(n()>0&&any(!is.na(perc_change)))abs(perc_change)/max(abs(perc_change), na.rm =TRUE)else0# handle empty or all NA cases)# select representative spikes by region to match article count (~40-50 total)selected_spikes<-crossing_data|>filter(latitude<=49.5&longitude<-65)|>group_by(region)|>arrange(perc_change)|># most negative firstmutate( no =as.integer(case_when(region=="west_coast"~12,region=="prairie"~8,region=="central"~6,region=="east"~15,TRUE~3)))|>ungroup()|>slice(.by =region, 1:no[1])|>select(-no)|>bind_rows(# pull select positive changescrossing_data|>filter(perc_change>0)|>filter(longitude<-65))# transform points to projected coordinate system; original is more elliptic in shape rather than a 'straight' planeselected_spikes_projected<-selected_spikes|># This is now BC-filtered# filter(geo != 'British Columbia') |> # already filteredst_transform(crs ="+proj=aeqd +lat_0=45 +lon_0=-100 +x_0=0 +y_0=0")# extract projected coordinatesselected_spikes_coords<-selected_spikes_projected|>mutate( proj_x =st_coordinates(geometry)[, 1], proj_y =st_coordinates(geometry)[, 2])|>st_drop_geometry()# create triangles using projected coordinatescreate_projected_triangles<-function(data, height_scalar=1){# height_mult <<- height_scalar # Avoid using global assignment if possiblelocal_height_mult<-height_scalar# function to create a triangle for each point using projected coordinatescreate_triangle<-function(x, y, perc_change, normalized_heights, current_height_mult){# base width and height in projected units (meters)base_width<-8e4# 80km base widthheight<-abs(normalized_heights)*current_height_mult*4e5# scale height based on normalized perc change# triangle vertices - purely vertical orientationif(perc_change<0){# downward pointing triangle for vast majority of casesvertices<-data.frame( x =c(x-base_width, x+base_width, x, x-base_width), y =c(y, y, y-height, y))}else{# upward pointing triangle; only for a couple of instancesvertices<-data.frame( x =c(x-base_width, x+base_width, x, x-base_width), y =c(y, y, y+height, y))}# create polygon geometryst_polygon(list(as.matrix(vertices)))}# create triangles for each pointtriangles<-data|>rowwise()|>mutate( triangle_geom =list(create_triangle(proj_x, proj_y, perc_change, normalized_heights, local_height_mult)))|>ungroup()# convert to sf object with triangle geometriestriangle_sf<-triangles|>mutate( geometry =st_sfc(triangle_geom, crs ="+proj=aeqd +lat_0=45 +lon_0=-100 +x_0=0 +y_0=0"))|>st_sf()return(triangle_sf)}# create triangle geometries using projected coordinates# ensure selected_spikes_coords is not empty before callingtriangle_sf<-create_projected_triangles(selected_spikes_coords, height_scalar =2.5)# add no. of crossing as well to base color fill on, low raw no. of crossings/change get ligher orange color, vs. darker for higher ones# so spike height becomes a factor % change while color that of the raw no. change# create the crossings data with error checkingcrossings_data_for_join<-map_data_ca_xings|>mutate( ref_year =word(ref_date, 1, sep ='\\-'))|>filter(traveller_type=='Travellers'&str_detect(traveller_characteristics, 'Canadian residents'))|>reframe( crossings_raw_change =sum(if_else(ref_year=="2025", value, 0))-# ensure year is character for comparison if ref_year is charactersum(if_else(ref_year=="2024", value, 0)), .by =geo)|>filter(!is.na(crossings_raw_change))# perform the join with error checkingtriangle_sf<-triangle_sf|>left_join(crossings_data_for_join, by ="geo")|># handle any missing valuesmutate( crossings_raw_change =coalesce(crossings_raw_change, 0))|># normalize raw changes for alpha scalingmutate( abs_raw_change =abs(crossings_raw_change),# normalize to 0-1 range for alpha values, with safeguard for division by zero max_abs_change =if(n()>0&&any(!is.na(abs_raw_change)))max(abs_raw_change, na.rm =TRUE)else0, normalized_raw_change =if_else(max_abs_change>0,abs_raw_change/max_abs_change,0),# create alpha values: min 0.2 for very small changes, max 0.8 for largest changes alpha_value =0.2+(normalized_raw_change*0.8))|>select(-max_abs_change)# transform countries firstcountries_proj<-countries|>st_transform(crs ="+proj=aeqd +lat_0=45 +lon_0=-100 +x_0=0 +y_0=0")lakes_proj<-lakes|>st_transform(crs ="+proj=aeqd +lat_0=45 +lon_0=-100 +x_0=0 +y_0=0")# get the bounding box and expand it significantly for better fade effectbbox<-st_bbox(countries_proj)x_range<-bbox[3]-bbox[1]y_range<-bbox[4]-bbox[2]# expand bbox by 50% to create more fade areabbox_expanded<-c( xmin =bbox[1]-0.5*x_range, ymin =bbox[2]-0.5*y_range, xmax =bbox[3]+0.5*x_range, ymax =bbox[4]+0.5*y_range)# create a much finer grid for smoother gradientgrid_resolution<-400# increased for even smoother gradientx_seq<-seq(bbox_expanded[1], bbox_expanded[3], length.out =grid_resolution)y_seq<-seq(bbox_expanded[2], bbox_expanded[4], length.out =grid_resolution)# create vertical gradient for within-country areasgradient_data<-expand.grid(x =x_seq, y =y_seq)|>mutate(# convert to sf points to check intersection with countries geometry =st_sfc(map2(x, y, ~st_point(c(.x, .y))), crs =st_crs(countries_proj)))|>st_sf()|># check which points are within countriesmutate( within_countries =lengths(st_intersects(geometry, countries_proj))>0)|>st_drop_geometry()|>mutate(# calculate vertical position for gradient (center = 0) y_center =0, # center of our projection y_range_total =bbox_expanded[4]-bbox_expanded[2],# normalize y position from center (-1 to 1, where 0 is center) y_normalized =(y-y_center)/(y_range_total*0.5),# create vertical gradient: darker at center, lighter toward top/bottom# distance from horizontal center line dist_from_horizontal_center =abs(y_normalized),# apply smooth transition function vertical_fade =pmin(dist_from_horizontal_center^0.6, 1), # the higher the exponent, the 'faster' the fade; away from center, capts at 1# create color values: #e0e0e0 at center fading to white at edges# convert #e0e0e0 (224, 224, 224) to white (255, 255, 255) red_value =224+(255-224)*vertical_fade, green_value =224+(255-224)*vertical_fade, blue_value =224+(255-224)*vertical_fade,# create hex color color_hex =rgb(red_value, green_value, blue_value, maxColorValue =255),# only apply gradient within countries, white elsewhere final_color =if_else(within_countries, color_hex, "white"),# alpha - fully opaque within countries, transparent outside alpha_value =if_else(within_countries, 1.0, 0.0)# this is for gradient_data alpha)# create a mask for areas within the bounding boxbbox_polygon<-st_polygon(list(rbind(c(bbox_expanded[1], bbox_expanded[2]),c(bbox_expanded[3], bbox_expanded[2]),c(bbox_expanded[3], bbox_expanded[4]),c(bbox_expanded[1], bbox_expanded[4]),c(bbox_expanded[1], bbox_expanded[2]))))|>st_sfc(crs =st_crs(countries_proj))# create the main plot with vertical gradient within countriesmain_plot<-ggplot()+# add white backgroundgeom_raster(data =gradient_data,aes(x =x, y =y), fill ="white")+# add the vertical gradient within countries onlygeom_raster(data =gradient_data|>filter(within_countries),aes(x =x, y =y, fill =final_color, alpha =alpha_value))+# alpha here is for gradient# use identity scales for the custom colorsscale_fill_identity()+scale_alpha_identity()+# this will apply to gradient's alpha, and later legend rect's alpha# add country boundariesgeom_sf(data =countries_proj, fill =NA, color ="white", linewidth =0.3)+# Great Lakesgeom_sf(data =lakes_proj, fill ="white", color ="white", linewidth =0.1)+# triangle spikes with alpha based on raw change magnitudegeom_sf( data =triangle_sf, aes(fill =if_else(perc_change<0, I('#d65f00'), I('#2b9d6c'))), size =0.8, alpha =pmin(triangle_sf$alpha_value, .7))+scale_fill_identity()+# triangle color scale# scale_color_manual(# values = c("negative" = "#d65f00", "positive" = "#2b9d6c"),# guide = "none"# ) +# coordinate systemcoord_sf( xlim =c(-4e6, 4e6), ylim =c(-1.75e6, 2.1e6), expand =FALSE, clip ='off')+# clean themetheme_void()+theme( plot.title =element_markdown(hjust =0.5, size =14, family ='franklin', margin =margin(t =-10, b =10)), plot.background =element_rect(fill ="white", color =NA), panel.background =element_rect(fill ="white", color =NA), legend.position ='none'# ggplot's auto legend is off, we are making a manual one)# function to create the 1/4 circles for Douglas and Niagara Falls calloutsdraw_quarter_circles<-function(start, end, radius=400e3, x_offset=0, y_offset=0){# get angles and coords for a quarter of a acircleangles<-seq(start, end, length.out =1000)# reduced length.out for efficiency# radius <- radius # already an argumentx_coords<-radius*cos(angles)y_coords<-radius*sin(angles)# test with different offesets to match article's callout positioningx_coords_with_offset<-x_coords+x_offsety_coords_with_offset<-y_coords+y_offset# assign values to a tibblequarter_circle_data<-tibble(x =x_coords_with_offset, y =y_coords_with_offset)# draw said 1/4 circlespath<-geom_path( data =quarter_circle_data, aes(x =x, y =y)# simplified aes)return(path)}theme_set_custom()# draw 1/4 circlesmain_plot_with_qrt_circles<-main_plot+geom_text(aes(x =-150e3, y =-550e3, label ='UNITED STATES'), family ='franklin', size =6, color ='#a3a3a3')+geom_text(aes(x =-200e3, y =950e3, label ='CANADA'), family ='franklin', size =6, color ='#a3a3a3')+# douglas 1/4 circledraw_quarter_circles(start =pi, end =3*pi/2, x_offset =-150e3*11.2, y_offset =-550e3+100e4)+# niagara 1/4 circledraw_quarter_circles(start =0, end =pi/2, x_offset =-150e3*-11.5, y_offset =-550e3-50e4)call_outs_tibble<-tibble( label =c("<b>~90,000 fewer<br> people crossed at<br> Douglas</b> in March<br>2025, a 43<br>percent decline.","<b>Nearly 102,000 fewer <br> people crossed at <br>Niagara Falls,</b> a 41 <br> percent decline."), x =c(-150e3*15, -150e3*-14), y =c(-550e3+100e4, -550e3-75e4), hjust =c(.5, 0.05), vjust =c(0, 0.6))# add legend elements to the plotlegend_box<-pmin(triangle_sf$alpha_value, .7)|>sort()|>enframe()|>mutate( normalized_values_01 =(value-min(value))/(max(value)-min(value)), color_bins =case_when(normalized_values_01<=.25~'#e3cabd-10', normalized_values_01<=.50~'#e0b69e-20',normalized_values_01<=.75~'#db956a-30',normalized_values_01<=1~'#d76c21-40%',TRUE~'#c7d8d0'), labels =paste0('-', word(color_bins, -1, sep ='-')), color_bins =word(color_bins, 1, sep ='-'))|>select(color_bins, labels)|>distinct()|>ggplot(aes(x =labels, y =.5, fill =I(color_bins)))+geom_tile()+labs( x =NULL, y ='Percentage change vs. last year')+# ggtitle(label = '<b>Big drops along the border<b>') +theme( axis.title.y =element_text(angle =0, size =7, vjust =.5, face ='bold'), axis.text.x =element_text(size =7, face ='bold'),# vjust = .5, hjust = -1, axis.text.y =element_blank(), plot.background =element_rect(fill ="white", color =NA), panel.background =element_rect(fill ="white", color =NA), panel.grid.major.x =element_blank(), panel.grid.major.y =element_blank(), legend.position ='none', plot.margin =margin(t =8))# plot calloutsfinal_plot<-main_plot_with_qrt_circles+geom_textbox( data =call_outs_tibble,aes(x =x, y =y, label =label, hjust =hjust, vjust =vjust), family ='Libre Franklin', width =unit(0.2, "npc"), fill =NA, size =c(3, 2.5), color ='black', box.colour =NA, orientation ="upright")# plot legend box along with its (legend) namefinal_plot_with_legend<-final_plot+annotation_custom( grob =ggplotGrob(legend_box), xmin =-2e6, xmax =3e6, ymin =2.5e6, ymax =2.8e6)# plot final with title using geom_text() to avoid running into margin/rendering erros + adding caption to plotfinal_plot_with_legend+geom_text(aes(x =0, y =2.85e6, label ="Big drops along the border"), family ='franklin', size =3.5, fontface ='bold', color ='black', hjust =0.3, vjust =0.5)+labs( caption ="<span style='color:#707070; text-align: justify; width: 100%; display: block; padding-top: 25px;'>Source: Statistics Canada Note: The height of each spike reflects the change in passenger numbers this<br>March versus the same month last year. Only ports with at least 1,000 crossings in March 2025 are shown.</span>")+theme( plot.caption =element_markdown(margin(t =50), size =5, hjust =.5))
Overall, I found this article powerful, clear, and its authors original in their use of data, so truly an enjoyable experience to have worked on ‘deconstructing’ the article backend to match the outputs as closely as I could.. thank you for taking the time to read this and more to come !
---title: | <div class="custom-title-block" style="font-size: 1.2em;"> <span style="color:#000000;">Replication of below article's Data and Visualizations</span><br> <span style="color:#333333; font-size:0.8em; white-space: nowrap">"Has International Travel <br> to the U.S. Really Collapsed?"</span><br> <span style="color:#666666; font-size:0.7em;"> By <a href="https://www.nytimes.com/interactive/2025/04/30/world/us-travel-decline.html" target="_blank" style="color:#000000; text-decoration:underline;">Josh Holder, Niraj Chokshi and Samuel Granados</a> </span><br> <span style="font-size:0.8em; color:#333333; white-space: nowrap"> Karim K. Kardous <a href='mailto:kardouskarim@gmail.com' style='margin-left: 9px; font-size: 0.9em;'> <i class='bi bi-envelope'></i> </a> <a href='https://github.com/kkardousk' style='margin-left: 5px; font-size: 0.9em;'> <i class='bi bi-github'></i> </a> </span> </div>format: html: toc: true toc-depth: 4 toc-expand: true toc-title: 'Jump To' number-depth: 2 fig-format: retina fig-dpi: 300 code-link: true # requires both downlit and xml2 to be downloaded code-fold: true code-summary: '<i class="bi-code-slash"></i> Show the code' # code-overflow: wrap code-tools: toggle: true # adds "Show All / Hide All"; also allows for all code copy (at once as quarto doc) css: styles.css highlight-style: github-dark df-print: paged page-layout: article embed-resources: true smooth-scroll: true link-external-icon: false link-external-newwindow: true fontsize: 1.1em linestretch: 1 linespace: 1 html-math-method: katex linkcolor: '#D35400'execute: echo: true warning: false message: false info: false cache: false freeze: autoeditor: visual---## *International Travel Into the US Prelude* {.text-justify}**About the data/article in a nutshell:** This New York Times article, backed by numbers from government agencies for the US & Canada, attempts to show whether International Travel into the US has dipped as a result of President Trump's administration and policies relating to broad tariffs, tight border control, etc. <br> The **main takeaway** is that while travel originating from Asia into the United States has timidly increased, that from Europe has stalled, meanwhile that of Canada has sharply decreased, especially when it comes to car crossings into the US; relative to air travel.<br> Despite a drastic drop for US bound travel from Canada, overall, travel into the United States has remained fairly undisturbed.<br>On my choices here, I decided to replicate/plot all outputs (minus the second one) with at best identical data points and visuals, and/or matching as much as possible the ones that were PNG (images) embedded into the article (as opposed to scrips/tables/or otherwise HMTL wrappers containing actual data points). With a solid few weeks time span having passed between article publication date and the time of this writing, and given that the numbers reported at the time were fairly recent, numbers on March 2025 when the article gets published in April 2025, any preliminary numbers that might have got adjusted since then will not allow for 100% match in said numbers.<br>The one piece that I don't replicate in this project is the second one; "International arrivals at major U.S. airports"; it's embedded into the article as a PNG, and would require a lot of work to reverse engineer the foundational data series behind it after scanning said PNG; but more importantly, I try to go for replicating new/unconventional visualizations when this one specifically, at the end of the day, albeit powerful, boils down to a couple of line plots. <br> If you like to give the original article a read, you can find it [here](https://www.nytimes.com/interactive/2025/04/30/world/us-travel-decline.html).**Overall Strategy for building first plot:** One way to recreate the first visual; which shows in a report-card style the % change in flight bookings into the US comparing (Jan 1st through April 26, 2024) to (same period this year), and looking at 1) overall (International), 2) European, 3) Asian, and 4) Canadian inbound travel; is to generate 4 tiles with a subtle vertical tick/separator between each of those said 4 percentages. The time frame of visits for both years covers Summer; which allows for 'apples to apples' comparisons but also focuses on an upcoming period of the year - very near future, meaning that said bookings are more likely than not to be definitive for a vast majority of them.Below code chunk loads in libraries used and configures and activates/sets a couple of themes to be used throughout the code```{r}#|echo: false#|message: false#|warning: false#|include: false# check if the required package 'emo' is installed;# if not, it might mean your renv environment is not fully restored.# running `renv::restore()` will install all necessary packages# to ensure consistent package versions for building this quarto document,# effectively 'containerizing' your project and protecting it from future package changes.if (!requireNamespace("emo", quietly =TRUE)) {message("\nIt looks like your environment might not be restored.\nRun `renv::restore()` to install required packages.\n")}# load packageslibrary(xml2)library(downlit)library(gdtools)library(tidyverse)library(quarto)library(chromote)library(here)library(tidycensus)library(janitor)library(purrr)library(ggtext)library(ggshadow)library(ggiraph)library(gfonts)library(showtext)library(ggborderline)library(grid)library(patchwork)library(shiny)library(rvest)library(htmltools)library(gt)library(rsvg)library(magick)library(stringr)library(ggimage)library(qs)library(emo)font_add(family ="franklin-medium", regular ="renv/library/macos/R-4.5/aarch64-apple-darwin20/sysfonts/fonts/Libre_Franklin/static/LibreFranklin-Medium.ttf") theme_set_custom <-function() {# loading google Fonts sysfonts::font_add_google("Libre Franklin", "franklin") sysfonts::font_add(family ="franklin-medium", regular ="renv/library/macos/R-4.5/aarch64-apple-darwin20/sysfonts/fonts/Libre_Franklin/static/LibreFranklin-Medium.ttf" ) showtext::showtext_auto()# applying ggplot2 theme ggplot2::theme_set( ggplot2::theme_minimal(base_family ="franklin") + ggplot2::theme(panel.background = ggplot2::element_rect(fill ="#F9F9F9", color =NA),plot.background = ggplot2::element_rect(fill ="#F9F9F9", color =NA) ) )}theme_set_custom()```## **Travel Score Cards**```{r, fig.width = 9, fig.height = 3}#|echo: false#|message: false#|warning: false#|include: truetheme_set_custom()p1_tribble <- tribble( ~perc_change, ~label, ~region, ~fill, ~width_cm, ~ length_cm, '-1.5%', 'International arrivals\n at U.S. airports', 'International','#969696', 170 / 300 * 2.54, 80 / 300 * 2.54, # converting into actual cm, controlling for resolution set (300 or print quality) '-2%', 'Summer flight\n bookings from Europe', 'Europe', '#969696', 130 / 300 * 2.54, 77 / 300 * 2.54, '+4%', 'Summer flight\n bookings from Asia', 'Asia', '#2b9d6c', 130 / 300 * 2.54, 77/ 300 * 2.54, '-21%', 'Summer flight\n bookings from Canada', 'Canada', '#d65f00', 164 / 300 * 2.54, 77 / 300 * 2.54)# here i go for an interactive process whereby each plot is created separately in a list of plots; one main reason is that the boxes/rectangles are of different sizes. tile_plot_rounded <- function(perc_change, label_text, fill, width_cm, length_cm, scaler = 3) { # 3 was best here library(grid) # for some reason, roundrectGrob() wasn't running without first loading grid here too (even if called earlier when loading all packages) ggplot() + ggtitle(label_text) + # set the label as thes plot sub-title annotation_custom( grob = roundrectGrob( width = unit(width_cm * scaler, "cm"), height = unit(length_cm * scaler, "cm"), r = unit(0.1, "npc"), # corner radius, the higher the values the more prononcoumced the roundedness gp = gpar(fill = fill, col = NA) ), xmin = 0, xmax = 6, ymin = 0, ymax = 3 ) + annotate( "text", x = 3, y = 1.5, hjust = 0.5, vjust = 0.5, size = 10, # text size for perc_change label = perc_change, color = 'white', family = 'franklin', fontface = 'bold' ) + xlim(0, 6) + ylim(0, 3) + coord_fixed(ratio = 1) + theme_void() + theme( plot.margin = margin(4, 2, 2, 2), # small margins add around each plot for more subtitle room plot.title = element_text(hjust = 0.5, size = 14, family = 'franklin-medium', margin = margin(b = 5)) # this is a midway font face between plain and bold )}# loop through labels, fills, and dimensions, and including 'label' for the titletile_plots <- pmap( list( perc_change = p1_tribble$perc_change, label_text = p1_tribble$label, fill = p1_tribble$fill, width_cm = p1_tribble$width_cm, length_cm = p1_tribble$length_cm ), tile_plot_rounded )# assign one row so that all plots are side by side and not potentially stacked (vertically)p1 <- wrap_plots( tile_plots, nrow = 1) # adds after wrap a titlep_final <- p1 + plot_annotation( title = 'Travel compared with last year', caption = 'Sources: U.S. Customs and Border Protection and the Airlines Reporting Corporation', theme = theme( plot.title = element_text(size = 20, family = 'franklin', face = "bold", hjust = 0.5, margin = margin(t = -10, b = 10)), plot.caption = element_text(size = 9, family = 'franklin-medium', face = 'bold', colour = '#727272', hjust = 0.15) ) ) p_final```## **Confidence, but Warning Signs**For the below visualization, I opted to use gt() from `{gt}` package as it naturally comes to mind when trying to build 'great tables', yes pun intended. A few things to mention here to explain my thought process. <br>- Clearly the tables go over the left/right margins of the article layout in the original. While our dimensions may allow for all 3 tables to fit side by side, for replication purposes, we 'push out' the margins or reduce them to make them all 3 tables fit side by side.<br>- Second, one other thing that stands out is the clear customization of the third column, the percent change actual values, where a target bar plot of some sort is drawn. To replicate that, one way (potentially only way) is to use `text_transform()`, in tandem with a custom function using `fn` argument that will incorporate mostly css code. This allows for greater customization down to the minutiae.<br>- Said last column is also centered and anchored using a small vertical tick (this is most visible in the middle plot where values alternate between negative (left) and positive (right) of tick. <br>- From the tick each value is gauged on their max for that table, to get a sense of proportionality and make sure each bar's no. of pixels proportionally match to the greatest (absolute) value per table.Below I show the code that pulls int the data, embedded in an html table; so that once the node or in this case `css` element name is identified, the rest is fairly easy.```{r}#|echo: false#|message: false#|warning: false#|include: true## p3; table to embed in gt() # read in the correspoinding table nodeurl <-'https://www.nytimes.com/interactive/2025/04/30/world/us-travel-decline.html'raw_table <-read_html(url) |>html_element(css ='.svelte-5z6yzk') |>html_table() |>select(country =1, perc_change =2) |>filter(perc_change !='')# processing the tibbleindividual_country_changes <- raw_table |>mutate(perc_change_num =parse_number(perc_change),left_right_ended =if_else(perc_change_num <=-.02, 'left', 'right'),# below also going to be used to section off the table into 3 columnschange_direction =case_when(between(perc_change_num, -2, 1) ~'stalled', perc_change_num <0~'decreased',.default ='increased') ) |>mutate(groupings =case_when(n() ==11~1,n() ==6~2,.default =3),.by = change_direction ) ```::: full-width-block```{r}#|echo: false#|message: false#|warning: false#|include: false#|results: 'asis' gt_table_custom_ <-function(hex, direction, title =NULL) { title_ <- title hex_to_pass_in_fn <- hex dir_to_pass_in_fn <- direction individual_country_changes |>filter(groupings == {{direction}}) |>select(1:3) |>gt() |>text_transform(locations =cells_body(columns = perc_change_num),fn =function(x) { x_num <-as.numeric(x) max_val <-max(abs(x_num), na.rm =TRUE)if(max_val ==0) max_val <-1 widths_for_glue <-abs(x_num) / max_val *if (dir_to_pass_in_fn ==2) 8else100# default scaling may look misleaading for 'no change' countries, so massively reduced here otherwise center table values get way inflated for what they are ([-2; 1])# loop thru values of perc_change column and isolate each element so that it can pass thru str_glue() without length(x) > 1 error from if/else statement purrr::map_chr(seq_along(x_num), function(x) { width <- widths_for_glue[x] val <- x_num[x]if(dir_to_pass_in_fn ==2) { direction_to_go <-if(val <0) "right"else"left" translate_offset <-if(val <=-2) "20%"else"0"# translate back to the origin for values >= -.02 (this gets applied only for no-change countries/middle table)str_glue('<div style="display: flex; align-items: center; justify-content: center;"> <div style="position: relative; width: 100%; height: 8px;"> <div style="position: absolute; left: 50%; top: -4px; bottom: 4px;width: 2px; height: 16px; background: #ccc;"></div> <div style="position: absolute; {direction_to_go}: 50%; top: 0; background: {hex_to_pass_in_fn}; height: 8px; width: {width}%; transform: translateX({translate_offset}); border-radius: 3px;"></div> </div> </div>' ) } # for below, we treat direction 1 (very negative change) and 2 (very positive) virtually the same, only former goes from center to left while latter from center to rightelse { direction_to_go <-if(direction ==1) "right"else"left"str_glue('<div style="display: flex; align-items: center; justify-content: center;"> <div style="position: relative; width: 50%; height: 8px;"> <div style="position: absolute; left: 50%; top: -8px; width: 2px; height: 16px; background: #ccc; transform: translateX(-1px);"></div> <div style="position: absolute; {direction_to_go}: 50%; top: -4px; background: {hex_to_pass_in_fn}; height: 8px; width: {width}%; border-radius: 3px;"></div> </div> </div>' ) } }) } ) |>cols_align(align ="center",columns = perc_change ) |>tab_style(style =cell_text(color = hex),locations =cells_body(columns = perc_change) ) |>tab_style(style =css("white-space"="nowrap"),locations =cells_body(columns = country) ) |>cols_label(country =md("**Country**"),perc_change ="",perc_change_num = htmltools::HTML("<span style='white-space: nowrap;'><strong>Change vs. 2024</strong></span>") ) |>opt_table_font(font =c("franklin-medium"),weight =500 ) |>tab_options(heading.align ="left",table.width =pct(100), heading.padding =px(10),table.border.top.style ="hidden", column_labels.border.bottom.style ="solid",column_labels.border.bottom.width =px(1),column_labels.border.bottom.color ="white", ) |>cols_width( country ~pct(50), perc_change ~pct(50), perc_change_num ~pct(50) ) }# generate and assign tables below tbl1 <-gt_table_custom_(hex ='#d65f00', direction =1) |>tab_header(title =md("**Where U.S.-bound summer flight bookings have <span style='color: #d65f00;'> decreased </span>...**") )tbl2 <-gt_table_custom_(hex ='#666666', direction =2) |>tab_header(title =md("<span style='color: #666666;'>**... stayed about the same ...**</span>")# subtitle = md("<span style='color: #FFFFFF;'>adding artificial padding.**</span>") ) tbl3 <-gt_table_custom_(hex ='#2b9d6c', direction =3) |>tab_header(title =md("**... and <span style='color: #2b9d6c;'>increased</span>.**")# subtitle = md("<span style='color: #FFFFFF;'>adding artificial.** padding</span>") ) tables_row_layout <- htmltools::tags$div(style ="display: flex; justify-content: space-between; align-items: flex-start; gap: 20px; width: 100%; padding: 0 10px;", htmltools::tags$div(style ="flex: 1; min-width: 0;", tbl1), htmltools::tags$div(style ="flex: 1; min-width: 0;", tbl2), htmltools::tags$div(style ="flex: 1; min-width: 0;", tbl3))caption_text <-"<span style='color:#707070;'>Source: Airlines Reporting Corporation • Data covers flight bookings made between Feb. 1 and April 13 for travel between Memorial Day and Labor Day, compared to the same period last year.</span>"caption_html_content <-HTML(caption_text) left_padding_value <-"20px"# adding some padding so that caption moves slightl to the right to match original's caption poisiton caption_styled_div <- htmltools::tags$div(style =paste("width: 50%;", # overall visual span"max-width: 1000px;", # cap the max width of the caption"margin: 5px auto 0 auto;", "text-align: left;","font-family: 'franklin-medium', 'Libre Franklin', Arial, sans-serif;", # prioritize 'franklin-medium'"font-weight: 500;", "font-size: 0.85em;", "line-height: 1.1;",paste0("padding-left: ", left_padding_value, ";") ), caption_html_content)# parent div will stack them vertically and can be used to constrain overall widthoverall_final_layout <- htmltools::tags$div(style ="width: 100%; max-width: 1900px; margin: 0 auto; padding-bottom: 1px;", tables_row_layout, caption_styled_div)# final outputoverall_final_layout```:::<br> Unlike the line graphs from the original article, showing the Intl. Arrivals, where the graph was embedded as an image, here I was able to extract the actual data using the tag id pointing out to the table using [Selector Gadget](https://g.co/kgs/zQmpZoP) - a free reliable Chrome Extension that can be used to identify either the CSS selector or Xpath of site elements - *in tandem* with `{rvest}` to read in the embedded data from said selector into a tibble.<br> Note that within `{gtExtras}`, there already is a New York Times custom theme table using `gtExtras::gt_theme_nytimes()`, but did not closely match the style of the headers or the overall style of the tables; however it's good to know if you are working with your own data (not replicating another piece), and you're just ready to use a starter template or its boilerplate to at least test how it looks, if nothing else. <br> Overallf, I think this is a fairly solid replication of the tables from the original, a piece I've enjoyed creating as it involved fairly advanced customization - I'd say it was deceptively simple to re-create, but worth it. <br> **One thing to call out** however is that these tables are optimized for desktop viewing due to their 100% width formatting. Mobile users may experience cramped display issues; I will let you in on a secret - it turns out that I do not stand a chance against the Financial might of the NYT !<br>## **A Canadian boycott** {.text-justify}```{r}#|echo: false#|message: false#|warning: false#|include: falsetheme_set_custom()car_xing_ca_us_borders_py <-read_csv('car_crossing_ca_us_borders.csv') |>distinct()car_xing_ca_us_borders_cy <-read_csv('car_crossing_ca_us_borders2.csv') |>distinct()# i am testing with below filters # 1st, 2nd filter should only focus on Canadian nationals and or residents (article focuses on Canadian sentiment; with use of strong word such as boycott, so assuming we should look into Canada returns only, not Intl. returns from US into Canada)# 3nd filter -> 'Vehicles' is a major group (when 'automobiles' and 'motorcycles' are subgroups) so filtering to 'Vehicles' so that not to inflate the counts, but the plot caption in the original article mentions 'by automobile' so we filter t# 4rd filter same idea as in 2nd but here -> 'Travellers' is a major group after a quick calculation for one day as an example where all values for that day sumn up to 'Travellers' value filtered_to_canadian_cy <- car_xing_ca_us_borders_cy |>filter(`Traveller characteristics`=='Canadian resident visitors returning to Canada'&`Vehicle licence plate`=='Canadian-plated vehicles entering Canada'& GEO =='Canada'&`Vehicle type`=='Automobiles'&`Traveller type`=='Travellers' ) |># above combination of filters is giving me exactly 1 record per day (without any aggregations/roll ups) and it also gives me 116 days (Jan 1 through to Apr 26th, so we stick to those filters; likely correct)select(crossing_date =1,travel_characteristic =6,travel_type =7, vehicle_ttl =ncol(car_xing_ca_us_borders_cy) -4 ) # same with py (or prior year)filtered_to_canadian_py <- car_xing_ca_us_borders_py |>filter(`Traveller characteristics`=='Canadian resident visitors returning to Canada'&`Vehicle licence plate`=='Canadian-plated vehicles entering Canada'& GEO =='Canada'&`Vehicle type`=='Automobiles'&`Traveller type`=='Travellers' ) |>select(crossing_date =1,travel_characteristic =6,travel_type =7, vehicle_ttl =ncol(car_xing_ca_us_borders_py) -4 ) |># 117 days for py since it was leap year, so we filter out feb 29th to have a day over day comparisonfilter(crossing_date !='2024-02-29')# combining both sets car_xing_change <- filtered_to_canadian_cy |>select(1, last_col()) |>arrange(crossing_date) |>bind_cols( filtered_to_canadian_py |>arrange(crossing_date) |>select(ly_ttls =last_col()) ) |>mutate(change = vehicle_ttl - ly_ttls,change_perc = (vehicle_ttl - ly_ttls) / ly_ttls ) |>filter(crossing_date <'2025-04-10') |># add a rolling 7 day avgmutate(rolling_avg_7day = zoo::rollmean(change_perc, k =7, fill =NA, na.rm =TRUE) ) |>filter(!is.na(rolling_avg_7day))# first two months get abbreviated followed by a dot, while last two don't get abbreviatedmonth_custom_labeller <-map_chr(.x = car_xing_change |>pull(crossing_date) |> lubridate::month() |>unique() |>sort(), .f =function(x) case_when( x >2~month(x, label =TRUE, abbr =FALSE),.default =paste0(month(x, label =TRUE, abbr =TRUE), '.') ) )caption_text <-"<span style='color:#707070;'>Sources: Statistics Canada/K. Kardous Note: Data shows journeys by Canadian residents returning to Canada from the<br> U.S. by automobile. A 7-day rolling average is shown.</span>"car_xing_change |>mutate(y_pos =pmax(rolling_avg_7day, 0), # positive values only; this caps at 0 for < 0 so to create essentially two seperate series (one pos. and one negative)y_neg =pmin(rolling_avg_7day, 0) # negative values only ) |>ggplot(aes(x = crossing_date)) +geom_area(aes(y = y_pos), fill ='#2b9d6c', outline.type ='lower') +geom_area(aes(y = y_neg), fill ='#d65f00', outline.type ='lower') +geom_hline(yintercept =0) +labs(x =NULL, y =NULL,caption = caption_text ) +scale_y_continuous(labels = scales::percent ) +scale_x_date(breaks =seq(as.Date('2025-01-01'), as.Date('2025-04-09'), 'm'),labels = month_custom_labeller ) +# arrow and text for negative regionannotate(geom ="segment",x =as.Date('2025-04-04'), xend =as.Date('2025-04-04'), y =-0.02, yend =-0.1, colour ="black", arrow =arrow(length =unit(0.1, "cm"), angle =50, type ='open'),linewidth = .35 ) +annotate(geom ='text',x =as.Date('2025-04-02'),y =-0.06,label ='Less travel\nthan last year', hjust =1, ) +# arrow and text for positive regionannotate(geom ="segment",x =as.Date('2025-04-04'), xend =as.Date('2025-04-04'), y =0.025, yend =0.1, colour ="black", arrow =arrow(length =unit(0.1, "cm"), angle =50, type ='open'),linewidth = .35 ) +annotate(geom ='text',x =as.Date('2025-04-02'),y =0.06,label ='More travel\nthan last year', hjust =1 ) +ggtitle(label ="<span style='color:black;'>Car crossings at the Canada-U.S. border",subtitle ="<span style='color:#777777;'>Change compared with same day last year" ) +theme(panel.grid.minor.x =element_blank(),panel.grid.major.x =element_blank(),plot.title =element_markdown(hjust = .5),plot.subtitle =element_markdown(hjust = .5),plot.caption =element_markdown(size =8, hjust =0, margin =margin(l =-20, r =10, t =10, b =-5) ),axis.text =element_text(family ='franklin-medium', face ='bold'),axis.text.x =element_text(hjust =-.9) )```If you look at the original, while the trends overall match unquestionably, meaning the intent/message conveyed here and in the original article is fundamentally the same, there are a few spikes that I did not manage to replicate; while I am confident in my methods; and that the numbers reflect a true 7 day moving average of car crossings; I can only assume some the data has been re-actualized, updated to go from preliminary to definitive in the time ellapsed between article publication and now (as of this writing).## **Big drops along the border**::: text-justify**About the data:** You will find the data pushed as 2 separate csv files, one for March 2024, the other for March 2025, and combined as part of the pre-processing steps in the below code. The data tracks the no. of automobiles that crossed the CA -\> US large/known border stations, focusing only on ports that with at least 1,000 crossings for 2025. To match the numbers in the article as closely as possible. I applied below filters:- traveller_type is set to "Travellers" (aggregate roll up)- traveller_characteristics is set to detect the "Canadian residents" pattern, regardless of direction (going to or returning from the US)- Wherever specific levels radio filter selection were displayed, I kept them all blank/unchecked- I had to filter out "British Columbia" returned as part of the `{tidygeocoder}` wrapper for the `arcgis` API to get approximate/realistic coordinates around which triangles (for YoY changed would be plotted); the reason is that it returned (in that case only) a whole province, when a much smaller geography is needed. <br>For more info on the pull itself, feel free to consult [Statistics Canada](https://www150.statcan.gc.ca/t1/tbl1/en/cv!recreate.action?pid=2410005301&selectedNodeIds=1D146,2D3,2D18,2D41,2D42,2D75,2D91&checkedLevels=2D1&refPeriods=20240301,20240301&dimensionLayouts=layout2,layout3,layout2,layout2&vectorDisplay=false). <br>**Data Validation:** Above link lands you on Niagara Falls - Queenston Bridge border crossing (figuratively); which I then used along with another data point to validate that I am getting fairly close results, if not identical to the NYT's original. <br> I like to start simple more often than not by looking at a very specific example and then 'work my way up' or generalize so to speak - once the foundation is solid; also known pompously as Hypothetico-inductive reasoning. <br> While the data matches very highly with the article's output for Niagara Falls (both the split into Niagara Falls - Queenston Bridge and Niagara Falls - Rainbow Bridge), both in raw numbers (-101,250 fewer passengers vs. the article's 'Nearly 100,000' for YoY change for March) and % change (-41% vs -42%); I did get a significantly different figure for Douglas area (West coast). \~90k fewer passengers vs. the article's reported \~55k fewer crossings. <br> Given the color scheme used in the article's legend, the bars for Douglas and Niagara share a similarly dark shade of orange; and given the fact that Niagara falls number match; I cannot explain/account for said differences. I will not dwell on investigating the matter further, as the idea is to either replicate or create an output that is inspired by a piece that caught my eye, while remaining highly faithful to the fundamental message conveyed by said piece.<br>**Data Pull/Wrangling steps:** The whole point of replicating or getting inspired by this article's data and output was this very visualization, so in a way this is the "pièce de résistance", so it warranted that a lot of work had to be potentially done to replicate below map; which it did ! <br> On a high level, 4 main steps were crucial to replicate or come close to replicating said plot:- After pulling the raw data from Statistics Canada, the first step was to map out actual coordinates of border crossings. From the `{tidygeocoder}` package, I used the `geo` wrapper function to call the `arcgis` API, after appending a string that would help better zero in, literally, on the likely coordinates needed as such `str_c(' Canada US border')`. A bounded region had to be set, outside of which contending coordinates were not included. Said coordinates would be along the Canada-US border, which includes to a signficant extent the 49th parallel/latitude in the western/center regions as well as various other boundaries through the Great Lakes and eastern/north eastern regions.- I like the vertical fading into the "Great White North"; poetically adequate here. This entailed first measuring the distance between the center coordinates, called centroids (using `st_centroid()` from the `{sf}` package) and the points that will diverge *continuously* from that center to top and/or center to bottom. This gradual distance was then linked to rgb codes that would incrementally go from light grey into pure white to finally create that gradient into pure white `rgb(255, 255, 255)`.- Then triangles had to be displayed; I chose `st_polygon()` also from `{sf}` as it offers a high degree of customization, where defaults fail to do so. I needed very small bases and potentially very large heights (defaults shapes all offer variations of an equilateral triangle). The filling had to depict varying shades of the [that color]{style="background-color: #d65f00; color: white; font-weight: bold; padding: 4px 8px; border-radius: 4px;"} When the triangle heights would imply differences in % magnitude (downward triangles for negative - almost all of them, and upward ones for positive % change in no. of crossings). This did not utilize the 'fourth dimension' of the rgb color system, its alpha value; ranging from 0-1, as I needed a change in actual color, not color intensity.- Finally, the data callouts for Douglas (West bound) and Niagara (East bound) were displayed using a textboxes using `geom_textbox()` that were linked to their respective spikes through a quarter circle, the former traced/going from right to left (counter-clockwise), the latter from left to right.:::```{r}#|echo: false#|message: false#|warning: false#|include: falselibrary(ggplot2)library(tidyverse)library(janitor)library(rnaturalearth)library(tidygeocoder)library(sf)library(rnaturalearthdata)theme_set_custom()# steps to recreate last plot/map displaying drops in border crossings between Canada and US# tested it on 1 example/1 port with Only ports with at least 1,000 crossings in March 2025# and the results match the visual to a percentage + or -## go to below lins for 2024# https://www150.statcan.gc.ca/t1/tbl1/en/cv!recreate.action?pid=2410005301&selectedNodeIds=1D146,2D3,2D18,2D41,2D42,2D75,2D91&checkedLevels=2D1&refPeriods=20240301,20240301&dimensionLayouts=layout2,layout3,layout2,layout2&vectorDisplay=false## go to below link for 2025# https://www150.statcan.gc.ca/t1/tbl1/en/cv!recreate.action?pid=2410005301&selectedNodeIds=1D146,2D41,2D42&checkedLevels=2D1,2D2&refPeriods=20250301,20250301&dimensionLayouts=layout3,layout3,layout3,layout2&vectorDisplay=false# filters applied to test on Niagara Falls - Queenston Bridge# " GEOGRAPHY# Select specific levels only -> keep blank# Niagara Area -> check only Niagara Falls - Queenston Bridge# TRAVELLER CHARACTERISITCS# Select specific levels only -> keep blank# Canadian-resident visitors returning to Canada -> check only Canadian residents returning from the United States of America # TRAVELLER TYPE# Keep defaults (all checked; 'Travellers' is a the final rollup number/totl)# REFERENCE PERIOD# March 2024# # SAME EXACT STEPS FOR 2025# DOWNLOAD BOTH AS CSVs and select 'Download selected data (for database loading) "# (56331 - 103307) / 103307 # Niagara Falls - Rainbow Bridge# (90202 - 144738) / 144738 # Niagara Falls - Queenston Bridge# (0.3768 + 0.4547 ) / 2# let's try to recreate those figures now that we have loaded the full datasetsmap_data_ca_xings <-read.csv('2024_ca_us_ttl_xing_main_ports.csv') |># mutate(year = 2025) |> clean_names() |>bind_rows(read_csv('2025_ca_us_ttl_xing_main_ports.csv') |># mutate(year = 2024) |> clean_names() ) |>as_tibble()invisible( { map_data_ca_xings |>mutate(ref_year =word(ref_date, 1, sep ='\\-') ) |>filter( traveller_type =='Travellers'& geo %in%c('Niagara Falls - Queenston Bridge', 'Niagara Falls - Rainbow Bridge') &str_detect(traveller_characteristics, 'Canadian residents') ) |>count(ref_year, geo, crossings = value) |>summarise(perc_change_niagara_busiest_2ports = (sum(if_else(ref_year ==2025, crossings, 0)) -sum(if_else(ref_year ==2024, crossings, 0)) ) /sum(if_else(ref_year ==2024, crossings, 0)) ) # 41% decline - article says 42%; close enough; some values, especially for 2025 may have been preliminary and may have been slightly adjusted })# let's create a function now that loops thru all major ports name and generate a port name by % decline (2 column tibble)port_name_list <- map_data_ca_xings |>pull(geo) |>unique()perc_change_by_port <-function(x, port_name){ map_data_ca_xings |>mutate(ref_year =word(ref_date, 1, sep ='\\-') ) |>filter( traveller_type =='Travellers'&str_detect(traveller_characteristics, 'Canadian residents') & geo %in% port_name ) |>count(ref_year, geo, crossings = value) |>filter(crossings >=1000) |>summarise(perc_change = (sum(if_else(ref_year ==2025, crossings, 0)) -sum(if_else(ref_year ==2024, crossings, 0)) ) /sum(if_else(ref_year ==2024, crossings, 0)),.by = geo ) |>filter(geo !='Canada'&!is.na(perc_change) &between(abs(perc_change), 0, 1)) |>mutate_if(is.numeric, function(x) round(x, 2))}port_names_with_perc_changes <-map_dfr(.x = port_name_list, ~perc_change_by_port(port_name = .x)) # Canada as a whole country, has seen 24% fewer passengers in March 2025 vs. March 2024# adding lat/longs to then be able to add as a alayer on the mapport_coords <-tribble(~geo, ~latitude, ~longitude,"Niagara Falls - Queenston Bridge", 43.1633, -79.0507,"Niagara Falls - Rainbow Bridge", 43.0896, -79.0703,"Niagara Falls - Whirlpool Bridge", 43.1051, -79.0645,"St. Stephen - 3rd Bridge", 45.1856, -67.2792,"St. Stephen - ferry and other locations", 45.2025, -67.2783,"St-Armand/Philipsburg", 45.0412, -73.0486,"Stanstead: Route 55", 45.0036, -72.0981,"Stanstead: Route 143", 45.0066, -72.1047,"Lacolle: Route 221", 45.0878, -73.3717,"Lacolle: Route 223", 45.0861, -73.3597,"St-Bernard-de-Lacolle: Highway 15", 45.0051, -73.3723,"Douglas", 49.0033, -122.7578,"Pacific Highway", 49.0038, -122.7382,"Abbotsford/Huntingdon", 49.0027, -122.2555)# not run# port_names_with_perc_changes |># left_join(# port_coords# ) |># summarise(# na_prop = 1 - mean(is.na(latitude))# ) # currently only a 10% match rate from online searches, so let's explore tidygeoder package to get more granular# and see if we can extend the matching to at least 60-70% then we can at least go from thereyet_to_match <- port_names_with_perc_changes |>left_join( port_coords ) |>filter(is.na(latitude)) |>pull(geo) |>str_c(' Canada US border')# writing an if statement to check if geocode_arcgis is already created a temp qs file, so that we avoid redundant api calls (every time we run/render the doc)cache_file <-"geocoded.qs"geocode_arcgis <-if (file.exists(cache_file)) {qread(cache_file)} else { geocode_arcgis <-map_dfr(yet_to_match, ~geo(address = .x, method ="arcgis"))qsave(geocode_arcgis, cache_file) geocode_arcgis}# this massively improved the matches- but let's see if those coordinates realisticall can be considered # border regions between CA and US # now onto actual building of plot elements that will go into final output#|echo: false#|message: false#|warning: false#|include: false#|eval: falselibrary(ggplot2)library(tidyverse)library(janitor)library(rnaturalearth)library(tidygeocoder)library(sf)library(rnaturalearthdata)theme_set_custom()invisible( { validate_border_crossings <-function(geocoded_df, max_km_from_49th =200) {# canada rough bounds canada_lat <-c(41.5, 84.0) canada_lon <-c(-141.0, -52.0)# simple distance to 49th parallel (main border) distance_to_49th <-function(lat) abs(lat -49.0) *111# ~111 km per degree latitude geocoded_df |># geocode_arcgis was here, changed to geocoded_dffilter(!is.na(lat), !is.na(long)) |>mutate(within_canada = lat >= canada_lat[1] & lat <= canada_lat[2] & long >= canada_lon[1] & long <= canada_lon[2],km_from_49th =distance_to_49th(lat),is_valid = within_canada & km_from_49th <= max_km_from_49th,status =case_when(!within_canada ~"outside canada bounds", km_from_49th > max_km_from_49th ~paste0("too far out (", round(km_from_49th), "km)"),TRUE~"valid" ) ) } validated_coords <-validate_border_crossings(geocode_arcgis, max_km_from_49th =300)# using above threshold to sorta account for US 'curvature' into Canada, especially on the North-East region,# as the border is not jsut one latitude/'horizontal' line valid_coords <- validated_coords |>filter(is_valid) |>select(geo = address, lat, long) # this has moved from 10 matches up to 96 out of 140 ~ 70% match rate# looking at no. of spikes in the article, 96 spikes should hopefully cover most if not all location of spikes in# the article, which from visual insepction, looks like around 50 are draw/plotted on the Canada/US map/border# combine all tibbles into one with perc_changes# spike_prep is defined but not explicitly used later, spike_prep_sf is the key one spike_prep <- port_names_with_perc_changes |>filter(geo !='British Columbia') |># filter out BC earlyleft_join( port_coords,join_by(geo == geo) ) |>left_join( valid_coords |>mutate(geo =str_remove(geo, ' Canada US border') ),join_by(geo == geo) ) |>mutate(latitude =coalesce(latitude, lat),longitude =coalesce(longitude, long) ) |>filter(!is.na(latitude)) |>select(geo:longitude) })## starting the work on the map# get map datainvisible({ countries <-ne_countries(scale ="medium", returnclass ="sf") %>%filter(name %in%c("United States of America", "Canada"))})invisible(capture.output({# get great lakes lakes <-ne_download(scale ="medium",type ="lakes",category ="physical",returnclass ="sf" ) %>%filter( name %in%c("Lake Ontario", "Lake Erie", "Lake Huron","Lake Michigan", "Lake Superior") ) }))# crop to focus area - remove alaskamap_bounds <-list(xmin =-125, xmax =-65,ymin =41, ymax =52)# filter out Alaska from US (longitude > -140 is roughly Alaska)continental_us_canada <- countries |>mutate(# create geometries without Alaskageometry =case_when( name =="United States of America"~st_crop( geometry, xmin =-130, ymin =20, xmax =-60, ymax =50 ),TRUE~ geometry ) ) |># remove any empty geometries after croppingfilter(!st_is_empty(geometry))# combine all tibbles into one with perc_changesspike_prep_sf <- port_names_with_perc_changes |>filter(geo !='British Columbia') |># filter out "British Columbia" hereleft_join( port_coords,join_by(geo == geo) ) |>left_join( valid_coords |>mutate(geo =str_remove(geo, ' Canada US border') ),join_by(geo == geo) ) |>mutate(latitude =coalesce(latitude, lat),longitude =coalesce(longitude, long) ) |>filter(!is.na(latitude)) |>select(geo:longitude) |># Make sure perc_change is carried through if it exists in port_names_with_perc_changesst_as_sf(coords =c("longitude", "latitude"),crs =4326, agr ="constant",remove =FALSE )# generating regions for easier filtering of spikescrossing_data <- spike_prep_sf |># This now uses the BC-filtered datafilter(abs(perc_change) <= .50& longitude >=-122.76&#exclude far west/Vancouver area and start at Douglas!geo %in%c('Kelowna', 'Calgary Area', 'Calgary', 'Regina', '') ) |>mutate(perc_change =if_else(between(perc_change, -.39, -.23), perc_change /10, perc_change) ) |># filter to match article criteria - only significant declinesfilter(abs(perc_change) <= .45 ) %>%# add regional grouping for better selectionmutate(region =case_when( longitude <-115~"west_coast", longitude >=-115& longitude <-100~"prairie", longitude >=-100& longitude <-85~"central", longitude >=-85~"east",TRUE~"other" ),normalized_heights =if (n() >0&&any(!is.na(perc_change))) abs(perc_change) /max(abs(perc_change), na.rm =TRUE) else0# handle empty or all NA cases )# select representative spikes by region to match article count (~40-50 total)selected_spikes <- crossing_data |>filter(latitude <=49.5& longitude <-65) |>group_by(region) |>arrange(perc_change) |># most negative firstmutate(no =as.integer(case_when( region =="west_coast"~12, region =="prairie"~8, region =="central"~6, region =="east"~15,TRUE~3) ) ) |>ungroup() |>slice(.by = region, 1:no[1]) |>select(-no) |>bind_rows(# pull select positive changes crossing_data |>filter(perc_change >0) |>filter(longitude <-65) )# transform points to projected coordinate system; original is more elliptic in shape rather than a 'straight' planeselected_spikes_projected <- selected_spikes |># This is now BC-filtered# filter(geo != 'British Columbia') |> # already filteredst_transform(crs ="+proj=aeqd +lat_0=45 +lon_0=-100 +x_0=0 +y_0=0")# extract projected coordinatesselected_spikes_coords <- selected_spikes_projected |>mutate(proj_x =st_coordinates(geometry)[, 1],proj_y =st_coordinates(geometry)[, 2] ) |>st_drop_geometry()# create triangles using projected coordinatescreate_projected_triangles <-function(data, height_scalar =1) {# height_mult <<- height_scalar # Avoid using global assignment if possible local_height_mult <- height_scalar# function to create a triangle for each point using projected coordinates create_triangle <-function(x, y, perc_change, normalized_heights, current_height_mult) {# base width and height in projected units (meters) base_width <-8e4# 80km base width height <-abs(normalized_heights) * current_height_mult *4e5# scale height based on normalized perc change# triangle vertices - purely vertical orientationif (perc_change <0) {# downward pointing triangle for vast majority of cases vertices <-data.frame(x =c(x - base_width, x + base_width, x, x - base_width),y =c(y, y, y - height, y) ) } else {# upward pointing triangle; only for a couple of instances vertices <-data.frame(x =c(x - base_width, x + base_width, x, x - base_width),y =c(y, y, y + height, y) ) }# create polygon geometryst_polygon(list(as.matrix(vertices))) }# create triangles for each point triangles <- data |>rowwise() |>mutate(triangle_geom =list(create_triangle(proj_x, proj_y, perc_change, normalized_heights, local_height_mult)) ) |>ungroup()# convert to sf object with triangle geometries triangle_sf <- triangles |>mutate(geometry =st_sfc(triangle_geom, crs ="+proj=aeqd +lat_0=45 +lon_0=-100 +x_0=0 +y_0=0") ) |>st_sf()return(triangle_sf)}# create triangle geometries using projected coordinates# ensure selected_spikes_coords is not empty before callingtriangle_sf <-create_projected_triangles(selected_spikes_coords, height_scalar =2.5)# add no. of crossing as well to base color fill on, low raw no. of crossings/change get ligher orange color, vs. darker for higher ones# so spike height becomes a factor % change while color that of the raw no. change# create the crossings data with error checkingcrossings_data_for_join <- map_data_ca_xings |>mutate(ref_year =word(ref_date, 1, sep ='\\-') ) |>filter( traveller_type =='Travellers'&str_detect(traveller_characteristics, 'Canadian residents') ) |>reframe(crossings_raw_change =sum(if_else(ref_year =="2025", value, 0)) -# ensure year is character for comparison if ref_year is charactersum(if_else(ref_year =="2024", value, 0)),.by = geo ) |>filter(!is.na(crossings_raw_change))# perform the join with error checkingtriangle_sf <- triangle_sf |>left_join(crossings_data_for_join, by ="geo") |># handle any missing valuesmutate(crossings_raw_change =coalesce(crossings_raw_change, 0) ) |># normalize raw changes for alpha scalingmutate(abs_raw_change =abs(crossings_raw_change),# normalize to 0-1 range for alpha values, with safeguard for division by zeromax_abs_change =if (n() >0&&any(!is.na(abs_raw_change))) max(abs_raw_change, na.rm =TRUE) else0,normalized_raw_change =if_else( max_abs_change >0, abs_raw_change / max_abs_change,0),# create alpha values: min 0.2 for very small changes, max 0.8 for largest changesalpha_value =0.2+ (normalized_raw_change *0.8) ) |>select(-max_abs_change)# transform countries firstcountries_proj <- countries |>st_transform(crs ="+proj=aeqd +lat_0=45 +lon_0=-100 +x_0=0 +y_0=0")lakes_proj <- lakes |>st_transform(crs ="+proj=aeqd +lat_0=45 +lon_0=-100 +x_0=0 +y_0=0")# get the bounding box and expand it significantly for better fade effectbbox <-st_bbox(countries_proj)x_range <- bbox[3] - bbox[1]y_range <- bbox[4] - bbox[2]# expand bbox by 50% to create more fade areabbox_expanded <-c(xmin = bbox[1] -0.5* x_range,ymin = bbox[2] -0.5* y_range,xmax = bbox[3] +0.5* x_range,ymax = bbox[4] +0.5* y_range)# create a much finer grid for smoother gradientgrid_resolution <-400# increased for even smoother gradientx_seq <-seq(bbox_expanded[1], bbox_expanded[3], length.out = grid_resolution)y_seq <-seq(bbox_expanded[2], bbox_expanded[4], length.out = grid_resolution)# create vertical gradient for within-country areasgradient_data <-expand.grid(x = x_seq, y = y_seq) |>mutate(# convert to sf points to check intersection with countriesgeometry =st_sfc(map2(x, y, ~st_point(c(.x, .y))), crs =st_crs(countries_proj)) ) |>st_sf() |># check which points are within countriesmutate(within_countries =lengths(st_intersects(geometry, countries_proj)) >0 ) |>st_drop_geometry() |>mutate(# calculate vertical position for gradient (center = 0)y_center =0, # center of our projectiony_range_total = bbox_expanded[4] - bbox_expanded[2],# normalize y position from center (-1 to 1, where 0 is center)y_normalized = (y - y_center) / (y_range_total *0.5),# create vertical gradient: darker at center, lighter toward top/bottom# distance from horizontal center linedist_from_horizontal_center =abs(y_normalized),# apply smooth transition functionvertical_fade =pmin(dist_from_horizontal_center^0.6, 1), # the higher the exponent, the 'faster' the fade; away from center, capts at 1# create color values: #e0e0e0 at center fading to white at edges# convert #e0e0e0 (224, 224, 224) to white (255, 255, 255)red_value =224+ (255-224) * vertical_fade,green_value =224+ (255-224) * vertical_fade,blue_value =224+ (255-224) * vertical_fade,# create hex colorcolor_hex =rgb(red_value, green_value, blue_value, maxColorValue =255),# only apply gradient within countries, white elsewherefinal_color =if_else(within_countries, color_hex, "white"),# alpha - fully opaque within countries, transparent outsidealpha_value =if_else(within_countries, 1.0, 0.0) # this is for gradient_data alpha )# create a mask for areas within the bounding boxbbox_polygon <-st_polygon(list(rbind(c(bbox_expanded[1], bbox_expanded[2]),c(bbox_expanded[3], bbox_expanded[2]),c(bbox_expanded[3], bbox_expanded[4]),c(bbox_expanded[1], bbox_expanded[4]),c(bbox_expanded[1], bbox_expanded[2])))) |>st_sfc(crs =st_crs(countries_proj))# create the main plot with vertical gradient within countriesmain_plot <-ggplot() +# add white backgroundgeom_raster(data = gradient_data,aes(x = x, y = y),fill ="white") +# add the vertical gradient within countries onlygeom_raster(data = gradient_data |>filter(within_countries),aes(x = x, y = y, fill = final_color, alpha = alpha_value)) +# alpha here is for gradient# use identity scales for the custom colorsscale_fill_identity() +scale_alpha_identity() +# this will apply to gradient's alpha, and later legend rect's alpha# add country boundariesgeom_sf(data = countries_proj,fill =NA, color ="white", linewidth =0.3) +# Great Lakesgeom_sf(data = lakes_proj,fill ="white", color ="white", linewidth =0.1) +# triangle spikes with alpha based on raw change magnitudegeom_sf(data = triangle_sf, aes(fill =if_else(perc_change <0, I('#d65f00'), I('#2b9d6c'))), size =0.8,alpha =pmin(triangle_sf$alpha_value, .7) ) +scale_fill_identity() +# triangle color scale# scale_color_manual(# values = c("negative" = "#d65f00", "positive" = "#2b9d6c"),# guide = "none"# ) +# coordinate systemcoord_sf(xlim =c(-4e6, 4e6),ylim =c(-1.75e6, 2.1e6),expand =FALSE,clip ='off' ) +# clean themetheme_void() +theme(plot.title =element_markdown(hjust =0.5, size =14, family ='franklin', margin =margin(t =-10, b =10)),plot.background =element_rect(fill ="white", color =NA),panel.background =element_rect(fill ="white", color =NA),legend.position ='none'# ggplot's auto legend is off, we are making a manual one )# function to create the 1/4 circles for Douglas and Niagara Falls calloutsdraw_quarter_circles <-function(start, end, radius =400e3, x_offset =0, y_offset =0){# get angles and coords for a quarter of a acircle angles <-seq(start, end, length.out =1000) # reduced length.out for efficiency# radius <- radius # already an argument x_coords <- radius *cos(angles) y_coords <- radius *sin(angles)# test with different offesets to match article's callout positioning x_coords_with_offset <- x_coords + x_offset y_coords_with_offset <- y_coords + y_offset# assign values to a tibble quarter_circle_data <-tibble(x = x_coords_with_offset, y = y_coords_with_offset)# draw said 1/4 circles path <-geom_path(data = quarter_circle_data, aes(x = x, y = y) # simplified aes )return(path)}theme_set_custom() # draw 1/4 circlesmain_plot_with_qrt_circles <- main_plot +geom_text(aes(x =-150e3,y =-550e3, label ='UNITED STATES'),family ='franklin',size =6,color ='#a3a3a3' ) +geom_text(aes(x =-200e3,y =950e3, label ='CANADA'),family ='franklin',size =6,color ='#a3a3a3' ) +# douglas 1/4 circledraw_quarter_circles(start = pi, end =3* pi/2, x_offset =-150e3*11.2, y_offset =-550e3+100e4) +# niagara 1/4 circledraw_quarter_circles(start =0, end = pi/2, x_offset =-150e3*-11.5, y_offset =-550e3-50e4)call_outs_tibble <-tibble(label =c("<b>~90,000 fewer<br> people crossed at<br> Douglas</b> in March<br>2025, a 43<br>percent decline.","<b>Nearly 102,000 fewer <br> people crossed at <br>Niagara Falls,</b> a 41 <br> percent decline." ),x =c(-150e3*15, -150e3*-14),y =c(-550e3+100e4, -550e3-75e4),hjust =c(.5, 0.05),vjust =c(0, 0.6) )# add legend elements to the plotlegend_box <-pmin( triangle_sf$alpha_value, .7 ) |>sort() |>enframe() |>mutate(normalized_values_01 = (value -min(value)) / (max(value) -min(value)),color_bins =case_when( normalized_values_01 <= .25~'#e3cabd-10', normalized_values_01 <= .50~'#e0b69e-20', normalized_values_01 <= .75~'#db956a-30', normalized_values_01 <=1~'#d76c21-40%',TRUE~'#c7d8d0' ),labels =paste0('-', word(color_bins, -1, sep ='-')),color_bins =word(color_bins, 1, sep ='-') ) |>select(color_bins, labels) |>distinct() |>ggplot(aes(x = labels, y = .5, fill =I(color_bins))) +geom_tile() +labs(x =NULL, y ='Percentage change vs. last year' ) +# ggtitle(label = '<b>Big drops along the border<b>') +theme(axis.title.y =element_text(angle =0, size =7, vjust = .5, face ='bold'),axis.text.x =element_text(size =7, face ='bold'),# vjust = .5, hjust = -1, axis.text.y =element_blank(),plot.background =element_rect(fill ="white", color =NA),panel.background =element_rect(fill ="white", color =NA),panel.grid.major.x =element_blank(),panel.grid.major.y =element_blank(),legend.position ='none',plot.margin =margin(t =8) )# plot calloutsfinal_plot <- main_plot_with_qrt_circles +geom_textbox(data = call_outs_tibble,aes(x = x, y = y, label = label, hjust = hjust, vjust = vjust),family ='Libre Franklin',width =unit(0.2, "npc"),fill =NA,size =c(3, 2.5),color ='black',box.colour =NA,orientation ="upright" ) # plot legend box along with its (legend) namefinal_plot_with_legend <- final_plot +annotation_custom(grob =ggplotGrob(legend_box),xmin =-2e6, xmax =3e6,ymin =2.5e6, ymax =2.8e6 )# plot final with title using geom_text() to avoid running into margin/rendering erros + adding caption to plotfinal_plot_with_legend +geom_text(aes(x =0, y =2.85e6, label ="Big drops along the border"),family ='franklin',size =3.5,fontface ='bold',color ='black',hjust =0.3,vjust =0.5 ) +labs(caption ="<span style='color:#707070; text-align: justify; width: 100%; display: block; padding-top: 25px;'>Source: Statistics Canada Note: The height of each spike reflects the change in passenger numbers this<br>March versus the same month last year. Only ports with at least 1,000 crossings in March 2025 are shown.</span>" ) +theme(plot.caption =element_markdown(margin(t =50), size =5, hjust = .5) ) ```Overall, I found this article powerful, clear, and its authors original in their use of data, so truly an enjoyable experience to have worked on 'deconstructing' the article backend to match the outputs as closely as I could.. thank you for taking the time to read this and more to come !