import pandas as pd
import great_tables
This week’s TidyTuesday dataset contains information about palm trees. I decided to make a table about this data using Pandas
and the great_tables
Python packages. The table counts the number of palm tree species by fruit color and gives some statistics about fruit width.
First, let’s import the packages needed for this analysis.
Then we download the data from GitHub into a Pandas dataframe.
= "https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2025/2025-03-18"
base_url = pd.read_csv(f"{base_url}/palmtrees.csv", encoding="windows-1252") df
The dataset includes information about 2,557 palm tree species. Of these, 758 do not report a main fruit color.
Next we wrangle the data into the desired tabular format. This is done by creating one table that holds information about all species for each fruit color, and then joining that table with a second table that holds information about one randomly sampled species for each fruit color. There is also some data cleaning that makes the text nicer for the desired table.
# Create one row for each fruit color and remove missing values
= (
df_base =df["main_fruit_colors"].str.split("; "))
df.assign(fruit_color"fruit_color")
.explode(=["main_fruit_colors"])
.dropna(subset
)
# Compute summary count and width statistics by fruit color
= df_base.groupby("fruit_color").agg(
df_all_species =("spec_name", "size"),
n=("average_fruit_width_cm", "min"),
min_average_fruit_width_cm=("average_fruit_width_cm", "max"),
max_average_fruit_width_cm
)
# Sample name and width of one species per fruit color
= (
df_one_species "fruit_color", "spec_name", "average_fruit_width_cm"]]
df_base[["fruit_color")
.groupby(1, random_state=1)
.sample(
)
# Join dataframes and clean-up for final table presentation
= (
df_table ="fruit_color")
df_all_species.merge(df_one_species, on=True)
.reset_index(drop
.assign(=lambda x: x["fruit_color"]
fruit_colorstr.capitalize()
."Straw-coloured", "Straw"),
.replace(
)"n", ascending=False)
.sort_values( )
Now we create the desired table with the great_tables
package.
(
great_tables.GT(df_table)
.tab_header(="Palm Tree Fruit Characteristics",
title="A guide for relating fruit size to fruit color",
subtitle
)
.tab_spanner(="Across all Species",
label=[
columns"n",
"min_average_fruit_width_cm",
"max_average_fruit_width_cm",
],
)
.tab_spanner(="Sample Species",
label=[
columns"spec_name",
"average_fruit_width_cm",
],
)
.cols_label(="Species Name",
spec_name="Fruit Color",
fruit_color="Number of Species",
n="Average Fruit Width (cm)",
average_fruit_width_cm="Min Average Fruit Width (cm)",
min_average_fruit_width_cm="Max Average Fruit Width (cm)",
max_average_fruit_width_cm
)
.fmt_number(=[
columns"average_fruit_width_cm",
"min_average_fruit_width_cm",
"max_average_fruit_width_cm",
],=2,
decimals=False,
use_seps
)
.tab_source_note(="TidyTuesday: 2025, week 11 | PalmTraits 1.0 Database."
source_note
)
.tab_source_note(f"Note, some species can have multiple fruit colors, and \
{n_missing} species have no reported main fruit color."
)
.opt_row_striping()# Save table as an image for the blog listing, also shows the table
"./image.png")
.save( )
Palm Tree Fruit Characteristics | |||||
---|---|---|---|---|---|
A guide for relating fruit size to fruit color | |||||
Fruit Color | Across all Species | Sample Species | |||
Number of Species | Min Average Fruit Width (cm) | Max Average Fruit Width (cm) | Species Name | Average Fruit Width (cm) | |
Red | 501 | 0.21 | 11.00 | Bactris schultesii | 0.85 |
Brown | 484 | 0.40 | 20.00 | Dypsis coursii | 2.00 |
Black | 462 | 0.40 | 20.00 | Pinanga auriculata | 0.85 |
Orange | 265 | 0.40 | 15.50 | Bactris killipii | 0.90 |
Yellow | 206 | 0.35 | 6.00 | Daemonorops macroptera | 1.20 |
Green | 195 | 0.30 | 14.00 | Calamus erinaceus | 1.00 |
Purple | 175 | 0.40 | 20.00 | Burretiokentia koghiensis | 1.05 |
White | 87 | 0.30 | 5.00 | Pinanga albescens | 5.00 |
Pink | 36 | 0.20 | 3.20 | Pinanga annamensis | 1.20 |
Straw | 22 | 0.60 | 3.17 | Calamus symphysipus | 0.65 |
Blue | 19 | 0.47 | 2.80 | Geonoma triandra | 0.55 |
Cream | 11 | 0.50 | 1.30 | Calamus vestitus | 1.16 |
Grey | 10 | 0.47 | 2.00 | Licuala orbicularis | 1.00 |
Ivory | 9 | 0.60 | 4.00 | Calamus psilocladus | 0.80 |
TidyTuesday: 2025, week 11 | PalmTraits 1.0 Database. | |||||
Note, some species can have multiple fruit colors, and 758 species have no reported main fruit color. |
We see that most palm tree species have red fruit and that ivory is the least common fruit color. Brown, black, and purple fruit have the largest maximum average fruit size of 20 cm. Cream colored fruit have the smallest maximum average fruit size of 1.3 cm. The minimum average fruit widths are more constant across fruit colors than the maximum average fruit widths.
Overall, I suspect this table can be made nicer with additional styling, such as adding a border or rearranging columns, but this was only meant to be a quick analysis so I’ll leave it here for now!