Migrating from Trello to Airtable: Working with JSON Data in R
Tutorial on how to migrate your current Trello board to an Airtable base from start to finish.
By Silvia Canelón in R Tutorial
May 12, 2020
Airtable is a user-friendly and powerful tool that until recently I’d been using for personal projects (i.e. document organizing, apartment hunting, etc.). A couple of weeks ago I leaned on Airtable to create a base designed for the Philadelphia Reproductive Freedom Collective to support our COVID-19 mutual aid efforts.
Shout out to the Silicon Valley Democratic Socialists of America who provided inspiration and support! Their mutual aid network is extensive and operates at an impressive scale. Get connected with the South Bay Mutual Aid network here: https://www.aidnetwork.org/
Having fallen in some Airtable deep-work I figured maybe it was time to retire my Trello boards in favor of some task bases. Airtable accepts CSV files from Trello but, alas, my free Trello account only gave me the option to print as a PDF or export as JSON. I decided this would be a good opportunity to learn how to parse JSON data and export it as a CSV ready for import to Airtable.
What to expect
The code provided in the R Markdown file requires the tidyverse
, JSONlite
, and lubridate
packages as well as a JSON file containing your Trello data. It will generate the following:
tidy_cards.csv
file ready to import to Airtableattachments
folder containing any content you’ve attached to your cards using a URL.attachment_errors.csv
file containing the card name, or task, as well as the URL/file path of any attachments that failed to download (i.e. links to files stored locally), so that you exactly which links you’ll need to locate manually.
By the end of the tutorial you’ll have imported a CSV file into Airtable and configured Airtable to match your Trello board. Specifically, you’ll be able to migrate labels, lists, attachments, card names, card descriptions, card due dates, date last modified, and whether the cards were marked “complete” or archived.
What not to expect
- Rendered markdown from your card descriptions. Airtable does support markdown in its “long text” field type, but it doesn’t automatically render from imported text.
Choose your own adventure
- Download the R Markdown file linked above and run with it. Jump down to section Formatting in Airtable for pointers on how to get your Airtable base configured like your Trello board.
- Keep reading for a code-through of the R Markdown file, using the example featured in the picture above, with tips and resources along the way.
Follow along with me!
The Trello board used for this code-through is this
Program Management Template created by Soniya Ahuja. To download the Trello board data, go to Menu > More > Print and Export and then right-click on “Export as JSON” to save the JSON file in your working directory. Or save a little time and download the program-mgmt.json
file linked above .
You may also choose to read through Making Sense of Trello’s JSON Export.
Last but not least – we’ll be keeping things tidy, with the help of the tidyverse collection of R packages, to ensure each variable is in its own column, observations are in rows, and values are in cells. You can read more about tidy data in Chapter 12 of R for Data Science.
Importing JSON file
# clearing environment
rm(list = ls())
# loading libraries
library(jsonlite)
library(lubridate)
library(tidyverse)
trello <- stream_in(file("program-mgmt.json")) # produces a data frame
glimpse(trello)
A lot of information about the Trello board is contained in JSON data. We’ll be parsing information about the cards
as well as the lists
and then joining them together by a common identifier id
or idShort
.
Parsing information
Parsing information from JSON data was not something I was familiar with and found myself coming back to Kan Nishida’s post Working with JSON data in a very simple way time and time again throughout this process.
Cards
The first step is to extract information about the Trello cards
themselves. This information is contained within a list of data frames and requires flattening which makes the nested hierarchical data structure into a flatter structure by assigning each of the nested variables its own column as much as possible. Then, the most important variables are selected as cards_trim
before moving on to extracting label
information.
# selecting cards information
cards <- trello$cards # list of 1
# flattening
cards_flat <- flatten(cards) #list of 37
# tibble time
cards_flat_tbl <- as_tibble(cards_flat) # 32 obs of 37 variables
glimpse(cards_flat_tbl)
# selecting wanted variables
cards_trim <- cards_flat_tbl %>%
select(id, idShort, idList, dateLastActivity, name, desc, dueComplete, due,
labels, attachments, shortUrl, closed) %>%
arrange(desc(dateLastActivity))
Labels
Relevant information about the labels is selected and the unnest
function is used to flatten because labels
is a list of data frames. Again, I found
Kan’s post helpful here! Particularly for saving the label details as a character list, which is helpful later on. Once we get to Airtable it’ll be important that label information for each card
be structured as a simple list of words (i.e. label1, label2, label3).
We get close once the labels
are contained within a character list labelList
, but there are still “c"s and parentheses that need to be removed. String manipulation is something I’m still learning about so the code below is far from elegant!
# extracting labels details
labels_info <- cards_trim %>%
select(id, idShort, labels) %>%
unnest() %>% # no arguments because the nested items don't have names
rename(labelName = name) %>%
select(id, idShort, labelName) %>%
group_by(id, idShort) %>%
summarize(labelList = list(labelName)) %>%
mutate(labelList = as.character(labelList)) %>%
mutate(labelList_tidy = str_remove_all(labelList, pattern = "\"")) %>%
mutate(labelList_tidy = str_remove_all(labelList_tidy, pattern ="c\\(")) %>%
mutate(labelList_tidy = str_remove_all(labelList_tidy, pattern ="\\)")) %>%
unique()
knitr::kable(labels_info %>% head(n = 3L))
# joining back with main cards data frame
ct_labels <- left_join(cards_trim %>% select(-labels), labels_info %>% select(-labelList))
Attachments
The next step is to download all of the items attached to the cards onto a local folder. I found this StackOverflow post really helpful. When I tried this out on my own Trello board I also found that I couldn’t download the few attachments I had made from my local drive. This StackOverflow post helped me figure out how to flag and catch these download errors so that I could create a list of the urls with “attachment errors” that I could follow up with manually.
# expanding the attachment lists into separate url records
att_urls <- ct_labels %>%
select(idShort, attachments) %>%
unnest() %>%
select(idShort, url) %>%
mutate(url = as.character(url),
attachmentError = 'FALSE')
knitr::kable(att_urls %>% head(n = 3L))
# creating directory for attachments
dirAttachments <- "attachments/"
dir.create(dirAttachments)
# downloading urls and checking for errors using try()
for (i in 1:length(att_urls$url)){
locAttachments <-
paste(dirAttachments, "/", att_urls$idShort[i], "_", basename(att_urls$url[i]), sep = "")
step_to_try <- try(attachment_check <- download.file(att_urls$url[i], destfile = locAttachments))
if("try-error" %in% class(step_to_try)) {
cat("Error row: ", i, "\n", "Error message: ", step_to_try[1], sep = "")
att_urls$attachmentError[i] = 'TRUE'
}
}
The following selects the attachment records with errors, renames somes variables, and exports the data frame as a CSV.
# preparing data frame for export to CSV
attachment_errors <- att_urls %>%
filter(attachmentError == TRUE) %>%
rename(Task_Id = idShort, Attachment_URL = url, Attachment_Error = attachmentError)
# exporting to CSV
write.csv(attachment_errors, file = "attachment_errors.csv")
Aside: If you have a lot of attachments per card, you may want to create a directory folder for each card. This for loop will get you there – use it instead of the one above:
# creates individual directory folders for each card id
for (i in 1:length(att_urls$url)){
dirAttachments <- paste(dirFiles, "attachments", att_urls$idShort[i], sep = "/")
dir.create(dirAttachments) # creates directory for each unique card id
locAttachments <- paste(dirAttachments, basename(url[i]), sep = "/")
download.file(url[i], destfile = locAttachments)
}
Records in the main cards
data frame are labeled “TRUE” within the attachments
column if they have attachments and “FALSE” if they don’t.
# converts the attachment column to a categorical variable in the main cards+labels data frame
ct_labels <- ct_labels %>%
mutate(attachments = ifelse(idShort %in% att_urls$idShort, TRUE, FALSE))
Lists
The lists
information is extracted similarly to the cards
information, but flattening is a little more straightforward because it involves only one data frame. With more data frames, the unnest
function is a better choice.
# selecting lists information
lists <- trello$lists # list of 1 data frame
glimpse(lists)
# flattening
lists_flat <- lists[[1]] # 17 obs of 9 variables
# selecting wanted variables
lists_trim <- lists_flat %>%
select(id, name, closed) %>%
rename(idList = id, nameList = name, closedList = closed)
knitr::kable(lists_trim %>% head(n = 3L))
# joining back with main cards+labels data frame
ct_labels_list <- left_join(ct_labels, lists_trim) %>%
select(id:shortUrl, labelList_tidy:nameList, closed, closedList)
Exporting to CSV
Data prepping
Columns in the new ct_labels_list
data frame are given new names, and the lubridate
package is used next to convert the date fields.
This resource was helpful in understanding date conversions and formatting.
# changing variable names
tidy_cards <- ct_labels_list %>%
select(-id, -idList, -closedList) %>%
rename(Task = name, Task_ID = idShort, Notes = desc, Done = dueComplete, Date_Due = due,
Labels = labelList_tidy, Trello_List = nameList, Trello_Last_Modified = dateLastActivity,
Trello_Url = shortUrl, Trello_Attachments = attachments, Archived = closed) %>%
select(Task, Task_ID, Notes, Done, Date_Due, Labels, Trello_List, Trello_Last_Modified,
Trello_Url, Trello_Attachments, Archived) %>%
mutate(Trello_Last_Modified = as_datetime(Trello_Last_Modified, tz = ""),
Date_Due = as_datetime(Date_Due, tz = ""),
Done = ifelse(is.na(Date_Due) == TRUE, 'NA', Done)) # ensures only undone tasks assigned a due date get marked as "FALSE"
The last step before exporting the final data frame tidy_cards
is to check the unique number of tasks to make sure it matches the number of records in the data frame (i.e. one task per observation).
# determining the number of unique tasks
length(unique(tidy_cards$Task_ID))
# final look at tidy_cards
glimpse(tidy_cards)
Data exporting
write.csv(tidy_cards, file = "tidy_cards.csv")
Formatting in Airtable
Import CSV into Airtable as a new base. Add a base > Import a spreadsheet > CSV file. Upload the cards_tidy.csv
file. Pick a name, icon, and color for your brand new base!
More detailed instructions can be found in
this Airtable article.
Task
- Because we know each observation in our table is unique, we can copy and paste
Task
into the first column and hide/delete the original column.Task
is now the primary field.
Task_ID
- Convert
Task_ID
field type to “number”.
Notes
- Convert
Notes
field type to “long text” and enable rich text formatting. This gives us the option of using Markdown in the future, but sadly doesn’t automatically recognize fully formatted Markdown in the imported text.
Labels
-
Change
Labels
field type to “multiple select” so that it turns each item in each list into a label. -
Optional: Create a new
Projects
column next toLabels
and use the labels to guide you in creatingProject
labels/categories: Group the records byLabels
field and add toProjects
field as appropriate.I recommend creating an
NA
project from theNA
labels so that these tasks aren’t marked as “uncategorized” in theProjects
column. Having records with an “empty” assignment gets in the way whenever you want to group by that category. To that end, it’s helpful to group byProject
and make sure any “empty” records get assigned to the “NA”Project
. -
Delete from
Labels
any labels that were converted toProjects
and ungroup the records.
Trello_Lists
-
Convert
Trello_Lists
column field type into “single select”. This gives us the option of using the Kanban style we were used to in Trello.Every record should already be associated with a Trello_List.
-
If you want to replicate the Trello kanban layout, change the order of the single select options in
Trello_Lists
to match the order from left-to-right of your Trello board
Trello_URL
- Convert the
Trello_URL
field type to “URL” and then hide it if you don’t think you’ll reference it often.
Trello_Attachments
-
Convert
Trello_Attachments
field type to “single select” -
Create a new
Attachments
column with field type “attachment”. This is where you’ll upload your downloaded attachments. -
Filter your records by
Trello_Attachments
so only show “TRUE” results -
Sort records by
Task_ID
and simplify your view by temporarily hiding all columns except forTask_Name
(primary field),Task_ID
,Trello_Attachments
, and the newAttachments
column. -
Open your local
attachments
folder and drag and drop the files to their correspondingAttachments
field according to theirTask_ID
in the filename.Increase the height of the records for this step. It’ll make it easier to make sure you’re dragging and dropping to the correct record
-
If you encountered errors downloading some of your attachment URLs, now is the time to check your local
attachment_errors.csv
file for the URLs with errors during the download process. These are attachments you’ll have to find elsewhere and upload to theAttachments
field as needed. -
Remove the filter to your view and unhide any columns you wish to remain visible.
Done & Archived
-
Convert the
Done
andArchived
field types to “Checkbox” and it will automatically assign a “check” to all records marked “TRUE” and leave the ones marked “FALSE” or “NA” unchecked. So easy! -
There is no direct option to “archive” tasks that have been completed like you can do in Trello, but you can apply a filter to your table view to hide the tasks that are complete. To do this, set the filter so that the
Done
andArchived
fields are unchecked.This must be repeated for each saved View of your records.
Moving forward
Dates
- Modify
Date_Due
andTrello_Last_Modified
field types to “Date” with time. - You can sort the records by
Trello_Last_Modified
if that’s helpful, but otherwise you can hide the column and keep it for historical reference. - Create a new column
Last_Modified
with field type “Last modified time” and select all columns you want to track changes to on a date/time basis moving forward.
Record Views
- Select Kanban from the Views options and group by
Trello_List
to see your tasks similar to how you saw them in Trello, complete with attachment covers! A bonus is that if you have multiple images attached to a card, you can view them without expanding the card by just hovering over the attachment cover! - Move, collapse, and delete stacks as you see fit. Customize cards with as little or as much information as you want.
Tasks vs Subtasks
There are probably many ways to parallel the checklist option Trello gives you within a card.
- The most straightforward is to use the basic checklist formatting within the
Description
field to create lists - Another is to think of your primary field
Tasks
instead as ‘subtasks’ and create a new column to serve as the umbrella ‘task’. This new ‘task’ column would be field type “single selection”, then you could group your records by ‘task’.