This repository includes data and code to accompany the manuscript 'Topography drives variability in recent circumpolar permafrost thaw pond expansion' by Abolt et al. The data include satellite imagery and derived maps of thermokarst pools from twenty seven survey areas in North America and Siberia. The code, written in MATLAB (R2021a), contains demonstrations of the workflow for generating the maps. The demonstrations include training a generalized UNet for mapping thermokarst pools using data from three survey areas, 'fine tuning' the UNet for use at a specific survey area using transfer learning, applying a trained UNet to infer thermokarst pool extent within satellite imagery, and performing histogram matching as a pre-processing step to improve satellite imagery contrast. If you wish only to use the satellite imagery and final maps, all relevant data are stored in the directory 'final_products.' Data are organized by survey area and date. All data are projected to the local UTM coordinate system. Within 'final_products', the subdirectory 'imagery' contains panchromatic satellite imagery from the WorldView satellites projected at 50 cm horizontal resolution. Maps of thermokarst pool extent are presented in two formats. The subdirectory 'pools_raster' contains binary maps, also at 50 cm horizontal resolution, in which thermokarst pools are labeled 'true'. The subdirectory 'pools_vector' contains the same maps, but in vector format. The subdirectory 'masks' contains binary images labeling clouds (where present in the satellite imagery) and land (from two survey areas that include include seawater). These masks were used in calculations of percent land cover by thermokarst pools. All rasters in 'final_products' are presented in geotiff format, and vector data are presented as shapefiles. Finally, 'final_products' contains a table, 'PoolExtent.xlsx' and 'PoolExtent_all_areas.csv', containing time series of thermokarst pool extent at each survey area, expressed as total fraction of the land surface. If you wish to explore the workflow for mapping thermokarst pools, you must first run the script 'UnpackData.m'. This script segments the data from 'final_products' into one square kilometer tracts, on which the workflow was designed to operate. Satellite imagery is segmented into tracts with 14 m (28 pixel) buffers on all sides, to prevent edge effects during inference of thermokarst pools. These data are stored in the directory 'workspace\imagery'. Final maps of thermokarst pools are also segmented into one square kilometer tracts, without a buffer, stored in 'workspace\results\post_CRF'. Each tract of segmented imagery is named to denote the lower left corner of its spatial extent. For example, the tract '430_7746' at Prudhoe Bay extends from 430000-431000 eastings and 7746000-7747000 northings in the local UTM system. After executing 'UnpackData.m', you may explore several functions in the directory 'workspace.' The script 'MappingWorkflowDemo.m' demonstrates the use of the functions 'getSites.m', 'getTracts.m', 'getDates.m', 'applyUNet.m', 'applyCRF.m', and 'overlayPools.m'. This demonstration maps and visualizes thermokarst pools at a tract of the Prudhoe Bay survey area using a pre-trained UNet. We note that the function 'applyCRF.m' relies on code for executing a fully connected conditional random field from Krahenbuhl and Koltun (2011). To use this function, first unzip the folder 'include.zip' found at 'workspace\crf'. The script 'HistogramMatchingDemo.m' demonstrates the procedure for histogram matching to improve the consistency of satellite image contrast, using the Anaktuvuk survey area as an example. The subdirectory 'workspace\training_data' contains data and code for training a neural network from scratch and for performing transfer learning. Simply run the scripts 'TrainUNet.m' to train a generalized UNet using imagery from Prudhoe Bay, Kuparuk, and Wrangel Island, and 'TransferLearning.m' to fine tune the generalized UNet for performance at Anaktuvuk. These scripts train neural networks using samples of satellite imagery stored in the subdirectory 'workspace\training_data\imagery' and corresponding labeled images stored in the subdirectory 'workspace\training_data\pools'. Please contact Chuck Abolt (permanent email is chuck.abolt@gmail.com) with any questions regarding the use of these data.