API reference#

class WholeSlideImage(path, *, attributes=None, hdf5_file=None)#
__init__(path, *, attributes=None, hdf5_file=None)#

WholeSlideImage object for handling WSI.

Parameters:
path: Path

Path to WSI file or URL. If URL is given, the file will be downloaded to a temporary directory in the filesystem.

attributes: dict[str, tp.Any]

Optional dictionary with attributes to store in the object.

hdf5_file: Path

Path to file used to save tile coordinates (and images). Default is path.with_suffix(“.h5”).

Attributes:
path: Path

Path to WSI file.

attributes: dict[str, tp.Any]

Dictionary with attributes to store in the object.

name: str

Name of the WSI file.

wsi: openslide.OpenSlide

A handle to the low-level OpenSlide object.

hdf5_file: Path

Path to file used to save tile coordinates (and images).

level_downsamples: list[tuple[float, float]]

List of tuples with downsample factors for each level.

level_dim: list[tuple[int, int]]

List of tuples with dimensions for each level.

contours_tissue: list[np.ndarray]

List of tissue contours.

contours_tumor: list[np.ndarray]

List of tumor contours.

holes_tissue: list[np.ndarray]

List of holes in tissue contours.

target: None

Placeholder for target (e.g. label) for the WSI.

Returns:
WholeSlideImage

WholeSlideImage object.

Methods

__init__(path, *[, attributes, hdf5_file])

WholeSlideImage object for handling WSI.

get_thumbnail([level])

Get array representing a low resolution image of the whole slide image.

segment([method, params])

Segment the WSI for tissue and background.

plot_segmentation([output_file, ...])

Plot the segmentation of the WSI.

save_segmentation([hdf5_file, mode])

Save slide segmentation results to an HDF5 file.

load_segmentation([hdf5_file])

Load slide segmentation results from pickle file.

tile([patch_level, patch_size, step_size, ...])

Tile the WSI.

plot_tile_graph([output_file])

Plot a graph of tile spatial proximity.

save_tile_images(output_dir[, ...])

Save tile images as individual files to disk.

has_tile_coords()

Check if the WSI has tile coordinates saved in its HDF5 file.

has_tile_images()

Check if the WSI has tile images in its HDF5 file.

has_tissue_contours()

Check if the WSI has tissue contours saved.

get_tile_coordinate_level_size([hdf5_file])

Retrieve level and size of tiles from HDF5 file.

get_tile_coordinates([hdf5_file])

Retrieve coordinates of tiles from HDF5 file.

get_tile_graph([query_type, max_dist])

Retrieve a graph of tile spatial proximity.

get_tile_images([hdf5_file, as_generator])

Get tile images from HDF5 file.

get_tile_polygons()

Retrieve polygons of tile bounds.

get_tile_tissue_piece()

Retrieve which tile overlaps which tissue contour.

inference([model, model_repo, device, ...])

Inference on the WSI using a pretrained model.

as_tile_bag(**kwargs)

Return a torch.dataset of tiles from the whole slide image.

as_data_loader([batch_size, with_coords, ...])

Return a data loader for the whole slide image.

as_torch_geometric_data([feats, coords, ...])

Return a torch_geometric.data.Data object for the whole slide image.

as_data_loader(batch_size=32, with_coords=False, tile_bag_kwargs={}, data_loader_kwargs={})#

Return a data loader for the whole slide image.

Parameters:
batch_sizeint

Number of images per batch in data loader. Default is 32.

with_coordsbool

Whether to include coordinates in data loader. Default is False.

kwargsdict

Additional keyword arguments to pass to torch.utils.data.DataLoader.

Returns:
torch.utils.data.DataLoader
as_tile_bag(**kwargs)#

Return a torch.dataset of tiles from the whole slide image.

Can be customized for example in which transform functions are used using the keyword arguments. Check wsi.utils.WholeSlideBag for more details.

Parameters:
kwargsdict

Additional keyword arguments to pass to wsi.utils.WholeSlideBag.

as_torch_geometric_data(feats=None, coords=None, model_name=None, data_loader_kws={})#

Return a torch_geometric.data.Data object for the whole slide image.

Parameters:
featsnp.ndarray

Array of features. By default, features extracted for tiles using self.inference() with model_name are used.

coordsnp.ndarray

Array of coordinates. By default, coordinates for tiles present as output of slide.tile() are used.

model_namestr

Name of the model to use for inference.

data_loader_kwsdict

Additional keyword arguments to pass to torch.utils.data.DataLoader.

Returns:
torch_geometric.data.Data
get_thumbnail(level=None)#

Get array representing a low resolution image of the whole slide image.

Parameters:
level: int

Which pyramid level to retrieve image at.

get_tile_coordinate_level_size(hdf5_file=None)#

Retrieve level and size of tiles from HDF5 file.

By default uses the self.hdf5_file attribute, but can be overridden.

Parameters:
hdf5_file: Path

Path to HDF5 file containing tile coordinates.

Returns:
tuple[int, int]

Level and size of tiles.

get_tile_coordinates(hdf5_file=None)#

Retrieve coordinates of tiles from HDF5 file.

By default uses the self.hdf5_file attribute, but can be overridden.

Parameters:
hdf5_file: Path

Path to HDF5 file containing tile coordinates.

Returns:
np.ndarray

Array of tile coordinates with shape (N, 2).

get_tile_graph(query_type='distance', max_dist=None)#

Retrieve a graph of tile spatial proximity.

Parameters:
query_type: str

Type of query. Either “distance” or “knn”.

max_dist: float

Maximum distance for distance-based queries. If None, use the tile size centered on tile centroids.

Returns:
np.ndarray

Array with edges of shape (2, N) where N is the number of edges.

get_tile_images(hdf5_file=None, as_generator=True)#

Get tile images from HDF5 file.

By default it returns a generator, but can be overridden to return all as a array with batch dimension. By default uses the self.hdf5_file attribute, but can be overridden.

Parameters:
hdf5_file: Path

Path to HDF5 file containing tile images.

Returns:
generator

Each element is an array of with shape (3, H, W).

get_tile_polygons()#

Retrieve polygons of tile bounds.

Returns:
List of tile shapely.Polygon objects.
get_tile_tissue_piece()#

Retrieve which tile overlaps which tissue contour.

Returns:
np.ndarray

Array of shape (N, M) where N is the number of tiles and M is the number of tissue contours.

has_tile_coords()#

Check if the WSI has tile coordinates saved in its HDF5 file.

Returns:
bool

True if it exists

has_tile_images()#

Check if the WSI has tile images in its HDF5 file.

Returns:
bool

True if it exists

has_tissue_contours()#

Check if the WSI has tissue contours saved.

Returns:
bool

True if it exists

inference(model=None, model_repo='pytorch/vision', device=None, data_loader_kws={})#

Inference on the WSI using a pretrained model.

Parameters:
model_name: str

Name of the model to use for inference.

model_repo: str

Repository to load the model from. Default is “torch/vision”.

data_loader_kws: dict

Keyword arguments to pass to the data loader.

Returns:
Tuple[np.ndarray, np.ndarray]

Tuple of (features, coordinates).

load_segmentation(hdf5_file=None)#

Load slide segmentation results from pickle file.

Parameters:
hdf5_file: Path

Path to file used to save segmentation. If None, the segmentation results will be loaded from self.hdf5_file.

Returns:
None
plot_segmentation(output_file=None, per_contour=False, level=None, **kwargs)#

Plot the segmentation of the WSI.

This plot is an overlay of a low resolution image of the WSI and the contours of the tissue and holes.

Parameters:
output_file: Path

Path to save the plot to. If None, save to self.path.with_suffix(“.segmentation.png”).

kwargs: dict

Additional keyword arguments to pass to vis_wsi.

Returns:
None
plot_tile_graph(output_file=None)#

Plot a graph of tile spatial proximity.

Parameters:
output_file: Path

Path to output file. If None, save to self.path.with_suffix(“.tile_graph.png”).

Returns:
None
save_segmentation(hdf5_file=None, mode='a')#

Save slide segmentation results to an HDF5 file.

Parameters:
hdf5_file: Path

File path used to save segmentation. If None, the segmentation results will be loaded from self.hdf5_file.

mode: str

File open mode.

Returns:
None
save_tile_images(output_dir, output_format='jpg', attributes=True, n=None, frac=1.0)#

Save tile images as individual files to disk.

Parameters:
output_dir: Path

Directory to save tile images to.

output_format: str

File format to save images as.

attributes: bool

Whether to include attributes in filename.

n: int

Number of tiles to save. Default is to save all.

frac: float

Fraction of tiles to save. Default is to save all.

Returns:
None
segment(method='manual', params=None)#

Segment the WSI for tissue and background.

Segmentations are saved as a list of contours and holes in the contours_tissue and holes_tissue attributes. This object is then saved to disk as a pickle file, by default in the same directory as the WSI with the same name but with a .segmentation.pickle suffix.

A visualization of the segmentation will also be plotted by calling plot_segmentation and saved as a PNG file ( default in the same directory as the WSI with the same name but with a .segmentation.png suffix).

Parameters:
params: dict[str, tp.Any]

Parameters for the segmentation method.

method: str

Segmentation method to use. Either “manual” or “CLAM”. The CLAM method uses the parameters given in params or the default parameters (bwh_biopsy) if params is None.

Returns:
None
tile(patch_level=0, patch_size=224, step_size=224, contour_subset=None)#

Tile the WSI.

Parameters:
patch_level: int

WSI level to extract patches from. Default is 0, which a convention for highest resolution, but not always true.

patch_size: int

Size of patches to extract in pixels.

step_size: int

Step size between patches in pixels.

contour_subset: list[int]

Index of which contours to use (0-based). If None, use all contours.

Returns:
None
vis_wsi(vis_level=0, color=(0, 255, 0), hole_color=(0, 0, 255), annot_color=(255, 0, 0), line_thickness=250.0, max_size=None, top_left=None, bot_right=None, custom_downsample=1.0, view_slide_only=False, number_contours=False, seg_display=True, annot_display=True)#

Visualize the whole slide image.

Parameters:
vis_level: int

The level to visualize.

color: tuple

The color of the tissue.

hole_color: tuple

The color of the holes.

annot_color: tuple

The color of the annotations.

line_thickness: int

The thickness of the annotations.

max_size: int

The maximum size of the image.

top_left: tuple

The top left corner of the region to visualize.

bot_right: tuple[int, int]: tuple

The bottom right corner of the region to visualize.

custom_downsample: int

The custom downsample factor.

view_slide_only: bool

Whether to only visualize the slide.

number_contours: bool

Whether to number the contours.

seg_display: bool

Whether to display the segmentation.

annot_display: bool

Whether to display the annotations.

Returns:
Image