Chapter 4: Multidimensional Data Structures
In Chapter 3: Network Communication, we learned how to share DICOS data between different systems. Now, let’s dive into how pyDICOS actually represents and stores that data internally.
Why Special Data Structures?
Imagine you’re working with security scan images. You might have:
- 2D X-ray images (like a flat photograph)
- 3D CT scans (like a stack of many thin slices showing the inside of a bag)
You could use regular Python lists or NumPy arrays to store this data, but security imaging has unique requirements:
- Very large datasets (CT scans can be gigabytes)
- Special memory handling for efficient processing
- Integration with security-specific algorithms
That’s why pyDICOS provides specialized data structures designed specifically for security imaging data.
Basic Concept: Dimensions in Imaging
Let’s understand dimensions in imaging with a simple analogy:
- 1D data is like a line of pixels (rarely used alone)
- 2D data is like a photograph (width × height)
- 3D data is like a CT scan (width × height × depth)
graph LR
A[1D: Line of pixels] --> B[2D: Flat image]
B --> C[3D: Volume of slices]
In security imaging, we mostly work with 2D and 3D data. Let’s explore the classes pyDICOS provides for each.
Array2D: Working with Flat Images
Array2D
represents two-dimensional data - perfect for X-ray images or individual slices of a CT scan.
Think of Array2D
as a grid of pixels, where each pixel has a value representing its intensity (how bright it is).
Creating and Using Array2D
Here’s a simple example of creating and working with an Array2D
:
from pydicos import Array2DUInt16 # For 16-bit unsigned integers
import matplotlib.pyplot as plt
# Create a new 2D array (width=100, height=80)
image = Array2DUInt16(100, 80)
# Set some pixel values
for x in range(40, 60):
for y in range(30, 50):
image.Set(x, y, 65000) # Set to a bright value
# Display the image
plt.imshow(image, cmap='gray')
plt.title("Simple 2D Array")
plt.show()
This code creates a 100×80 pixel image and sets a 20×20 square in the middle to a bright value. The rest of the image remains black (value 0).
Accessing Pixel Values
You can access or modify individual pixels using the Get
and Set
methods:
# Get a pixel value
pixel_value = image.Get(50, 40)
print(f"Pixel value at (50,40): {pixel_value}")
# Set a pixel value
image.Set(50, 40, 32000) # Set to half brightness
You can also access the raw buffer directly, which is useful for faster operations:
# Get the raw buffer (pointer to the data)
buffer = image.GetBuffer()
# Get the dimensions
width = image.GetWidth()
height = image.GetHeight()
print(f"Image dimensions: {width}×{height}")
Array3DLarge: Working with Volumetric Data
When you have 3D data like a CT scan, you need Array3DLarge
. It represents a 3D volume as a stack of 2D slices.
Creating and Using Array3DLarge
Here’s how to create and work with a 3D volume:
from pydicos import Array3DLargeUInt16
from pyDICOS import VOLUME_MEMORY_POLICY
import numpy as np
# Create a volume (width=100, height=80, depth=50)
volume = Array3DLargeUInt16(100, 80, 50, VOLUME_MEMORY_POLICY.OWNS_SLICES)
# Access a specific slice (2D array)
slice_10 = volume[10] # Get the 10th slice
# Set values in a specific region of the volume
for z in range(20, 30):
slice_z = volume[z]
for x in range(40, 60):
for y in range(30, 50):
slice_z.Set(x, y, 65000) # Create a bright cube
This creates a 3D volume and fills a 20×20×10 cube inside it with a bright value.
The OWNS_SLICES
policy means that the Array3DLarge
object is responsible for managing the memory of each slice. This is important because 3D data can be very large.
Working with Slices
A key feature of Array3DLarge
is that you can work with individual 2D slices:
# Get a specific slice
slice_25 = volume[25]
# Work with the slice like any Array2D
middle_value = slice_25.Get(50, 40)
print(f"Value at position (50,40,25): {middle_value}")
# Modify the slice
slice_25.Set(50, 40, 30000)
This is very useful for processing CT scans, where you might want to analyze or modify one slice at a time.
Volume: Higher-Level Functionality
While Array3DLarge
provides basic 3D data storage, the Volume
class offers higher-level functionality for working with volumetric imaging data:
- Handles different data types automatically
- Provides iterators for efficiently processing slices
- Manages memory more flexibly
Creating and Using Volume
Here’s a basic example:
from pydicos import Volume
from pyDICOS import IMAGE_DATA_TYPE
import numpy as np
# Create a Volume with 16-bit unsigned integer data
vol = Volume()
vol.Allocate(IMAGE_DATA_TYPE.enumUnsigned16Bit, 100, 80, 50)
# Set data from a NumPy array
data = np.zeros((50, 80, 100), dtype=np.uint16)
data[20:30, 30:50, 40:60] = 65000 # Create a bright cube
Volume.set_data(vol, data)
# Get dimensions
print(f"Volume dimensions: {vol.GetWidth()}×{vol.GetHeight()}×{vol.GetDepth()}")
This code creates a Volume object and fills it with data from a NumPy array. The set_data
method is a convenient way to transfer data between NumPy (which you might use for processing) and pyDICOS.
Iterating Through Slices
One powerful feature of the Volume
class is its ability to iterate through slices:
# Create an iterator for processing all slices
slice_iter = vol.Begin()
# Process each slice
while slice_iter != vol.End():
# Get the current slice as a 16-bit unsigned array
current_slice = slice_iter.AsUnsigned16()
# Do something with the slice
avg_value = np.mean(current_slice)
print(f"Slice {slice_iter.GetCurrentSlice()} average: {avg_value}")
# Move to the next slice
slice_iter = slice_iter + 1
This makes it easy to process an entire volume slice by slice, which is often more memory-efficient than working with the whole volume at once.
Converting Between NumPy and pyDICOS
Since many data processing libraries in Python use NumPy arrays, pyDICOS makes it easy to convert between its specialized data structures and NumPy:
import numpy as np
from pydicos import Array2DUInt16, Volume
# Convert an Array2D to NumPy
image = Array2DUInt16(100, 80)
# ... set some values ...
numpy_image = np.array(image) # Automatic conversion
# Convert a Volume to NumPy
vol = Volume()
vol.Allocate(IMAGE_DATA_TYPE.enumUnsigned16Bit, 100, 80, 50)
# ... set some values ...
# Get the data as a NumPy array
numpy_volume = None
if vol.GetUnsigned16() is not None:
array3d = vol.GetUnsigned16()
numpy_volume = np.array(array3d)
print(f"NumPy volume shape: {numpy_volume.shape}")
This interoperability makes it easy to use pyDICOS with other Python libraries for image processing, machine learning, or visualization.
Under the Hood: How the Data Structures Work
Let’s look at what happens when you create and use these data structures:
sequenceDiagram
participant User as Your Code
participant A2D as Array2D
participant A3D as Array3DLarge
participant Vol as Volume
participant Mem as Memory
User->>A2D: Create Array2D(width, height)
A2D->>Mem: Allocate width*height*sizeof(T) bytes
User->>A3D: Create Array3DLarge(w, h, d)
A3D->>Mem: Allocate d slices
loop For each slice
A3D->>A2D: Create Array2D(w, h)
A2D->>Mem: Allocate memory for slice
end
User->>Vol: Create Volume()
User->>Vol: Allocate(type, w, h, d)
Vol->>A3D: Create appropriate Array3DLarge
A3D->>Mem: Allocate memory
When you create an Array2D
, it allocates a contiguous block of memory to store all the pixel values in row-major order (rows are stored one after another in memory).
Array3DLarge
doesn’t store all its data in one contiguous block. Instead, it manages an array of Array2D
objects (slices). This design allows for more flexible memory management, especially with large datasets.
Volume
adds another layer that handles different data types. Based on the data type you specify (like unsigned 16-bit), it creates and manages the appropriate Array3DLarge
object internally.
Memory Considerations
Security imaging data can be very large. A typical CT scan might be 512×512×600 voxels with 16 bits per voxel, which is over 300MB! That’s why memory management is critical.
The Array3DLarge
class offers different memory policies:
from pydicos import Array3DLargeUInt16
from pyDICOS import VOLUME_MEMORY_POLICY
# Create a volume that owns its slices (default)
volume1 = Array3DLargeUInt16(100, 80, 50, VOLUME_MEMORY_POLICY.OWNS_SLICES)
# Create a volume that doesn't own its slices
# (useful for referencing data that exists elsewhere)
volume2 = Array3DLargeUInt16(100, 80, 50, VOLUME_MEMORY_POLICY.BORROWS_SLICES)
OWNS_SLICES
: TheArray3DLarge
is responsible for allocating and freeing memory for each slice.BORROWS_SLICES
: TheArray3DLarge
references slices allocated elsewhere, without taking ownership.
This flexibility is crucial for complex systems where memory efficiency matters.
Practical Examples
Example 1: Processing a CT Scan Slice by Slice
Here’s how you might process a CT scan using pyDICOS data structures:
from pydicos import dcsread, Volume
import numpy as np
import matplotlib.pyplot as plt
# Read a CT scan
ct_scan = dcsread("baggage_scan.ct")
# Get the volume data
volume = ct_scan.get_volume_data()
# Process each slice
for z in range(volume.shape[0]):
# Get the slice
slice_z = volume[z]
# Simple processing: thresholding
threshold = 1000
binary_slice = (slice_z > threshold).astype(np.uint8) * 255
# Save or display the processed slice
plt.imsave(f"processed_slice_{z}.png", binary_slice, cmap='gray')
# Just process first 5 slices for this example
if z >= 4:
break
This example reads a CT scan, extracts the volume data, and processes each slice by applying a threshold to create a binary image.
Example 2: Creating a 3D Visualization
from pyDICOS import dcsread
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
# Read a CT scan
ct_scan = dcsread("baggage_scan.ct")
# Get the volume data
volume = ct_scan.get_volume_data()
# Create a 3D visualization (simplified)
# Extract points above a threshold
threshold = 1000
x, y, z = np.where(volume > threshold)
# Downsample for faster rendering
sample_rate = 50
x = x[::sample_rate]
y = y[::sample_rate]
z = z[::sample_rate]
# Plot the points
fig = plt.figure(figsize=(10, 8))
ax = fig.add_subplot(111, projection='3d')
ax.scatter(x, y, z, c='red', marker='.')
plt.title("3D Visualization of CT Scan")
plt.show()
This example creates a simple 3D visualization of a CT scan by extracting points above a threshold and plotting them in 3D space.
Summary
In this chapter, we’ve explored the multidimensional data structures in pyDICOS:
Array2D
for working with 2D imagesArray3DLarge
for handling 3D volumetric dataVolume
for higher-level functionality with different data types
These specialized structures are optimized for security imaging data, making it efficient to:
- Store large datasets
- Process individual slices of 3D volumes
- Convert between pyDICOS and NumPy for compatibility with other libraries
We’ve seen how to create, access, and modify these data structures, as well as how they manage memory behind the scenes.
In the next chapter, Geometry Representation, we’ll learn how pyDICOS represents the physical geometry of scans, which is essential for mapping the pixel/voxel data to real-world locations.
Generated by AI Codebase Knowledge Builder