用Python計算柵格資料的真實面積

skypanxh發表於2024-11-11

用Python計算柵格資料的真實面積

在地理空間分析中,柵格資料的畫素值通常代表某種屬性,比如土地利用比例、植被覆蓋率等。這些資料往往基於經緯度網格表示的比例值,而為了更直觀地理解這些資料的空間意義,我們需要將這些比例值轉化為實際面積(如平方米或公頃)。

對於高解析度的大尺寸柵格檔案(GeoTIFF),一次性載入整個檔案會耗費大量記憶體,甚至導致處理失敗。為了解決這個問題,我們可以透過分塊處理 的方式,逐步計算畫素的真實面積,確保即使是大檔案也能順利處理。

一. 應用場景

該方法尤其適用於以下場景:

1. 比例值柵格資料:

當柵格畫素值表示某個區域的比例(例如森林覆蓋率)時,透過面積計算,可以得出真實的覆蓋面積。

2. 土地利用分析:

計算農田、草地、建築用地等各類土地利用型別的真實面積。

3. 遙感影像處理:

在遙感影像中將畫素值的屬性對映到真實的地理空間。

二.程式碼實現

import sys
import os
import rasterio
import numpy as np
from glob import glob

def haversine(coord1, coord2):
    """
    Calculate the great circle distance in kilometers between two points
    on the earth (specified in decimal degrees, with coordinates in the order of longitude, latitude).

    Arguments:
    coord1 -- tuple containing the longitude and latitude of the first point (lon1, lat1)
    coord2 -- tuple containing the longitude and latitude of the second point (lon2, lat2)

    Returns:
    distance in kilometers.
    """
    # Extract longitude and latitude, then convert from decimal degrees to radians
    lon1, lat1 = np.radians(coord1)
    lon2, lat2 = np.radians(coord2)

    # Haversine formula
    dlat = abs(lat2 - lat1)
    dlon = abs(lon2 - lon1)

    a = np.sin(dlat / 2) ** 2 + np.cos(lat1) * np.cos(lat2) * np.sin(dlon / 2) ** 2
    c = 2 * np.arcsin(np.sqrt(a))
    r = 6371  # Radius of earth in kilometers. Use 3956 for miles
    return c * r


# Read geotiff file in data folder
# Calculate area for a single GeoTIFF file
def calculate_area_for_tif(input_tif, chunk_size=1024):
    # Check if input file exists
    if not os.path.isfile(input_tif):
        raise FileNotFoundError(f"Input file '{input_tif}' does not exist.")

    # Construct output file path
    output_path = input_tif.split('.')[0] + '_area.tif'

    # Skip if the area raster already exists
    if os.path.exists(output_path):
        print(f"Area of {input_tif} already calculated. Skipping...")
        return

    # Open the source raster
    with rasterio.open(input_tif) as src:
        # Check if the raster is in a geographic coordinate system
        if not src.crs.is_geographic:
            raise ValueError(f"The raster {input_tif} is not in a geographic coordinate system!")

        # Get raster metadata and size
        meta = src.meta.copy()
        width, height = src.width, src.height
        transform = src.transform

        # Update metadata for output
        meta.update(compress='lzw', dtype=rasterio.float32, count=1)

        # Calculate the total number of chunks
        total_chunks = ((height + chunk_size - 1) // chunk_size) * ((width + chunk_size - 1) // chunk_size)
        chunk_count = 0  # Track processed chunks

        # Create the output raster file
        with rasterio.open(output_path, 'w', **meta) as dst:
            print(f"Processing {input_tif} in chunks...")

            for row_start in range(0, height, chunk_size):
                row_end = min(row_start + chunk_size, height)

                for col_start in range(0, width, chunk_size):
                    col_end = min(col_start + chunk_size, width)

                    # Read a chunk of the source raster
                    window = rasterio.windows.Window(
                        col_start, row_start, col_end - col_start, row_end - row_start
                    )
                    chunk_transform = src.window_transform(window)

                    rows, cols = np.meshgrid(
                        np.arange(row_start, row_end),
                        np.arange(col_start, col_end),
                        indexing='ij'
                    )

                    # Apply geotransform to get geographical coordinates
                    x_coords, y_coords = chunk_transform * (cols, rows)
                    x_coords_right, y_coords_right = chunk_transform * (cols + 1, rows)
                    x_coords_bottom, y_coords_bottom = chunk_transform * (cols, rows + 1)

                    xy_coords = np.stack((x_coords, y_coords), axis=0)
                    xy_coords_right = np.stack((x_coords_right, y_coords_right), axis=0)
                    xy_coords_bottom = np.stack((x_coords_bottom, y_coords_bottom), axis=0)

                    # Calculate the area for this chunk
                    length_right = haversine(xy_coords, xy_coords_right)
                    length_bottom = haversine(xy_coords, xy_coords_bottom)
                    area_chunk = length_right * length_bottom

                    # Write the chunk to the output raster
                    dst.write(area_chunk.astype(np.float32), 1, window=window)

                    # Update progress
                    chunk_count += 1
                    progress = (chunk_count / total_chunks) * 100
                    print(f"Progress: {progress:.2f}% ({chunk_count}/{total_chunks} chunks)")

            print(f"\nArea of {input_tif} calculated and saved to {output_path}")


if __name__ == '__main__':
    # Specify the input file path
    input_tif = 'path/to/your/input_file.tif'

    # Run the function
    calculate_area_for_tif(input_tif, chunk_size=1024)
    print('\nArea calculation complete!')

! 本程式碼修改自迪肯大學王金柱博士的程式碼

相關文章