0

我正在使用 Colab 中的 qubvel/segmentation_models 库来设置基于 EfficientNetB0 的 CNN 模型。除了第一个 epoch 甚至在 GPU 上运行都需要几个小时之外,所有工作都很好。我最终用完了资源,因此升级到 ColabPro,并将训练集的大小减少到 30,000 个掩码和 30,000 个图像。验证集有 3000 个掩码和 3000 个图像。当我运行时,model.fit我收到此错误:

TypeError: unsupported operand type(s) for /: 'NoneType' and 'int'

问题似乎是 .jpg 文件没有被检索,因此它们被读取为 None 而不是 numpy 数组。但源文件夹清楚地包含图像文件,并且编号正确:

len(os.listdir(x_train_dir)) = 30000
len(os.listdir(y_train_dir)) = 30000
len(os.listdir(x_valid_dir)) = 3000
len(os.listdir(y_valid_dir)) = 3000

尽管如此,当我创建数据加载器时,只为火车数据加载器检索了一小部分文件:

len(train_dataloader) = 3750
len(valid_dataloader) = 3000

引发错误的代码是:

history = model.fit(
    train_dataloader, 
    steps_per_epoch=len(train_dataloader), 
    epochs=EPOCHS, 
    callbacks=callbacks, 
    validation_data=valid_dataloader, 
    validation_steps=len(valid_dataloader),
)

和 Dataset 和 Dataloder 代码是:

class Dataset:
    """
    Read images, apply augmentation and preprocessing transformations.
    
    Args:
        images_dir (str): path to images folder
        masks_dir (str): path to segmentation masks folder
        #class_values (list): values of classes to extract from segmentation mask
        augmentation (albumentations.Compose): data transfromation pipeline 
            (e.g. flip, scale, etc.)
        preprocessing (albumentations.Compose): data preprocessing 
            (e.g. noralization, shape manipulation, etc.)
    
    """
           
    def __init__(
            self, 
            images_dir, 
            masks_dir, 
            classes=None, 
            augmentation=None, 
            preprocessing=None,
    ):        
        self.ids = os.listdir(images_dir)
        self.images_fps = [os.path.join(images_dir, image_id) for image_id in self.ids]
        self.masks_fps = [os.path.join(masks_dir, image_id) for image_id in self.ids]
              
        self.augmentation = augmentation
        self.preprocessing = preprocessing
    
    def __getitem__(self, i):
        
        # read data
        image = cv2.imread(self.images_fps[i])
        image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
        mask = cv2.imread(self.masks_fps[i], 0)
        mask = mask / 255
        mask = np.expand_dims(mask, axis=2)
        mask = mask.astype(np.float32)
        
        # apply augmentations
        if self.augmentation:
            sample = self.augmentation(image=image, mask=mask)
            image, mask = sample['image'], sample['mask']
        
        # apply preprocessing
        if self.preprocessing:
            sample = self.preprocessing(image=image, mask=mask)
            image, mask = sample['image'], sample['mask']
            
        return image, mask
        
    def __len__(self):
        return len(self.ids)
    
    
class Dataloder(tensorflow.keras.utils.Sequence):
    """Load data from dataset and form batches
    
    Args:
        dataset: instance of Dataset class for image loading and preprocessing.
        batch_size: Integet number of images in batch.
        shuffle: Boolean, if `True` shuffle image indexes each epoch.
    """
    
    def __init__(self, dataset, batch_size=1, shuffle=False):
        self.dataset = dataset
        self.batch_size = batch_size
        self.shuffle = shuffle
        self.indexes = np.arange(len(dataset))

        self.on_epoch_end()

    def __getitem__(self, i):
        
        # collect batch data
        start = i * self.batch_size
        stop = (i + 1) * self.batch_size
        data = []
        for j in range(start, stop):
            data.append(self.dataset[j])
        
        # transpose list of lists
        batch = [np.stack(samples, axis=0) for samples in zip(*data)]
        
        return batch
    
    def __len__(self):
        """Denotes the number of batches per epoch"""
        return len(self.indexes) // self.batch_size
    
    def on_epoch_end(self):
        """Callback function to shuffle indexes each epoch"""
        if self.shuffle:
            self.indexes = np.random.permutation(self.indexes) 
4

0 回答 0