3 minute read

AWEISH

So, every pixel at index position (4,4) of each patch of the label is extracted and stored in a 1D array

where

Advertisement

,…, ( ) is the center pixels corresponding to t h e

patch , , ,….

We denote ( , ) as our input dataset and is given by equation 5.

) (5)

where ( , ) pair indicates as raster instance having corresponding label instances as .

8) Balancing of the Dataset Based On Lowest Pixel Class: Balanced datasets in the presence of multiple classes outperform imbalanced datasets [40, 41]. But the real-life data are generally imbalanced. To deal with imbalanced datasets several methods are proposed by many works of literature [40, 41, 42]. Two famous approaches are oversampling of weak classes or down-sampling [43] the stronger classes until a balanced dataset is created [40, 42]. To set a tradeoff between processing speed and time, we used the down sampling methods to balance the dataset. The frequency of t h e sample for each class is evaluated. Then we down sample the number of samples per class similar to the minimum frequency class. The outcome of this process is a balanced dataset in terms of a number of samples per class.

B. Model Training The model is trained using a convolution neural network. The training involves the following steps. 1) Scaling of Training and Testing Data: The training of the model involves splitting the dataset into training, validation, and testing samples. In this research, the dataset is split for training, validation, and testing in the ratio 50:30:20 respectively.

The validation samples are used to fine-tune the hyperparameters. Data normalization is an important part of the model training of CNN. Each pixel value in the dataset is represented within the range from 0 to 255 levels. So, we divided the train and test raster data by 255.0 to normalize the dataset. 2) Multilayer 2D CNN: Our Multilayer 2D CNN has been implemented by applying multiple convolution layers for feature extraction followed by a fully connected layer. The CNN takes w input tensor of height nh, width nw, and channels nc.

Every input to the 2D CNN layer is convoluted with a ‘p’ number of filters of size mxm to generate ‘p’ number of t h e feature map. The filters are tensors used to extract spatial information like smooth curve and edge detection from objects in the convolution layer. Each input patch is created with a stride having values the same as the height of the patch, so to have higher spatial dimensions of our feature map, we shifted our kernel by one pixel only by keeping stride as 1. The activation functions are used to make decisions. It decides the output of a neuron and estimates the relevant information of that neuron. Tanh and Rectified linear units (ReLUs) have been used for the hidden layers. A tanh activation function works well when we have negative values. So, we have used tanh during feature extraction as our dataset contains negative values.

The next step is flattening the matrix for feeding to a fully connected layer. In a fully connected network, we have used the

ReLU activation function. A ReLU yields an output x if x is positive and zero otherwise. The model is trained using a certain number of epochs and then backpropagate to update weights and calculate the loss. As we have multiple classes to be predicted so we have used the softmax activation function in the output layer. The objective function is used to find the best parameters that quantify the distance between the predicted values and actual values on the training set. We can minimize the objective or loss function using forward propagation and the backpropagation method. 3) Forward Propagation: Here the input is propagated through the entire network in 16 batches than we estimated the objective function to find the error in the predicted value, which is obtained by observing the difference between the actual value and predicted value for the different rows.

This article is from: