Classify Raster Operation

On this page, you will find general help for the ProRaster product family including links to documentation, instructional videos, and training videos.
Previous Topic: Join Operation
Next Topic: Classify Polygons Operation
Back to: ProRaster Scientific Help
Back to: ProRaster Help
Back to: ProRaster
Go to: ProRaster Essential help page, ProRaster Premium help page, ProRaster Scientific help page.
The ProRaster User Guide is available for download as a PDF.

Watch this video on YouTube to learn how to use this processing operation. Use the chapter indexing to skip to the detail you need.

A classified raster field contains an “Index” band which contains a zero-based unsigned integer index value for each cell. This index maps directly to the row number of a class table. The class table may contain any number of rows and columns. Each column has a particular data type – for example it may contain a color type like RGB or a text string. The raster engine presents the Index band and all the columns in the class table as raster bands which you are free to render or use in processing operations, just like any other kind of data band.

The Classify operation will convert an existing raster band to a classified field in a virtual raster. The primary use-case is to turn a pseudo-classified raster (generally a TIFF file) to a true classified raster – in other words to reclassify an existing raster. You can also classify continuous band data.

To perform this operation, you must supply a class table containing the data for each class and a data transform to map the values in the source raster band to zero-based row numbers in the class table.

Firstly, specify the source raster and select the field and band you will target. You may select a stand-alone raster or a raster source. The raster source may contain multiple rasters. Then specify the output virtual raster. This output MVR will be displayed as soon as you complete the operation.

In a classified raster, the overview pyramid levels cannot be constructed by averaging or interpolating adjacent cell values. If you do so, you may generate class values in the overview pyramid that do not exist at that location in the base level raster. These errors are generally easy to spot in ProRaster as you zoom in and out. Instead, when the overviews are generated, the system must select the value that is most likely to occupy the cell.

Unfortunately, the virtual raster that is generated by this operation does not guarantee that this is the case as it depends on the existing overview pyramid for the source raster and has no control over how it was generated. If the overview pyramid is supplied with the file (for example in formats like COG, ECW, and JPEG2000) then you have no control over it. If, on the other hand, ProRaster must generate an overview pyramid cache, then you can exercise some control over it via the “Advanced” properties in the Raster Source Editor, shown adjacent. Typically, you will override the field type of the source raster to “Continuous” and override the band value type for the source index band to “Unique”. Then “Clean” the raster source and “Prepare” it. This will generate overview pyramid cache rasters that correctly maintain cell data values.

An alternative solution (which I strongly recommend) is to execute an Export operation, once you are happy with your classified MVR, to convert the MVR into an MRR. You can include metadata in the MRR in which you can document the class table and your processing procedure. To minimise the file size, use high quality compression, and consider using run-length encoding as a pre-compression phase. This can be an effective technique to maximise compression in large, high resolution classified rasters.

Now specify the “Index” band data type. You need to define a data type that contains sufficient bit-width to represent the number of rows in your table. An 8-bit unsigned integer is sufficient for a table with up to 256 rows and suits most purposes. A 16-bit integer can represent large tables of up to 65536 rows. In very rare cases, if you have a very large number of rows, you might have to use a 32-bit integer.

Sometimes, the data transform you specify may fail to classify a source data value. In other words, it fails to convert a source data value to a valid class index. By default, this will result in an invalid and empty cell. Alternatively, you can specify a class index to assign to these cells. Check “Classify failure index” and enter the zero-based class index.

You now need to supply the class table and the data transform. This can be done in two ways – by providing a Legend file (which defines both) or by providing two CSV format text files (one for the class table and one for the data transform).

Legend Format

A legend file is a simple text file with a *.LEG extension that you can create and edit manually. The format was invented by Encom Technology and is used in products like Profile Analyst and Discover. It is similar in structure and appearance to the color lookup table format introduced in ERMapper. It is used to describe a set of vector rendering styles (such as color, line style, and fill style).

ProRaster supports legend files in the Color Table Editor, and they can be used as a color table for any raster. You can create legend files for an existing classified raster in the Color Table Editor. You can also export legend files from the Data Transform Editor.

A legend contains a data transform and a simple table, so it meets the minimum requirement for classification. A simple legend file is illustrated below. All the legend files you create will have this basic structure.

LegendTable Begin
DataType = NUMERIC
LEG = {
10,”No Water”,”RGBA(128,82,0,255)”,0,0,0,0,0,0
20,”Ocean”,”RGBA(0,0,192,255)”,0,0,0,0,0,0
28,”Lake”,”RGBA(87,128,255,255)”,0,0,0,0,0,0
103,”River”,”RGBA(0,105,128,255)”,0,0,0,0,0,0
}
LegendTable End

Firstly, you can declare some tags including Version, Name, Description, NULLColor and DataType. The only tag we use in ProRaster is “DataType”. This tag declares what kind of data transform the legend employs. It can be declared as NUMERIC, NUMERICDISCRETE, NUMERICCONTINUOUS, or TEXT. Then, add a row for each class in the classified raster between the brackets {}. You can add as many rows as you need. The data in each row must be comma delimited and any text strings ought to be enclosed in quotation marks.

NUMERIC: Transforms a single defined source data value to row index. The first value in each row is the source data value. In the example is given above, source data values of 103 will be transformed to class index 3 (“River”). The values do not have to be declared in any particular order.

NUMERICDISCRETE: Transforms a range of data values to a row index. The first value in each row in the minimum value in the source data range and the second value is the maximum value. Any source data values that are >= the minimum value and <= the maximum value will be assigned the row index. In a transform like this the ranges can have gaps between them, and they do not have to be declared in any particular order. In the example shown below, all values between 5101 and 6347 (inclusive) will be assigned class index 0 (“Vegetation”).

LegendTable Begin
DataType = NUMERICDISCRETE
LEG = {
5101,6347,”Vegetation”,”RGBA(0,255,0,255)”,0,0,0,0,0,0
6978,7593,”Bare Earth”,”RGBA(128,255,0,255)”,0,0,0,0,0,0
8124,8839,”Built-up”,”RGBA(255,254,0,255)”,0,0,0,0,0,0
}
LegendTable End

NUMERICCONTINUOUS: Transforms a range of data values to a row index. The first value in each row in the minimum value in the source data range and the second value is the maximum value. Any source data values that are >= the minimum value and < the maximum value will be assigned the row index. There should be no gaps between the ranges and they ought to be monotonically increasing.

TEXT: Transforms a string to a row index. The first value in each row will be the string enclosed in quotation marks.

Following the data transform values is a label string in quotation marks. It is important to supply a label for each class.

Then specify the class color. This can be done in a wide variety of ways. Specify a 32-bit numeric value or use a Red/Green/Blue/Alpha macro enclosed in quotation marks. Note that an alpha value of zero is fully transparent and a transparency value of zero is fully opaque.

12345453                             32-bit 0xAABBGGRR
RGB(R,G,B)                             8-bit components (0 – 255) or Float (0 – 1)
RGB(R,G,B,A)                         8-bit + opacity
RGBA(R,G,B,T)                       8-bit + transparency
RGBA(R,G,B,A)                      8-bit + opacity or Float (0 – 1)
RGBF(R,G,B)                           Float (0 – 1)
RGBTF(R,G,B,T)                      Float (0 – 1) + transparency
RGBAF(R,G,B,A)                    Float (0 – 1) + opacity
RGB16(R,G,B)                        16-bit
RGBT16(R,G,B,T)                   16-bit + transparency
RGBA16(R,G,B,A)                  16-bit + opacity

You can also use one of these standard color identifiers. In some cases, multiple identifiers map to the same color as shown in the list below.

WHITE, SILVER                      RGB(255,255,255)
LIGHTGRAY, LIGHTGREY      RGB(192,192,192)
GRAY, GREY                           RGB(128,128,128)
DARKGRAY, DARKGREY       RGB(64,64,64)
BLACK                                    RGB(0,0,0)
RED, HIGHRED                       RGB(255,0,0)
MAROON, LOWRED            RGB(128,0,0)
YELLOW                                 RGB(255,255,0)
OLIVE                                      RGB(128,128,0)
GREEN, LIME, HIGHGREEN  RGB(0,255,0)
LOWGREEN, DARKGREEN   RGB(0,128,0)
CYAN, AQUA                         RGB(0,255,255)
TEAL                                       RGB(0,128,128)
BLUE                                       RGB(0,0,255)
NAVY, DARKBLUE                 RGB(0,0,128)
MAGENTA, FUCHSIA            RGB(255,0,255)
PURPLE                                  RGB(128,0,128)
BROWN                                  RGB(128,64,64)

After the class color there are six additional variables (background color, fill pattern, line color, line style, line width, symbol index) but ProRaster does not use any of these so they can all be zero.

You can create a legend file for an existing raster from the Data Transform Editor by hitting the “Export to a Legend file” button at the top of the main dialog. The legend will reflect the properties of the current data transform and will be generated for the currently selected raster and the currently selected color table.

The best way to use this export mechanism is to start by displaying the raster you want to target in the main ProRaster dialog. Make sure you select the color table you want to use, and then open the Data Transform Editor. The raster and color table will be used in the dialog and in the legend export. Check this on the “Raster” property page and make sure you select the raster you want to target and the appropriate field and band. If statistics are not available for this raster-field-band then hit the “Compute statistics” button to compute them. On the “Transform” page you will generally want to uncheck the “Full color table” button to turn off color interpolation.

It is likely that you will select either a “Ranges” or “Tables” style transform. For a “Ranges” transform make sure you specify the “Number of ranges”. When you select one of the following transform types, the table will be automatically populated using the statistics of the raster and the color table size as a guide.

Table: Index for value
Table: Match value
Table: Match value range
Table: Match continuous value range

If the raster band data type is a small integer, then the “Table: Match value” transform can be populated with each individual value in the raster. A message box will open, shown on the left, asking you if you want this table auto filled.

CSV File Format

If you want to include more information in the classification table, then you will have to use the CSV method. You need to supply two comma delimited text files (*.CSV) – one for the table and one for the data transform. Both files have a format which you must adhere to.

You ought to prepare your class table file in a spreadsheet editor like Excel. An example spreadsheet is provided that you can use as a template. It contains drop-lists you can use to select the appropriate tags for each column. The table can be any size you want, supporting a large number of columns (each column becomes a band in the raster) and rows (each row contains the data for a class).

The first row of the spreadsheet contains the column labels. It is important to label each column.

The second row of the spreadsheet contains the column special type identifiers. These are used to declare to the raster engine that a column has a particular special use. The identifiers are listed below.

Label      The class label string
Colour    The class color – generally RGB or RGBA
ColourR  The class color – red component
ColourG The class color – green component
ColourB  The class color – blue component
Class       The original class index
Value      The primary data field
Data        A secondary data field

As a minimum, it is customary to provide a “Label” and a “Colour” in every table. These are used by the raster engine to render the raster and in cell value reporting. Color is usually supplied as an RGB or RGBA value. Alternatively, you can supply the red, green, and blue components (0 – 255) as separate columns. The “Class” identifier is used to record the original class index. For example, you may reclassify a TIFF file that contains values 10, 20, 30, 103 that are mapped to classes 0 – 3. Those original values can be remembered in the “Class” column. All other columns ought to be designated “Data” columns and they can contain anything you want. If you have a primary data value, you can use the “Value” tag to identify it.

The third row of the spreadsheet contains the column data type identifiers. Some of the available data types are listed below.

UNSIGNED_INT8                  (8-bit integer, +16, 32, 64)
SIGNED_INT8                        (8-bit integer, +16, 32, 64)
REAL4                                     (32-bit floating point real)
REAL8                                     (64-bit floating point real)
DATETIME                              A decimal date-time
STANDARD_DATETIME        An integer date-time
STRING                                   An ASCII string.
STRING_UTF8                        A UTF-8 string
STRING_UTF16                      A UTF-16 string
RED, GREEN, BLUE               Color components 0 – 255
ALPHA                                    Opacity component 0 – 255
RGB, RGBA                            Color

To specify color, you can use the same kind of methods shown previously for the legend file – for example RGB(R,G,B). To specify date-time, you can use English nomenclature. You can specify the date or time or both. For example, “14 January 2024 17:56:04” is a valid representation. When you are typing date-time information into Excel, what you type into the cell will be decoded and presented differently in the editing bar, and differently again when you export to CSV! This serves to highlight how badly computing systems deal with date-time information in general. You may need to edit the CVS file to wrap the date-time data in quotation marks.

Each subsequent row contains the data for a class.

The data transform CSV file contains a simple format. In the first row declare the type of the transform. This can be NUMERIC, NUMERICDISCRETE, or NUMERICCONTINUOUS. In each subsequent row you declare the data map values and then the class index. All values ought to be comma delimited. The class index is a zero based unsigned integer that maps to rows in the class table. They do not have to be declared in row order.

NUMERIC: Transforms one or more defined source data values to a supplied class index. On each row, list the data values that will be transformed, followed by the zero-based class index at the end. The values do not have to be declared in any particular order. Note that this is more flexible than the same transform in a legend file as you can map multiple values to the same class index.

NUMERIC
10,11,12,0
20,21,1
30,31,32,33,2
100,101,110,111,120,3

NUMERICDISCRETE: Transforms a range of data values to a supplied class index. On each row enter the minimum value in the source data range, the maximum value, and the class index. Any source data values that are >= the minimum value and <= the maximum value will be assigned the class index. The ranges can have gaps between them, and they do not have to be declared in any particular order.

NUMERICDISCRETE
10,12,0
20,21,1
30,33,2
100,120,3

NUMERICCONTINUOUS: Transforms a range of data values to a supplied class index. The first value in each row in the minimum value in the source data range and the second value is the maximum value. Any source data values that are >= the minimum value and < the maximum value will be assigned the class index. There should be no gaps between the ranges and they ought to be monotonically increasing.

NUMERICCONTINUOUS
0,0.5,0
0.5,5.0,1
5.0,50.0,2
50.0,500.0,3