![]()
The source for the information contained in this section on ARC/INFO Data Structure was taken from ArcDoc, Version 7.1.2, the online help documentation contained in ARC/INFO Version 7.1.2.
ARC/INFO GIS uses a hierarchical directory structure. Data sets are organized into ARC/INFO workspaces, which are directories that contain one or more geographic data sets, a local INFO database, and other supporting data.
Coverages represent the fundamental data source for ARC/INFO. A coverage contains a set of features, each represented by a feature class such as arc, node, label point, annotation or polygon. The coverage supports the georelational model _ it contains both the spatial (location) and attribute (descriptive) data for geographic features. The georelational model is the fundamental data model used in ARC/INFO.
A coverage is stored as a directory containing a set of files, with corresponding data in an INFO database identified as the INFO directory. The directory name is usually the coverage name. INFO is the relational database product integral to ARC/INFO.
The combination of feature classes present in a coverage depends on the geographic phenomena being represented. Each feature class stores attribute information in a corresponding feature attribute table. These tables reside in the workspace INFO database where the coverage resides.
Three topological concepts are used to define features: arc-node, left-right and area definition. Arc-node topology defines the connectivity of arcs; arcs connect at nodes. Polygons are defined using left-right and area definition topologies. A polygon is defined as an ordered set of connected arcs, with the constraint that the first and last arcs must connect (area-definition). For each arc, the left and right polygons are identified (left-right). Arcs also have a direction (to-from) that is defined by the location of the beginning and ending node of an arc.
Grids are ARC/INFO's raster data structure used to represent categorical data, and the GRID program extension is used for raster analysis of grid data sets. Each grid represents a spatial variable. Both raster images and maps can be stored in GRID.
Grid-based systems divide the world into discrete uniform units called cells. Every cell represents a certain specified area of the earth and is given a value to correspond to the feature or characteristic that is located at or describes the site. Location is not defined as an attribute but is inherent in the storage structure.
Grid systems treat points, lines, polygons and surfaces, and their locational structures, the same way; as cells in a grid. Analysis and computation using grids are generally very fast. Once registered, computing or deriving a value for an output cell from two or more input cells is a matter of direct value computation. No geometric detection, topology building, and error checking is necessary.
The uniform cells are organized into a Cartesian matrix consisting of rows and columns. A row identifies all cells equidistant from the top or bottom boundary of a grid. Columns identify all cells equidistant from the left or right boundary of the grid. Each Cartesian matrix is called a grid. Every cell in a grid has a unique row and column identifier.
A grid is similar to an ARC/INFO coverage. A grid is stored in an ARC/INFO workspace. The grid, like a coverage, is stored as a separate directory with associated tables and files that contain specific information about the grid.
Descriptive attributes of geographic features are stored in rows of a table. Each attribute is stored in a field or item, with one record (or row) of attributes for each feature. In this way, feature attribute tables can be related or linked to geographic features. The columns contain values for particular attributes, such as area, soil-type, drawing symbol. ARC/INFO manages three kinds of attribute tables: feature attribute tables, INFO data files, and external attribute tables from a relational database management system (RDBMS) such as ORACLE.
Images store photographs in rows and columns as a set of cells called pixels. Images represent two types of information: map images and descriptive images. Map images can be aerial photos or satellite imagery. Picture images are items such as photos and scanned documents.
TIN, or triangulated irregular network, is the data structure used to represent surfaces and the data model for the TIN software extensions to ARC/INFO. Tins are useful for representing surfaces that are highly variable, and contain discontinuities and breaklines. The main components of a tin are triangles, nodes, and edges. Nodes are locations defined by x, y, and z values (xyz) from which a tin is constructed. Triangles are formed by connecting each node with its neighbors. Edges are the sides of the triangles. The exact structure of a tin is based on certain triangulation rules that control tin creation.
The following are just some of the ARC/INFO coverage features and are the features that are used when converting data between the ARC/INFO and GRASS software products.
Points
Points represent geographic features that have no area or length, or features that are too small for their boundaries to be apparent for the given input map scale. A single x, y coordinate and an internal sequence number describe each point. In ARC/INFO, points are stored in a LAB file.
A point attribute table (PAT) is used to hold the attributes for the points. There is one record in the PAT for each point. The record is related to the point by the sequence number. At a minimum the PAT contains four items:
AREA Holds the area of a polygon. The value is 0 for points.
PERIMETER Holds the perimeter of a polygon. The value is 0 for points.
<cover> (where <cover> is the name of the ARC/INFO coverage) Internal sequence number (i.e., the record number) of the point feature in the LAB file.
<cover>-ID User-assigned feature ID for each point
Arcs
Arcs represent both linear features and the borders of areal features. Linear features represented by arcs can have length, but no area (e.g., elevation contours), or can be long narrow features whose width is not apparent at a given map scale (e.g., streets). Each linear feature may be made up of many arcs. Nodes indicate the endpoints and intersections of arcs. In addition, nodes can represent point features, which connect segments of a linear feature.
Arcs are stored in two coverage files: ARC and AAT. The ARC file contains one record for each arc. Each record contains the arc's user-id, location and shape information defined as a series of x,y coordinates, the from-node and to-node, and the left and right polygon numbers. Descriptive data about arcs are stored in an arc attribute table (AAT). There is one record in the AAT for each arc in the coverage. The record is related to the feature by the internal sequence number stored for each arc. At a minimum, the following items are contained in the AAT:
FNODE# Internal sequence number of the from-node
TNODE# Internal sequence number of the to-node
LPOLY# Internal sequence number of the left polygon; set to 0 if the coverage does not contain polygons.
RPOLY# Internal sequence number of the right polygon; set to 0 if the coverage does not contain polygons.
LENGTH Length in coverage units.
<cover># Internal sequence number (i.e., the record number) of the arc in the ARC file.
<cover>-ID User-assigned feature ID.
The from-node number (FNODE#) and the to-node number (TNODE#) identify which areas are connected (share a common node). The left polygon number (LPOLY#) and the right polygon number (RPOLY#) identify which polygons are contiguous (share a common arc).
Polygons
Polygons are used to represent area features. A polygon is defined by a series of arcs comprising its border and by a label point positioned inside the polygon. The user-id is assigned to the label point.
Polygons are stored topologically using left-right topology stored in the ARC file. The polygon arc list (PAL) file contains a list of all arcs and nodes defining each polygon's boundary. There is one record in the PAL for each polygon in the coverage. The CNT (centroid) file stores the label point numbers for each polygon. Label point coordinates are stored in the LAB file. Polygons require at least one label point in order to associate attributes. Descriptive data about polygons is stored in a polygon attribute table (PAT). There is one record in the PAT for each polygon, which is related to the polygon using the polygon's internal sequence number. At a minimum, the PAT contains four items:
AREA Holds the area of a polygon.
PERIMETER Holds the perimeter of a polygon.
<cover># Internal sequence number of a polygon.
<cover>-ID User-assigned feature ID for each polygon.
Nodes
Node coordinates are not stored explicitly within a coverage. Instead, node locations are stored as a part of each arc _ as the arc's beginning and ending vertices. Internal numbers of nodes are automatically assigned and stored as part of the arc information in the ARC file. When an AAT is built for a coverage, it contains the FNODE# and TNODE# items.
When nodes are used to represent point features, descriptive data is stored in a node attribute table (NAT). There is one record in the NAT for each node. The record is related to the node by the node's internal sequence number. At a minimum, the following items are contained in a NAT:
ARC# Internal sequence number of one of the arcs that connects at the node location. If more than one arc shares the node, the arc with the lowest internal number is used. This allows the x,y coordinate for the node to be read from the arc's record in the ARC file.
<cover># Internal sequence number of the node.
<cover>-ID User-assigned feature ID. When an NAT is initially created, node IDs are automatically set equal to the node's internal sequence number.
Tics
A tic is a registration or geographic control point for a coverage. TICs allow coverage coordinates to be registered to a common coordinate system. All tic information for a coverage is stored in the TIC file. It contains the following items:
IDTIC The user-id for each tic
XTIC The tic's x-coordinate
YTIC The tic's y-coordinate
Coverage extent represents the outer boundary of a coverage. It is the minimum bounding rectangle that defines the coordinate limits (extreme minimum and maximum coordinates) of coverage arcs and label units, and by definition, polygons, route-systems, and regions.
The BND is typically used to set a map extent for coverage drawing and display operations. It is often used as a default map extent for quick coverage display. Many spatial processes use the BND to determine whether one coverage overlaps another and to sort coverage features by location for processing.
All extent information for a coverage is stored in the BND file, which contains the following items:
XMIN The x-coordinate of the coverage extent's lower-left corner
YMIN The y-coordinate of the coverage extent's lower-left corner
XMAX The x-coordinate of the coverage extent's upper-right corner
YMAX The y-coordinate of the coverage extent's upper-right corner
A coverage containing no arcs or label points (or a single label point will have an undefined BND.
A grid cell is a discrete, uniform cell unit. Every cell represents a specific area on the earth. Each cell is given a value to correspond to the feature or characteristic that is located at or describes the site. An integer value is normally associated with each grid, which defines the group, class, or category the cell belongs to.
All integer grids include an INFO table called the value attribute table (VAT). The VAT always contains at least two items: VALUE and COUNT. VALUE is the value assigned to the cells in the grid, and COUNT is the number of cells in the grid that are assigned that value. Any number of additional items representing other attributes of the group, class, or category can be added or related to the VAT. There are no feature-IDs in a VAT.
An ARC/INFO coverage can contain both the spatial and attribute information. Feature class attributes are stored in feature tables (i.e., AAT, NAT, PAT, and TIC).
Feature attribute tables are generated by ARC/INFO when you create a feature class topology. For each feature in the coverage feature class there is one record in the feature attribute table. The attribute tables contain a mandatory set of attribute items required by ARC/INFO.
In INFO, a column in an attribute table is called an item. These mandatory items vary between feature classes. You can then add additional items to the feature attribute table to hold the information you require to record about features in your database.
There is always one record in the feature attribute table corresponding to each feature in the coverage. Both the spatial information used to define the coverage feature and the corresponding record in the feature attribute table contain the feature number so that a one-to-one correspondence is maintained between the feature and its record in the feature attribute table. Even though the records in a feature attribute table maintain a one-to-one correspondence with coverage features, one-to-many and many-to-many relationships can be managed between the feature attribute table and corresponding tables.
The specification of the format for each record in the data file is referred to as the item definition. Each record can be up to 4,096 characters (bytes) long. Any number of items can be defined for the data file. Items are defined by their name, the data type, the number of characters (or bytes) used to store values, a display width, and (for real numbers) the number of decimals you wish to display. INFO uses the following conventions to define the format of each item in a data file:
Item Name Any name with up to 16 alphanumeric characters.
Item Width Number of spaces (or bytes) used to store item values.
Output Width Number of spaced used to display the item values.
Item Type The data type of the item.
No. of decimals The number of digits to the right of the decimal place for item types that hold decimal numbers.
The following INFO file item types are supported:
B Whole numbers stored as binary integers (width of 2 or 4 bytes only). The maximum value for width of 2 is 32,767: for width of 4 is 2,147,483,647.
C Character (width up to 320 alphanumeric characters)
D Dates in the form DD/MM/YY or DD/MM/YYYY (item width is fixed at 8 and stored internally as YYYYMMDD).
F Decimal numbers stored in internal floating-point representation (width of 4 or 8 bytes only). A 4-byte width is single-precision real (7 digits of precision), and 8 bytes is double precision (14 digits of precision).
I Integers stored as 1 byte per digit (width from 1 to 16, maximum value possible is 2,147,483,647).
N Decimal numbers stored as 1 byte per digit (width from 1 to 16).
Both feature attribute tables and related INFO files are stored in INFO databases. The INFO database is a file-based system. Each ARC/INFO workspace contains an INFO database directory; thus a multi-workspace ARC/INFO database contains many INFO databases.
Feature attribute tables must be stored in an INFO database located in the coverage workspace. In ARC/INFO, a workspace is a directory that contains a set of coverages and their INFO subdirectory. The INFO subdirectory contains all of the feature attribute tables for those coverages plus any other associated INFO files.
One important fact in the use of INFO is the user's view of INFO; sets of INFO files are managed as a unit in an INFO database. An INFO directory is a collection of INFO files stored in a single-user workspace. These files are only accessible from the ARC/INFO modules and the INFO database program.
Any selected set of INFO file records or related records will, by default, be returned in the order that the records were inserted into the INFO file. If you wish to change this order, then you have to sort the INFO file. INFO files can be sorted on one or more item values so that records are returned in a predictable order. Since feature attribute tables are always ordered by the cover#, sorting these files by cover# will corrupt your coverage.