Attribute Fields¶
A field contains field name, field type and further field characteristics which depend on the field type, like for example the field precision for decimal numbers. Geometry fields are distinguished from other fields and treated separately. Non geometry fields are denominated here attribute fields.
Fields are retrieved from layer definitions and are therefore layer dependent. Layer number zero is the retrieved layer whenever the layer number is not explicitly given.
Fields are described by osgeo.ogr.FieldDefn
. It produces unexpected behaviour if not used correctly.
For example, altough the program does not crash in the emphasized lines of the code block below,
it shows a wrong output “S”.
from osgeo import ogr
dataset = ogr.Open('D:/tmp/girs/DEU_adm_shp/DEU_adm4.shp', 0) # open read-only
if dataset is None:
raise ValueError('Could not open {}'.format('D:/tmp/girs/DEU_adm_shp/DEU_adm4.shp'))
lyr = dataset.GetLayer(0)
print lyr
ldf = lyr.GetLayerDefn()
print ldf
fd = ldf.GetFieldDefn(0)
print 'Field name: {}'.format(fd.GetName())
print fd
del dataset # This causes the problem
print 'Field name: {}'.format(fd.GetName())
print fd # it does not crash but shows a wrong output "S"
Output:
<osgeo.ogr.Layer; proxy of <Swig Object of type 'OGRLayerShadow *' at 0x0000000002536E70> >
<osgeo.ogr.FeatureDefn; proxy of <Swig Object of type 'OGRFeatureDefnShadow *' at 0x0000000002536EA0> >
Field name: ID_0
<osgeo.ogr.FieldDefn; proxy of <Swig Object of type 'OGRFieldDefnShadow *' at 0x0000000002536F60> >
Field name: S
<osgeo.ogr.FieldDefn; proxy of <Swig Object of type 'OGRFieldDefnShadow *' at 0x0000000002536F60> >
Class FieldDefinition¶
The girs
class girs.feat.layers.FieldDefinition
stores field properties obtained from
osgeo.ogr.FieldDefn
, avoiding unexpected behaviours. The class has the following attributes:
class FieldDefinition:
- name
- oft_type
- width
- precision
- oft_subtype
- nullable
- default
Field properties¶
Number of fields¶
from girs.feat.layers import LayersReader
lrs = LayersReader('D:/tmp/girs/DEU_adm_shp/DEU_adm4.shp')
print lrs.get_field_count() # layer number 0
print lrs.get_field_count(layer_number=0) # same result
Field names¶
def ex02_field_names():
from girs.feat.layers import LayersReader
lrs = LayersReader('D:/tmp/girs/DEU_adm_shp/DEU_adm4.shp')
print lrs.get_field_names()
Output:
['ID_0', 'ISO', 'NAME_0', 'ID_1', 'NAME_1', 'ID_2', 'NAME_2', 'ID_3', 'NAME_3', 'ID_4',
'NAME_4', 'VARNAME_4', 'CCN_4', 'CCA_4', 'TYPE_4', 'ENGTYPE_4']
Field numbers¶
Fields are numbered from 0 to n-1, where n is the number of fields in a layer (see get_field_count()). In order to get the field number (also called field index):
from girs.feat.layers import LayersReader
lrs = LayersReader('D:/tmp/girs/DEU_adm_shp/DEU_adm4.shp')
print lrs.get_field_numbers(['NAME_0', 'NAME_1', 'NAME_2', 'NAME_3', 'NAME_4']) # layer 0
Output:
[2, 4, 6, 8, 10]
Field definition¶
from girs.feat.layers import LayersReader
lrs = LayersReader('D:/tmp/girs/DEU_adm_shp/DEU_adm4.shp')
print lrs.get_field_definition('NAME_0')
print lrs.get_field_definition('NAME_0', layer_number=0) # same result
Field definitions¶
from girs.feat.layers import LayersReader
lrs = LayersReader('D:/tmp/girs/DEU_adm_shp/DEU_adm4.shp')
print lrs.get_field_definitions(field_names=['NAME_0', 'NAME_1'])
print lrs.get_field_definitions()
Output:
[NAME_0 (String), NAME_1 (String)]
[ID_0 (Integer64), ISO (String), NAME_0 (String), ID_1 (Integer64), NAME_1 (String),
ID_2 (Integer64), NAME_2 (String), ID_3 (Integer64), NAME_3 (String), ID_4 (Integer64),
NAME_4 (String), VARNAME_4 (String), CCN_4 (Integer64), CCA_4 (String), TYPE_4 (String),
ENGTYPE_4 (String)]
Field definitions as data frame¶
A full description of all fields in a layer is shown with get_field_definitions_data_frame()
:
from girs.feat.layers import LayersReader
lrs = LayersReader('D:/tmp/girs/DEU_adm_shp/DEU_adm4.shp')
print lrs.get_field_definitions_data_frame()
Output:
name type type_name width precision subtype subtype_name nullable default
0 ID_0 12 Integer64 10 0 0 None 1 None
1 ISO 4 String 3 0 0 None 1 None
2 NAME_0 4 String 75 0 0 None 1 None
3 ID_1 12 Integer64 10 0 0 None 1 None
4 NAME_1 4 String 75 0 0 None 1 None
5 ID_2 12 Integer64 10 0 0 None 1 None
6 NAME_2 4 String 75 0 0 None 1 None
7 ID_3 12 Integer64 10 0 0 None 1 None
8 NAME_3 4 String 75 0 0 None 1 None
9 ID_4 12 Integer64 10 0 0 None 1 None
10 NAME_4 4 String 100 0 0 None 1 None
11 VARNAME_4 4 String 100 0 0 None 1 None
12 CCN_4 12 Integer64 10 0 0 None 1 None
13 CCA_4 4 String 20 0 0 None 1 None
14 TYPE_4 4 String 35 0 0 None 1 None
15 ENGTYPE_4 4 String 35 0 0 None 1 None
field values¶
Field values return a pandas DataFrame indexed by feature ids. It does not include the geometry field.
from girs.feat.layers import LayersReader
lrs = LayersReader('D:/tmp/girs/DEU_adm_shp/DEU_adm4.shp')
print lrs.get_field_values(field_names=['NAME_1', 'NAME_2', 'NAME_3'])
Output:
NAME_1 NAME_2 NAME_3
FID
0 Baden-Württemberg Alb-Donau-Kreis Allmendingen
1 Baden-Württemberg Alb-Donau-Kreis Allmendingen
2 Baden-Württemberg Alb-Donau-Kreis Blaubeuren
... ... ... ...
11299 Thüringen Weimarer Land Nordkreis Weimar
11300 Thüringen Weimarer Land Nordkreis Weimar
11301 Thüringen Weimar Weimar
OGR field definitions¶
If it is necessary to work with instances of osgeo.ogr.FieldDefn
, the less error prone way is to
use the method fields(layer_number), which is an iterator through all fields in layer definition:
Better not using it:
from girs.feat.layers import LayersReader
lrs = LayersReader('D:/tmp/girs/DEU_adm_shp/DEU_adm4.shp')
for fd in lrs.fields(layer_number=0):
print fd.GetName() # don't use fd outside the loop if lrs is deleted!
Use instead:
from girs.feat.layers import LayersReader
lrs = LayersReader('D:/tmp/girs/DEU_adm_shp/DEU_adm4.shp')
for field_name in lrs.get_field_names():
print field_name