drv_georss.html 10.3 KB
Newer Older
xuebingbing's avatar
xuebingbing committed
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226
<html>
<head>
<title>OGR GeoRSS driver</title>
</head>

<body bgcolor="#ffffff">

<h1>GeoRSS :  Geographically Encoded Objects for RSS feeds</h1>

(Driver available in GDAL 1.7.0 or later)<p>

GeoRSS is a way of encoding location in RSS or Atom feeds.<p>

OGR has support for GeoRSS reading and writing. Read support is only available if GDAL is built with <i>expat</i> library support<p>

The driver supports RSS documents in RSS 2.0 or Atom 1.0 format.<p>

It also supports the <a href="http://georss.org/model">3 ways of encoding location</a> :
GeoRSS simple, GeoRSS GML and W3C Geo (the later being deprecated).<p>

The driver can read and write documents without location information as well.<p>

The default datum for GeoRSS document is the WGS84 datum
(EPSG:4326). Although that GeoRSS locations are encoded in
latitude-longitude order in the XML file, all coordinates reported or
expected by the driver are in longitude-latitude order. The
longitude/latitude order used by OGR is meant for compatibility with
most of the rest of OGR drivers and utilities. For locations encoded
in GML, the driver will support the srsName attribute for describing
other SRS.<p>

Simple and GML encoding support the notion of a <i>box</i> as a geometry. This will be decoded as a rectangle (Polygon geometry) in OGR Simple Feature model.<p>

A single layer is returned while reading a RSS document.
Features are retrieved from the content of &lt;item&gt; (RSS document) or &lt;entry&gt; (Atom document) elements.<p>

<h2>Encoding issues</h2>

Expat library supports reading the following built-in encodings :
<ul>
<li>US-ASCII</li>
<li>UTF-8</li>
<li>UTF-16</li>
<li>ISO-8859-1</li>
</ul>

OGR 1.8.0 adds supports for Windows-1252 encoding (for previous versions, altering the encoding
mentioned in the XML header to ISO-8859-1 might work in some cases).<p>

The content returned by OGR will be encoded in UTF-8, after the conversion from the
encoding mentioned in the file header is.<p>

If your GeoRSS file is not encoded in one of the previous encodings, it will not be parsed by the
GeoRSS driver. You may convert it into one of the supported encoding with the <i>iconv</i> utility
for example and change accordingly the <i>encoding</i> parameter value in the XML header.<br>
<p>

When writing a GeoRSS file, the driver expects UTF-8 content to be passed in.<p>

<h2>Field definitions</h2>

While reading a GeoRSS document, the driver will first make a full scan of the document to get the field definitions.<p>

The driver will return elements found in the base schema of RSS channel or Atom feeds. It will also return extension elements,
that are allowed in namespaces.<p>

Attributes of first level elements will be exposed as fields.<p>

Complex content (elements inside first level elements) will be returned as an XML blob.<p>

When a same element is repeated, a number will be appended at the end of the attribute name for the repetitions. This is useful for the &lt;category&gt; element in RSS and Atom documents for example.<p>

<p>
The following content :
<pre>
    &lt;item&gt;
        &lt;title&gt;My tile&lt;/title&gt;
        &lt;link&gt;http://www.mylink.org&lt;/link&gt;
        &lt;description&gt;Cool description !&lt;/description&gt;
        &lt;pubDate&gt;Wed, 11 Jul 2007 15:39:21 GMT&lt;/pubDate&gt;
        &lt;guid&gt;http://www.mylink.org/2007/07/11&lt;/guid&gt;
        &lt;category&gt;Computer Science&lt;/category&gt;
        &lt;category&gt;Open Source Software&lt;/category&gt;
        &lt;georss:point&gt;49 2&lt;/georss:point&gt;
        &lt;myns:name type="my_type"&gt;My Name&lt;/myns:name&gt;
        &lt;myns:complexcontent&gt;
            &lt;myns:subelement&gt;Subelement&lt;/myns:subelement&gt;
        &lt;/myns:complexcontent&gt;
    &lt;/item&gt;
</pre>
will be interpreted in the OGR SF model as :
<pre>
  title (String) = My title
  link (String) = http://www.mylink.org
  description (String) = Cool description !
  pubDate (DateTime) = 2007/07/11 15:39:21+00
  guid (String) = http://www.mylink.org/2007/07/11
  category (String) = Computer Science
  category2 (String) = Open Source Software
  myns_name (String) = My Name
  myns_name_type (String) = my_type
  myns_complexcontent (String) = &lt;myns:subelement&gt;Subelement&lt;/myns:subelement&gt;
  POINT (2 49)
</pre>
<p>

<h2>Creation Issues</h2>

On export, all layers are written to a single file. Update of existing
files is not supported.<p>
If the output file already exits, the writing will not occur. You have to delete the existing file first.<p>

A layer that is created cannot be immediately read without closing and reopening the file. That is to say that a dataset is read-only or write-only in the same session.<p>

Supported geometries :
<ul>
<li>Features of type wkbPoint/wkbPoint25D.</li>
<li>Features of type wkbLineString/wkbLineString25D.</li>
<li>Features of type wkbPolygon/wkbPolygon25D.</li>
</ul>
Other type of geometries are not supported and will be silently ignored.
<p>

The GeoRSS writer supports the following <i>dataset</i> creation options:
<ul>
<li> <b>FORMAT</b>=RSS|ATOM: whether the document must be in RSS 2.0 or Atom 1.0 format. Default value : RSS<p>
<li> <b>GEOM_DIALECT</b>=SIMPLE|GML|W3C_GEO (RSS or ATOM document): the encoding of location information. Default value : SIMPLE<br>
W3C_GEO only supports point geometries.<br>
SIMPLE or W3C_GEO only support geometries in geographic WGS84 coordinates.<p>
<li> <b>USE_EXTENSIONS</b>=YES|NO. Default value : NO. If defined to YES, extension fields (that is to say fields not in the base schema of RSS or Atom documents) will be written. If the field name not found in the base schema matches the foo_bar pattern, foo will be considered as the namespace of the element, and a &lt;foo:bar&gt; element will be written. Otherwise, elements will be written in the &lt;ogr:&gt; namespace.<p>
<li> <b>WRITE_HEADER_AND_FOOTER</b>=YES|NO. Default value : YES. If defined to NO, only &lt;entry&gt; or &lt;item&gt; elements will be written. The user will have to provide the appropriate header and footer of the document. Following options are not relevant in that case.<p>
<li> <b>HEADER</b> (RSS or Atom document): XML content that will be put between the &lt;channel&gt; element and the first &lt;item&gt; element for a RSS document, or between the xml tag and the first &lt;entry&gt; element for an Atom document. If it is specified, it
will overload the following options.<p>
<li> <b>TITLE</b> (RSS or Atom document): value put inside the &lt;title&gt; element in the header. If not provided, a dummy value will be used as that element is compulsory.<p>
<li> <b>DESCRIPTION</b> (RSS document): value put inside the &lt;description&gt; element in the header. If not provided, a dummy value will be used as that element is compulsory.<p>
<li> <b>LINK</b> (RSS document): value put inside the &lt;link&gt; element in the header. If not provided, a dummy value will be used as that element is compulsory.<p>
<li> <b>UPDATED</b> (Atom document): value put inside the &lt;updated&gt; element in the header. Should be formatted as a XML datetime. If not provided, a dummy value will be used as that element is compulsory.<p>
<li> <b>AUTHOR_NAME</b> (Atom document): value put inside the &lt;author&gt;&lt;name&gt; element in the header. If not provided, a dummy value will be used as that element is compulsory.<p>
<li> <b>ID</b> (Atom document): value put inside the &lt;id&gt; element in the header. If not provided, a dummy value will be used as that element is compulsory.<p>
</ul>
<p>

<p>
When translating from a source dataset, it may be necessary to rename the field names from the source dataset to the expected RSS or ATOM attribute names, such as &lt;title&gt;, &lt;description&gt;, etc...
This can be done with a <a href="drv_vrt.html">OGR VRT</a> dataset, or by using the "-sql" option of the ogr2ogr utility (see <a href="http://trac.osgeo.org/gdal/wiki/rfc21_ogrsqlcast">RFC21: OGR SQL type cast and field name alias</a>)

</li>

<h2>VSI Virtual File System API support</h2>

(Some features below might require OGR >= 1.9.0)<p>

The driver supports reading and writing to files managed by VSI Virtual File System API, which include
"regular" files, as well as files in the /vsizip/ (read-write) , /vsigzip/ (read-write) , /vsicurl/ (read-only) domains.<p>

Writing to /dev/stdout or /vsistdout/ is also supported.<p>

<h2>Example</h2>

<li>The ogrinfo utility can be used to dump the content of a GeoRSS datafile :

<pre>
ogrinfo -ro -al input.xml
</pre>
</li>
<p><br>

<li>The ogr2ogr utility can be used to do GeoRSS to GeoRSS translation. For example, to translate a Atom document into a RSS document

<pre>
ogr2ogr -f GeoRSS output.xml input.xml "select link_href as link, title, content as description, author_name as author, id as guid from georss"
</pre>
<br>
Note : in this example we map equivalent fields, from the source name to the expected name of the destination format.
</li>
<p><br>


<li>The following Python script shows how to read the content of a online GeoRSS feed
<pre>
    #!/usr/bin/python
    import gdal
    import ogr
    import urllib2

    url = 'http://earthquake.usgs.gov/eqcenter/catalogs/eqs7day-M5.xml'
    content = None
    try:
        handle = urllib2.urlopen(url)
        content = handle.read()
    except urllib2.HTTPError, e:
        print 'HTTP service for %s is down (HTTP Error: %d)' % (url, e.code)
    except:
        print 'HTTP service for %s is down.' %(url)

    # Create in-memory file from the downloaded content
    gdal.FileFromMemBuffer('/vsimem/temp', content)

    ds = ogr.Open('/vsimem/temp')
    lyr = ds.GetLayer(0)
    feat = lyr.GetNextFeature()
    while feat is not None:
        print feat.GetFieldAsString('title') + ' ' + feat.GetGeometryRef().ExportToWkt()
        feat.Destroy()
        feat = lyr.GetNextFeature()

    ds.Destroy()

    # Free memory associated with the in-memory file
    gdal.Unlink('/vsimem/temp')
</pre>
</li>

<h2>See Also</h2>

<ul>
<li> <a href="http://georss.org/">Home page for GeoRSS format</a><p>
<li> <a href="http://en.wikipedia.org/wiki/GeoRSS">Wikipedia page for GeoRSS format</a><p>
<li> <a href="http://en.wikipedia.org/wiki/RSS">Wikipedia page for RSS format</a><p>
<li> <a href="http://www.rssboard.org/rss-specification">RSS 2.0 specification</a><p>
<li> <a href="http://en.wikipedia.org/wiki/Atom_(standard)">Wikipedia page for Atom format</a><p>
<li> <a href="http://www.ietf.org/rfc/rfc4287.txt">Atom 1.0 specification</a><p>
</ul>

</body>
</html>