Tuesday, October 13, 2009

Rambling Thoughts

Just thinking out loud... I am sure it's crazy talk.

Several colleagues and I have been discussing this for quite sometime so I figure it's past time to get input. The premise is - Get out of the way! The topic is data.

As coordinators we work with numerous data providers. Inevitably we get rich data from locals, often attempt to standardize, publish out for consumption in a host of formats and services. We (AR GIO) have been repeating this process about 7 years. We have gotten better, but are still not efficient. The time delay from manufacture to shelf is far too slow. Slow equates to months instead of days or hours.

This is all well and good but what if we are the problem because we are in the way. There are a host (no pun intended) of companies/organizations that are much more capable of standardizing and publishing the data for consumption more efficiently than we are. Yes, I concede they have a great deal more money, staff, and likely brain power than I do. This is not a bad thing but it begs the question. How do I point, provide, enable the process and get out of the way.

A couple of us were just chatting and came to the conclusion many of you have already come too. If a company/organization uses the rich data and publishes in a format that can be consumed by all of the major GIS packages, then that is one less thing we have to do. Lets go find more data to feed the monster.

Next series of questions.
1) How do we feed the monster? Shapefile/geotiff and FTP? No one can argue the efficiency.
2) Do we continue to maintain all of the web services? The point at which the data is provided back from integrators as a web service in a manner that is consumed by major GIS packages the answer likely changes.
3) I'm at a loss but feel sure there is a 3rd, 4th, and so on.

So maybe we are quickly coming full circle? Its just a question. I am really interested in solving the reconciliation of deltas from all the various sources (Google, OSM, city, county, state) and feeding that data back out (city, county, state, ect). Bet someone has that figured out. Ping me if you do.

3 comments:

  1. Most data changes are due to the geospatial folks catching to speed by adding or greatly revising existing data. As our data matures the deltas will likely become much smaller in number and size. New feature will be easy to pick up visually for things like streets where the existing features don't fill the canvas (like parcels do). But attribute level changes are harder to detect. Some basic change detection tools would go along way (do we want to work together to define what these might look like?) until we get everyone following a consistent standard with regard to documenting their changes in some sort of feature level metaata scheme (which could be just last update date stored in a field).

    ReplyDelete
  2. Agreed, what I am interested in is what the (hate to even use this term) full life cycle looks like, even if it is a pie in the sky concept. How are deltas ingested, redistributed, (re)attributed/updated/modified and republished.

    As I read this; those words even look dated. #coal

    ReplyDelete
  3. I hope we transcend ETL...it jsut doesnt need to be part of the equation anymore...(see google)

    GDB Replication...serious issues all around but possible with the right situation

    Centralized databases (we don't need 3141 county road databases do we? or 51 state databases for that matter...if we get the data model right, this is the answer...with crowd sourced verification and edit suggestions.

    ReplyDelete

About Me

Little Rock, Arkansas, United States