WYND Environmental Data & Oracles Service
WYND is committed to bringing visibility to the environmental conditions by bringing the correct data to measure the current state appropriately. To achieve this, WYND is actively building data services for fetching orbital (satellites) as well ground-based data from communities, organizations, and volunteers.
There is now a critical need for an intelligent data system integrated with a marketplace economy to leverage the vast potential of these data sets fully. Satellite observations need ground data for validation and calibration. Ground measurements on forest fires, land cover changes, and air pollution are critical to building a bridge between the global picture painted by the satellites and the contextual reality of what happens on the ground. Data collected from different sources are sometimes challenging to locate and understand across diverse formats. Published scientific analyses can be complex to reproduce for new use cases and are sometimes not easily comparable with other metrics and algorithms. Many scientific data products do not follow common data protocols or even have a clean API to query for another task. Finally, and perhaps most importantly, the collaborative sharing of data and algorithms often stalls due to the challenge of incentivizing and rewarding continued interest and community development. The inconsistency of funding often precipitates the breakdown of relationships between data users and providers due to funding constraints, which eventually renders data sets obsolete to operational management decisions and environmental action.
WYND data services are founded on the following tenants:
- simple trusted data
- reproducibility
- interoperability
- free exchange
- easy collaboration
This approach provides a foundation that will enable a multitude of conservation and business applications focused on ecological health, as well as turbo-charging research and analysis to advance the state of the art.
Open Data
One essential aspect of this project is that all source data is open. This ensures we don’t end up like another Microsoft Earth, a pay-to-play initiative that many cannot access and one organization controls. We are establishing a data commons and providing value-added services on top of it. The original datasets we are working with are often under some open license. Generally, that would be Public Domain, CC0, CC-BY (attribution required), or even CC-BY-SA (share derivative work must also be open).
This means that the original data (in an increasingly normalized format) will be available to all. The indexes to search and organize such data will also be fully open, and the various algorithms we use to process the data.
However, many projects uploading “ground truth” need continuous funding to be sustainable. While we require all remote sensing data to be under an open license, we will allow such user-uploaded data to be a non-open-source, as many groups are concerned about reuse without payments.
To provide a viable revenue stream for all data providers, we focus on value-added products on top of the data. Other teams can build products directly on the primary data for free or pay to use these extensions.
Data Value Enrichment
WYND will collect user-specific data and other open data sources and perform the process of Extracting, Transforming, and Loading data on the base map based on the base map’s location index (H3 Index). Once this user-specific data is enriched utilizing all other data sources, some extra model weights may be applied by federated machine learning and deep learning models for more user-published data enrichment. This new fused dataset will then calculate certain quality indexes to represent a shareable user and site behavior that changes over time. This brings unprecedented visibility to the site and the data publishers to the whole community.
Together the WYND aims to present a framework that serves as a market for innovation and which will evolve as observing techniques and algorithms advance. Still, innovation will be contained within three basic categories for descriptors and metrics of ecosystem health:
-
Carbon / Living Biomass:
These metrics describe the total mass of living organisms in an area or region, and can be associated with carbon stored in living plants.
-
Biodiversity:
These metrics represent the characterization of the breadth of the varied species within a region or ecosystem, including plant, animal, fungal, and microbial classifications.
-
Degree of human intrusion:
These metrics represent objective measures of human presence and landscape alteration, and include variables such as mining, logging, agricultural density and type, road density, human population density and nighttime lights.
-
Pollution:
These metrics describe the degree of toxicity (i.e. the ability to cause harmful effects over long periods), of the air, water, or soil, often associated with foreign chemical contamination.
These metric classifications were designed for their comprehensiveness as well as for their orthogonality. They are comprehensive in that these metrics will conclusively describe the health of a region or ecosystem. They are orthogonal in that any one of these metrics may change in isolation from the others. For instance, while it is likely that many of these metrics would be consistent or correlated for an impacted region (e.g., the city of Los Angeles), environmental action may shift only one (e.g., with a conversion to electric vehicles, the pollution metric alone may improve). As we consider a rapidly changing ecosystem, it is likely that these metrics could evolve independently or in sequence but may not be temporally or spatially correlated.