Qld Environment Dept pioneers cloud-based analytics platform
In the 1960s, the area now known as East Trinity Reserve was a natural wetland, rich with native plants and animals. Today the reserve is serving as a pilot for a cloud-based data and analytics transformation for Queensland’s Department of Environment and Science (DES), designed to break open the Department’s on-premise silos of information and make them more widely accessible and useful to accredited scientists.
The DES has six directorates focused on areas ranging from environmental management to climate modelling. Data collections cover water, air and soil measurements, as well as flora and fauna surveys, and encompass remote sensing and satellite imagery.
Armed with its own high-performance computing environment to manage data processing, the Department’s data collections have been largely siloed in the past, with different scientists using different systems to store, manage and analyse their data. The DES was thus concerned that it was missing opportunities to share data among scientists and risking duplication of effort. There was also a risk that the key people in the Department who understood where data was stored and how it could be accessed might retire or leave the organisation — taking with them the knowledge accumulated over the years.
In order to encourage greater data access, enhance analytic opportunities and improve the governance and security surrounding its collections, the DES is modernising its approach to data and analytics, first undertaking the proof of concept with data from East Trinity Reserve.
Enhanced data governance and improved access
Working with Microsoft partner Versor, the DES identified East Trinity Reserve as a good candidate for its data and analytics project designed to demonstrate the value of cracking open data silos. Since remediation of the reserve began in 2001, the DES has collected data every 10 minutes from sensors located in 15 stations. A significant amount of data has been collected over the past 20 years — but it’s always been siloed and hard to analyse.
In the past, scientists relied on manual processes to manage sensor data cleansing, error analysis, storage and version management. This meant there was a huge backlog of unvalidated monitoring data and low confidence about the overall quality of data, and scientists spent more time wrangling data than conducting scientific work.
The data project, which was rolled out in around two months, involved transferring scientific data into Azure Data Lake, where it is cleansed using Azure Databricks and turned into an SQL database, which is then available for analysis using Power BI. Daniel Brough, Science Leader for Science Information Services at the DES, noted, “This is a way to kickstart that modernisation of not just our process, but also our governance and how we deal with data as well.
“I think moving to some of these new systems … is really improving the way we manage the data so that it’s repeatable.”
More robust data governance also sets up opportunities to augment data collections with external information. For example, adding in weather forecasts would provide scientists with early warnings of overcast conditions that might prompt them to replace sensor batteries that would normally be charged by solar.
Improved data governance also brings peace of mind regarding data sharing both within the DES and externally. In the East Trinity Reserve case, for example, there are Indigenous-owned ecotourism businesses that are hungry for data about the health of the environment. Power BI dashboards can be developed for specific users to deliver the information they need, with the confidence that accurate and timely data has been used to generate the reports.
“The East Trinity team is looking forward to being able to do more with the data than we have in the past, and that’s partly because of the access on a single platform to all the tools,” said Evan Thomas, Science Leader Soil and Land Resources at the DES. “So there’s a bit of skilling up that we’ve got to do, but I think the opportunity is there now to do it, and to be a little more able to service our own needs internally. And also reach out to others within the Department for more ready and reliable support, which … just wasn’t possible.”
So far, the pilot has been able to save the efforts of the equivalent of 0.5 of a full-time employee. The initial workload indicates that once the system is in full production with all data loaded, authorised scientists will have unfettered access to data and tools that should simplify analysis, allowing them to focus on the science.
Widespread opportunities to improve science outcomes
DES Business Analyst Jennifer Richards says that following the East Trinity pilot there are plans to broaden the strategy in order to support the entire science division. She says the focus is on getting better use from the division’s data, making it more findable, accessible, interoperable and reusable.
“The science division is a division built on data … [but] at the moment we have all these roadblocks in making that data reusable,” Richards said. “The ultimate aim is to get data out onto open data and in use outside of the Department as well as inside of the Department.”
Brough added that the modern data and analytics platform also frees up scientists so they have more time to do science, rather than chasing data and converting data. He quoted Michelle Martens, DES Land Resource Officer, saying she was “really looking forward to being able to actually get in and analyse the data … not just looking at a graph every morning to work out if everything’s still working”.
Martens added, “I’m really looking forward to having more time to not just wrangle equipment and keep things operating on site, but to actually look at some of the longer-term trends. In the stuff we were graphing before, I couldn’t graph more than three months at a time, or it would fall over because of the dataset.”
For scientists, that transparency and access to entire longitudinal datasets, and the ability to use cutting-edge analytical tools to see patterns and trends emerge, is critically important, allowing them to make evidence-backed decisions with real and lasting impact. At present, two of the 20 years of East Trinity Reserve data has been loaded onto the platform, with another 18 to follow.
“I’m so excited at the prospect of looking at 20 years of data from the one station,” Martens concluded. “That’s something I haven’t been able to do all in one big, long series before.”
Phone: 02 9870 2200
Getac has announced the F110-EX, aimed at professionals working in and around hazardous...
Thermo Scientific Proteome Discoverer 3.0 software combines with the CHIMERYS search engine by...
The software introduces a handling method for clipping planes and slicers to improve efficiency...