Job Specifications
The Position
Global Fishing Watch (GFW) started its journey almost 10 years ago creating the first global dataset of fishing based on public vessel tracking data. Since then, we have not only created tools for users to make that dataset actionable through an online platform and APIs, but we have continued to expand our released data products.
Our pipeline is a series of processes that ingests data sources (e.g. Automatic Identification System (AIS), satellite imagery, vessel registries, external partner data sources), compiles, cleans, aggregates the information, and then applies a variety of algorithms and machine learning models to classify vessel activity (e.g. fishing, port visits, transshipments), behavior and identity. The data pipeline outputs are productive BigQuery tables that fuel our public-facing products (APIs, Map, and Data Download Portal), our analytic reports, and research papers.
We then make these datasets available through our multiple products:
Public open map platforms with advanced data navigation, visualization, and real-time report generation allow users with no GIS or big data experience to obtain the data they need,
Application Programming Interface (API)we allow users to access our data through our APIs to download or connect to their platforms,
Bulk data downloads,
Scientific journal publications
From a team of 5 researchers and software engineers, we have expanded to an organization of ~100 people with dedicated teams, including Research, Engineering, and Product. For this reason, we are looking to expand our Data Team. Four years ago, the Data Team initially began to manage the growing need for cross-team coordination and communication on data product releases. The Data Team’s sole focus is to manage the quality, documentation, accessibility, and ethical review of data products from initial integration into public production. As we’ve grown, the quality assurance work under the Data Team has also continued to evolve.
Role
The Data Quality Analyst will play a key role in the Data Team by helping to ensure the highest quality of data generated and data delivery in all Global Fishing Watch products. This will be done in multiple ways:
Conducting quality assurance checks and monitoring of evolving Big Data datasets, including AIS, VMS, SAR, VIIRS, and other types of vessel identification data from different sources,
Making visible to the organization and public the level of QA we have in the different datasets and owning the roadmap for adding QA to existing and new data,
Spreading a QA culture in the different areas of Global Fishing Watch through aiding in cross-team QA coordination and communication throughout each data pipeline
We are looking for a person who has software development or data QA experience. The person will require well-developed analytical skills, the ability to conduct impact assessments, cross-discipline communication skills, creativity, understanding data changes within APIs of evolving Big Data datasets, and must not be afraid of big challenges.
Principal Duties And Responsibilities Include
End-to-End Review
Provide thorough reviews and assessments of methodology changes in data developed by the Research and Engineering teams at GFW, including different types of data such as AIS, VMS, SAR, VIIRS, etc.
Work collaboratively across teams and specifically with Engineering to develop automated and continual quality control checks on productive pipelines’ outputs to detect anomalies immediately and ensure that the data published in APIs is consistent with the pipeline outputs
Understand and act as a knowledge source of GFW data sources and pipelines, from ingestion of data to published data products in APIs
Manage JIRA task management of QA tasks
Aid in transitioning public facing data pipelines from an early ‘prototype’ stage to ‘production’ by improving release processes, pipeline consistency and stability, as well as promoting quality assurance best practices across various teams
Communication
Work with other team members, create, organize, and track metrics to communicate quality assurance improvements and changes both for internal and external audiences
Highlight the improvements and progress in quality assurance at GFW internal and external audiences through the creation of public dashboards and performance metrics.
Aid in standardizing metadata requirements and data documentation
Planning and Development of QA Policies and Culture
Work closely with the other Data Team members to institutionalize Global Fishing Watch standard QA protocols across all projects and teams
Along with the Data Team and Engineering Team continue to advance understanding and cohesive use of the data staging concepts and requirements of datasets to move from proof-of-concept to published
Aid in the standardization and improvement of QA tools used for pipeline review, including review templates for change impact assessment, automated monitoring alerts, and anomaly detection
Other
Develop and maintain an informed awareness of relevant topics to continually innovate on the application of GFW products to aid in the the sustainability of our ocean.
Assist other team members with any ongoing needs, such as review of user reported data errors, data communication with external partners, and answering questions regarding data and our methodologies from external partners and internal teams
Job Requirements
Required Experience and Competencies
Master's degree in a relevant area or equivalent professional experience (such as Bachelor’s degree with 5+ years of experience in quality assurance, software engineering, data analytics, and/or fisheries science)
Demonstrated track record of attention to detail, for example catching issues while reviewing pull requests or conducting root cause analysis of issues
Demonstrated skills in QA/QC processes, specifically the ability to develop and carry out testing methodology to quantify change, inaccuracies, and inconsistencies in the data
Trained in presenting QA findings and metrics in a digestible format to various audiences with the corresponding level of detail.
Fluency with SQL and another coding language, such as Python or R
Proficiency in data visualization either through BI tools such as Power BI or Looker, or through packages such as ggplot2, Shiny, matplotlib, or plotly
Experience with QA of public facing data products
Demonstrated knowledge of best practices in evolving data and data quality
Knowledge of agile methodologies to work on sprints and participate in planning meetings
Preferred
Skilled data communicator and educator, trained in data ethics and accessibility standards
Knowledge of big data pipelines and APIs
Experience in fisheries compliance and/or monitoring, control and surveillance, especially regarding fisheries supply chain
Fluency in Google BigQuery
Knowledge of AIS, Visible Infrared Imaging Radiometer Suite (VIIRS), synthetic aperture radar (SAR), VMS and other sa...