Skip to content

broadinstitute/monster

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

44 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

The Monster Team

Monster Slack Monster CI Slack

New to the team? Start here.

People

Name Role GitHub
Jeff Korte Product Owner @JeffKorte
Quazi Hoque Software Engineer @quazi-broad
Drew Herbst Tech Lead @aherbst-broad

GitHub Teams

  • DSP Monsters - Team for repositories under the broadinstitute org
  • Monster - Team for repositories under the DataBiosphere org

Projects

Data Modeling

Linked Data definitions for the Terra Core Data Model, with extensions for unmodeled datasets.

Documentation

GitHub repos

Data Ingest

Pipelines for moving data into the Jade Data Repository.

Documentation

GitHub repos

  • ClinVar - ETL pipeline for the ClinVar dataset
  • ENCODE - ETL pipeline for the ENCODE dataset
  • Dog Aging - ETL pipeline for the Dog Aging Project dataset
  • HCA - ETL pipeline for the HCA

Ingest Utilities

Tools and libraries used to support the top-level ingest pipelines.

GitHub repos

  • Base utilities - Common utilities shared across our batch ETL projects
  • XML-to-JSON-list - Command-line tool for mechanical conversion of XML into Beam-friendly JSON

Operations

Infrastructure, configuration, and shared code used to manage developing and deploying our services.

GitHub repos

Semi-Archived

The repositories in this section are still being used, but we're trying to move away from them.

Data Ingest Framework

Our first stabs at data ingest envisioned a framework of dataset-agnostic services. We shifted away from that pattern because it introduced significant overhead vs. custom pipelines using common command-line tools.

GitHub repos

  • Transporter - Bulk file-transfer system
  • Storage Libs - Utility libraries for I/O against external storage systems

About

Hub for the Monster team in DSP Data Engineering

Resources

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •