Creative Information Technology, Inc.

Data Engineer – Baltimore City, MD

Creative Information Technology, Inc.  •  Falls Church, VA (Onsite)  •  15 days ago
Apply
AI can make mistakes so check important info. Chat history is never stored.

Job Description

Data Engineer – Baltimore City, MD

About us

Creative Information Technology Inc (CITI) is an esteemed IT enterprise renowned for its exceptional customer service and innovation. We serve both government and commercial sectors, offering a range of solutions such as Healthcare IT, Human Services, Identity Credentialing, Cloud Computing, and Big Data Analytics. With clients in the US and abroad, we hold key contract vehicles including GSA IT Schedule 70, NIH CIO-SP3, GSA Alliant, and DHS-Eagle II.

Join us in driving growth and seizing new business opportunities.

Background

Client is seeking a hands-on Data Engineer to design, develop, and optimize large-scale data pipelines in support of our Enterprise Data Warehouse (EDW) and Data Lake solutions. This role requires deep technical expertise in coding, pipeline orchestration, and cloud-native data engineering on AWS. The Data Engineer will be directly responsible for implementing ingestion, transformation, and integration workflows — ensuring data is high-quality, compliant, and analytics-ready. This role may support other projects or teams within MDH as needed.

Responsible for designing, building, and maintaining data pipelines and infrastructure to support data-driven decisions and analytics. The individual is responsible for the following tasks:

  1. Design, develop and maintain data pipelines, and extract, transform, load (ETL) processes to collect, process and store structured and unstructured data
  2. Build data architecture and storage solutions, including data lakehouses, data lakes, data warehouse, and data marts to support analytics and reporting
  3. Develop data reliability, efficiency, and qualify checks and processes
  4. Prepare data for data modeling
  5. Monitor and optimize data architecture and data processing systems
  6. Collaboration with multiple teams to understand requirements and objectives
  7. Administer testing and troubleshooting related to performance, reliability, and scalability
  8. H. Create and update documentation

Role and Responsibilities

Hands-On Data Pipeline Development

  • Design, code, and deploy ETL/ELT pipelines across bronze, silver, and gold layers of the Data Lakehouse.
  • Build ingestion pipelines for structured (SQL), semi-structured (JSON, XML), and unstructured data using PySpark/Python programming language using AWS Glue or EMR.
  • Implement incremental loads, deduplication, error handling, and data validation.
  • Actively troubleshoot, debug, and optimize pipelines for scalability and cost efficiency.

EDW & Data Lake Implementation

  • Develop dimensional data models (Star Schema, Snowflake Schema) for analytics and reporting.
  • Build and maintain tables in Iceberg, Delta Lake, or equivalent OTF formats.
  • Optimize partitioning, indexing, and metadata for fast query performance.

Healthcare Data Integration

  • Build ingestion and transformation pipelines for EDI X12 transactions (837, 835, 278, etc.).
  • Implement mapping and transformation of EDI data with FHIR and HL7 frameworks.
  • Work hands-on with AWS Health Lake (or equivalent) to store and query healthcare data.

Data Quality, Security & Compliance

  • Develop automated validation scripts to enforce data quality and integrity.
  • Implement IAM roles, encryption, and auditing to meet HIPAA and CMS compliance standards.
  • Maintain lineage and governance documentation for all pipelines.

Collaboration & Delivery

  • Work closely with the Lead Data Engineer, analysts, and data scientists to deliver pipelines that support enterprise-wide analytics.
  • Actively contribute to CI/CD pipelines, Infrastructure-as-Code (IaC), and automation.
  • Continuously improve pipelines and adopt new technologies where appropriate.

Minimum Qualifications

Specialized experience: The candidate should have experience as data engineer or similar role with a strong understanding of data architecture and ETL processes. The candidate should be proficient in programming languages for data processing and knowledgeable of distributed computing and parallel processing.

  • This position requires a bachelor’s or master’s degree from an accredited college or university with a major in computer science, statistics, mathematics, economics, or a related field. Three (3) years of equivalent experience in a related field may be substituted for the Bachelor’s degree.
  • 3+ years hands-on experience in building, deploying, and maintaining data pipelines on AWS or equivalent cloud platforms.
  • Strong coding skills in Python and SQL (Scala or Java a plus).
  • Proven experience with Apache Spark (PySpark) for large-scale processing.
  • Hands-on experience with AWS Glue, S3, Redshift, Athena, EMR, Lake Formation.
  • Strong debugging and performance optimization skills in distributed systems.
  • Hands-on experience with Iceberg, Delta Lake, or other OTF table formats.
  • Experience with Airflow or other pipeline orchestration frameworks.
  • Practical experience in CI/CD and Infrastructure-as-Code (Terraform, CloudFormation).
  • Practical experience with EDI X12, HL7, or FHIR data formats.
  • Strong understanding of Medallion Architecture for data lake houses.
  • Hands-on experience building dimensional models and data warehouses.
  • Working knowledge of HIPAA and CMS interoperability requirements.
Creative Information Technology, Inc.

About Creative Information Technology, Inc.

Who We Are

Creative Information Technology, Inc. (CITI) continues to prove itself as a forward thinking information technology company, one that leverages the latest technologies to provide its clients with solutions that solve complex real-world problems. Over the last 20 years, CITI has been recognized for its dedication to customer service and commitment to innovation by many of the departments and agencies we have served in the government and commercial sectors alike. CITI has grown into a diversely talented and motivated IT enterprise with clients in the US and abroad.

Game-changing technologies

CITI’s core lines of business include Healthcare IT Solutions and Services, Human Services, Identity Credentialing and Access Management, Cloud and Mobile Computing, Big Data Analytics, and Business Intelligence.

Investing in the future

Our facilities include a state-of-the-art development integration lab and testing facility where we develop our software systems and solutions and explore the next best technologies on behalf of our customers.

Government Contract Vehicles

CITI holds multiple government contract vehicles that encourage a close relationship with the government agencies we support and make our products more visible. Some of the contract vehicles we hold are: GSA IT Schedule 70, NIH CIO-SP3, GSA Alliant, DHS- Eagle II, and various IT Service contracts.

The CITI Advantage

• CMMI Level 4 appraised

• ISO 9007:27001, ISO 20000

• Oracle and Microsoft Partner

• SBA Champion Award

• Virginia Fantastic 50

• Inc. 500

• ACT-IAC Top 20 Excellence.Gov

• ACT-IAC Igniting Innovation Award

Industry
IT & Software
Company Size
201-500 employees
Headquarters
Falls Church, VA
Year Founded
1996
Social Media