SageCor Solutions

Senior HPC Administrator (IMC - 002)

SageCor Solutions  •  Maryland / Alaska (Onsite)  •  29 days ago
Apply
AI can make mistakes so check important info. Chat history is never stored.

Job Description

Serving Maryland and the Greater Washington D.C. area, SageCor Solutions (SageCor) is a growing company bringing complete engineering services and true full lifecycle System Engineering services to areas requiring (or desiring) nationally-recognized expertise in high performance computing, large data analytics and cutting edge information technologies.

Active TS/SCI w/ Polygraph required.

Requirements

• Configure and manage Linux and Windows (or other applicable) operating systems and installs/loads operating system software, troubleshoot, maintain integrity of and configure network components, along with implementing operating systems enhancements to improve security, reliability, and performance
• Administer, monitor, and maintain HPC systems, including compute nodes, storage, networking, and software stacks
• Provide support to IT systems including day-to-day operations, monitoring and problem resolution for all of the client/server/storage/network devices, mobile devices, etc.
• Implement and maintain automation tools for system provisioning, configuration management, and monitoring.
• Provide support for implementation, troubleshooting and maintenance of IT systems
• Manage the daily activities of configuration and operation of IT systems
• Provide assistance to users in accessing and using IT systems
• Optimize system operations and resource utilization, and perform system capacity analysis and planning
• Provide in-depth experience in trouble-shooting IT systems
• Analyze and resolve complex problems associated with server hardware, applications and software integration
• Contribute to performance benchmarking, system tuning, and capacity planning
• Support researchers by providing technical expertise and resolving IT-related roadblocks or issues
• Document system administration procedures and contribute to knowledge-sharing initiatives

Technical skills:

• Experience administering Linux-based servers and HPC clusters, including job schedulers (e.g., Slurm, LSF, PBS)
• Experience configuring and managing Virtual Private Network (VPN) clients and servers
• Scripting/programming skills (C and Python)
• Knowledge of:
o System automation tools (e.g., Ansible)
o System provisioning tools (e.g., Warewolf)
o Distributed storage systems (e.g., Lustre, BeeGFS)
o Containerization (e.g., Docker, Apptainer)
o Installing, maintaining and using infrastructure and performance monitoring and optimization tools (e.g., Grafana, Prometheus)
o Setting up and executing benchmarks in an HPC environment and analyzing their results systematically

Qualifications:

• Active Top Secret/SCI clearance with polygraph
• Preferably meets DoD 8140.01 or DoD 8570.01-M training and certification requirements

Consistent with federal and state law where SageCor conducts business, SageCor Solutions provides equal employment opportunities (EEO) to all employees and applicants for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability or veteran status, or any other protected class.

SageCor Solutions

About SageCor Solutions

Sagecor is a complete engineering services company that employs top-level talent who possess the expertise to provide customers with true full-lifecycle System Engineering Services. Sagecor offers services beginning as early as the Research phase, where basic analysis is performed and prototypes as proof-of-concept are built.

During the crucial planning phase of a system, Sagecor’s team of Systems Engineers, SETA Services professionals, and Knowledge Management experts make sure each customer is prepared to define its program requirements, begin design activities, and manage the planning for development through lifecycle support & sustainment.

After the System Development begins, Sagecor has Hardware and Software engineers prepared to work alongside our customers in the Design and Build of the system. Our processes ensure robust technical expertise, access to time-tested procedures, and successful program execution. When the primary system components are completed, Sagecor’s team of Integration and Test engineers will integrate and test the customer’s system to guarantee the components function as part of the larger system. Upon completion of Acceptance Testing, Sagecor employs Systems Administrators to ensure that the system remains sustained and operational for the length of the mission. Sagecor’s “cradle-to-grave” experts provide our customers with the peace of mind that we can provide the expertise to meet any challenge.

Industry
IT & Software
Company Size
11-50 employees
Headquarters
Baltimore, Maryland
Year Founded
Unknown
Social Media