Job Description

At Bank of America, we are guided by a common purpose to help make financial lives better through the power of every connection. We do this by driving Responsible Growth and delivering for our clients, teammates, communities and shareholders every day.

Being a Great Place to Work is core to how we drive Responsible Growth. This includes our commitment to being an inclusive workplace, attracting and developing exceptional talent, supporting our teammates’ physical, emotional, and financial wellness, recognizing and rewarding performance, and how we make an impact in the communities we serve.

Bank of America is committed to an in-office culture with specific requirements for office-based attendance and which allows for an appropriate level of flexibility for our teammates and businesses based on role-specific considerations.

At Bank of America, you can build a successful career with opportunities to learn, grow, and make an impact. Join us!

This role is responsible for designing, developing, and operating container‑based application platforms that support Generative AI and Large Language Model (LLM) workloads at enterprise scale. The engineer will partner closely with application developers, data scientists, and platform teams to ensure AI workloads are deployed securely, efficiently, and reliably across Kubernetes‑based environments.

The successful candidate will focus on building and managing GPU‑accelerated containerized services, enabling scalable inference platforms, and supporting production‑grade AI frameworks. This role operates within a large‑scale enterprise environment and contributes to Agile delivery, DevOps automation, and continuous platform improvement.

Key responsibilities include:

Designing, deploying, and maintaining containerized applications on Kubernetes and OpenShift platforms.
Supporting Generative AI inference environments, including model packaging, deployment, scaling, and performance optimization.
Enabling GPU‑based workloads and ensuring efficient resource utilization and isolation.
Collaborating with cross‑functional teams to deliver secure, resilient, and production‑ready AI platforms.
Contributing to CI/CD pipelines, infrastructure automation, and operational best practices.
Participating in Agile ceremonies and supporting iterative, high‑quality software delivery.

Required Skills:

8+ years in a technology environment with 5+ years of experience with container tools
Strong hands‑on experience with Kubernetes, including OpenShift, and container tools such as Docker and Podman.
Deep understanding of container orchestration concepts, including scheduling, networking, storage, configuration, and secrets management.
Experience operating container platforms supporting GPU‑accelerated workloads.

Proficiency in Python for developing and operationalizing AI‑driven applications.
Hands‑on experience with Large Language Models (LLMs) and inference‑focused frameworks, including:
- vLLM
- NVIDIA Triton Inference Server
- NVIDIA NeMo framework
Understanding of AI workload patterns, including real‑time and batch inference, scaling strategies, and high‑throughput serving.

Experience working in large‑scale enterprise environments with strong requirements for security, reliability, and compliance.
Familiarity with CI/CD pipelines and DevOps practices for containerized applications.
Experience contributing within Agile frameworks (Scrum, Kanban, or SAFe).
Working knowledge of infrastructure‑as‑code and automated deployment approaches.

Strong problem‑solving skills and ability to troubleshoot complex platform issues.
Clear, concise communication with technical and non‑technical stakeholders.
Ability to work effectively across engineering, infrastructure, security, and data science teams.

Desired Skills:

Experience operating container platforms in regulated or highly secure environments.
Exposure to observability tools (logging, metrics, tracing) for distributed and AI‑driven systems.
Experience supporting multi‑tenant platforms or shared AI inference services at scale.

Skills:

Application Development
Automation
Collaboration
DevOps Practices
Solution Design
Agile Practices
Architecture
Result Orientation
Solution Delivery Process
User Experience Design
Analytical Thinking
Data Management
Risk Management
Technical Strategy Development
Test Engineering

Shift:

1st shift (United States of America)

Hours Per Week:

About Bank of America

Bank of America is one of the world's largest financial institutions, serving individuals, small- and middle-market businesses and large corporations with a full range of banking, investing, asset management and other financial and risk management products and services. The company serves approximately 56 million U.S. consumer and small business relationships. It is among the world's leading wealth management companies and is a global leader in corporate and investment banking and trading.

This LinkedIn company page is moderated. For more information, please visit: https://bit.ly/32FDdQr.

For account issues, please visit: https://bit.ly/2GeTIeP.

Industry

Finance & Insurance

Company Size

10,000+ employees

Headquarters

Charlotte, NC

Year Founded

Unknown

Website

bankofamerica.com

Social Media

ADS AI Services Software Engineer

Job Description

About Bank of America