Director, Cloud Operations
Published: 31st August 2022
Closing Date: 30th November 2022
t Bristol Myers Squibb, we are inspired by a single vision – transforming patients’ lives through science. In oncology, hematology, immunology and cardiovascular disease – and one of the most diverse and promising pipelines in the industry – each of our passionate colleagues contribute to innovations that drive meaningful change. We bring a human touch to every treatment we pioneer. Join us and make a difference.
BMS is seeking an experienced, hands-on Director of Cloud Operations reporting to the Executive Director IT Hosting Operations.
As Director Cloud Operations you will work alongside Cloud architects/Engineers, Network, Cyber, and application development teams to support transition and operations of BMS business applications in the AWS, Azure, and GCP cloud.
Responsibilities will include technical design, implementation, and ongoing support services assuring quality and compliance in a regulated Pharmaceutical environment. Areas of focus will include IaaS, implementation and cost optimization, delivering mission critical infrastructure and ensuring the highest levels of availability, performance and security to our internal customers.
- Hands-on leadership of cloud and alignment with on premise data center operations
- Serve as a senior-level technical point of contact for enterprise customers
- Deploying, and managing, highly available, and fault tolerant systems on AWS, Azure, GCP
- Deploying and maintaining Linux (RHEL), Windows and opensource databases (Mysql, Mariadb, Postgresql)
- Selecting the appropriate cloud service based on compute, data, or security requirements
- Identifying appropriate use of cloud operational best practices
- Understanding of AWS, Azure, GCP usage costs and FinOps practices for cloud cost optimization
- Receive and record incident related information leveraging tools and process, selects appropriate actions to resolve issues and communicates the solution or action plan to the client.
- Manage and perform design and testing of disaster recovery plans for mission critical aplications
- Collaborate with the Cloud Computing DEV/OPS team and associated SME’s as advisors on all technology issues
- Understand high level cloud application architecture and provide initial analysis on P1 and P2 incidents
- Evaluate effectiveness of the company’s systems and make recommendations for improvement
- Coordinate and engineer cloud zones and regional architecture for data protection, DR and manual redundancy fail overs.
- Provide daily, weekly & monthly integrated service management reports across all solutions.
- Provide support in the configuration and integration of ITIL tools (ServiceNow).
- Bridge support between Cloud Engineering, Cyber and the various consumers of cloud services (App Supp, COTs vendors, and power users)
- Responsible for ensuring production services hosted in the Cloud are compliant and protected
- Will provide 2nd level support to DevOps teams and assist with the implementation of applications when the scope demands
- Will evaluate applications against business requirements for HA and DR (eliminate or minimize cross zone dependencies and single points of failure)
- Responsible for enforcing the standards as set forth by Hosting, Cyber and Cloud Engineering
- Responsible to ensure Cloud services, standards, and optimization requirements are maintained
- Responsible for security, patch remediation enforcement and will manage Cyber CSIRT event when required
- Will provide technical and subject matter expert support to MSP’s
The ideal candidate will need to demonstrate:
A Deep technical skill in Cloud Service infrastructure, software architecture and cloud computing, demonstrated ability to think tactically and strategically about solutions to business, product, and technical challenges. Will have significant experience with cloud performance monitoring tools such as Amazon Cloud Watch and Azure Monitor. Will have deep technical knowledge of AWS EC2, VPC, Elastic Load Balancing, Auto Scaling, Lambda, Container management, S3, Intelligent Tiering, EBS, RDS and AWS Systems Manager automation as well as Azure, and GCP equivalents.
Be able to use professional knowledge and problem determination / source identification skills to resolve problems involving APIs, application services, IaaS, PaaS, SaaS, micro-services, containers, Kubernetes nodes, middleware components, network, security and infrastructure issues alike. If unable to resolve, will triage and route the incident to the appropriate level of support.
The candidate will be experienced with the ITIL processes of Incident (including Critical Incident Management), Problem, Change management and Integrated Service Level Management.
The duties include liaising with external Service Providers (SP’s) and software vendors to ensure compliance and acting as a point of escalation for IT Hosting issues. With SP’s and software vendors, and in alignment with BMS internal teams, this position supports Incident/Problem management, CMDB quality assurance, and Problem/CAPA coordination. Duties also include monitoring Hosted Services for quality, performance, cost, and reliability. The position is responsible for assuring SP’s delivery against committed service level agreements (SLAs) and managing formal recovery plans that result from SLA misses
Required Technical and Professional Expertise
- At least 15 years of experience in an IT professional role and 5+ years hands on management role supporting a public/private cloud environment such as AWS, Azure or GCP platform
- Professional experience managing/operating production systems on cloud service provider infrastructure
- Proactive monitoring, backup and recovery and auditing systems experience
- Hands on experience with the AWS CLI and SDKs/API tools
- Familiarity with multi-cloud Finacial Management (FinOps) tools such as Cloudability, MyXalytics or others
- AWS data protection and lifecycle experience using technologies like EBS snapshots, S3/Glacier, data replication between availability zones and regions, etc.
- Good grasp of fundamental security concepts with hands on experience implementing security controls and compliance requirements
- Extensive experience using ITIL processes, ideally managed using ServiceNow.
- Required to be on call on a 24×7 basis to manage major outages or escalations.
- Working knowledge of Pharmaceutical industry regulatory requirements, qualification and validation of applications.
- Must be organized in documenting qualified processes to assure Cloud tenants adhere to standards required in order to maintain operational standards in the Cloud
- Strong verbal and written communication skills (content creation such as whitepapers, proof of concepts, etc.). Ability to present on complex technology decisions to both technical and non-technical audiences at all levels in the organization
- Skill and experience with managing a budget, including forecasting projections, modeling future consumption, and meeting financial commitments
- Experience with response to regulatory authorities such as FDA, including internal/external audit on items such as GxP and SOX
- Demonstrated ability to develop complex business justifications and to implement and manage new IT capabilities
- Experience with building and leading effective technical teams and developing staff members
- Partnering with IT Business Partners, Digital Capability Management, developers, and researchers to understand business objectives and priorities and to develop aligned IT solutions
Preferred Certification Requirements:
- AWS Solution Architect Certificates; Associates or Professional (preferred)
- AWS Certified SysOps Administrator Associate (preferred)
- AWS Certified DevOps Engineer Professional (a plus)
BS or equivalent in information technology, computer engineering, computer science, or related field
- Company: Bristol Myers Squibb
- Type: Full-time
- Seniority: Entry-level Contributor
- FinOps Certifications Required: None